Ben Lorica, O’Reilly’s chief data scientist, has posted slides and notes from his talk at last December’s Strata Data Conference in Singapore, “We need to build machine learning tools to augment machine learning engineers.”
Lorica describes a new job emerging in IT departments: “machine learning engineers,” whose job is to adapt machine learning models for production environments. These new engineers run the risk of embedding algorithmic bias into their systems, which unfairly discriminate, create liability, and reduces the quality of the recommendations the systems produce.
He presents a set of technical and procedural steps to take to minimize these risks, with links to the relevant papers and code. It’s really required reading for anyone implementing a machine learning system in a production environment.
Another example has to do with error: once we are satisfied with a certain error rate, aren’t we done and ready to deploy our model to production? Consider a scenario where you have a machine learning model used in health care: in the course of model building, your training data for millenials (in red) is quite large compared to the number of labeled examples from senior citizens (in blue). Since accuracy tends to be correlated with the size of your training set, chances are the error rate for senior citizens will be higher than for millenials.
For situations like this, a group of researchers introduced a concept, called “equal opportunity”, that can help alleviate disproportionate error rates and ensure the “true positive rate” for the two groups are similar. See their paper and accompanying interactive visualization.
We need to build machine learning tools to augment machine learning engineers [Ben Lorica/O’Reilly]
(via 4 Short Links)
Two days ago, an industry/academic team released a terrifying alert about a pair of CPU bugs called Spectre and Meltdown that allowed one program to steal data from another, even with the best memory-management and isolation techniques — news that meant that virtually all the mission-critical computers in the world could no longer be trusted […]
Princeton’s Ed Felten (previously) is one of America’s preeminent computer scientists, having done turns as CTO of the FTC and deputy CTO of the White House.
University of Washington data scientist Jake Vanderplas found himself trapped in an interminable series of Snakes and Ladders (AKA Chutes and Ladders) with his four-year-old and found himself thinking of how he could write a Python program to simulate and solve the game.
Computers and Software Buyers Guide
Compare Computers and Laptops
Mobile Phones Buyers Guide
- Mobile Phones Buyers Guide
- Mobile Phones Accessories Buyers Guide
- All in one Printers Buyers Guide
- Fax Machines Buyers Guide
- Home Telephones Buyers Guide
Compare Mobile Phones
- Compare Mobile Phones
- Compare Mobile Phone Accessories
- Compare Smart Watches
- Compare All in One Printers
- Compare Fax Machines
- Compare Home Telephones
- Compare Home Telephone Accessories