Research

Interests: explainable AI, NLP, data privacy, and climate tech.

Select projects

Uncertainty Estimation for Boosted Regression Trees

Gradient-boosted regression trees (GBRTs) are hugely popular for solving tabular regression problems, such as weather forecasting and clinic-mortality prediction. However, GBRTs only produce point predictions and provide no estimate of the uncertainty for each prediction. To quantify the uncertainty and increase the interpretability of GBRT predictions, we developed IBUG: Instance-Based Uncertainty estimation for Gradient-boosted regression trees, a simple method for extending any GBRT point predictor to produce probabilistic predictions. IBUG computes a non-parametric distribution around a prediction using the k-nearest training instances, where distance is measured with a tree-ensemble kernel. In spite of its simplicity, IBUG achieves similar or better performance than the previous state-of-the-art. [paper][code]

Reverse Engineering Adversarially Constructed Text

Methods for automatically attacking NLP models are rapidly increasing. These methods perturb the input text in subtle ways to manipulate the predicted output of the model. For example, strategically adding/removing/replacing a single word or character in a given hateful message can change the prediction of a hate-speech classifier on that message from toxic to non-toxic without altering the original meaning (i.e., the message remains toxic). We aim to automatically detect these kinds of attacks and furthermore identify which method was used to perturb the input using a rich set of features that capture surface-level and linguistic characteristics of the perturbed text, as well as characteristics of the victim model; this set of features provides an attack signature that can help uniquely identify existing and potentially new unseen attacks. [paper1] [paper2][paper3]

Explaining Model Predictions via the Training Data

The burgeoning field of explainable AI (XAI) has grown rapidly due to the widespread use of machine-learning models in many domains and applications. Gradient-boosted decision trees are a powerful class of models that typically outperform all other models on tabular data, and are regularly used to win competitions on websites such as Kaggle and DrivenData. Unfortunately, these highly-predictive "black-box" models are often relatively complex, making their decision-making processes opaque. To better understand their behavior, we use influence estimation, a general approach of tracing a given prediction back through the model to the examples most responsible for that prediction. To this end, we have adapted several influence-estimation techniques designed for deep-learning models to gradient-boosted decision trees. [paper] [code]

Efficiently Removing Data with Machine Unlearning

The GDPR's "Right to be Forgotten" gives users the right to have their data removed from a company upon request. This ordinance also applies to machine-learning models, which essentially contain a "memory" of the training examples stored in their learnt representation. With increasing dataset sizes and model complexities, retraining a model after every deletion request is often intractable. Machine unlearning aims to remove specific training examples from a learnt model without retraining from scratch, saving valuable time and compute resources; to this end, we have developed a machine-unlearning approach for random-forest models that efficiently remove examples orders of magnitude faster than retraining from scratch while sacrificing little to no predictive power. [paper] [code]

Detecting Image and Video Manipulations

Image and video manipulations are increasingly common in the digital world and can be particularly harmful. For example, deepfakes have gained international attention by superimposing celebrity faces onto entertainers in adult films, potentially tarnishing the reputations of those celebrities; manipulated images/videos can also add to the amount of misinformation that circulate social network feeds. There are a large number of possible image/video manipulations including blurring, content-aware fill, removing objects, and many more; and in response to these manipulations, an equally large number of specially designed forensics algorithms that analyze pixel data looking for tell-tale signs of these manipulations have been developed. In this international and multi-university collaboration, we developed methods for combining numerous forensics algorithms to create a model that better detects and localizes image and video manipulations.

Jointly Classifying Social Network Spam

Spam is pervasive on social networks and can take many different forms, from fraudulent ad campaigns to hate speech to coordinated botnet messages. Machine learning offers an automated approach to detect new spam content, but many machine-learning algorithms assume messages are independent of one another. Social networks, however, are rich with relational information that can help better detect spam. For example, a spammer may post the same spam message on one or multiple accounts, or be affiliated with other spammers. We make it easy to capture relationships like these using a statistical-relational learning algorithm that allows users to write intuitive first-order logical rules that become instantiated as full probabilistic-graphical models; then, full joint inference propagates information between related entities which helps better detect spammers and their spam content. [paper] [code]

Page updated

Google Sites

Report abuse