Adversarial Robustness Through Local Lipschitzness
Neural networks are very susceptible to adversarial examples, a.k.a., small perturbations of normal inputs that cause a classifier to output the wrong label. The standard defense against adversarial examples is Adversarial Training, which trains a classifier using adversarial examples close to training inputs. This improves test accuracy on adversarial examples, but it often lowers clean accuracy, sometimes by a lot.
Several recent papers investigate whether an accuracy-robustness trade-off is necessary. Some pessimistic work says that unfortunately this may be the case, possibly due to high-dimensionality or computational infeasibility.
If a trade-off is unavoidable, then we have a dilemma: should we aim for higher accuracy or robustness or somewhere in between? Our recent paper explores an optimistic perspective: we posit that robustness and accuracy should be attainable together for real image classification tasks.
The main idea is that we should use a locally smooth classifier, one that doesn’t change its value too quickly around the data. Let’s walk through some theory about why this is a good idea. Then, we will explore how to use this in practice.
The problem with natural training
The reason why we see a trade-off between robustness and accuracy is due to training methods. The best neural network optimization methods lead to functions that change very rapidly, as this allows the network to closely fit the data.
Since we care about robustness, we actually want to move as slowly as possible from class to class. This is especially true for separated data. Think about an image dataset. Cats look different than dogs, and pandas look different than gibbons. Quantitatively, different animals should be far apart (for example, in
So why do adversarial perturbations lead to a high error rate? This is a very active area of research, and there’s no easy answer. As a step towards a better understanding, we present theoretical results on achieving perfect accuracy and robustness by using a locally smooth function. We also explore how well this works in practice.
As a motivating example, consider a simple 2D binary classification dataset. The goal is to find a decision boundary that has 100% training accuracy without passing closely to any individual input. The orange curve in the following picture shows such a boundary. In contrast, the black curve comes very close to some data points. Even though both boundaries correctly classify all of the examples, the black curve is susceptible to adversarial examples, while the orange curve is not.
Perfect accuracy and robustness, at least in theory
We propose designing a classifier using the sign of a relatively smooth function. For separated data, this ensures that it’s impossible to change the label by slightly perturbing a true input. In other words, if the function value doesn’t change very quickly, then neither does the label.
More formally, we consider classifiers
Previous work by Hein and Andriushchenko has shown that local Lipschitzness indeed guarantees robustness. In fact, variants of Lipschitzness have been the main tool in certifying robustness with randomized smoothing as well. However, we are the first to identify a natural condition (data separation) that ensures both robustness and high test accuracy.
Our main theoretical result says that if the two classes are separated – in the sense that points from different classes are distance at least
For many real world datasets, the separation assumption in fact holds.


For example, consider the CIFAR-10 and Restricted ImageNet datasets (for the latter, we removed a handful of images that appeared twice with different labels).
The figure shows the histogram of the
We basically use a scaled version of the 1-nearest neighbor classifier in the infinite sample limit. The proof just uses the data separation along with a few applications of the triangle inequality. The next figure shows our theorem in action on the Spiral dataset. The classifier

Encouraging the smoothness of neural networks
Now that we’ve made a big deal of local Lipschitzness, and provided some theory to back it up, we want to see how well this holds up in practice. Two questions drive our experiments:
- Is local Lipschitzness correlated with robustness and accuracy in practice?
- Which training methods produce locally Lipschitz functions?
We also need to explain how we measure Lipschitzness on real data. For simplicity, we consider the average local Lipschitzness, computed using
The benefit is that we want the function to be smooth on average, even though there may be some outliers. One of the best methods for adversarial examples is TRADES, which encourages local Lipschitzness by minimizing the following loss function:
TRADES is different than Adversarial Training (AT), which optimizes the following:
AT directly optimizes over adversarial examples, while TRADES encourages
We also consider two other plausible methods for achieving accuracy and robustness, along with local Lipschitzness: Local Linear Regularization (LLR) and Gradient Regularization (GR).
Comparing five different training methods
Here we provide experimental results for CIFAR-10 and Restricted ImageNet. See our paper for other datasets (MNIST and SVHN).
CIFAR-10 | train accuracy | test accuracy | adv test accuracy | test lipschitz |
---|---|---|---|---|
Natural | 100.00 | 93.81 | 0.00 | 425.71 |
GR | 94.90 | 80.74 | 21.32 | 28.53 |
LLR | 100.00 | 91.44 | 22.05 | 94.68 |
AT | 99.84 | 83.51 | 43.51 | 26.23 |
TRADES( |
99.76 | 84.96 | 43.66 | 28.01 |
TRADES( |
99.78 | 85.55 | 46.63 | 22.42 |
TRADES( |
98.93 | 84.46 | 48.58 | 13.05 |
Restricted ImageNet | train accuracy | test accuracy | adv test accuracy | test lipschitz |
---|---|---|---|---|
Natural | 97.72 | 93.47 | 7.89 | 32228.51 |
GR | 91.12 | 88.51 | 62.14 | 886.75 |
LLR | 98.76 | 93.44 | 52.65 | 4795.66 |
AT | 96.22 | 90.33 | 82.25 | 287.97 |
TRADES( |
97.39 | 92.27 | 79.90 | 2144.66 |
TRADES( |
95.74 | 90.75 | 82.28 | 396.67 |
TRADES( |
93.34 | 88.92 | 82.13 | 200.90 |
For both datasets, we see correlation between accuracy, Lipschitzness, and adversarial accuracy. For example, on CIFAR-10, we see that TRADES(
Natural training has the lowest adversarial accuracy, and also the higher Lipschitz constant. GR has a fairly low training accuracy (possibly due to underfitting).
For LLR, AT, and TRADES, we see that smoother classifiers have higher adversarial test accuracy as well. However, this is only true up to some point. Increased local Lipschitzness helps, but with very high local Lipschitzness, the neural networks start underfitting which leads to loss in accuracy, for example, with TRADES(
Robustness requires some local Lipschitzness
Our experimental results provide many insights into the role that Lipschitzness plays in classifier accuracy and robustness.
-
A clear takeaway is that very high Lipschitz constants imply that the classifier is vulnerable to adversarial examples. We see this most clearly with natural training, but it is also evidenced by GR and LLR.
-
For both CIFAR and Restricted ImageNet, the experiments show that minimizing the Lipschitzness goes hand-in-hand with maximizing the adversarial accuracy. This highlights that Lipschitzness is just as important as training with adversarial examples when it comes to improving the adversarial robustness.
-
TRADES always leads to significantly smaller Lipschitz constants than most methods, and the smoothness increases with the TRADES parameter
. However, the correlation between smoothness and robustness suffers from diminishing returns. It is not optimal to minimize the Lipschitzness as much as possible. -
The main downside of AT and TRADES is that the clean accuracy suffers. This issue may not be inherent to robustness, but rather it may be possible to achieve the best of both worlds. For example, LLR is consistently more robust than natural training, while simultaneously achieving state-of-the-art clean test accuracy. This leaves open the possibility of combining the benefits of both LLR and AT/TRADES into a classifier that does well across the board. This is the main future work!
More Details
See our paper on arxiv or our repository.