Continuity
This article assumes familiarity with the ideas of real valued functions defined on the real line and distances between real numbers at a late high school or early undergraduate level. It is aimed at a student who has just been introduced to the \(\epsilon - \delta\) definition of continuity and is intended to help provide motivation for it. When I first encountered this definition, I was encouraged to generalise it further, as much as I can. I quite enjoyed that feeling, and so I have attempted to share a taste of the same feeling with this article. This is my entry for SoME4, held in 2025.
Motivation
So, we’ve just discovered functions. A natural question is to wonder if we can classify these mathematical objects in some (useful) way. Since, at this stage, we do not really know much about functions, let us begin by looking at some examples of functions. Perhaps this will highlight some special property that only a few functions seem to have, which we can then use to classify functions. To keep things simple, we restrict ourselves to functions from \([0, 1]\) to \([0, 1]\). This will let us draw the complete graphs of the functions.
Graphs of some functions
Instructions: Click and drag to sketch function graphs on the canvas above. Some example functions to sketch:
- \(f_1(x)=x\)
- \(f_2(x)=x^2\)
- \(f_3(x)=\begin{cases} x &\colon 0\leq x\leq 0.5\\ x-0.5 & \colon 0.5<x \leq 1 \end{cases}\)
Now, some of these graphs can be drawn without releasing the pen tool, while others require it to be released at some point. So this notion of ‘not needing to lift the pen’ is satisfied by only by some functions, and seems to be a good starting point to classify functions.
Candidate classification 1. A function \(f \colon [0, 1] \to [0, 1]\) is single stroke graphable if its graph can be drawn on a sheet of paper without lifting the pen (assuming that we use only one pen).
The function \(f\) is not single stroke graphable if its graph requires us to lift the pen at least once (again, assuming we use only one pen).
An immediate issue with this classification is that this requires us to draw the graphs on a sheet of paper. However, we may not be able to draw the graphs of some functions with a pen, and consequently, cannot apply this classification to such functions. Consider, for instance, the function \(f_1 \colon [0, 1] \to [0, 1]\) defined as:
So, we now look for a better classification, one that we can apply to all functions, and not just functions that can be graphed. Perhaps we can capture the essence of this notion and express that in precise mathematical language which can then be applied to all functions.
A Deeper Look
Before we begin studying the behaviour of single stroke graphable functions, it helps to establish some terminology.
Some helpful terminology
Definition 1. A function \(f\colon [0, 1] \to [0, 1]\) is graphable if we can draw its complete graph on a sheet of paper using a pen.
While this definition is not rigorous, we will nevertheless use it as a handhold in our study of single stroke graphable functions.
We can now use Classification 1 to classify graphable functions as it applies to all graphable functions.
Definition 2. Fix \(x \in [0, 1]\) and \(\delta \in \mathbb{R}_{>0}\). A neighbourhood \(N(x,\delta)\) is the set of all points that are less than \(\delta\) distance away from \(x\). So a neighbourhood of \(x\) in \([0, 1]\) is the set \(N(x, \delta) = \{y \in [0, 1] \mid |y − x| < \delta\}\).
Definition 3. Let \(f \colon [0, 1] \to [0, 1]\) be a graphable function. Any point \(x_0 \in [0, 1]\) such that the graph \(y=f(x)\) requires us to lift the pen at \(x=x_0\) is a jump.
This means any graphable function that does not have a jump is single stroke graphable. Equivalently, any non single stroke graphable function has at least one jump.
What happens at a jump
Let us now study what happens at a jump and hopefully gain some insight into how we can express that in mathematical language. We look at prototype non single stroke graphable functions with exactly one jump at \(x=0.5\). To construct such prototypes, we begin with a single stroke graphable function and 'push' or 'pull' the graph beyond \(x=0.5\) to create a non single stroke graphable function.
Studying gaps
Instructions: Move the slider and experiment with different values of the jump distance. Notice how the functions are non single stroke graphable whenever the jump is non zero, and single stroke graphable whenever it is zero.
Notice how there is always a ‘gap’ at \(x=0.5\). In fact, it is precisely because of this ‘gap' that we are required to lift the pen at \(x=0.5\) when drawing the graph. So such ‘gaps' are not just artefacts of the way we are representing non single stroke graphable functions, but are defining features of such functions.
Our goal now is to attempt to capture the idea of such ‘gaps' in precise mathematical language that can then be extended to non graphable functions as well. So what happens at these ‘gaps'? We study what happens in the neighbourhoods of \(x=0.5\). Fix a neighbourhood \(N(0.5,\delta)\). Now, because we constructed the non single stroke graphable function \(f\) by creating a jump at \(x=0.5\), there will always be some \(x \in N(0.5,\delta)\) such that \(f(x)\) and \(f(0.5)\) are at least the ‘gap distance' apart. In fact, if we choose any \(\epsilon \in \mathbb{R}_{> 0}\) such that \(\epsilon\) is less than this jump distance, there will be some point \(x \in N(0.5,\delta)\) such that \(f(x)\) and \(f(0.5)\) are at least \(\epsilon\) distance apart.
Now, this is true for all neighbourhoods of \(x=0.5\) because \(f\) was constructed by pushing its immediate neighbours away by the jump distance.
Worded differently, this means: There is \(\epsilon \in \mathbb{R}_{>0}\) such that for any neighbourhood \(N(0.5,\delta)\), there is \(x \in N(0.5,\delta)\) such that \(|f(x)-f(0.5)| \geq \epsilon\).
Notice that there is nothing special about \(0.5\), so we can extend the above idea to any jump of a non single stroke graphable function.
This is the property of jumps we were looking for!
Now, for any non single stroke graphable function \(f \colon [0,1] \to [0,1]\) there is at least one jump. This means there is at least one point \(x_0 \in [0,1]\) such that there is some \(\epsilon \in \mathbb{R}_{>0}\) for which any neighbourhood \(N(x_0,\delta)\) has a point \(p \in N(x_0,\delta)\) such that \(|f(p)-f(x_0)| \geq \epsilon\).
We can now negate the above to get:
For any single stroke graphable function \(f \colon [0,1] \to [0,1]\), there is no jump. This means for all points \(x_0 \in [0,1]\) there is no \(\epsilon \in \mathbb{R}_{>0}\) for which any neighbourhood \(N(x_0,\delta)\) has a point \(p \in N(x_0,\delta)\) such that \(|f(p)-f(x_0)| \geq \epsilon\).
That \(\dots\) is a mess to get through, so let us attempt to simplify it. Let \(P\) be the property: ‘There is some \(\epsilon \in \mathbb{R}_{>0}\) for which any neighbourhood \(N(x_0,\delta)\) has a point \(p \in N(x_0,\delta)\) such that \(|f(p)-f(x_0)| \geq \epsilon\)'. This lets us say that for any single stroke graphable function \(f \colon [0,1] \to [0,1]\), there is no point \(x_0 \in [0,1]\) such that \(P\) is true.
Equivalently, for any single stroke graphable function \(f \colon [0,1] \to [0,1]\), at each \(x_0 \in [0,1]\), \(P\) is false.
Now, the negation of \(P\) is: ‘For every \(\epsilon \in \mathbb{R}_{>0}\), there is a neighbourhood \(N(x_0,\delta_0)\) with no point \(p \in N(x_0,\delta_0)\) such that \(|f(p)-f(x_0)| \geq \epsilon\)'. This means, for every \(\epsilon \in \mathbb{R}_{>0}\), there is a neighbourhood \(N(x_0,\delta_0)\) such that \(|f(x)-F(x_0)|< \epsilon\) for all \(x \in N(x_0,\delta_0)\).
Using this, we get that for any single stroke graphable function \(f \colon [0,1] \to [0,1]\), at each \(x_0 \in [0,1]\), for every \(\epsilon \in \mathbb{R}_{>0}\), there is a neighbourhood \(N(x_0,\delta_0)\), such that \(|f(x)-f(x_0)|< \epsilon\) for all \(x \in N(x_0,\delta_0)\).
Continuity
The results in the previous section give us a way to classify all functions, for the ideas of distances and neighbourhoods do not require a graph. So we can apply them to functions that are not graphable as well.
Definition 4 (Continuity at a point). Let \(f \colon [0,1] \to [0,1]\) be a function and \(x_0 \in [0,1]\). The function \(f\) is continuous at \(x_0\) if, for every \(\epsilon \in \mathbb{R}_{>0}\), there is a neighbourhood \(N(x_0,\delta_0)\), such that \(|f(x)-f(x_0)|< \epsilon\) for all \(x \in N(x_0,\delta_0)\).
Candidate classification 2. A function \(f \colon [0, 1] \to [0, 1]\) is continuous if it is continuous at each point \(x_0 \in [0,1]\).
The function \(f\) is discontinuous if it is not continuous.
Note that this classification only uses the fact that \(f\) is a well defined function, and does not depend on whether \(f\) is graphable. Note also that if \(f\) were to be graphable, the definitions for continuous functions and single stroke graphable functions coincide.
A word of caution
It is only when \(f\) is graphable that \(f\) is continuous if and only if \(f\) is single stroke graphable (i.e. its graph has no jumps). It is not true in general that a function \(f\) is continuous if and only if its graph has no jumps.
Consider the function \(f(x) = \begin{cases} \sin(\frac{1}{x}) & x \in (0,1]\\ 0 & x=0 \end{cases}\), defined on the domain [0,1]. It is not continuous and has no jumps. It is not graphable since it oscillates too much near \(x=0\).
So there we have it! A classification that we can extend to all functions from \([0, 1]\) to \([0,1]\). One is encouraged to see how this means that a function is continuous if and only if a tiny ‘wiggle’ in the input does not result in a sudden ‘jump’ in the function’s output. Over the years, we have found that preserving this idea of tiny wiggles in the input not resulting in jumps in the output turns out to be useful in several (real world) contexts. So we use Classification 2 to classify functions as continuous and discontinuous ones.
Extensions
Using closed intervals let us draw the complete graphs of the functions, but the definition of continuity in Classification 2 does not need graphs anyway! So let us do away with that restriction and apply this definition to functions from \(\mathbb{R}\) to \(\mathbb{R}\).
Definition 5 (Continuity on the real line). (Continuity on the real line). Let \(f \colon \mathbb{R} \to \mathbb{R}\) be a function and \(x_0 \in \mathbb{R}\). The function \(f\) is continuous at \(x_0\) if, for every \(\epsilon \in \mathbb{R}_{>0}\), there is a neighbourhood \(N(x_0,\delta_0)=\{y \in \mathbb{R} \mid |y-x| < \delta\}\), such that \(|f(x)-f(x_0)|< \epsilon\) for all \(x \in N(x_0,\delta_0)\).
A function \(f \colon \mathbb{R} \to \mathbb{R}\) is continuous if it is continuous at each point \(x_0 \in \mathbb{R}\).
Upon some reflection, we see that the notion of continuity is still well defined.
In fact, we can take this further. Notice that all we needed to define neighbourhoods and jumps is some notion of distance between two points of the set. We call any such set \(S\) that comes equipped with a distance function \(d_S \colon S \times S \to \mathbb{R}\) a metric space (this means the function \(d_S\) satisfies some axioms, but we shall not get into that here. For our purposes, we assume that \(d_S\) is a valid notion of distance). We simply replace \(d_S\) in place of the distance function on \(\mathbb{R}\) (\(|x-y|, x,y \in \mathbb{R}\)) to get an extension of continuity to metric spaces.
Definition 6 (Continuity in metric spaces). Suppose \((S,d_S)\) is a metric space. Let \(f \colon S \to S\) be a function and \(x_0 \in S\). The function \(f\) is continuous at \(x_0\) if, for every \(\epsilon \in \mathbb{R}_{>0}\), there is a neighbourhood \(N(x_0,\delta_0)=\{y \in \mathbb{R} \mid d_S(y,x) < \delta\}\), such that \(d_S(f(x),f(x_0))< \epsilon\) for all \(x \in N(x_0,\delta_0)\).
A function \(f \colon S \to S\) is continuous if it is continuous at each point \(x_0 \in S\).
There’s no special reason for the two metric spaces in the above to be the same, so we can remove that restriction as well!
Definition 7. Suppose \((S,d_S)\) and \((T,d_T)\) are two metric spaces. Let \(f \colon S \to T\) be a function and \(x_0 \in S\). The function \(f\) is continuous at \(x_0\) if, for every \(\epsilon \in \mathbb{R}_{>0}\), there is a neighbourhood \(N(x_0,\delta_0)=\{y \in \mathbb{R} \mid d_S(y,x) < \delta\}\), such that \(d_T(f(x),f(x_0))< \epsilon\) for all \(x \in N(x_0,\delta_0)\).
A function \(f \colon S \to T\) is continuous if it is continuous at each point \(x_0 \in S\).
Recap
We started with a definition of continuity that was only applicable to graphable functions from \([0, 1]\) to \([0, 1]\) (Classification 1), observed that what really mattered was a lack of ‘jumps’, which let us extend the definition to all functions from \([0, 1]\) to \([0, 1]\) (Classification 2). This extension did not depend on the intervals being closed, so it applied to all functions from \(\mathbb{R}\) to \(\mathbb{R}\) as well. Not just that, this definition only required a well defined concept of distance, so we were able to extend it further to functions between any two metric spaces!
Further Study
We can continue this process of generalising continuity. It turns out that a notion of distance is not essential to have a well defined idea of continuity. What we need is an idea of ‘open sets’ being preserved in some way, which leads to the study of point-set topology.
Corrections:
- The function \(f(x)=\sin(\frac{1}{x})\) was used as an example here. It is, however, not defined at \(x=0\). This is now fixed. Thanks anonymous reader letting me know.
Acknowledgements:
I was first exposed to the \(\epsilon -\delta\) definition of continuity in my first course on calculus. The IISERB Maths Club conducted an open hour then, in which club members (Dhawal and Raj) encouraged me to attempt generalising the definition in this manner.