We are going to have a look at some advanced topics Optimality Conditions for Unconstrained Optimization Problems
on convexity, gradient descent and other related stuffs.
Additional Mathematics Background
Fréchet Differentiation provides a way to extend Differentiability to general vector spaces.
A mapping is said to be Fréchet Differentiable at if there is a linear function such that
The function is the derivative of at and it holds that
where is the Jacobian-Matrix.
Optimality Conditions for Unconstrained Optimization Problems
Basic knowledge on FONC, SONC, SOSC has already been covered in Optimization Learning Notes 5. I may only provide more proofs and more remarks for them.
Quadratic Growth Condition
SOSC turns out to have a equivalent form.
Let be twice continuously differentiable on the open set . Then, the SOSC conditions for the local optimum are equivalent to
Notice in the proof of SOSC, we have already shown that
Now we proof the other direction.
Since QGC implies FONC, we have .
By Taylor Expansion,
Since is open, can be in arbitrary direction, we can see that must be PSD.
Extended Definitions for Convexity
Beside the normal definitions of Convex Functions in Optimization Learning Notes 6, we now introduce two more constrained convexity definitions:
Let be given and let be a convex set. The function is
- strictly convex if for all with and we have
- strongly convex (with parameter ) is there exists such that for all and all , we have
First Order Characterization
Let be continuously differentiable on an open set that contains the convex set . Then, it follows
The function is convex if and only if
The mapping is strictly convex if and only if
The function is strongly convex (with parameter ) if and only if there exists such that
Proof of 1
From the definitions, we have
Taking , we get
, we get
Proof of 2
First, suppose is strictly convex, by letting we have
Using 1, it holds that
The other half can be proved in the same way as .
Proof of 3
Following the same way of ,
For the other direction, consider
We then obtain
Second Order Characterization
Let be twice continuously differentiable on an open set and convex set .
The function is convex if and only if the Hessian is positive semidefinite for all .
If the Hessian is PD for all , i.e., if we have
then is strictly convex.
The mapping is strongly convex (with parameter ) iff is uniformly positive definite (with parameter ), i.e., there is s.t.
Proof of 1
Suppose the Hessian is PSD. By Second Order MVT, we can find s.t.
This gives us the first order characterization.
To prove the other direction, consider . Since is open, we have find an , s.t.
Now consider the following Taylor Expansion:
for sufficiently small . Since is convex, we can use the first order characterization:
We can infer that is PSD.
Proof of 2
The same approach as the proof of 1.
Proof of 3
Similarly to the proof of 1, we use MVT to get
This already gives us the first order characterization.
Conversely, we start with the first order characterization:
Reuse the Taylor Expansion:
We now get
This finishes the proof.
Convex Optimization Problems
In Optimization Learning Notes 6, we already shows that for a typical convex minimization problem, any local minimum must be a global minimum. There are twos things remains here:
If is strictly convex, then the minimization problem possesses at most one local minimizer which (if it exists) is also the strict global minimum of on .
If the optimization set is open and is continuously differentiable and is a stationary point of . Then, is a global minimizer of on . (we provides a proof here)
Proof of the first part
Suppose possesses two local minimizer . We then know that they are both global solutions:
This is a contradiction.
Proof of the second part
Just use the first order characterization:
Hence, is a global solution.
Some Additional Properties
We show that is strongly convex with parameter iff is convex.
First suppose is strongly convex, the we have
Therefore, is convex.
Conversely, if is convex,
Then, by a similar approach in the above proof, we see
We finishes the proof.
Convexity of Epigraph
The epigraph of a function is defined as
is convex iff its epigraph is convex. Consider we have
We then have
Therefore, is in the epigraph and the epigraph is convex.
To proof the other side, we know that
is in the epigraph. By the convexity of the epigraph,
Therefore, the function also be convex.
Existence of Directional Derivative
Sometimes, the objective function is not differentiable. However, if it is convex, we can see that the directional derivative always exists.
Suppose is convex. we then have
The difference quotient
is a non-decreasing function of on .
For every the the directional derivative always exists and is given by
Proof of 1
Let and let . Then
Proof of 2
The definition of the directional derivative is
Then the limit is necessarily given by the infimum of these ratios.
Local Lipschitz Continuity
Let be a convex function such that the interior of its domain is not empty. Then, is locally Lipschitz continuous on in the interior of its domain.
[Proof for Local Lipschitz Continuity may be provided later]