A late morning on information geometry
(1) Luigi Montrucchio and Giovanni Pistone
“Kantorovich distance on a finite metric space”
ABSTRACT: Kantorovich distance (or 1-Wasserstein distance) on the probability simplex of a finite metric space is the value of a Linear Programming problem for which a closed-form expression is known in some cases. When the ground distance is defined by a graph, a few examples have already been studied. In the present talk after rederiving, with different tools, the result for trees, we prove that, for an arbitrary weighted graph, the K-distance is the minimum of the K-distances over all the spanning trees associated with the graph. We work in the dual LP-problem by using Arens-Eells norm associated with the metric space. Finally, we introduce new norms which are naturally related to `1-embeddable distances and allows for a partial extension of our results to this new setting.
(2) Jesse Van Oostrum
“Bures-Wasserstein geometry for optimal transport and quantum information”
ABSTRACT: The Bures Wasserstein distance is a distance function arising naturally in both optimal transport and quantum information theory. The geometrical properties of this distance are investigated using an extension of a classical geometrical construction by Rao.
(3) Goffredo Chirco
“Bregman-Lagrangian Formalism on the non-parametric Statistical Bundle”
ABSTRACT: I will discuss some preliminary results on the derivation of a variational approach to accelerated methods for optimization, in the context of non-parametric information geometry. A Bregman-Lagrangian system is defined on the maximal exponential manifold, where it provides a generative framework for second-order accelerated natural gradient dynamics on the affine geometry of the manifold. A dictionary between Lagrangian mechanics and information geometry is explored, with a focus on the symplectic structure of the statistical bundle for the exponential model. The research aims at the definition of a geometric framework for the adaptation of the variational optimization algorithms to the training of deep and convolutional neural network, with emphasis on accelerated methods, such as Nesterov’s accelerated gradient.