Abstract

Democratization of AI involves training and deploying machine learning models across heterogeneous and potentially massive environments. While a diversity of data can bring about new possibilities to advance AI systems, it simultaneously restricts the extent to which information can be shared across environments due to pressing concerns such as privacy, security, and equity. This work provides a minimum requirement in terms of intuitive and reasonable axioms under which an empirical risk minimization (ERM) is the only rational learning algorithm in heterogeneous environments. Our impossibility result is based on a novel characterization of learning algorithms as choice correspondences on a hypothesis space. This result implies that Collective Intelligence (CI), the ability of algorithms to successfully learn across heterogeneous environments, cannot be achieved without sacrificing at least one of these basic properties. More importantly, it reveals an incomparability of performance metrics across environments as one of the fundamental obstacles in critical areas of machine learning such as out-of-distribution generalization, federated learning, algorithmic fairness, and multi-modal learning.

| Link | BibTex | Poster | Slide | Video | Code |