On the Kolmogorov-Chaitin Complexity for Short Sequences

Hector Zenil
Wolfram Research

Jean-Paul Delahaye
Lille I University

Abstract

A drawback to Kolmogorov-Chaitin complexity (K) is that it is uncomputable in general, and that limits its range of applicability. Moreover, when strings are short, the dependence of K on a particular universal Turing machine U can be arbitrary. In practice one can approximate it by computable compression methods. However, such compression methods do not provide a good approximation for short sequences—shorter for example than typical compiler lengths. We will suggest an empirical approach to overcome this difficulty and to obtain a stable definition of the Kolmogorov-Chaitin complexity for short sequences. Additionally, a correlation in terms of distribution frequencies was found across the output of several systems, including abstract devices such as cellular automata and Turing machines as well as real-world data sources such as images and human DNA fragments.