Here's a simple modification to The Hutter Prize <http://prize.hutter1.net/> and the Large Text Compression Benchmark <http://mattmahoney.net/dc/text.html> to illustrate my point:
Split the Wikipedia corpus into separate files, one per Wikipedia article. An entry qualifies only if the set of checksums of the files produced by the self-extracting archive matches that of the original corpus. This reduces the over-constraint imposed by the strictly serialized corpus. On Sun, Jan 5, 2020 at 12:12 PM James Bowery <[email protected]> wrote: > In reality, sensors and effectors exist in space as well as time. > Serializing the spatial dimension of observations to formalize their > Kolmogorov Complexity, so they conform to the serialized input to a > Universal Turing machine, over-constrains the observations, introducing > order not relevant to their natural information content, hence artificially > inflating the, so-defined, KC. > > Since virtually all models in machine learning are based on tabular data, > even if they can be cast as time series, row-indexed by a timestamp, each > row is an observation with multiple dimensions. So it seems rather > interesting, if not frustrating, that the default assumption in Algorithmic > Information Theory is of a serial UTM. > > > *Artificial General Intelligence List <https://agi.topicbox.com/latest>* > / AGI / see discussions <https://agi.topicbox.com/groups/agi> + > participants <https://agi.topicbox.com/groups/agi/members> + delivery > options <https://agi.topicbox.com/groups/agi/subscription> Permalink > <https://agi.topicbox.com/groups/agi/Tc33b8ed7189d2a18-M52c8573613f4210aba7709dd> > ------------------------------------------ Artificial General Intelligence List: AGI Permalink: https://agi.topicbox.com/groups/agi/Tc33b8ed7189d2a18-M46367e236f33c84c655e8e76 Delivery options: https://agi.topicbox.com/groups/agi/subscription
