Aside from the fact that the Ref.zip metadata shows the years associated with the column identifiers, and the contestant may therefore include in the compressed representation that temporal information if that lowers the compressed representation, consider the case where longitudinal measurements (ie: time-sequence data) are presented without any metadata at all, let alone metadata that specifies a temporal dimension to any of the measurements.
If these data are from a dynamical system, application of dynamical system identification <https://www.nature.com/articles/s41467-021-26434-1> will minimize the size of the compressed representation by specifying the boundary condition and system of differential equations. This is not because there is a "time" dimension anywhere, except in the implicit dimension across which differentials are identified. Let's further take the utterly atemporal data case where a single year snapshot is taken across a wide range of counties (or other geographic statistical area) on a wide range of measures. It may still make sense to identify a dynamical system where processes are at work across time that result in spatial structures at different stages of progression of that system. Urbanization is one such obvious case. Deforestation is another. There will be covariants of these measures that may be interpreted as caused by them in the sense of a latent temporal dimension. On Tue, Aug 8, 2023 at 5:23 PM Matt Mahoney <mattmahone...@gmail.com> wrote: > ... > I see that BMLiNGAM is based on the LINGAM model of causality, so I found > the paper on LINGAM by Shimizu. It extends Pearl's covariance matrix model > of causality to non Gaussian data. But it assumes (like Pearl) that you > still know which variables are dependent and which are independent. > > But a table of numbers like LaboratoryOfTheCounties doesn't tell you this. > We can assume that causality is directional from past to future, so using > an example from the data, increasing 1990 population causes 2000 population > to increase as well. But knowing this doesn't help compression. I can just > as easily predict 1990 population from 2000 population as the other way > around. > > As a more general example, suppose I have the following data over 3 > variables: > > A B C > 0 0 0 > 0 1 0 > 1 0 1 > 1 1 1 > > I can see there is a correlation between A and C but not B. I can compress > just as well by eliminating column A or C, since they are identical. This > does not tell us whether A causes C, or C causes A, or both are caused by > some other variable. > > What would be an example of determining causality with generic labels? > > *Artificial General Intelligence List <https://agi.topicbox.com/latest>* > / AGI / see discussions <https://agi.topicbox.com/groups/agi> + > participants <https://agi.topicbox.com/groups/agi/members> + > delivery options <https://agi.topicbox.com/groups/agi/subscription> > Permalink > <https://agi.topicbox.com/groups/agi/T772759d8ceb4b92c-Me179c26239d34b6412482b91> > ------------------------------------------ Artificial General Intelligence List: AGI Permalink: https://agi.topicbox.com/groups/agi/T772759d8ceb4b92c-M6e82e89e73fc2d713315846f Delivery options: https://agi.topicbox.com/groups/agi/subscription