Time Series databases

2007-02-08 Thread michael.dillon

  Going back to this thread, http://www.kx.com/ deals in 
 financial transaction
  databases where they store millions of ticks.  They appear to have a
  transactional based language with a solution that appears 
 to be robust and
  fail resistant.

 hmm, that is quite interesting. and apparently people out there _are_
 using it for things like counter values and what not - based on their
 FAQ. I'd absolutely love to know more about the algorithms and math
 behind something like kdb+

KX publish a bunch of information about their product. Their lineage
goes back to APL and the J language, both of which found most of their
users in financial services.

However, the general issue of time-series databases is more interesting.
Google will take you to lots of research using keywords like:

time-series database delta wavelet search indexing maxima

Of course, don't use them all at once. To give you a flavor of the stuff
that people have done, here is a slide presentation on compression and
indexing that does not use averages like RRD does:
http://www.cs.cmu.edu/~eugene/research/talks/major-extrema.ppt

In addition to Google, it is a good idea to search CiteSeer
http://citeseer.ist.psu.edu/ because it allows you to quickly track down
references to other papers so you can read them all as a set.

I don't think there are any full-blown open-source implementations that
you could integrate into your own systems. There is stuff like Metakit
http://www.equi4.com/metakit.html which stores data by column rather
than by row. And people who have thought about how to efficiently store
time-series probably cobbled together their own systems using bsddb or
HDF5. 

If you are stuck in the SQL world, then check out these articles on star
and snowflake schemas. http://en.wikipedia.org/wiki/Snowflake_schema
http://en.wikipedia.org/wiki/Star_schema and follow up the references at
the bottom of the page. 



Re: Time Series databases

2007-02-08 Thread Rodrick Brown


On 2/8/07, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote:


  Going back to this thread, http://www.kx.com/ deals in
 financial transaction
  databases where they store millions of ticks.  They appear to have a
  transactional based language with a solution that appears
 to be robust and
  fail resistant.

 hmm, that is quite interesting. and apparently people out there _are_
 using it for things like counter values and what not - based on their
 FAQ. I'd absolutely love to know more about the algorithms and math
 behind something like kdb+

KX publish a bunch of information about their product. Their lineage
goes back to APL and the J language, both of which found most of their
users in financial services.

However, the general issue of time-series databases is more interesting.
Google will take you to lots of research using keywords like:

time-series database delta wavelet search indexing maxima

Of course, don't use them all at once. To give you a flavor of the stuff
that people have done, here is a slide presentation on compression and
indexing that does not use averages like RRD does:
http://www.cs.cmu.edu/~eugene/research/talks/major-extrema.ppt

In addition to Google, it is a good idea to search CiteSeer
http://citeseer.ist.psu.edu/ because it allows you to quickly track down
references to other papers so you can read them all as a set.

I don't think there are any full-blown open-source implementations that
you could integrate into your own systems. There is stuff like Metakit
http://www.equi4.com/metakit.html which stores data by column rather
than by row. And people who have thought about how to efficiently store
time-series probably cobbled together their own systems using bsddb or
HDF5.

If you are stuck in the SQL world, then check out these articles on star
and snowflake schemas. http://en.wikipedia.org/wiki/Snowflake_schema
http://en.wikipedia.org/wiki/Star_schema and follow up the references at
the bottom of the page.




There have been numerous technical discussions over at EliteTrader.com
about tick database implementations using a variety of technologies
from with various pros and cons of SQL, KX, Vhayu, Times Ten,
Hibernate, and HDF5 a must read for anyone interested.

The threads can be found on elite trader automated trading forums
http://www.elitetrader.com/vb/showthread.php?s=threadid=81345perpage=6pagenumber=1


--
Rodrick R. Brown