A Monday 12 May 2008, Nick Bower escrigué:
> Hi - just investigating pytables from storing data from many
> distributed remote logging stations, each logging about 100 channels
> at 1 second frequency (a fair bit).
>
> My questions;
>
> 1. How does one handle ordering timeseries data within a table?
> *Does* one actually order on the way in (eg re-shuffling data) or
> simply ignore it? Coming from a NetCDF background, the answer would
> be the former because at some point you'd want to efficiently
> serially read and plot it, but I've little experience with HDF and if
> this needs to be considered at all.
Well, I think that the HDF5 case is similar than the NetCDF for this
scenario: if you need to efficiently retrieve measurements that are
near in time, the best would be to save them in that order. However, in
order to take advantage of this (disk-sorted) arrangement, you will
need to build a map {table_indices} <--> {time_range} so as not having
to walk the entire table in order to get the interesting time slice.
A way to avoid having to build such a map by yourself is to use the
indexing capabilities of PyTables Pro (in fact, this is what an index
provides, a map between sorted values and indices for those values).
Cheers,
--
Francesc Alted
-------------------------------------------------------------------------
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference
Don't miss this year's exciting event. There's still time to save $100.
Use priority code J8TL2D2.
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
_______________________________________________
Pytables-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/pytables-users