[algogeeks] Re: Efficient data structure to store sorted timestamped data

Gene Fri, 02 Dec 2005 15:42:31 -0800

If you want to use the fact that the data are sorted, you can just read
the file once in order to build an array of integers that are the
character offsets of the start of each line. (If all the lines are the
same length, you don't even need to do this!)  Then use file seek
(direct access) and binary search to find the start of a range, reading
lines until you get to the end of the range.


In this manner you eliminate the overhead of building a huge data
structure (a file offset is only 6 or 8 bytes).  But you will be doing
log_2(n) seeks to find each range start, which could be too slow if you
need to do many many lookups.

[algogeeks] Re: Efficient data structure to store sorted timestamped data

Reply via email to