On 07/20/2013 01:00 AM, Sivaram Neelakantan wrote:
On Sat, Jul 20 2013,Dave Angel wrote:
<snip>
These are small,fixed line extracts.
Once you determine the offset in the file for those 180, 90, and 30
day points, it's a simple matter to just seek to one such spot and
process all the records following. Most records need never be read
from disk at all.
Will this work when the trading days and calendar days are not the
same? My 30 days is the calendar days while the 30 trading days could
mean an extra 1-2 calendar weeks.
Certainly it'll work. Once you've done your binary search to find one
of the 3 starting places. you can process the data sequentially.
If I can describe your file spec, you have a file of fixed-length
records, each with a date stamp. You have a variable number of records
per day (zero or one, the way you describe it, but that doesn't affect
our algorithm). You want to make a list of all of the records since
todays_date-N, where N is 30, 90, 180, or whatever.
Since you don't have tons of data, you could load all of it into a list
ahead of time. Manipulating that list will be easier than manipulating
the file. But because the records are fixed size, the two are isomorphic.
So: read all the records into a list. (Each item in the list is a
tuple of date and data)
For a particular value of "age", create a sublist of those records newer
than age: target is today-age. Do a binary search in the list for
target. Using that index as a starting point, return a slice in the list.
Now just call that function 3 times, for your three different values of age.
Since you've got an in-memory list, it's straightforward to use
bisect.bisect_left() to do the binary search.i But since your data is
small, a simple linear search isn't too much slower either. See the
following link:
http://docs.python.org/2/library/bisect.html#searching-sorted-lists
http://docs.python.org/3.3/library/bisect.html#searching-sorted-lists
--
DaveA
_______________________________________________
Tutor maillist - Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor