I think I'm having a major understanding failure.

So having discovered that my Unix sort breaks on the last day of the
month, I've gone ahead and implemented a per log search, using heapq.

I've tested it with various data, and it produces a sorted logfile, per log.

So in essence this:

logs = [ LogFile( "/home/stephen/qa/ded1353/quick_log.gz", "04/Nov/2009" ),
         LogFile( "/home/stephen/qa/ded1408/quick_log.gz", "04/Nov/2009" ),
         LogFile( "/home/stephen/qa/ded1409/quick_log.gz", "04/Nov/2009" ) ]

Gives me a list of LogFiles - each of which has a getline() method,
which returns a tuple.

I thought I could merge iterables using Kent's recipe, or just with
heapq.merge()

But how do I get from a method that can produce a tuple, to some
mergable iterables?

for log in logs:
  l = log.getline()
  print l

This gives me three loglines.  How do I get more?  Other than while True:

Of course tuples are iterables, but that doesn't help, as I want to
sort on timestamp... so a list of tuples would be ok....  But how do I
construct that, bearing in mind I am trying not to use up too much
memory?

I think there's a piece of the jigsaw I just don't get.  Please help!

The code in full is here:

import gzip, heapq, re

class LogFile:
   def __init__(self, filename, date):
       self.logfile = gzip.open(filename, 'r')
       for logline in self.logfile:
           self.line = logline
           self.stamp = self.timestamp(self.line)
           if self.stamp.startswith(date):
               break
       self.initialise_heap()

   def timestamp(self, line):
       stamp = re.search(r'\[(.*?)\]', line).group(1)
       return stamp

   def initialise_heap(self):
       initlist=[]
       self.heap=[]
       for x in xrange(10):
           self.line=self.logfile.readline()
           self.stamp=self.timestamp(self.line)
           initlist.append((self.stamp,self.line))
       heapq.heapify(initlist)
       self.heap=initlist


   def getline(self):
       self.line=self.logfile.readline()
       stamp=self.timestamp(self.line)
       heapq.heappush(self.heap, (stamp, self.line))
       pop = heapq.heappop(self.heap)
       return pop

logs = [ LogFile( "/home/stephen/qa/ded1353/quick_log.gz", "04/Nov/2009" ),
         LogFile( "/home/stephen/qa/ded1408/quick_log.gz", "04/Nov/2009" ),
         LogFile( "/home/stephen/qa/ded1409/quick_log.gz", "04/Nov/2009" ) ]
_______________________________________________
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Reply via email to