Re: [Tutor] parsing a "chunked" text file

Christian Witts Tue, 02 Mar 2010 04:07:45 -0800

Andrew Fithian wrote:

Hi tutor,
I have a large text file that has chunks of data like this:

headerA n1
line 1
line 2
...
line n1
headerB n2
line 1
line 2
...
line n2
Where each chunk is a header and the lines that follow it (up to thenext header). A header has the number of lines in the chunk as itssecond field.
I would like to turn this file into a dictionary like:
dict = {'headerA':[line 1, line 2, ... , line n1], 'headerB':[line1,line 2, ... , line n2]}
Is there a way to do this with a dictionary comprehension or do I haveto iterate over the file with a "while 1" loop?
-Drew
------------------------------------------------------------------------

_______________________________________________
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


A solution that could work for you could be something like...

dict([(z.splitlines()[0].split()[0],z.splitlines()[1:]) for z in [x forx in open(filename).read().split('header') if x.strip()]])

{'A': ['line 1', 'line 2', '...', 'line n1'], 'B': ['line 1', 'line 2','...', 'line n2']}

Of course that doesn't look very pretty and only works for a specificcase as demonstrated on your sample data.


--
Kind Regards,
Christian Witts
Business Intelligence

C o m p u s c a n | Confidence in Credit

Telephone: +27 21 888 6000
National Cell Centre: 0861 51 41 31
Fax: +27 21 413 2424
E-mail: cwi...@compuscan.co.za

NOTE:  This e-mail (including attachments )is subject to the disclaimer 
published at: http://www.compuscan.co.za/live/content.php?Item_ID=494.
If you cannot access the disclaimer, request it from 
email.disclai...@compuscan.co.za or 0861 514131.

National Credit Regulator Credit Bureau Registration No. NCRCB6


_______________________________________________
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] parsing a "chunked" text file

Reply via email to