[Tutor] splits and pops

Eric Abrahamsen Sat, 12 Jul 2008 05:56:07 -0700

I have a horribly stupid text parsing problem that is driving mecrazy, and making me think my Python skills have a long, long way togo...

What I've got is a poorly-though-out SQL dump, in the form of a textfile, where each record is separated by a newline, and each field ineach record is separated by a tab. BUT, and this is what sinks me,there are also newlines within some of the fields. Newlines are not'safe' – they could appear anywhere – but tabs are 'safe' – they onlyappear as field delimiters.

There are nine fields per record. All I can think to do is read thefile in as a string, then split on tabs. That gives me a list whereevery eighth item is a string like this: u'last-field\nfirst-field'.Now I want to iterate through the list of strings, taking every eighthitem, splitting it on '\n', and replacing it with the two resultingstrings. Then I'll have the proper flat list where every nine listitems constitutes one complete record, and I'm good to go from there.

I've been fooling around with variations on the following (assumingsplitlist = fullstring.split('\t')):


for x in xrange(8, sys.maxint, 8):
    try:
        splitlist[x:x] = splitlist.pop(x).split('\n')
    except IndexError:
        break

The first line correctly steps over all the list items that need to besplit, but I can't come up with a line that correctly replaces thoselist items with the two strings I want. Either the cycle goes off andsplits the wrong strings, or I get nested list items, which is notwhat I want. Can someone please point me in the right direction here?


Thanks,
Eric
_______________________________________________
Tutor maillist  -  [email protected]
http://mail.python.org/mailman/listinfo/tutor

[Tutor] splits and pops

Reply via email to