On Aug 7, 2:21 am, Steve Holden <[EMAIL PROTECTED]> wrote:
> [EMAIL PROTECTED] wrote:
> > Hello everybody, I'm new to python (...I work with cobol...)
>
> > I have to parse a file (that is a dbIII file) whose stucture look like
> > this:
> > |string|, |string|, |string that may contain commas inside|, 1, 2, 3, |
> > other string|
>
As Steve mentioned pyparsing, here is a pyparsing version for cracking
your data:

from pyparsing import *

data = "|string|, |string|, |string that may contain commas inside|,
1, 2, 3, |other string|"

integer = Word(nums)
# change unquoteResults to True to omit '|' chars from results
string = QuotedString("|", unquoteResults=False)
itemList = delimitedList( integer | string )

# parse the data and print out the results as a simple list
print itemList.parseString(data).asList()

# add a parse action to convert integer strings to actual integers
integer.setParseAction(lambda t:int(t[0]))

# reparse the data and now get converted integers in results
print itemList.parseString(data).asList()

Prints:

['|string|', '|string|', '|string that may contain commas inside|',
'1', '2', '3', '|other string|']
['|string|', '|string|', '|string that may contain commas inside|', 1,
2, 3, '|other string|']

-- Paul

-- 
http://mail.python.org/mailman/listinfo/python-list

Reply via email to