Re: [Tutor] how to parse a multiple character words from plaintext

Kent Johnson Sat, 23 Feb 2008 03:45:36 -0800

John Gunderman wrote:
> I am looking to parse a plaintext from a document. However, I am 
> confused about the actual methodology of it. This is because some of the 
> words will be multiple digits or characters. However, I don't know the 
> length of the words before the parse. Is there a way to somehow have 
> open() grab something until it sees a /t or ' '? I was thinking I could 
> have it count ahead the number of spaces till the stopping point and 
> then parse till that point using read(), but that seems sort of 
> inefficient. Is there a better way to pull this off? Thanks in advance.


How big is the file? Can you just read the whole document and parse the 
resulting string? Or read by lines?

Depending on how complex your parsing is, you might want to use 
pyparsing or one of the other Python parser libraries.
http://pyparsing.wikispaces.com/
http://nedbatchelder.com/text/python-parsers.html

Kent
_______________________________________________
Tutor maillist  -  [email protected]
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] how to parse a multiple character words from plaintext

Reply via email to