On Feb 16, 3:48 pm, Imaginationworks <xiaju...@gmail.com> wrote: > Hi, > > I am trying to read object information from a text file (approx. > 30,000 lines) with the following format, each line corresponds to a > line in the text file. Currently, the whole file was read into a > string list using readlines(), then use for loop to search the "= {" > and "};" to determine the Object, SubObject,and SubSubObject. My > questions are > > 1) Is there any efficient method that I can search the whole string > list to find the location of the tokens(such as '= {' or '};' > > 2) Is there any efficient ways to extract the object information you > may suggest?
Parse it! Go full-bore with a real parser. You may want to consider one of the many fine Pythonic implementations of modern parsers, or break out more traditional parsing tools. This format is nested, meaning that you can't use regexes to parse what you want out of it. You're going to need a real, full-bore, no- holds-barred parser for this. Don't worry, the road is not easy but the destination is absolutely worth it. Once you come to appreciate and understand parsing, you have earned the right to call yourself a red-belt programmer. To get your black- belt, you'll need to write your own compiler. Having mastered these two tasks, there is no problem you cannot tackle. And once you realize that every program is really a compiler, then you have truly mastered the Zen of Programming in Any Programming Language That Will Ever Exist. With this understanding, you will judge programming language utility based solely on how hard it is to write a compiler in it, and complexity based on how hard it is to write a compiler for it. (Notice there are not a few parsers written in Python, as well as Jython and PyPy and others written for Python!) -- http://mail.python.org/mailman/listinfo/python-list