On Thu, 03 Mar 2005 09:54:02 -0700, Steven Bethard <[EMAIL PROTECTED]> wrote:
A possible solution, using the re module:
py> s = """\ ... Gibberish ... 53 ... MoreGarbage ... 12 ... RelevantInfo1 ... 10/10/04 ... NothingImportant ... ThisDoesNotMatter ... 44 ... RelevantInfo2 ... 22 ... BlahBlah ... 343 ... RelevantInfo3 ... 23 ... Hubris ... Crap ... 34 ... """ py> import re py> m = re.compile(r"""^RelevantInfo1\n([^\n]*) ... .* ... ^RelevantInfo2\n([^\n]*) ... .* ... ^RelevantInfo3\n([^\n]*)""", ... re.DOTALL | re.MULTILINE | re.VERBOSE) py> score = {} py> for info1, info2, info3 in m.findall(s): ... score.setdefault(info1, {})[info3] = info2 ... py> score {'10/10/04': {'23': '22'}}
Note that I use DOTALL to allow .* to cross line boundaries, MULTILINE to have ^ apply at the start of each line, and VERBOSE to allow me to write the re in a more readable form.
If I didn't get your dict update quite right, hopefully you can see how to fix it!
Thanks! That was very helpful. Unfortunately, I wasn't completely clear when describing the problem. Is there anyway to extract multiple scores from the same file and from multiple files
I think if you use the non-greedy .*? instead of the greedy .*, you'll get this behavior. For example:
py> s = """\
... Gibberish
... 53
... MoreGarbage
[snip a whole bunch of stuff]
... RelevantInfo3
... 60
... Lalala
... """
py> import re
py> m = re.compile(r"""^RelevantInfo1\n([^\n]*)
... .*?
... ^RelevantInfo2\n([^\n]*)
... .*?
... ^RelevantInfo3\n([^\n]*)""",
... re.DOTALL | re.MULTILINE | re.VERBOSE)
py> score = {}
py> for info1, info2, info3 in m.findall(s):
... score.setdefault(info1, {})[info3] = info2
...
py> score
{'10/10/04': {'44': '33', '23': '22'}, '10/11/04': {'60': '45'}}If you might have multiple info2 values for the same (info1, info3) pair, you can try something like:
py> score = {}
py> for info1, info2, info3 in m.findall(s):
... score.setdefault(info1, {}).setdefault(info3, []).append(info2)
...
py> score
{'10/10/04': {'44': ['33'], '23': ['22']}, '10/11/04': {'60': ['45']}}HTH,
STeVe -- http://mail.python.org/mailman/listinfo/python-list
