On 2015-09-30 11:34, massi_...@msn.com wrote:
> firstly the description of my problem. I have a string in the
> following form:
> s = "name1 name2(1) name3 name4 (1, 4) name5(2) ..."
> that is a string made up of groups in the form 'name' (letters
> only) plus possibly a tuple containing 1 or 2 integer values.
> Blanks can be placed between names and tuples or not, but they
> surely are placed beween two groups. I would like to process this
> string in order to get a dictionary like this:
> d = {
>     "name1":(0, 0),
>     "name2":(1, 0),
>     "name3":(0, 0),
>     "name4":(1, 4),
>     "name5":(2, 0),
> }
> I guess this problem can be tackled with regular expressions, b

First out of the gate, I suggest you follow Emile's advice and try
using string expressions.  However, if you *want* to do it with
regular expressions, you can.  It's ugly and might be fragile, but

import re
s = "name1 name2(1) name3 name4 (1, 4) name5(2) ..."
r = re.compile(r"""
    \b       # start at a word boundary
    (\w+)    # capture the word
    \s*      # optional whitespace
    (?:      # start an optional grouping for things in the parens
     \(      # a literal open-paren
      \s*    # optional whitespace
      (\d+)  # capture the number in those parens
      (?:    # start a second optional grouping for the stuff after a comma
       \s*   # optional whitespace
       ,     # a literal comma
       \s*   # optional whitespace
       (\d+) # the second number
      )?     # make the command and following number optional
     \)      # a literal close-paren
    )?       # make that stuff in parens optional
    """, re.X)
d = {}
for m in r.finditer(s):
    a, b, c  = m.groups()
    d[a] = (int(b or 0), int(c or 0))

from pprint import pprint

I'd stick with the commented version of the regexp if you were to use
this anywhere so that others can follow what you're doing.



Reply via email to