[email protected] writes:
> I have a string like:
> {'the','dog\'s','bite'}
> or maybe:
> {'the'}
> or sometimes:
> {}
>
> [FYI: this is postgresql database "array" field output format]
>
> which I'm trying to parse with the re module.
> A single quoted string would, I think, be:
> r"\{'([^']|\\')*'\}"
what about {'dog \\', ...} ?
If you don't need to validate anything you can just forget about the commas
etc and extract all the 'strings' with findall,
The regexp below is a bit too complicated (adapted from something else) but I
think will work:
In [90]:rex =
re.compile(r"'(?:[^\n]|(?<!\\)(?:\\)(?:\\\\)*\n)*?(?<!\\)(?:\\\\)*?'")
In [91]:rex.findall(r"{'the','dog\'s','bite'}")
Out[91]:["'the'", "'dog\\'s'", "'bite'"]
Otherwise just add something like ",|}$" to deal with the final } instead of a
comma.
Alternatively, you could also write a regexp to split on the "','" bit and trim
the first and the last split.
'as
--
http://mail.python.org/mailman/listinfo/python-list