i wonder what fraction of people posting with "bug?" in their titles here actually find bugs?
anyway, how about: re.findall('[A-Z]?[a-z]*', 'fooBarBaz') or re.findall('([A-Z][a-z]*|[a-z]+)', 'fooBarBaz') (you have to specify what you're matching and lookahead/back doesn't do that). andrew Ron Garret wrote: > I'm trying to split a CamelCase string into its constituent components. > This kind of works: > >>>> re.split('[a-z][A-Z]', 'fooBarBaz') > ['fo', 'a', 'az'] > > but it consumes the boundary characters. To fix this I tried using > lookahead and lookbehind patterns instead, but it doesn't work: > >>>> re.split('((?<=[a-z])(?=[A-Z]))', 'fooBarBaz') > ['fooBarBaz'] > > However, it does seem to work with findall: > >>>> re.findall('(?<=[a-z])(?=[A-Z])', 'fooBarBaz') > ['', ''] > > So the regular expression seems to be doing the Right Thing. Is this a > bug in re.split, or am I missing something? > > (BTW, I tried looking at the source code for the re module, but I could > not find the relevant code. re.split calls sre_compile.compile().split, > but the string 'split' does not appear in sre_compile.py. So where does > this method come from?) > > I'm using Python2.5. > > Thanks, > rg > -- > http://mail.python.org/mailman/listinfo/python-list > > -- http://mail.python.org/mailman/listinfo/python-list