trailingPattern = '(\S*)\ +?\n'
line = re.sub(trailingPattern, '\\1\n', line)
What happens with this?
trailingPattern = '\s+$'
line = re.sub(trailingPattern, '', line)
I'm guessing that $ terminates \s+'s greediness without snarfing the underlying
\n. Then I'm guessing that the lack of a \1 replacer will help the sub work
faster with less internal string shuffling.
line = line.rstrip()?
is probably faster still, but there might be a technical reason to avoid it.
But these uncertainties are why I write unit tests, including tests for the edge
cases. (What if it's a \r\n? What if the \n is missing? etc.) That way I don't
need to memorize re's exact behavior, and if I find a reason to swap in a
.rstrip(), I can pass all the tests and make sure the substitution works the same.
--
Phlip
http://c2.com/cgi/wiki?ZeekLand
--
http://mail.python.org/mailman/listinfo/python-list