Tim Chase wrote: >>>> this didn't work elegantly as expected: >>>> >>>> >>> ss >>>> 'owi\nweoifj\nfheu\n' >>>> >>> re.split(r'(?m)$',ss) >>>> ['owi\nweoifj\nfheu\n'] >>> Do you have a need to use a regexp? >> I'd like the general case - split without consumption. > > I'm not sure there's a one-pass regex solution to the problem > using Python's regex engine. If pre-processing was allowed, one > could do it. >
I only found it partly with inverse logic - findall: >>> re.findall(r'(?s).*?(?:\n|$)','owi\nweoifj\nfheu\nxx') ['owi\n', 'weoifj\n', 'fheu\n', 'xx', ''] >>> re.findall(r'(?s).*?(?:\n|$)','owi\nweoifj\nfheu\n') ['owi\n', 'weoifj\n', 'fheu\n', ''] >>> but its also wrong regarding partial last lines. re.split obviously doesn't understand \A \Z ^ $ and also \b etc. empty matches. >>> re.split(r'\b(?=\n)','owi\nweoifj\nfheu\n\nxx') ['owi\nweoifj\nfheu\n\nxx'] >>>>>> ss.splitlines(True) >>> ['owi\n', 'weoifj\n', 'fheu\n'] >>> >> thanks. Yet this does not work "naturally" consistent in my line >> processing algorithm - the further buffering. Compare e.g. >> ss.split('\n') .. > > well, one can do > > >>> [line + '\n' for line in ss.splitlines()] > ['owi\n', 'eoifj\n', 'heu\n'] > >>> [line + '\n' for line in (ss+'xxx').splitlines()] > ['owi\n', 'eoifj\n', 'heu\n', 'xxx\n'] > > as another try for your edge case. It's understandable and > natural-looking > nice for some display purposes, but "wrong" regarding a general logic. The 'xxx' is not a complete line in the general case. Its and (open) part and should appear so. Robert -- http://mail.python.org/mailman/listinfo/python-list