Re: splitting by double newline

Peter Otten Mon, 07 Feb 2011 10:28:13 -0800

Nikola Skoric wrote:

> Hello everybody,
> 
> I'd like to split a file by double newlines, but portably. Now,
> splitting by one or more newlines is relatively easy:
> 
> self.tables = re.split("[\r\n]+", bulk)
> 
> But, how can I split on double newlines? I tried several approaches,
> but none worked...


If you open the file in universal newline mode with

with open(filename, "U") as f:
    bulk = f.read()

your data will only contain "\n". You can then split with 

blocks = bulk.split("\n\n") # exactly one empty line

or 

blocks = re.compile(r"\n{2,}").split(bulk) # one or more empty lines

One last variant that doesn't read in the whole file and accepts lines with 
only whitespace as empty:

with open(filename, "U") as f:
    blocks = ("".join(group) for empty, group in itertools.groupby(f, 
key=str.isspace) if not empty)
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: splitting by double newline

Reply via email to