On Tue, 20 Oct 2009 15:22:55 -0700, Mensanator wrote: > On Oct 20, 1:51 pm, David C Ullrich <dullr...@sprynet.com> wrote: >> On Thu, 15 Oct 2009 18:18:09 -0700, Mensanator wrote: >> > All I wanted to do is split a binary number into two lists, a list of >> > blocks of consecutive ones and another list of blocks of consecutive >> > zeroes. >> >> > But no, you can't do that. >> >> >>>> c = '0010000110' >> >>>> c.split('0') >> > ['', '', '1', '', '', '', '11', ''] >> >> > Ok, the consecutive delimiters appear as empty strings for reasons >> > unknown (except for the first one). Except when they start or end the >> > string in which case the first one is included. >> >> > Maybe there's a reason for this inconsistent behaviour but you won't >> > find it in the documentation. >> >> Wanna bet? I'm not sure whether you're claiming that the behavior is >> not specified in the docs or the reason for it. The behavior certainly >> is specified. I conjecture you think the behavior itself is not >> specified, > > The problem is that the docs give a single example > >>>> '1,,2'.split(',') > ['1','','2'] > > ignoring the special case of leading/trailing delimiters. Yes, if you > think it through, ',1,,2,'.split(',') should return ['','1','','2',''] > for exactly the reasons you give. > > Trouble is, we often find ourselves doing ' 1 2 '.split() which > returns > ['1','2']. > > I'm not saying either behaviour is wrong, it's just not obvious that the > one behaviour doesn't follow from the other and the documentation could > be > a little clearer on this matter. It might make a bit more sense to > actually > mention the slpit(sep) behavior that split() doesn't do.
Have you _read_ the docs? They're quite clear on the difference between no sep (or sep=None) and sep=something: "If sep is given, consecutive delimiters are not grouped together and are deemed to delimit empty strings (for example, '1,,2'.split(',') returns ['1', '', '2']). The sep argument may consist of multiple characters (for example, '1<>2<>3'.split('<>') returns ['1', '2', '3']). Splitting an empty string with a specified separator returns ['']. If sep is not specified or is None, a different splitting algorithm is applied: runs of consecutive whitespace are regarded as a single separator, and the result will contain no empty strings at the start or end if the string has leading or trailing whitespace. Consequently, splitting an empty string or a string consisting of just whitespace with a None separator returns []." > >> because your description of what's happening, >> >> "consecutive delimiters appear as empty strings for reasons >> >> > unknown (except for the first one). Except when they start or end the >> > string in which case the first one is included" >> >> is at best an awkward way to look at it. The delimiters are not >> appearing as empty strings. >> >> You're asking to split '0010000110' on '0'. So you're asking for >> strings a, b, c, etc such that >> >> (*) '0010000110' = a + '0' + b + '0' + c + '0' + etc >> >> The sequence of strings you're getting as output satisfies (*) exactly; >> the first '' is what appears before the first delimiter, the second '' >> is what's between the first and second delimiters, etc. -- http://mail.python.org/mailman/listinfo/python-list