On Oct 20, 1:51 pm, David C Ullrich <dullr...@sprynet.com> wrote: > On Thu, 15 Oct 2009 18:18:09 -0700, Mensanator wrote: > > All I wanted to do is split a binary number into two lists, a list of > > blocks of consecutive ones and another list of blocks of consecutive > > zeroes. > > > But no, you can't do that. > > >>>> c = '0010000110' > >>>> c.split('0') > > ['', '', '1', '', '', '', '11', ''] > > > Ok, the consecutive delimiters appear as empty strings for reasons > > unknown (except for the first one). Except when they start or end the > > string in which case the first one is included. > > > Maybe there's a reason for this inconsistent behaviour but you won't > > find it in the documentation. > > Wanna bet? I'm not sure whether you're claiming that the behavior > is not specified in the docs or the reason for it. The behavior > certainly is specified. I conjecture you think the behavior itself > is not specified,
The problem is that the docs give a single example >>> '1,,2'.split(',') ['1','','2'] ignoring the special case of leading/trailing delimiters. Yes, if you think it through, ',1,,2,'.split(',') should return ['','1','','2',''] for exactly the reasons you give. Trouble is, we often find ourselves doing ' 1 2 '.split() which returns ['1','2']. I'm not saying either behaviour is wrong, it's just not obvious that the one behaviour doesn't follow from the other and the documentation could be a little clearer on this matter. It might make a bit more sense to actually mention the slpit(sep) behavior that split() doesn't do. > because your description of what's happening, > > "consecutive delimiters appear as empty strings for reasons > > > unknown (except for the first one). Except when they start or end the > > string in which case the first one is included" > > is at best an awkward way to look at it. The delimiters > are not appearing as empty strings. > > You're asking to split '0010000110' on '0'. > So you're asking for strings a, b, c, etc such that > > (*) '0010000110' = a + '0' + b + '0' + c + '0' + etc > > The sequence of strings you're getting as output satisfies > (*) exactly; the first '' is what appears before the first > delimiter, the second '' is what's between the first and > second delimiters, etc. -- http://mail.python.org/mailman/listinfo/python-list