On Oct 22, 7:47�am, David C. Ullrich <dullr...@sprynet.com> wrote: > On Wed, 21 Oct 2009 14:43:48 -0700 (PDT), Mensanator > > > > > > <mensana...@aol.com> wrote: > >On Oct 21, 2:46�pm, David C Ullrich <dullr...@sprynet.com> wrote: > >> On Tue, 20 Oct 2009 15:22:55 -0700, Mensanator wrote: > >> > On Oct 20, 1:51�pm, David C Ullrich <dullr...@sprynet.com> wrote: > >> >> On Thu, 15 Oct 2009 18:18:09 -0700, Mensanator wrote: > >> >> > All I wanted to do is split a binary number into two lists, a list of > >> >> > blocks of consecutive ones and another list of blocks of consecutive > >> >> > zeroes. > > >> >> > But no, you can't do that. > > >> >> >>>> c = '0010000110' > >> >> >>>> c.split('0') > >> >> > ['', '', '1', '', '', '', '11', ''] > > >> >> > Ok, the consecutive delimiters appear as empty strings for reasons > >> >> > unknown (except for the first one). Except when they start or end the > >> >> > string in which case the first one is included. > > >> >> > Maybe there's a reason for this inconsistent behaviour but you won't > >> >> > find it in the documentation. > > >> >> Wanna bet? I'm not sure whether you're claiming that the behavior is > >> >> not specified in the docs or the reason for it. The behavior certainly > >> >> is specified. I conjecture you think the behavior itself is not > >> >> specified, > > >> > The problem is that the docs give a single example > > >> >>>> '1,,2'.split(',') > >> > ['1','','2'] > > >> > ignoring the special case of leading/trailing delimiters. Yes, if you > >> > think it through, ',1,,2,'.split(',') should return ['','1','','2',''] > >> > for exactly the reasons you give. > > >> > Trouble is, we often find ourselves doing ' 1 �2 �'.split() which > >> > returns > >> > ['1','2']. > > >> > I'm not saying either behaviour is wrong, it's just not obvious that the > >> > one behaviour doesn't follow from the other and the documentation could > >> > be > >> > a little clearer on this matter. It might make a bit more sense to > >> > actually > >> > mention the slpit(sep) behavior that split() doesn't do. > > >> Have you _read_ the docs? > > >Yes. > > >> They're quite clear on the difference > >> between no sep (or sep=None) and sep=something: > > >I disagree that they are "quite clear". The first paragraph makes no > >mention of leading or trailing delimiters and they show no example > >of such usage. An example would at least force me to think about it > >if it isn't specifically mentioned in the paragraph. > > >One could infer from the second paragraph that, as it doesn't return > >empty stings from leading and trailing whitespace, slpit(sep) does > >for leading/trailing delimiters. Of course, why would I even be > >reading > >this paragraph when I'm trying to understand split(sep)? > > Now there you have an excellent point. > > At the start of the documentation for every function and method > they should include the following: > > Note: If you want to understand completely how this > function works you may need to read the entire documentation.
When I took Calculus, I wasn't required to read the entire book before doing the chapter 1 homework. Has teaching changed since I was ib school? > > And of course they should precede that in every instance with > > Note: Read the next sentence. And don't forget to add: We can't be bothered to show any examples of how this actually works, work out all the special cases for yourself. > > > > > > >The splitting of real strings is just as important, if not more so, > >than the behaviour of splitting empty strings. Especially when the > >behaviour is radically different. > > >>>> '010000110'.split('0') > >['', '1', '', '', '', '11', ''] > > >is a perfect example. It shows the empty strings generated from the > >leading and trailing delimiters, and also that you get 3 empty > >strings > >between the '1's, not 4. When creating documentation, it is always a > >good idea to document such cases. > > >And you'll then want to compare this to the equivalent whitespace > >case: > >>>> ' 1 � �11 '.split() > >['1', '11'] > > >And it wouldn't hurt to point this out: > >>>> c = '010000110'.split('0') > >>>> '0'.join(c) > >'010000110' > > >and note that it won't work with the whitespace version. > > >No, I have not submitted a request to change the documentation, I was > >looking for some feedback here. And it seems that no one else > >considers > >the documentation wanting. > > >> "If sep is given, consecutive delimiters are not grouped together and are > >> deemed to delimit empty strings (for example, '1,,2'.split(',') returns > >> ['1', '', '2']). The sep argument may consist of multiple characters (for > >> example, '1<>2<>3'.split('<>') returns ['1', '2', '3']). Splitting an > >> empty string with a specified separator returns ['']. > > >> If sep is not specified or is None, a different splitting algorithm is > >> applied: runs of consecutive whitespace are regarded as a single > >> separator, and the result will contain no empty strings at the start or > >> end if the string has leading or trailing whitespace. Consequently, > >> splitting an empty string or a string consisting of just whitespace with > >> a None separator returns []." > > >> >> because your description of what's happening, > > >> >> "consecutive delimiters appear as empty strings for reasons > > >> >> > unknown (except for the first one). Except when they start or end the > >> >> > string in which case the first one is included" > > >> >> is at best an awkward way to look at it. The delimiters are not > >> >> appearing as empty strings. > > >> >> You're asking to split �'0010000110' on '0'. So you're asking for > >> >> strings a, b, c, etc such that > > >> >> (*) '0010000110' = a + '0' + b + '0' + c + '0' + etc > > >> >> The sequence of strings you're getting as output satisfies (*) exactly; > >> >> the first '' is what appears before the first delimiter, the second '' > >> >> is what's between the first and second delimiters, etc. > > David C. Ullrich > > "Understanding Godel isn't about following his formal proof. > That would make a mockery of everything Godel was up to." > (John Jones, "My talk about Godel to the post-grads." > in sci.logic.)- Hide quoted text - > > - Show quoted text -- Hide quoted text - > > - Show quoted text - -- http://mail.python.org/mailman/listinfo/python-list