On Mon, Apr 27, 2009 at 9:55 AM, Robert Fraser <fraseroftheni...@gmail.com> wrote: > > Andrei Alexandrescu wrote: >> >> Jason House wrote: >>> >>> Before reading your post, I was going to say that I'd expect 4, would >>> accept 1, and consider 2 or 3 to be buggy! Notice how under your new >>> proposal everyone would still get the behavior wrong when reading the >>> code. >> >> everyone posting heavily in thiss group != everyone > > Yes, but it's a representative (albeit small) sample of the user base. > >> Andrei >> >> >> P.S. I scrolled down your post looking for counter-evidence that you might >> have brought, but found only the please-don't-do-this-again empty quote. It >> wastes everybody time looking in vain for nuggets of responses within the >> quoted text. > > Is consistency a good argument? std.string.split currently does (4). Java and > C#'s split() methods work like (4). strtok does (4). Is there any other > language/function besides Perl that does (2)?
Python's split is rather more nuanced than has been intimated here: """ str.split([sep[, maxsplit]]) Return a list of the words in the string, using sep as the delimiter string. If maxsplit is given, at most maxsplit splits are done (thus, the list will have at most maxsplit+1 elements). If maxsplit is not specified, then there is no limit on the number of splits (all possible splits are made). If sep is given, consecutive delimiters are not grouped together and are deemed to delimit empty strings (for example, '1,,2'.split(',') returns ['1', '', '2']). The sep argument may consist of multiple characters (for example, '1<>2<>3'.split('<>') returns ['1', '2', '3']). Splitting an empty string with a specified separator returns ['']. If sep is not specified or is None, a different splitting algorithm is applied: runs of consecutive whitespace are regarded as a single separator, and the result will contain no empty strings at the start or end if the string has leading or trailing whitespace. Consequently, splitting an empty string or a string consisting of just whitespace with a None separator returns []. For example, ' 1 2 3 '.split() returns ['1', '2', '3'], and ' 1 2 3 '.split(None, 1) returns ['1', '2 3 ']. """ So the default is to act like Perl, but this only applies when splitting on whitespace. Otherwise it acts like 4). re.split in python seems to do 4) pretty much all the time. --bb