On Sun, Oct 7, 2012 at 1:46 PM, Arnej Duranovic <arne...@gmail.com> wrote: > > When I type this in the python idle shell ( version 3...) : > '0' <= '10' <= '9' > The interpreter evaluates this as true, WHY? 10 is greater than 0 but not 9 > Notice I am not using the actual numbers, they are strings...I thought that > numbers being string were ordered by their numerical value but obviously > they are not?
As a supplement to what's already been stated about string comparisons, here's a possible solution if you need a more 'natural' sort order such as '1', '5', '10', '50', '100'. You can use a regular expression to split the string into a list of (digits, nondigits) tuples (mutually exclusive) using re.findall. For example: >>> import re >>> dndre = re.compile('([0-9]+)|([^0-9]+)') >>> re.findall(dndre, 'a1b10') [('', 'a'), ('1', ''), ('', 'b'), ('10', '')] Use a list comprehension to choose either int(digits) if digits is non-empty else nondigits for each item. For example: >>> [int(d) if d else nd for d, nd in re.findall(dndre, 'a1b10')] ['a', 1, 'b', 10] Now you have a list of strings and integers that will sort 'naturally' compared to other such lists, since they compare corresponding items starting at index 0. All that's left to do is to define this operation as a key function for use as the "key" argument of sort/sorted. For example: import re def natural(item, dndre=re.compile('([0-9]+)|([^0-9]+)')): if isinstance(item, str): item = [int(d) if d else nd for d, nd in re.findall(dndre, item.lower())] return item The above transforms all strings into a list of integers and lowercase strings (if you don't want letter case to affect sort order). In Python 2.x, use "basestring" instead of "str". If you're working with bytes in Python 3.x, make sure to first decode() the items before sorting since the regular expression is only defined for strings. Regular sort: >>> sorted(['s1x', 's10x', 's5x', 's50x', 's100x']) ['s100x', 's10x', 's1x', 's50x', 's5x'] Natural sort: >>> sorted(['s1x', 's10x', 's5x', 's50x', 's100x'], key=natural) ['s1x', 's5x', 's10x', 's50x', 's100x'] Disclaimer: This is only meant to demonstrate the idea. You'll want to search around for a 'natural' sort recipe or package that handles the complexities of Unicode. It's probably not true that everything the 3.x re module considers to be a digit (the \d character class is Unicode category [Nd]) will work with the int() constructor, so instead I used [0-9] and [^0-9]. _______________________________________________ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor