Re: [Tutor] string rules for 'number'
On 08/10/12 11:51, Arnej Duranovic wrote: Alright guys, I appreciate all your help SO much. I know understand, as the gentleman above said A string is a string is a string doesn't matter what is in it and they are ordered the same way...BUT this is what was going through my head. Since letters are ordered in such a way that A is less than B, for example, I thought the same applied to numbers, It does. 1 comes before 2, just like A comes before B. And 12345 comes before 2, just like Apple comes before B. -- Steven ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] string rules for 'number'
On 8 October 2012 03:19, eryksun eryk...@gmail.com wrote: As a supplement to what's already been stated about string comparisons, here's a possible solution if you need a more 'natural' sort order such as '1', '5', '10', '50', '100'. You can use a regular expression to split the string into a list of (digits, nondigits) tuples (mutually exclusive) using re.findall. For example: import re dndre = re.compile('([0-9]+)|([^0-9]+)') re.findall(dndre, 'a1b10') [('', 'a'), ('1', ''), ('', 'b'), ('10', '')] Use a list comprehension to choose either int(digits) if digits is non-empty else nondigits for each item. For example: [int(d) if d else nd for d, nd in re.findall(dndre, 'a1b10')] ['a', 1, 'b', 10] Now you have a list of strings and integers that will sort 'naturally' compared to other such lists, since they compare corresponding items starting at index 0. All that's left to do is to define this operation as a key function for use as the key argument of sort/sorted. For example: import re def natural(item, dndre=re.compile('([0-9]+)|([^0-9]+)')): if isinstance(item, str): item = [int(d) if d else nd for d, nd in re.findall(dndre, item.lower())] return item The above transforms all strings into a list of integers and lowercase strings (if you don't want letter case to affect sort order). In Python 2.x, use basestring instead of str. If you're working with bytes in Python 3.x, make sure to first decode() the items before sorting since the regular expression is only defined for strings. Regular sort: sorted(['s1x', 's10x', 's5x', 's50x', 's100x']) ['s100x', 's10x', 's1x', 's50x', 's5x'] Natural sort: sorted(['s1x', 's10x', 's5x', 's50x', 's100x'], key=natural) ['s1x', 's5x', 's10x', 's50x', 's100x'] For simple cases like this example I tend to use: natural = lambda x: (len(x), x) sorted(['s1x', 's10x', 's5x', 's50x', 's100x'], key=natural) ['s1x', 's5x', 's10x', 's50x', 's100x'] Oscar ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
[Tutor] string rules for 'number'
When I type this in the python idle shell ( version 3...) : '0' = '10' = '9' The interpreter evaluates this as true, WHY? 10 is greater than 0 but not 9 Notice I am not using the actual numbers, they are strings...I thought that numbers being string were ordered by their numerical value but obviously they are not? Can anyone explain this to me and explain how strings with numbers in them are ordered? Thx in advance. ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] string rules for 'number'
On Sun, Oct 7, 2012 at 1:46 PM, Arnej Duranovic arne...@gmail.com wrote: When I type this in the python idle shell ( version 3...) : '0' = '10' = '9' The interpreter evaluates this as true, WHY? 10 is greater than 0 but not 9 Notice I am not using the actual numbers, they are strings...I thought that numbers being string were ordered by their numerical value but obviously they are not? Can anyone explain this to me and explain how strings with numbers in them are ordered? Thx in advance. because '0' is less than '1', and '1' is less than '9' The comparison is done on a character by character basis -- Joel Goldstick ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] string rules for 'number'
On Oct 7, 2012 12:47 PM, Arnej Duranovic arne...@gmail.com wrote: When I type this in the python idle shell ( version 3...) : '0' = '10' = '9' The interpreter evaluates this as true, WHY? 10 is greater than 0 but not 9 Since they are strings it looks at these character by character. Since '0' '1' '9' , the 0 in '10' has no effect on the order. Compare 'a' 'ba' 'i' . boB ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] string rules for 'number'
On 07/10/2012 18:46, Arnej Duranovic wrote: When I type this in the python idle shell ( version 3...) : '0' = '10' = '9' The interpreter evaluates this as true, WHY? 10 is greater than 0 but not 9 Notice I am not using the actual numbers, they are strings...I thought that numbers being string were ordered by their numerical value but obviously they are not? Can anyone explain this to me and explain how strings with numbers in them are ordered? You thought wrong :) A string is a string is a string. They might look like numbers here but that's completely irrelevant. They'll be compared lexicographically, something I'm not inclined to attempt to explain so see here http://en.wikipedia.org/wiki/Lexicographical_order Please also be careful with your terminology. Note that I've used compared. Ordered is very different, e.g. FIFO is often used for first in, first out. Thx in advance. ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor -- Cheers. Mark Lawrence. ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] string rules for 'number'
On 08/10/12 04:46, Arnej Duranovic wrote: When I type this in the python idle shell ( version 3...) : '0'= '10'= '9' The interpreter evaluates this as true, WHY? 10 is greater than 0 but not 9 Notice I am not using the actual numbers, they are strings...I thought that numbers being string were ordered by their numerical value but obviously they are not? Others have already explained that strings are ordered on a character-by- character basis, not by numeric value. But I'm interested to ask how you came to the conclusion that they were ordered by numeric value. Was there something you read somewhere that gave you that (incorrect) impression, perhaps something in the tutorial or the reference manual? If so, please tell us, and we'll have it fixed. -- Steven ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] string rules for 'number'
On 08/10/12 05:20, Mark Lawrence wrote: [...] They'll be compared lexicographically, something I'm not inclined to attempt to explain so see here http://en.wikipedia.org/wiki/Lexicographical_order Please also be careful with your terminology. Note that I've used compared. Ordered is very different, e.g. FIFO is often used for first in, first out. Actually ordered is perfectly fine in this context. Notice that the page you link to is called lexicographical ORDER. Compared is a verb and refers to the act of examining the items in some sense. There are many different comparisons in Python: = = == != `is` `is not`. Ordered can mean one of two things: - that the items in question are *capable* of being placed into some order e.g. numbers can be ordered by value; complex numbers cannot; - that the items in question actually *have been* ordered. Some order includes: numeric order, date order, lexicographical order, insertion order, even random order! You may be conflating this with the difference between ordered dict and sorted dict, where the order referred to in the first case is insertion order rather than sorted order. -- Steven ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] string rules for 'number'
Alright guys, I appreciate all your help SO much. I know understand, as the gentleman above said A string is a string is a string doesn't matter what is in it and they are ordered the same way...BUT this is what was going through my head. Since letters are ordered in such a way that A is less than B, for example, I thought the same applied to numbers, I was very very wrong, as you guys have pointed out. I did not read it anywhere, it was just that logic was going through my head when I was writing the code and when it said that '10' is less than '9' I was like... WUT? But thanks again for all your help, I understand how STRINGS are ordered now :P ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] string rules for 'number'
On Sun, Oct 7, 2012 at 1:46 PM, Arnej Duranovic arne...@gmail.com wrote: When I type this in the python idle shell ( version 3...) : '0' = '10' = '9' The interpreter evaluates this as true, WHY? 10 is greater than 0 but not 9 Notice I am not using the actual numbers, they are strings...I thought that numbers being string were ordered by their numerical value but obviously they are not? As a supplement to what's already been stated about string comparisons, here's a possible solution if you need a more 'natural' sort order such as '1', '5', '10', '50', '100'. You can use a regular expression to split the string into a list of (digits, nondigits) tuples (mutually exclusive) using re.findall. For example: import re dndre = re.compile('([0-9]+)|([^0-9]+)') re.findall(dndre, 'a1b10') [('', 'a'), ('1', ''), ('', 'b'), ('10', '')] Use a list comprehension to choose either int(digits) if digits is non-empty else nondigits for each item. For example: [int(d) if d else nd for d, nd in re.findall(dndre, 'a1b10')] ['a', 1, 'b', 10] Now you have a list of strings and integers that will sort 'naturally' compared to other such lists, since they compare corresponding items starting at index 0. All that's left to do is to define this operation as a key function for use as the key argument of sort/sorted. For example: import re def natural(item, dndre=re.compile('([0-9]+)|([^0-9]+)')): if isinstance(item, str): item = [int(d) if d else nd for d, nd in re.findall(dndre, item.lower())] return item The above transforms all strings into a list of integers and lowercase strings (if you don't want letter case to affect sort order). In Python 2.x, use basestring instead of str. If you're working with bytes in Python 3.x, make sure to first decode() the items before sorting since the regular expression is only defined for strings. Regular sort: sorted(['s1x', 's10x', 's5x', 's50x', 's100x']) ['s100x', 's10x', 's1x', 's50x', 's5x'] Natural sort: sorted(['s1x', 's10x', 's5x', 's50x', 's100x'], key=natural) ['s1x', 's5x', 's10x', 's50x', 's100x'] Disclaimer: This is only meant to demonstrate the idea. You'll want to search around for a 'natural' sort recipe or package that handles the complexities of Unicode. It's probably not true that everything the 3.x re module considers to be a digit (the \d character class is Unicode category [Nd]) will work with the int() constructor, so instead I used [0-9] and [^0-9]. ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor