Re: [Tutor] string rules for 'number'

2012-10-08 Thread Steven D'Aprano

On 08/10/12 11:51, Arnej Duranovic wrote:

Alright guys, I appreciate all your help SO much. I know understand, as the
gentleman above said  A string is a string is a string doesn't matter
what is in it and they are ordered the same way...BUT this is what was
going through my head. Since letters are ordered in such a way that A is
less than B, for example, I thought the same applied to numbers,


It does.

1 comes before 2, just like A comes before B.

And 12345 comes before 2, just like Apple comes before B.



--
Steven
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] string rules for 'number'

2012-10-08 Thread Oscar Benjamin
On 8 October 2012 03:19, eryksun eryk...@gmail.com wrote:
 As a supplement to what's already been stated about string
 comparisons, here's a possible solution if you need a more 'natural'
 sort order such as '1', '5', '10', '50', '100'.

 You can use a regular expression to split the string into a list of
 (digits, nondigits) tuples (mutually exclusive) using re.findall. For
 example:

  import re
  dndre = re.compile('([0-9]+)|([^0-9]+)')

  re.findall(dndre, 'a1b10')
 [('', 'a'), ('1', ''), ('', 'b'), ('10', '')]

 Use a list comprehension to choose either int(digits) if digits is
 non-empty else nondigits for each item. For example:

  [int(d) if d else nd for d, nd in re.findall(dndre, 'a1b10')]
 ['a', 1, 'b', 10]

 Now you have a list of strings and integers that will sort 'naturally'
 compared to other such lists, since they compare corresponding items
 starting at index 0. All that's left to do is to define this operation
 as a key function for use as the key argument of sort/sorted. For
 example:

 import re

 def natural(item, dndre=re.compile('([0-9]+)|([^0-9]+)')):
 if isinstance(item, str):
 item = [int(d) if d else nd for d, nd in
 re.findall(dndre, item.lower())]
 return item

 The above transforms all strings into a list of integers and lowercase
 strings (if you don't want letter case to affect sort order). In
 Python 2.x, use basestring instead of str. If you're working with
 bytes in Python 3.x, make sure to first decode() the items before
 sorting since the regular expression is only defined for strings.

 Regular sort:

  sorted(['s1x', 's10x', 's5x', 's50x', 's100x'])
 ['s100x', 's10x', 's1x', 's50x', 's5x']

 Natural sort:

  sorted(['s1x', 's10x', 's5x', 's50x', 's100x'], key=natural)
 ['s1x', 's5x', 's10x', 's50x', 's100x']

For simple cases like this example I tend to use:

 natural = lambda x: (len(x), x)
 sorted(['s1x', 's10x', 's5x', 's50x', 's100x'], key=natural)
['s1x', 's5x', 's10x', 's50x', 's100x']


Oscar
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


[Tutor] string rules for 'number'

2012-10-07 Thread Arnej Duranovic
When I type this in the python idle shell ( version 3...) :
'0' = '10' = '9'
The interpreter evaluates this as true, WHY? 10 is greater than 0 but not 9
Notice I am not using the actual numbers, they are strings...I thought that
numbers being string were ordered by their numerical value but obviously
they are
not? Can anyone explain this to me and explain how strings with numbers in
them are ordered?

Thx in advance.
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] string rules for 'number'

2012-10-07 Thread Joel Goldstick
On Sun, Oct 7, 2012 at 1:46 PM, Arnej Duranovic arne...@gmail.com wrote:
 When I type this in the python idle shell ( version 3...) :
 '0' = '10' = '9'
 The interpreter evaluates this as true, WHY? 10 is greater than 0 but not 9
 Notice I am not using the actual numbers, they are strings...I thought that
 numbers being string were ordered by their numerical value but obviously
 they are
 not? Can anyone explain this to me and explain how strings with numbers in
 them are ordered?

 Thx in advance.

because '0' is less than '1', and '1' is less than '9'

The comparison is done on a character by character basis


-- 
Joel Goldstick
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] string rules for 'number'

2012-10-07 Thread boB Stepp
On Oct 7, 2012 12:47 PM, Arnej Duranovic arne...@gmail.com wrote:

 When I type this in the python idle shell ( version 3...) :
 '0' = '10' = '9'
 The interpreter evaluates this as true, WHY? 10 is greater than 0 but not
9

Since they are strings it looks at these character by character. Since '0'
 '1'  '9' , the 0 in '10' has no effect on the order.  Compare 'a'  'ba'
 'i' .

boB
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] string rules for 'number'

2012-10-07 Thread Mark Lawrence

On 07/10/2012 18:46, Arnej Duranovic wrote:

When I type this in the python idle shell ( version 3...) :
 '0' = '10' = '9'
The interpreter evaluates this as true, WHY? 10 is greater than 0 but not 9
Notice I am not using the actual numbers, they are strings...I thought that
numbers being string were ordered by their numerical value but obviously
they are
not? Can anyone explain this to me and explain how strings with numbers in
them are ordered?


You thought wrong :) A string is a string is a string.  They might look 
like numbers here but that's completely irrelevant.  They'll be compared 
lexicographically, something I'm not inclined to attempt to explain so 
see here http://en.wikipedia.org/wiki/Lexicographical_order


Please also be careful with your terminology.  Note that I've used 
compared.  Ordered is very different, e.g. FIFO is often used for first 
in, first out.




Thx in advance.



___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor




--
Cheers.

Mark Lawrence.

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] string rules for 'number'

2012-10-07 Thread Steven D'Aprano

On 08/10/12 04:46, Arnej Duranovic wrote:

When I type this in the python idle shell ( version 3...) :
 '0'= '10'= '9'
The interpreter evaluates this as true, WHY? 10 is greater than 0 but not 9
Notice I am not using the actual numbers, they are strings...I thought that
numbers being string were ordered by their numerical value but obviously
they are not?


Others have already explained that strings are ordered on a character-by-
character basis, not by numeric value. But I'm interested to ask how you
came to the conclusion that they were ordered by numeric value. Was there
something you read somewhere that gave you that (incorrect) impression,
perhaps something in the tutorial or the reference manual?

If so, please tell us, and we'll have it fixed.



--
Steven
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] string rules for 'number'

2012-10-07 Thread Steven D'Aprano

On 08/10/12 05:20, Mark Lawrence wrote:

[...]

They'll be compared lexicographically, something I'm not inclined
to attempt to explain so see here
http://en.wikipedia.org/wiki/Lexicographical_order



Please also be careful with your terminology. Note that I've used
compared. Ordered is very different, e.g. FIFO is often used for
first in, first out.


Actually ordered is perfectly fine in this context. Notice that the
page you link to is called lexicographical ORDER.

Compared is a verb and refers to the act of examining the items in
some sense. There are many different comparisons in Python:   =

= == != `is` `is not`.


Ordered can mean one of two things:

- that the items in question are *capable* of being placed into some
  order e.g. numbers can be ordered by value; complex numbers cannot;

- that the items in question actually *have been* ordered.

Some order includes: numeric order, date order, lexicographical
order, insertion order, even random order!


You may be conflating this with the difference between ordered
dict and sorted dict, where the order referred to in the first
case is insertion order rather than sorted order.


--
Steven
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] string rules for 'number'

2012-10-07 Thread Arnej Duranovic
Alright guys, I appreciate all your help SO much. I know understand, as the
gentleman above said  A string is a string is a string doesn't matter
what is in it and they are ordered the same way...BUT this is what was
going through my head. Since letters are ordered in such a way that A is
less than B, for example, I thought the same applied to numbers, I was very
very wrong, as you guys have pointed out. I did not read it anywhere, it
was just that logic was going through my head when I was writing the code
and when it said that '10' is less than '9' I was like... WUT? But thanks
again for all your help, I understand how STRINGS are ordered now :P
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] string rules for 'number'

2012-10-07 Thread eryksun
On Sun, Oct 7, 2012 at 1:46 PM, Arnej Duranovic arne...@gmail.com wrote:

 When I type this in the python idle shell ( version 3...) :
 '0' = '10' = '9'
 The interpreter evaluates this as true, WHY? 10 is greater than 0 but not 9
 Notice I am not using the actual numbers, they are strings...I thought that
 numbers being string were ordered by their numerical value but obviously
 they are not?

As a supplement to what's already been stated about string
comparisons, here's a possible solution if you need a more 'natural'
sort order such as '1', '5', '10', '50', '100'.

You can use a regular expression to split the string into a list of
(digits, nondigits) tuples (mutually exclusive) using re.findall. For
example:

 import re
 dndre = re.compile('([0-9]+)|([^0-9]+)')

 re.findall(dndre, 'a1b10')
[('', 'a'), ('1', ''), ('', 'b'), ('10', '')]

Use a list comprehension to choose either int(digits) if digits is
non-empty else nondigits for each item. For example:

 [int(d) if d else nd for d, nd in re.findall(dndre, 'a1b10')]
['a', 1, 'b', 10]

Now you have a list of strings and integers that will sort 'naturally'
compared to other such lists, since they compare corresponding items
starting at index 0. All that's left to do is to define this operation
as a key function for use as the key argument of sort/sorted. For
example:

import re

def natural(item, dndre=re.compile('([0-9]+)|([^0-9]+)')):
if isinstance(item, str):
item = [int(d) if d else nd for d, nd in
re.findall(dndre, item.lower())]
return item

The above transforms all strings into a list of integers and lowercase
strings (if you don't want letter case to affect sort order). In
Python 2.x, use basestring instead of str. If you're working with
bytes in Python 3.x, make sure to first decode() the items before
sorting since the regular expression is only defined for strings.

Regular sort:

 sorted(['s1x', 's10x', 's5x', 's50x', 's100x'])
['s100x', 's10x', 's1x', 's50x', 's5x']

Natural sort:

 sorted(['s1x', 's10x', 's5x', 's50x', 's100x'], key=natural)
['s1x', 's5x', 's10x', 's50x', 's100x']


Disclaimer: This is only meant to demonstrate the idea. You'll want to
search around for a 'natural' sort recipe or package that handles the
complexities of Unicode. It's probably not true that everything the
3.x re module considers to be a digit (the \d character class is
Unicode category [Nd]) will work with the int() constructor, so
instead I used [0-9] and [^0-9].
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor