Re: Case-insensitive sorting of strings (Python newbie)

2015-01-23 Thread Chris Angelico
On Sat, Jan 24, 2015 at 6:14 AM, Marko Rauhamaa wrote: > Well, if Python can't, then who can? Probably nobody in the world, not > generically, anyway. > > Example: > > >>> print("re\u0301sume\u0301") > résumé > >>> print("r\u00e9sum\u00e9") > résumé > >>> print("re\u0301sume\u0

Re: Case-insensitive sorting of strings (Python newbie)

2015-01-23 Thread Marko Rauhamaa
Peter Otten <__pete...@web.de>: > The standard recommendation is to convert bytes to unicode as early as > possible and only manipulate unicode. Unicode doesn't get you off the hook (as you explain later in your post). Upper/lowercase as well as collation order is ambiguous. Python even with dece

Re: Case-insensitive sorting of strings (Python newbie)

2015-01-23 Thread Steven D'Aprano
John Sampson wrote: > I notice that the string method 'lower' seems to convert some strings > (input from a text file) to Unicode but not others. I don't think so. You're going to have to show an example. I *think* what you might be running into is an artifact of printing to a terminal, which ma

Re: Case-insensitive sorting of strings (Python newbie)

2015-01-23 Thread Chris Angelico
On Sat, Jan 24, 2015 at 4:53 AM, Peter Otten <__pete...@web.de> wrote: > Now the same with unicode. To read text with a specific encoding use either > codecs.open() or io.open() instead of the built-in (replace utf-8 with your > actual encoding): > import io for line in io.open("tmp.txt",

Re: Case-insensitive sorting of strings (Python newbie)

2015-01-23 Thread Peter Otten
John Sampson wrote: > I notice that the string method 'lower' seems to convert some strings > (input from a text file) to Unicode but not others. > This messes up sorting if it is used on arguments of 'sorted' since > Unicode strings come before ordinary ones. > > Is there a better way of case-in

Re: Case-insensitive sorting of strings (Python newbie)

2015-01-23 Thread Michael Ströder
John Sampson wrote: > I notice that the string method 'lower' seems to convert some strings (input > from a text file) to Unicode but not others. > This messes up sorting if it is used on arguments of 'sorted' since Unicode > strings come before ordinary ones. I doubt that. Can you provide a short

Case-insensitive sorting of strings (Python newbie)

2015-01-23 Thread John Sampson
I notice that the string method 'lower' seems to convert some strings (input from a text file) to Unicode but not others. This messes up sorting if it is used on arguments of 'sorted' since Unicode strings come before ordinary ones. Is there a better way of case-insensitive sorting of strings i