Re: [Tutor] UTF-8 title() string method

2007-07-05 Thread Jon Crump
On Thu, 5 Jul 2007, Kent Johnson wrote: >> First, don't confuse unicode and utf-8. > > Too late ;-) already pitifully confused. > This is a good place to start correcting that: > http://www.joelonsoftware.com/articles/Unicode.html Thanks for this, it's just what I needed! > if s is your utf-8 s

Re: [Tutor] UTF-8 title() string method

2007-07-05 Thread Kent Johnson
Jon Crump wrote: > On Wed, 4 Jul 2007, Kent Johnson wrote: >> First, don't confuse unicode and utf-8. > > Too late ;-) already pitifully confused. This is a good place to start correcting that: http://www.joelonsoftware.com/articles/Unicode.html >> Second, convert the string to unicode and then

Re: [Tutor] UTF-8 title() string method

2007-07-04 Thread Kent Johnson
Terry Carroll wrote: > I think setting the locale is the trick: > s1 = open("text.txt").readline() print s1 > ANGOUL.ME, Angoumois. print s1.title() > Angoul.Me, Angoumois. import locale locale.setlocale(locale.LC_ALL,('french')) > 'French_France.1252' print s1.title(

Re: [Tutor] UTF-8 title() string method

2007-07-04 Thread Kent Johnson
Jon Crump wrote: > Dear All, > > I have some utf-8 unicode text with lines like this: > > ANVERS-LE-HOMONT, Maine. > ANGOULÊME, Angoumois. > ANDELY (le Petit), Normandie. > > which I'm using as-is in this line of code: > > place.append(line.strip()) > > What I would prefer would be something l

Re: [Tutor] UTF-8 title() string method

2007-07-04 Thread Jon Crump
Terry, thanks. Sadly, I'm still missing something. I've tried all the aliases in locale.py, most return locale.Error: unsupported locale setting one that doesn't is: locale.setlocale(locale.LC_ALL, ('fr_fr')) 'fr_fr' but if I set it thus it returns: Angoul?äMe, Angoumois. I'm running pyth

Re: [Tutor] UTF-8 title() string method

2007-07-03 Thread Terry Carroll
On Tue, 3 Jul 2007, Jon Crump wrote: > but where there are diacritics involved, title() gives me: > > AngoulMe, Angoumois. > > Can anyone give the clueless a clue on how to manage such unicode strings > more effectively? I think setting the locale is the trick: >>> s1 = open("text.txt").readlin

[Tutor] UTF-8 title() string method

2007-07-03 Thread Jon Crump
Dear All, I have some utf-8 unicode text with lines like this: ANVERS-LE-HOMONT, Maine. ANGOULÊME, Angoumois. ANDELY (le Petit), Normandie. which I'm using as-is in this line of code: place.append(line.strip()) What I would prefer would be something like this: place.append(line.title().strip