On Sat, Jul 12, 2014 at 11:27:17AM +0100, Alan Gauld wrote: > On 12/07/14 10:28, Steven D'Aprano wrote: > > >If you're using Python 3.3 or higher, it is better to use > >message.casefold rather than lower. For English, there's no real > >difference: > >... > >but it can make a difference for non-English languages: > > > >py> "Große".lower() # German for "great" or "large" > >'große' > >py> "Große".casefold() > >'grosse' > > You learn something new etc... > > But I'm trying to figure out what difference this makes in > practice? > > If you were targeting a German audience wouldn't you just test > against the German alphabet? After all you still have to expect 'grosse' > which isn't English, so if you know to expect grosse > why not just test against große instead?
Because the person might have typed any of: grosse GROSSE gROSSE große Große GROßE GROẞE etc., and you want to accept them all, just like in English you'd want to accept any of GREAT great gREAT Great gReAt etc. Hence you want to fold everything to a single, known, canonical version. Case-fold will do that, while lowercasing won't. (The last example includes a character which might not be visible to many people, since it is quite unusual and not supported by many fonts yet. If it looks like a box or empty space for you, it is supposed to be capital sharp-s, matching the small sharp-s ß.) Oh, here's another example of the difference, this one from Greek: py> 'Σσς'.lower() # three versions of sigma 'σσς' py> 'Σσς'.upper() 'ΣΣΣ' py> 'Σσς'.casefold() 'σσσ' I suspect that there probably aren't a large number of languages where casefold and lower do something different, since most languages don't have distinguish between upper and lower case at all. But there's no harm in using it, since at worst it returns the same as lower(). -- Steven _______________________________________________ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor