Sorry Karen my mistake for leaving that out, that reTagnormalizer just filtered everything that wasn't alphanumeric, the full code is below.
Also here's the error from manage.py test File "/restaurant/models.py", line 33, in mealadvisor.restaurant.models.normalize Failed example: normalize(u' café ') Expected: u'cafe' Got: u'cafa' *** import unicodedata, re reTagnormalizer= re.compile(r'[^a-zA-Z0-9]') reCombining = re.compile(u'[\u0300-\u036f\u1dc0-\u1dff\u20d0-\u20ff \ufe20-\ufe2f]',re.U) def remove_diacritics(s): " Decomposes string, then removes combining characters " return reCombining.sub('',unicodedata.normalize('NFD',unicode (s)) ) # tag normalizer def normalize(tag): """ >>> normalize(u'cafe') u'cafe' >>> normalize(u'caf e') u'cafe' >>> normalize(u' cafe ') u'cafe' For now this is wrong I think it's an error with doctest, not the actual function. >>> normalize(u' café ') u'cafe' >>> normalize(u'cAFe') u'cafe' >>> normalize(u'%sss%s') u'ssss' """ try: tag = remove_diacritics(tag) except: pass tag = reTagnormalizer.sub('', tag).lower() return tag On Dec 6, 9:42 pm, "Karen Tracey" <[EMAIL PROTECTED]> wrote: > On Sat, Dec 6, 2008 at 9:00 PM, Dave Dash <[EMAIL PROTECTED]> wrote: > > > Okay I think that fixes one fundamental issue... I've got a unittest, > > however that fails for a function: > > > def normalize(tag): > > """ > > >>> normalize(u'cafe') > > u'cafe' > > >>> normalize(u'caf e') > > u'cafe' > > >>> normalize(u' cafe ') > > u'cafe' > > >>> normalize(u' café ') > > u'cafe' > > >>> normalize(u'cAFe') > > u'cafe' > > >>> normalize(u'%sss%s') > > u'ssss' > > """ > > try: > > tag = remove_diacritics(tag) > > except: > > pass > > > tag = reTagnormalizer.sub('', tag).lower() > > return tag > > > It fails on the ' café' and translates it to cafa instead of cafe. > > THis is only through the unittest framework (doctest) since I can run > > it from django shell and it works as intended. > > > Is this just an issue with doctest? > > If I cut and paste your code and take out reTagnormalizer (since you didn't > post that) and all the tests that seem to depend on what it does vs. > remove_diacritics, and just test: > > """ > >>> normalize(u'café') > u'cafe' > """ > plain Python doctesting it works fine, as does 'manage.py test someapp' (if > I put the code in somapp's models.py file). > > So I can't recreate the error you are reporting based on what you have > posted. What's in reTagnormalizer? > > Karen --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Django users" group. To post to this group, send email to django-users@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/django-users?hl=en -~----------~----~----~----~------~----~------~--~---