The Precise Pangolin has reached end of life, so this bug will not be
fixed for that release

** Changed in: software-center (Ubuntu Precise)
       Status: Triaged => Won't Fix

-- 
You received this bug notification because you are a member of Ubuntu
Translations Coordinators, which is subscribed to Ubuntu Translations.
Matching subscriptions: Ubuntu Translations bug mail
https://bugs.launchpad.net/bugs/744914

Title:
  transliterate text/use collation before adding to xapian db and when
  searching

Status in Ubuntu Translations:
  New
Status in software-center package in Ubuntu:
  Triaged
Status in software-center source package in Precise:
  Won't Fix

Bug description:
  Binary package hint: software-center

  As of now software center uses str.lower() when searching in the
  xapian db:

  utils/query.py
  22:            s = search_term.lower()
  33:            query = 
xapian.Query(str_to_prefix[search_prefix]+search_term.lower())

  There are two problems with this:
  * many languages have diacritic marks for characters but for fast typing 
users usually write the base character: (in Romanian: ăâșțî and ĂÂȘȚÎ are 
spelled AASTI by some users).

  * characters in the Unicode set can appear in two forms: composed and
  decomposed:  the character U+00C7 (LATIN CAPITAL LETTER C WITH
  CEDILLA) can also be expressed as the sequence U+0327 (COMBINING
  CEDILLA) U+0043 (LATIN CAPITAL LETTER C).

  To solve both problems both the text entered in the xapian db and the
  user's text query must be normalized.

  The search function in Chromium uses ICU rules to achieve this:
  - http://code.google.com/p/chromium/issues/detail?id=1100
  - 
http://www.google.com/codesearch/p?hl=en#OAMlx_jo-ck/src/third_party/WebKit/Source/WebCore/editing/TextIterator.cpp&q=file:TextIterator.cpp&l=1882

  There is a python-icu library that could help achieve this. See for
  example http://lists.osafoundation.org/pipermail/pyicu-
  dev/2010-October/000214.html

  Or one could just remove the diacritical marks from the string
  altogether: http://stackoverflow.com/questions/517923/what-is-the-
  best-way-to-remove-accents-in-a-python-unicode-string

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu-translations/+bug/744914/+subscriptions


_______________________________________________
Mailing list: https://launchpad.net/~ubuntu-translations-coordinators
Post to     : ubuntu-translations-coordinators@lists.launchpad.net
Unsubscribe : https://launchpad.net/~ubuntu-translations-coordinators
More help   : https://help.launchpad.net/ListHelp

Reply via email to