Re case sensitivity, it was just a very simple example of the principle.
If it doesn’t find all then the search request and the index were not
normalized the same. NFC and NFD are different normalizations.
Note, stripping diacritics may be an appropriate normalization.
JSword doesn’t properly h
Thanks, DM.
My question was not about case-sensitivity, but about Unicode normalization.
The main issue is composition vs decomposition and the canonical ordering of
diacritics in each glyph.
e.g. Suppose the module contains 181 instances of the name "Efraím" which has 6
characters.
Suppose a u
It doesn’t matter that a search doesn’t use Lucene. The principle is the same.
The search request has to be normalized to the same form as the searched text.
For example a case insensitive search normalizes both to a single case. If it
isn’t done, even on the fly, then search will fail at times.
Thanks DM,
Not all searches make use of the Lucene index !
e.g. In Xiphos, the advanced search panel gives the user a choice of which type
of search.
Lucene is only one of these mutually exclusive options.
btw. Where is it documented that the creation of a Lucene search index
normalizes the Un
The requirement is not that the search is normalized to nfc but rather that it
is normalized the same as the index. This should not be a front end issue.
Btw it doesn’t matter how Hebrew is stored in the module. Indexing should
normalize it to a form that is internal to the engine.
— DM Smith
Dear all,
Not all front-ends automatically normalize the search string to Unicode NFC.
e.g.
- Eloquent does
- Xiphos does not
The data is incomplete for this feature in the table in our wiki page.
https://wiki.crosswire.org/Choosing_a_SWORD_program#Search_and_Dictionary
Please would other front