We discovered in Wikisource that a search for
kvibille doesn't produce hits for the older
spelling qvibille (with Q instead of K).

Also, searches for jern and järn produce two
non-overlapping result sets, where as a search
for järnvåg produces hits for järnväg, so
apparently å-ä are treated as similar (harmful!)
while e-ä are not treated as similar (bad!).

The e-ä and q-k similarity is an issue in Wikisource,
since these are old spelling reforms, but this is
not an issue in the all-modern Wikipedia. Making
e-ä and q-k match in Wikipedia too would most
probably be harmless.

Is there any way we can change the behaviour?
Per language and project? Or does it need to be
changed in the Solr/Lucene distribution?

What is the current configuration and what sort
of input is needed for an improvement? A short
list of matching letters, or a full dictionary?


-- 
  Lars Aronsson (l...@aronsson.se)
  Aronsson Datateknik - http://aronsson.se



_______________________________________________
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Reply via email to