dcausse added a comment.

While reading elastic5 breaking change notes I realized that they've added a hard limit on the number of fields in the mapping. The limit is 1000 by default. This limit can be increased by changing the config but we might still want to think of an alternative here just in case.
The idea would be to move the language bits at a lower level: in the content directly: F5304881: completion_with_no_specific_language_field.txt
It's unclear to me what would be the best approach here.
The advantage of language specific fields is that we are able to tweak the analysis chain, for completion search the only thing I could think of is tuning the list of diacritics we want to fold, e.g. do not fold รถ to o for finish but because of the fallbacks I'm not sure it makes sense anyway.
The advantage of non specific fields is that we do not have to change the mapping when we add a new language, everything is data.

Concerning fulltext it's unclear yet, but the idea would be to create language specific fields only on languages we know we have a "good analysis" chain. But the details for fulltext ranking regarding language are still unclear to me.


TASK DETAIL
https://phabricator.wikimedia.org/T150891

EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: aude, dcausse
Cc: Lydia_Pintscher, Jan_Dittrich, EBernhardson, dcausse, hoo, Ricordisamoa, aude, Deskana, StudiesWorld, Aklapper, Smalyshev, Tobi_WMDE_SW, thiemowmde, JanZerebecki, gerritbot, Jonas, daniel, EBjune, mschwarzer, Avner, debt, Gehel, D3r1ck01, FloNight, Izno, Wikidata-bugs, jayvdb, Mbch331, jeremyb
_______________________________________________
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs

Reply via email to