https://bugs.documentfoundation.org/show_bug.cgi?id=95274
--- Comment #58 from Jonathan Clark <jonat...@libreoffice.org> --- (In reply to Heiko Tietze from comment #57) > Jonathan, do you have an opinion on this? My vote is for the following, in this order: 1.) Languages that already exist in the current document. If a language has already been used in a document, I think it's likely that it will be used again. Those languages should be positioned prominently. 2.) Languages that the user has *explicitly* specified as languages they understand and intend to use with LibreOffice. The user is the best judge of this, so we should give them the opportunity to tell us. I don't think it will ever be possible to guess this right every time for every user. 3.) Languages derived heuristically from LO configuration: default languages for documents, user interface language. I think installed spellcheck dictionaries is a weak signal. My LO Snap install was bundled with dictionaries for many languages I don't know, and I'd rather not see them in this list. 4.) Languages derived heuristically from system configuration. This is a reasonable starting point, but is only a rough guess. There are many reasons why a user's system configuration might not reflect all of their languages. For an extreme example, consider Linux: English-primary users can't set a second language on Linux. If you try, it will break localization in most gettext programs. An English-locale Linux user with a US international keyboard might need to regularly work in dozens of languages, but we'd have no way of knowing which ones. Regarding libexttextcat: Based on the above discussion, am I correct that this is being used on individual words? Given cognates and loanwords, I don't expect classifying individual words can ever be reliable. The docs say they expect "hundreds of bytes", which seems more reasonable. Instead of using this to generate confusing recommendations, perhaps this could be used somehow for recommending the best match from the high-signal candidates? -- You are receiving this mail because: You are on the CC list for the bug.