[
https://issues.apache.org/jira/browse/SOLR-8014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14736161#comment-14736161
]
Hoss Man commented on SOLR-8014:
--------------------------------
bq. ...too many options will only confuse people, one implementation that does
the job properly should be enough.
I think Jan's point is that _this_ issue should remain focused on updating
LangDetectLanguageIdentifierUpdateProcessor to use the more up to date fork of
the library it's already using, because that hsould be fairly straight forward.
We already ship/support 2 implementations of LanguageIdentifierUpdateProcessor,
adding a 3rd based on langid-java (and or deprecating/removing either of the
ones we have already) should be done as distinct issue(s) ... particularly
since deciding which ones are the best will probably take more
time/effort/consideration then just upgrading the langdetect lib.
> Replace langdetect lib by more updated fork
> -------------------------------------------
>
> Key: SOLR-8014
> URL: https://issues.apache.org/jira/browse/SOLR-8014
> Project: Solr
> Issue Type: Improvement
> Components: contrib - LangId
> Reporter: Jan Høydahl
>
> The language-detection library we use is
> https://code.google.com/p/language-detection/ version 1.1 from 2012. The
> project has stalled with no new development, not even in the [github
> repo](https://github.com/shuyo/language-detection) the original author put up.
> Looks like the most promising fork is this one
> https://github.com/optimaize/language-detector/ which is also being selected
> by the Tika project to replace Tika's old detector.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]