[
https://issues.apache.org/jira/browse/TIKA-369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14721159#comment-14721159
]
Ken Krugler commented on TIKA-369:
--
Initial results from integrating language-detector (see
[
https://issues.apache.org/jira/browse/TIKA-369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14342531#comment-14342531
]
Ken Krugler commented on TIKA-369:
--
Hi Tyler - detection speed is an issue, but Tika also
[
https://issues.apache.org/jira/browse/TIKA-369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14342533#comment-14342533
]
Tyler Palsulich commented on TIKA-369:
--
Thanks, Ken! In that case, I definitely agree.
[
https://issues.apache.org/jira/browse/TIKA-369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14341994#comment-14341994
]
Tyler Palsulich commented on TIKA-369:
--
Is there any update on this? Language detection
[
https://issues.apache.org/jira/browse/TIKA-369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13581315#comment-13581315
]
Ken Krugler commented on TIKA-369:
--
Some questions then about integrating
[
https://issues.apache.org/jira/browse/TIKA-369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13573492#comment-13573492
]
Michael McCandless commented on TIKA-369:
-
The language-detection lib is now in
[
https://issues.apache.org/jira/browse/TIKA-369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13573541#comment-13573541
]
Ted Dunning commented on TIKA-369:
--
It is hard to object, but it would be good to replicate
[
https://issues.apache.org/jira/browse/TIKA-369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13573670#comment-13573670
]
Ken Krugler commented on TIKA-369:
--
I've been using language-detection in another project
[
https://issues.apache.org/jira/browse/TIKA-369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13573753#comment-13573753
]
Robert Muir commented on TIKA-369:
--
The DetectorFactory is definitely gnarly, but you can
[
https://issues.apache.org/jira/browse/TIKA-369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13499838#comment-13499838
]
Michael McCandless commented on TIKA-369:
-
+1 to cut over to
[
https://issues.apache.org/jira/browse/TIKA-369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13499847#comment-13499847
]
Pander Musubi commented on TIKA-369:
language-detection uses a variable length n-grams.
[
https://issues.apache.org/jira/browse/TIKA-369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13499479#comment-13499479
]
Pander Musubi commented on TIKA-369:
+1 for using
[
https://issues.apache.org/jira/browse/TIKA-369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13211424#comment-13211424
]
Christian Moen commented on TIKA-369:
-
Does anyone have any thoughts on how we should
[
https://issues.apache.org/jira/browse/TIKA-369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13143436#comment-13143436
]
Joseph Vychtrle commented on TIKA-369:
--
Imho the CERTAINTY_LIMIT is too rigorous. I was
[
https://issues.apache.org/jira/browse/TIKA-369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13143676#comment-13143676
]
Joseph Vychtrle commented on TIKA-369:
--
Wouldn't it be better if the field wasn't
[
https://issues.apache.org/jira/browse/TIKA-369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13055242#comment-13055242
]
Jan Høydahl commented on TIKA-369:
--
Any new thoughts on this one? Seems like LUCENE-826
16 matches
Mail list logo