Hello!
21.03.2012 19:51, Julien Nioche пишет:
Just wondering about the best way to make the language detection pluggable
instead of having it hard-wired as it is now. We now that the resources
that are currently in Tika are both slow and inaccurate [1] and there are
other libraries that we could leverage. Why not having the option to select
a different implementation just like we do for parsers? Obviously we'd need
a common interface for the parsers etc...
What do you think?
I think I would be nice to have this API since currently we are using patched Tika
with our own charset/language detector.
best wishes, Max