Tim Allison created TIKA-3213: --------------------------------- Summary: Consider migrating universalcharsetdetector to a live fork Key: TIKA-3213 URL: https://issues.apache.org/jira/browse/TIKA-3213 Project: Tika Issue Type: Task Reporter: Tim Allison
I just came across this living fork of the aged juniversalchardet (2011!!!): https://github.com/albfernandez/juniversalchardet It has a mozilla license, has decent star count and is published on maven central. Obv, we'll want to run a comparison on our corpus before making this change, but I wanted to open this issue for discussion. -- This message was sent by Atlassian Jira (v8.3.4#803005)