[
https://issues.apache.org/jira/browse/TIKA-2484?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tim Allison updated TIKA-2484:
--
Attachment: charset.zip
File from [~AndreasMeier]
> Improve CharsetDetector to recognize UTF-16LE/BE,UTF
[
https://issues.apache.org/jira/browse/TIKA-2484?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Andreas Meier updated TIKA-2484:
Description:
I would like to help to improve the recognition accuracy of the CharsetDetector.
Theref
[
https://issues.apache.org/jira/browse/TIKA-2484?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Andreas Meier updated TIKA-2484:
Attachment: IUC10-ar.UTF-7.with-BOM
IUC10-ar.UTF-7.without-BOM
IUC10-a