[ https://issues.apache.org/jira/browse/TIKA-4196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Tim Allison updated TIKA-4196: ------------------------------ Description: The ICU4j and the StandardHtmlEncodingDetector detectors include a bom detector, but for some use cases it would be useful to factor that out and allow users to configure bom detection on their own. (was: The ICU4j detector uses a bom detector, but for some use cases it would be useful to factor that out and allow users to configure bom detection on their own.) > Add a BOM charset detector > -------------------------- > > Key: TIKA-4196 > URL: https://issues.apache.org/jira/browse/TIKA-4196 > Project: Tika > Issue Type: New Feature > Reporter: Tim Allison > Priority: Trivial > > The ICU4j and the StandardHtmlEncodingDetector detectors include a bom > detector, but for some use cases it would be useful to factor that out and > allow users to configure bom detection on their own. -- This message was sent by Atlassian Jira (v8.20.10#820010)