Support for IBM866 (CP866) encoding in TXTParser
Key: TIKA-574
URL: https://issues.apache.org/jira/browse/TIKA-574
Project: Tika
Issue Type: Improvement
Components: parser
Affect
[
https://issues.apache.org/jira/browse/TIKA-574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
gross updated TIKA-574:
---
Attachment: tika-0.8-cp866.patch
I've used ngrams from cp1251 and wrote custom byteMap. All russian letters,
used in c
[
https://issues.apache.org/jira/browse/TIKA-573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13035265#comment-13035265
]
Henning Gross commented on TIKA-573:
I got an Use-Case for getting all known valid exten
[
https://issues.apache.org/jira/browse/TIKA-573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13035317#comment-13035317
]
Henning Gross commented on TIKA-573:
We will use a proxy class deserializing the xml for
MimeType class does contain a String with accessor named Extension. This should
be a List Extensions due to several reasons.
Key: TIKA-661
[
https://issues.apache.org/jira/browse/TIKA-661?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Henning Gross updated TIKA-661:
---
Attachment: MimeType.getExtensionsPatch.txt
Patch that adds getExtensions() and addExtension() as well
[
https://issues.apache.org/jira/browse/TIKA-573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13035463#comment-13035463
]
Henning Gross commented on TIKA-573:
Well the consumer is a portlet which allows uploads
[
https://issues.apache.org/jira/browse/TIKA-573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13035465#comment-13035465
]
Henning Gross commented on TIKA-573:
https://issues.apache.org/jira/browse/TIKA-661
> M