[
https://issues.apache.org/jira/browse/TIKA-86?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13186898#comment-13186898
]
Nick Burch commented on TIKA-86:
I'm not sure if we still need this, as the Tika mimetypes fi
[
https://issues.apache.org/jira/browse/TIKA-86?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13186938#comment-13186938
]
Andrew Jackson commented on TIKA-86:
The file command comes with signatures for a lot mor
[
https://issues.apache.org/jira/browse/TIKA-86?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13186948#comment-13186948
]
Nick Burch commented on TIKA-86:
Turning the file magic into a Tika xml match shouldn't be to
[
https://issues.apache.org/jira/browse/TIKA-86?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13187018#comment-13187018
]
Andrew Jackson commented on TIKA-86:
We've done some work in this area, and noticed that
[
https://issues.apache.org/jira/browse/TIKA-86?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13187026#comment-13187026
]
Nick Burch commented on TIKA-86:
RegEx magic could be interesting, with a bit of care to ensu
[
https://issues.apache.org/jira/browse/TIKA-86?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13187049#comment-13187049
]
Ken Krugler commented on TIKA-86:
-
For regex magic, I'd recommend compiling into FSM - e.g. u
[
https://issues.apache.org/jira/browse/TIKA-86?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13188398#comment-13188398
]
Andrew Jackson commented on TIKA-86:
I've added a new ticket concerning RegEx support her