[
https://issues.apache.org/jira/browse/TIKA-561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Antoni Mylka updated TIKA-561:
--
Attachment: tika-561.patch
a patch which contains the modifications and the test file, It overlaps with m
Support EMLX file detection
---
Key: TIKA-561
URL: https://issues.apache.org/jira/browse/TIKA-561
Project: Tika
Issue Type: Improvement
Reporter: Antoni Mylka
Apple Mail generates email files in .emlx fo
[
https://issues.apache.org/jira/browse/TIKA-560?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Antoni Mylka updated TIKA-560:
--
Attachment: test-documents.zip
tika-560.patch
A patch with my solution proposal, and the
Improve detection of .mht, Foxmail, and OOXML files
---
Key: TIKA-560
URL: https://issues.apache.org/jira/browse/TIKA-560
Project: Tika
Issue Type: Improvement
Reporter: Antoni Mylk
Hi Ben,
Great! I still haven't found the time to work on Nick's suggestions but you
can definitely work on the tests if you want to and add some of the emails
you mentioned. Having some cases of multipart with HTML and txt content +
images and attachments would be good.
Thanks
Julien
On 25 Nove
Hello,
I am working on a project with rfc-822 email messages and ran into the problem
discussed in TIKA-461. I'd be interested in helping this story along, if there
is anything more to be done. In particular, I have a pile of public domain
emails that might be useful for testing.
Thanks,
Ben D
[
https://issues.apache.org/jira/browse/TIKA-559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12935788#action_12935788
]
Staffan Olsson commented on TIKA-559:
-
Isnt this a duplicate of TIKA-548? Try trunk.
> [
[
https://issues.apache.org/jira/browse/TIKA-559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Antoine L. updated TIKA-559:
Attachment: partition.pdf
> [PDF Parser] New paragraph not taken into account sometime
>
[PDF Parser] New paragraph not taken into account sometime
--
Key: TIKA-559
URL: https://issues.apache.org/jira/browse/TIKA-559
Project: Tika
Issue Type: Bug
Components: parse
[
https://issues.apache.org/jira/browse/TIKA-557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12935773#action_12935773
]
Ken Krugler commented on TIKA-557:
--
By default the WriteOutContentHandler has a limit of 100
[
https://issues.apache.org/jira/browse/TIKA-558?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Nick Burch resolved TIKA-558.
-
Resolution: Duplicate
Duplicate of TIKA-556
> Problems/inconsistency with jar edu.ucar:netcdf:4.2 used by
[
https://issues.apache.org/jira/browse/TIKA-558?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Guest updated TIKA-558:
---
Description:
I use Maven to build my application. Part of this application is Tika. I
previously used Tika 0.4 with no
Problems/inconsistency with jar edu.ucar:netcdf:4.2 used by Tika 0.8
Key: TIKA-558
URL: https://issues.apache.org/jira/browse/TIKA-558
Project: Tika
Issue Type: Bug
[
https://issues.apache.org/jira/browse/TIKA-557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Nick Burch resolved TIKA-557.
-
Resolution: Invalid
You've set a Write Limit on your ContentHandler, and the text in your PDF is
too big
14 matches
Mail list logo