[jira] [Commented] (TIKA-2814) Extracted content of EML file contains words like "FONT-SIZE: 9pt; FONT-FAMILY: arial"

2019-01-15 Thread Edwin Yeo Zheng Lin (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16743564#comment-16743564 ] Edwin Yeo Zheng Lin commented on TIKA-2814: --- I have uploaded a sample EML file here: 

[jira] [Commented] (TIKA-2224) Mime magic for OneNote formats

2019-01-15 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16743485#comment-16743485 ] Tim Allison commented on TIKA-2224: --- Sorry. How about the old

[jira] [Commented] (TIKA-2224) Mime magic for OneNote formats

2019-01-15 Thread Nicholas DiPiazza (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16743415#comment-16743415 ] Nicholas DiPiazza commented on TIKA-2224: - i'm not concerned about the external call to a C++

[jira] [Comment Edited] (TIKA-2224) Mime magic for OneNote formats

2019-01-15 Thread Nicholas DiPiazza (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16743415#comment-16743415 ] Nicholas DiPiazza edited comment on TIKA-2224 at 1/15/19 9:37 PM: -- i'm

[jira] [Commented] (TIKA-2224) Mime magic for OneNote formats

2019-01-15 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16743359#comment-16743359 ] Tim Allison commented on TIKA-2224: --- Given that the code is ASL 2.0, y, my pref would be to port it. I

[jira] [Commented] (TIKA-2816) Error when sending request to /tika with header X-Tika-OCRMinFileSizeToOcr

2019-01-15 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16743337#comment-16743337 ] Hudson commented on TIKA-2816: -- SUCCESS: Integrated in Jenkins build Tika-trunk #1619 (See

[jira] [Commented] (TIKA-2224) Mime magic for OneNote formats

2019-01-15 Thread Nicholas DiPiazza (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16743338#comment-16743338 ] Nicholas DiPiazza commented on TIKA-2224: - well we can use the c++ executable, or just port the

[jira] [Comment Edited] (TIKA-2814) Extracted content of EML file contains words like "FONT-SIZE: 9pt; FONT-FAMILY: arial"

2019-01-15 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16743315#comment-16743315 ] Tim Allison edited comment on TIKA-2814 at 1/15/19 7:19 PM: Are you able to

[jira] [Commented] (TIKA-2816) Error when sending request to /tika with header X-Tika-OCRMinFileSizeToOcr

2019-01-15 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16743309#comment-16743309 ] Hudson commented on TIKA-2816: -- SUCCESS: Integrated in Jenkins build tika-branch-1x #154 (See

[jira] [Commented] (TIKA-2224) Mime magic for OneNote formats

2019-01-15 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16743319#comment-16743319 ] Tim Allison commented on TIKA-2224: --- If I understand correctly, the reason to go with a custom parser

[jira] [Commented] (TIKA-2814) Extracted content of EML file contains words like "FONT-SIZE: 9pt; FONT-FAMILY: arial"

2019-01-15 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16743315#comment-16743315 ] Tim Allison commented on TIKA-2814: --- Are you able to share an example file? > Extracted content of EML

[jira] [Commented] (TIKA-2802) Out of memory issues when extracting large files (pst)

2019-01-15 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16743311#comment-16743311 ] Tim Allison commented on TIKA-2802: --- 1.21 unless any fellow devs or anyone else objects to bundling

[jira] [Commented] (TIKA-2816) Error when sending request to /tika with header X-Tika-OCRMinFileSizeToOcr

2019-01-15 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16743289#comment-16743289 ] Hudson commented on TIKA-2816: -- UNSTABLE: Integrated in Jenkins build tika-2.x-windows #375 (See

[jira] [Resolved] (TIKA-2816) Error when sending request to /tika with header X-Tika-OCRMinFileSizeToOcr

2019-01-15 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2816?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison resolved TIKA-2816. --- Resolution: Fixed Fix Version/s: 1.21 2.0.0 Sorry about that, and thank

[jira] [Commented] (TIKA-2802) Out of memory issues when extracting large files (pst)

2019-01-15 Thread Abhijit Rajwade (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16743200#comment-16743200 ] Abhijit Rajwade commented on TIKA-2802: --- In which Tika version will this get resolved? > Out of

[jira] [Assigned] (TIKA-2816) Error when sending request to /tika with header X-Tika-OCRMinFileSizeToOcr

2019-01-15 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2816?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison reassigned TIKA-2816: - Assignee: Tim Allison > Error when sending request to /tika with header

[jira] [Created] (TIKA-2816) Error when sending request to /tika with header X-Tika-OCRMinFileSizeToOcr

2019-01-15 Thread JIRA
Anssi Törmä created TIKA-2816: - Summary: Error when sending request to /tika with header X-Tika-OCRMinFileSizeToOcr Key: TIKA-2816 URL: https://issues.apache.org/jira/browse/TIKA-2816 Project: Tika

[ANNOUNCE] Apache Roadshow Chicago, Call for Presentations

2019-01-15 Thread Trevor Grant
Hello Devs! You're receiving this email because you are subscribed to one or more Apache developer email lists. I’m writing to let you know about an exciting event coming to the Chicago area: The Apache Roadshow Chicago. It will be held May 13th and 14th at three bars in the Logan Square

[jira] [Created] (TIKA-2815) Priority of processing EML file should be TEXT_PLAIN instead of TEXT_HTML

2019-01-15 Thread Edwin Yeo Zheng Lin (JIRA)
Edwin Yeo Zheng Lin created TIKA-2815: - Summary: Priority of processing EML file should be TEXT_PLAIN instead of TEXT_HTML Key: TIKA-2815 URL: https://issues.apache.org/jira/browse/TIKA-2815

[jira] [Updated] (TIKA-2814) Extracted content of EML file contains words like "FONT-SIZE: 9pt; FONT-FAMILY: arial"

2019-01-15 Thread Edwin Yeo Zheng Lin (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Edwin Yeo Zheng Lin updated TIKA-2814: -- Description: When we are indexing EML file, the priority setting of TIka is using

[jira] [Created] (TIKA-2814) Extracted content of EML file contains words like "FONT-SIZE: 9pt; FONT-FAMILY: arial"

2019-01-15 Thread Edwin Yeo Zheng Lin (JIRA)
Edwin Yeo Zheng Lin created TIKA-2814: - Summary: Extracted content of EML file contains words like "FONT-SIZE: 9pt; FONT-FAMILY: arial" Key: TIKA-2814 URL: https://issues.apache.org/jira/browse/TIKA-2814