[jira] [Created] (TIKA-2696) Support output of Tesseract OSD output for psm mode 0

2018-07-26 Thread August Valera (JIRA)
August Valera created TIKA-2696: --- Summary: Support output of Tesseract OSD output for psm mode 0 Key: TIKA-2696 URL: https://issues.apache.org/jira/browse/TIKA-2696 Project: Tika Issue Type:

[jira] [Commented] (TIKA-2692) Blanket upgrades in prep for 1.19

2018-07-26 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16559028#comment-16559028 ] Hudson commented on TIKA-2692: -- SUCCESS: Integrated in Jenkins build Tika-trunk #1532 (See

[jira] [Commented] (TIKA-2693) Tika 1.17 uses the wrong classloader for reflection

2018-07-26 Thread Andreas Beeker (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16558986#comment-16558986 ] Andreas Beeker commented on TIKA-2693: -- The NoClassDefFoundError happens, because I suggested to use

[jira] [Commented] (TIKA-2692) Blanket upgrades in prep for 1.19

2018-07-26 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16558892#comment-16558892 ] Hudson commented on TIKA-2692: -- FAILURE: Integrated in Jenkins build tika-branch-1x #62 (See

[jira] [Resolved] (TIKA-2692) Blanket upgrades in prep for 1.19

2018-07-26 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison resolved TIKA-2692. --- Resolution: Fixed Fix Version/s: 2.0.0 1.19 Can reopen if there's a better

[jira] [Commented] (TIKA-2692) Blanket upgrades in prep for 1.19

2018-07-26 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16558849#comment-16558849 ] Hudson commented on TIKA-2692: -- FAILURE: Integrated in Jenkins build tika-2.x-windows #290 (See

tika-2.x-windows - Build # 290 - Still Failing

2018-07-26 Thread Apache Jenkins Server
The Apache Jenkins build system has built tika-2.x-windows (build #290) Status: Still Failing Check console output at https://builds.apache.org/job/tika-2.x-windows/290/ to view the results.

[jira] [Comment Edited] (TIKA-2692) Blanket upgrades in prep for 1.19

2018-07-26 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16558744#comment-16558744 ] Tim Allison edited comment on TIKA-2692 at 7/26/18 6:50 PM: After further

[jira] [Commented] (TIKA-2692) Blanket upgrades in prep for 1.19

2018-07-26 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16558744#comment-16558744 ] Tim Allison commented on TIKA-2692: --- After further general upgrades, I had problems with tika-bundle.

[jira] [Commented] (TIKA-2692) Blanket upgrades in prep for 1.19

2018-07-26 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16558707#comment-16558707 ] Hudson commented on TIKA-2692: -- SUCCESS: Integrated in Jenkins build Tika-trunk #1528 (See

[jira] [Commented] (TIKA-2692) Blanket upgrades in prep for 1.19

2018-07-26 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16558642#comment-16558642 ] Hudson commented on TIKA-2692: -- FAILURE: Integrated in Jenkins build tika-2.x-windows #289 (See

tika-2.x-windows - Build # 289 - Still Failing

2018-07-26 Thread Apache Jenkins Server
The Apache Jenkins build system has built tika-2.x-windows (build #289) Status: Still Failing Check console output at https://builds.apache.org/job/tika-2.x-windows/289/ to view the results.

[jira] [Commented] (TIKA-2692) Blanket upgrades in prep for 1.19

2018-07-26 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16558468#comment-16558468 ] Hudson commented on TIKA-2692: -- FAILURE: Integrated in Jenkins build Tika-trunk #1527 (See

[jira] [Comment Edited] (TIKA-2694) "From" headers is not always extracted correctly on msg mails

2018-07-26 Thread Ross Johnson (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16558431#comment-16558431 ] Ross Johnson edited comment on TIKA-2694 at 7/26/18 3:31 PM: - Just adding some

[jira] [Commented] (TIKA-2692) Blanket upgrades in prep for 1.19

2018-07-26 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16558432#comment-16558432 ] Hudson commented on TIKA-2692: -- FAILURE: Integrated in Jenkins build tika-2.x-windows #288 (See

[jira] [Commented] (TIKA-2694) "From" headers is not always extracted correctly on msg mails

2018-07-26 Thread Ross Johnson (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16558431#comment-16558431 ] Ross Johnson commented on TIKA-2694: Just adding some extra info. I checked the attached .msg file,

tika-2.x-windows - Build # 288 - Still Failing

2018-07-26 Thread Apache Jenkins Server
The Apache Jenkins build system has built tika-2.x-windows (build #288) Status: Still Failing Check console output at https://builds.apache.org/job/tika-2.x-windows/288/ to view the results.

[jira] [Commented] (TIKA-2694) "From" headers is not always extracted correctly on msg mails

2018-07-26 Thread Celpan Valeria (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16558393#comment-16558393 ] Celpan Valeria commented on TIKA-2694: -- Okay, thank you  > "From" headers is not always extracted

[jira] [Comment Edited] (TIKA-2694) "From" headers is not always extracted correctly on msg mails

2018-07-26 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16558384#comment-16558384 ] Tim Allison edited comment on TIKA-2694 at 7/26/18 3:02 PM: This is the way

[jira] [Commented] (TIKA-2694) "From" headers is not always extracted correctly on msg mails

2018-07-26 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16558387#comment-16558387 ] Tim Allison commented on TIKA-2694: --- See: TIKA-1865 > "From" headers is not always extracted correctly

[jira] [Commented] (TIKA-2694) "From" headers is not always extracted correctly on msg mails

2018-07-26 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16558384#comment-16558384 ] Tim Allison commented on TIKA-2694: --- I'm pretty sure this is the way that "addresses" can be stored in

[jira] [Commented] (TIKA-2368) Clean up SentimentParser dependencies

2018-07-26 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16558368#comment-16558368 ] Tim Allison commented on TIKA-2368: --- ossindex-maven-plugin:audit identifies 20 vulnerable dependencies:

[jira] [Created] (TIKA-2695) Upgrade Lucene in tika-eval and tika-example

2018-07-26 Thread Tim Allison (JIRA)
Tim Allison created TIKA-2695: - Summary: Upgrade Lucene in tika-eval and tika-example Key: TIKA-2695 URL: https://issues.apache.org/jira/browse/TIKA-2695 Project: Tika Issue Type: Task

[jira] [Updated] (TIKA-2694) "From" headers is not always extracted correctly on msg mails

2018-07-26 Thread Celpan Valeria (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Celpan Valeria updated TIKA-2694: - Description: For some emails we get instead of the email address for "From" field a value which

[jira] [Created] (TIKA-2694) "From" headers is not always extracted correctly on msg mails

2018-07-26 Thread Celpan Valeria (JIRA)
Celpan Valeria created TIKA-2694: Summary: "From" headers is not always extracted correctly on msg mails Key: TIKA-2694 URL: https://issues.apache.org/jira/browse/TIKA-2694 Project: Tika

[jira] [Created] (TIKA-2693) Tika 1.17 uses the wrong classloader for reflection

2018-07-26 Thread Karl Wright (JIRA)
Karl Wright created TIKA-2693: - Summary: Tika 1.17 uses the wrong classloader for reflection Key: TIKA-2693 URL: https://issues.apache.org/jira/browse/TIKA-2693 Project: Tika Issue Type: Bug

[jira] [Commented] (TIKA-2672) Upgrade dl4j to 1.0.0-beta

2018-07-26 Thread Thejan Wijesinghe (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16558262#comment-16558262 ] Thejan Wijesinghe commented on TIKA-2672: - [~talli...@apache.org] we get:1.0.0-SNAPSHOT from the

Re: improving Tika for web contents

2018-07-26 Thread Tim Allison
Y, we're waiting on dl4j so we have a week probably. On Thu, Jul 26, 2018 at 8:06 AM gbouchar wrote: > > Thank you very much, Tim! Do you think it will make it for the next release ? > > ‐‐‐ Original Message ‐‐‐ > Le 26 juillet 2018 1:58 PM, Tim Allison a écrit : > > > Y. Sorry. At beach

[jira] [Created] (TIKA-2692) Blanket upgrades in prep for 1.19

2018-07-26 Thread Tim Allison (JIRA)
Tim Allison created TIKA-2692: - Summary: Blanket upgrades in prep for 1.19 Key: TIKA-2692 URL: https://issues.apache.org/jira/browse/TIKA-2692 Project: Tika Issue Type: Task

Re: improving Tika for web contents

2018-07-26 Thread Tim Allison
Y. Sorry. At beach last week. Took care of quick issues yesterday, will try to return to your PRs today. Thank you! On Thu, Jul 26, 2018 at 5:38 AM gbouchar wrote: > Greetings everyone! > > I have two pull requests related to the use of tika for web contents that > have been waiting for quite

Re: Tika dependencies audit

2018-07-26 Thread Tim Allison
+1. There are a handful where we need to keep the older versions because of regressions, but let’s update where we can for 1.19. On Thu, Jul 26, 2018 at 3:12 AM Maxim Solodovnik wrote: > Hello All, > > recently following plugin was announced on Maven mailing list: >

improving Tika for web contents

2018-07-26 Thread gbouchar
Greetings everyone! I have two pull requests related to the use of tika for web contents that have been waiting for quite some time now. - [Improving html charset detection](https://github.com/apache/tika/pull/242) : None of the current charset detectors in tika respect the web standards, and

[jira] [Updated] (TIKA-2691) Can't create a RPM

2018-07-26 Thread Celpan Valeria (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2691?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Celpan Valeria updated TIKA-2691: - Summary: Can't create a RPM (was: Can't create an RPM ) > Can't create a RPM >

Tika dependencies audit

2018-07-26 Thread Maxim Solodovnik
Hello All, recently following plugin was announced on Maven mailing list: https://sonatype.github.io/ossindex-maven/maven-plugin/ I've tried to analyze our code and find out some libraries being used by Tika are not passing Maybe it worth to update Tika dependencies? -- WBR Maxim aka