[jira] [Commented] (TIKA-3795) General upgrades for 2.4.2

2022-07-14 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17567090#comment-17567090 ] Hudson commented on TIKA-3795: -- FAILURE: Integrated in Jenkins build Tika » tika-main-jdk8 #6

[GitHub] [tika] dependabot[bot] opened a new pull request, #611: Bump exec-maven-plugin from 3.0.0 to 3.1.0

2022-07-14 Thread GitBox
dependabot[bot] opened a new pull request, #611: URL: https://github.com/apache/tika/pull/611 Bumps [exec-maven-plugin](https://github.com/mojohaus/exec-maven-plugin) from 3.0.0 to 3.1.0. Release notes Sourced from https://github.com/mojohaus/exec-maven-plugin/releases";>exec-maven

[GitHub] [tika] dependabot[bot] opened a new pull request, #610: Bump google-cloud-storage from 2.9.3 to 2.10.0

2022-07-14 Thread GitBox
dependabot[bot] opened a new pull request, #610: URL: https://github.com/apache/tika/pull/610 Bumps [google-cloud-storage](https://github.com/googleapis/java-storage) from 2.9.3 to 2.10.0. Release notes Sourced from https://github.com/googleapis/java-storage/releases";>google-cloud

[jira] [Closed] (TIKA-2492) Remove pdfdebugger from tika

2022-07-14 Thread Tilman Hausherr (Jira)
[ https://issues.apache.org/jira/browse/TIKA-2492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tilman Hausherr closed TIKA-2492. - Resolution: Fixed > Remove pdfdebugger from tika > > >

[jira] [Resolved] (TIKA-3818) Remove pdfdebugger from tika (2)

2022-07-14 Thread Tilman Hausherr (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tilman Hausherr resolved TIKA-3818. --- Resolution: Fixed > Remove pdfdebugger from tika (2) > > >

[jira] [Commented] (TIKA-3818) Remove pdfdebugger from tika (2)

2022-07-14 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17567080#comment-17567080 ] Hudson commented on TIKA-3818: -- SUCCESS: Integrated in Jenkins build Tika » tika-main-jdk8 #6

[jira] [Commented] (TIKA-3817) Azure Graph conflict with Tika-app on (JsonGenerator) jackson version 2.13.3 - bug changes version to 2.4.1 (Tika's version)

2022-07-14 Thread Tilman Hausherr (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17567076#comment-17567076 ] Tilman Hausherr commented on TIKA-3817: --- I notice you're using tika-app, this is mea

[jira] [Updated] (TIKA-3818) Remove pdfdebugger from tika (2)

2022-07-14 Thread Tilman Hausherr (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tilman Hausherr updated TIKA-3818: -- Description: We already did this in 2017 in TIKA-2492, but it reappeared (see the graph I posted

[jira] [Updated] (TIKA-3818) Remove pdfdebugger from tika (2)

2022-07-14 Thread Tilman Hausherr (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tilman Hausherr updated TIKA-3818: -- Affects Version/s: 2.4.1 (was: 1.28.4) > Remove pdfdebugger from tika

[jira] [Created] (TIKA-3818) Remove pdfdebugger from tika (2)

2022-07-14 Thread Tilman Hausherr (Jira)
Tilman Hausherr created TIKA-3818: - Summary: Remove pdfdebugger from tika (2) Key: TIKA-3818 URL: https://issues.apache.org/jira/browse/TIKA-3818 Project: Tika Issue Type: Task Comp

[jira] [Updated] (TIKA-3818) Remove pdfdebugger from tika (2)

2022-07-14 Thread Tilman Hausherr (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tilman Hausherr updated TIKA-3818: -- Fix Version/s: 2.4.2 (was: 1.28.5) > Remove pdfdebugger from tika (2) > -

[jira] [Updated] (TIKA-3817) Azure Graph conflict with Tika-app on (JsonGenerator) jackson version 2.13.3 - bug changes version to 2.4.1 (Tika's version)

2022-07-14 Thread Tilman Hausherr (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tilman Hausherr updated TIKA-3817: -- Attachment: screenshot-1.png > Azure Graph conflict with Tika-app on (JsonGenerator) jackson ver

[jira] [Commented] (TIKA-3817) Azure Graph conflict with Tika-app on (JsonGenerator) jackson version 2.13.3 - bug changes version to 2.4.1 (Tika's version)

2022-07-14 Thread Tilman Hausherr (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17567061#comment-17567061 ] Tilman Hausherr commented on TIKA-3817: --- I don't understand this either. We use azur

[jira] [Comment Edited] (TIKA-3814) Extracted text from HTML file does not exclude newline chars from body

2022-07-14 Thread Sai Konuri (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17566996#comment-17566996 ] Sai Konuri edited comment on TIKA-3814 at 7/14/22 10:12 PM: Th

[jira] [Commented] (TIKA-3814) Extracted text from HTML file does not exclude newline chars from body

2022-07-14 Thread Sai Konuri (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17566996#comment-17566996 ] Sai Konuri commented on TIKA-3814: -- Thanks Nick and Tim!    As suggested, I agree that

[jira] [Commented] (TIKA-3814) Extracted text from HTML file does not exclude newline chars from body

2022-07-14 Thread Nick Burch (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17566991#comment-17566991 ] Nick Burch commented on TIKA-3814: -- I have a feeling that the Text content handler might

[jira] [Comment Edited] (TIKA-3814) Extracted text from HTML file does not exclude newline chars from body

2022-07-14 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17566979#comment-17566979 ] Tim Allison edited comment on TIKA-3814 at 7/14/22 8:29 PM: I'

[jira] [Updated] (TIKA-3814) Extracted text from HTML file does not exclude newline chars from body

2022-07-14 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison updated TIKA-3814: -- Priority: Minor (was: Critical) > Extracted text from HTML file does not exclude newline chars from bod

[jira] [Commented] (TIKA-3814) Extracted text from HTML file does not exclude newline chars from body

2022-07-14 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17566979#comment-17566979 ] Tim Allison commented on TIKA-3814: --- I'm sorry for our team's delay. I haven't looked a

[jira] [Commented] (TIKA-3795) General upgrades for 2.4.2

2022-07-14 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17566973#comment-17566973 ] Hudson commented on TIKA-3795: -- SUCCESS: Integrated in Jenkins build Tika » tika-main-jdk8 #6

[jira] [Commented] (TIKA-3812) Parser Order: image get parsed by GDALParser instead of TesseractOCRParser

2022-07-14 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17566974#comment-17566974 ] Hudson commented on TIKA-3812: -- SUCCESS: Integrated in Jenkins build Tika » tika-main-jdk8 #6

[jira] [Updated] (TIKA-3814) Extracted text from HTML file does not exclude newline chars from body

2022-07-14 Thread Sai Konuri (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sai Konuri updated TIKA-3814: - Priority: Critical (was: Trivial) > Extracted text from HTML file does not exclude newline chars from bod

[jira] [Commented] (TIKA-3814) Extracted text from HTML file does not exclude newline chars from body

2022-07-14 Thread Sai Konuri (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17566971#comment-17566971 ] Sai Konuri commented on TIKA-3814: -- This is impacting our customers for our feature, so m

[jira] [Updated] (TIKA-3814) Extracted text from HTML file does not exclude newline chars from body

2022-07-14 Thread Sai Konuri (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sai Konuri updated TIKA-3814: - Priority: Trivial (was: Critical) > Extracted text from HTML file does not exclude newline chars from bod

[jira] [Updated] (TIKA-3814) Extracted text from HTML file does not exclude newline chars from body

2022-07-14 Thread Sai Konuri (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sai Konuri updated TIKA-3814: - Priority: Critical (was: Trivial) > Extracted text from HTML file does not exclude newline chars from bod

[jira] [Commented] (TIKA-3812) Parser Order: image get parsed by GDALParser instead of TesseractOCRParser

2022-07-14 Thread Tilman Hausherr (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17566954#comment-17566954 ] Tilman Hausherr commented on TIKA-3812: --- Thanks, it works now. > Parser Order: imag

[jira] [Commented] (TIKA-3812) Parser Order: image get parsed by GDALParser instead of TesseractOCRParser

2022-07-14 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17566951#comment-17566951 ] Hudson commented on TIKA-3812: -- UNSTABLE: Integrated in Jenkins build Tika » tika-main-jdk8 #

[jira] [Commented] (TIKA-3812) Parser Order: image get parsed by GDALParser instead of TesseractOCRParser

2022-07-14 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17566946#comment-17566946 ] Tim Allison commented on TIKA-3812: --- Sorry. Just pushed fix. Tracking to see if that d

[jira] [Commented] (TIKA-3812) Parser Order: image get parsed by GDALParser instead of TesseractOCRParser

2022-07-14 Thread Tilman Hausherr (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17566943#comment-17566943 ] Tilman Hausherr commented on TIKA-3812: --- Build fails on my machine (W10): {noformat}

[jira] [Commented] (TIKA-3817) Azure Graph conflict with Tika-app on (JsonGenerator) jackson version 2.13.3 - bug changes version to 2.4.1 (Tika's version)

2022-07-14 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17566884#comment-17566884 ] Tim Allison commented on TIKA-3817: --- What are we doing wrong at the tika level? As you

[jira] [Updated] (TIKA-3817) Azure Graph conflict with Tika-app on (JsonGenerator) jackson version 2.13.3 - bug changes version to 2.4.1 (Tika's version)

2022-07-14 Thread Andre Nel (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andre Nel updated TIKA-3817: Description: Azure Graph conflict on jackson. Both Tika-app 2.4.1 and Azure-core 1.30.0 use jars with class

[jira] [Updated] (TIKA-3817) Azure Graph conflict with Tika-app on (JsonGenerator) jackson version 2.13.3 - bug changes version to 2.4.1 (Tika's version)

2022-07-14 Thread Andre Nel (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andre Nel updated TIKA-3817: Summary: Azure Graph conflict with Tika-app on (JsonGenerator) jackson version 2.13.3 - bug changes version

[jira] [Updated] (TIKA-3817) Azure Graph conflict with Tika-app on jackson (JsonGenerator) version - bug changes version from 2.13.3 to 2.4.1

2022-07-14 Thread Andre Nel (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andre Nel updated TIKA-3817: Summary: Azure Graph conflict with Tika-app on jackson (JsonGenerator) version - bug changes version from 2.

[jira] [Updated] (TIKA-3817) Azure Graph conflict with Tika-app on jackson (JsonGenerator) version - changing from 2.13.3 to 2.4.1

2022-07-14 Thread Andre Nel (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andre Nel updated TIKA-3817: Description: Azure Graph conflict on jackson. Both Tika-app 2.4.1 and Azure-core 1.30.0 use jars with class

[jira] [Updated] (TIKA-3817) Azure Graph conflict with Tika-app on jackson (JsonGenerator) version - changing from 2.13.3 to 2.4.1

2022-07-14 Thread Andre Nel (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andre Nel updated TIKA-3817: Environment: Java 1.8 Maven Project tika-app 2.4.1 jackson 2.13.3 azure-core 1.30.0 (was: Java 1.8 Maven Po

[jira] [Updated] (TIKA-3817) Azure Graph conflict with Tika-app on jackson (JsonGenerator) version - changing from 2.13.3 to 2.4.1

2022-07-14 Thread Andre Nel (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andre Nel updated TIKA-3817: Environment: Java 1.8, Maven Project, tika-app 2.4.1, jackson 2.13.3, azure-core 1.30.0 (was: Java 1.8 Mave

[jira] [Updated] (TIKA-3817) Azure Graph conflict with Tika-app on jackson (JsonGenerator) version - changing from 2.13.3 to 2.4.1

2022-07-14 Thread Andre Nel (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andre Nel updated TIKA-3817: Description: Azure Graph conflict on jackson. Both Tika-app 2.4.1 and Azure-core 1.30.0 use jars with class

[jira] [Updated] (TIKA-3817) Azure Graph conflict with Tika-app on jackson (JsonGenerator) version - changing from 2.13.3 to 2.4.1

2022-07-14 Thread Andre Nel (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andre Nel updated TIKA-3817: Environment: Java 1.8 Maven Pom (was: Java 1.8 Maven Pom                                         com.fast

[jira] [Updated] (TIKA-3817) Azure Graph conflict with Tika-app on jackson (JsonGenerator) version - changing from 2.13.3 to 2.4.1

2022-07-14 Thread Andre Nel (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andre Nel updated TIKA-3817: Description: Azure Graph conflict on jackson. Both Tika-app 2.4.1 and Azure-core 1.30.0 use jars with class

[jira] [Updated] (TIKA-3817) Azure Graph conflict with Tika-app on jackson (JsonGenerator) version - changing from 2.13.3 to 2.4.1

2022-07-14 Thread Andre Nel (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andre Nel updated TIKA-3817: Description: Azure Graph conflict on jackson. Both Tika-app 2.4.1 and Azure-core 1.30.0 use jars with class

[jira] [Created] (TIKA-3817) Azure Graph conflict with Tika-app on jackson (JsonGenerator) version - changing from 2.13.3 to 2.4.1

2022-07-14 Thread Andre Nel (Jira)
Andre Nel created TIKA-3817: --- Summary: Azure Graph conflict with Tika-app on jackson (JsonGenerator) version - changing from 2.13.3 to 2.4.1 Key: TIKA-3817 URL: https://issues.apache.org/jira/browse/TIKA-3817