[jira] [Commented] (TIKA-4199) commons-compress 1.26.0 breaks Apache Tika 2.9.1
[ https://issues.apache.org/jira/browse/TIKA-4199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17826996#comment-17826996 ] Tilman Hausherr commented on TIKA-4199: --- The original error you reported wasn't really a bug in commons compress, rather a change that more bytes were read than tika expected, see my first comment in COMPRESS-661. It resulted in several fixes in tika. > commons-compress 1.26.0 breaks Apache Tika 2.9.1 > > > Key: TIKA-4199 > URL: https://issues.apache.org/jira/browse/TIKA-4199 > Project: Tika > Issue Type: Bug > Components: parser >Affects Versions: 2.9.1 >Reporter: Alexander Veit >Assignee: Tilman Hausherr >Priority: Major > Fix For: 2.9.2, 3.0.0 > > > An update to commons-compress 1.26.0 to fix CVE-2024-25710 and CVE-2024-26308 > breaks Tika. > > For more information see https://issues.apache.org/jira/browse/COMPRESS-661. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (TIKA-4199) commons-compress 1.26.0 breaks Apache Tika 2.9.1
[ https://issues.apache.org/jira/browse/TIKA-4199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17826992#comment-17826992 ] Alexander Veit commented on TIKA-4199: -- The same error also occurs with Tika 2.9.1 and commons-compress 1.26.1. > commons-compress 1.26.0 breaks Apache Tika 2.9.1 > > > Key: TIKA-4199 > URL: https://issues.apache.org/jira/browse/TIKA-4199 > Project: Tika > Issue Type: Bug > Components: parser >Affects Versions: 2.9.1 >Reporter: Alexander Veit >Assignee: Tilman Hausherr >Priority: Major > Fix For: 2.9.2, 3.0.0 > > > An update to commons-compress 1.26.0 to fix CVE-2024-25710 and CVE-2024-26308 > breaks Tika. > > For more information see https://issues.apache.org/jira/browse/COMPRESS-661. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (TIKA-4199) commons-compress 1.26.0 breaks Apache Tika 2.9.1
[ https://issues.apache.org/jira/browse/TIKA-4199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17824933#comment-17824933 ] Hudson commented on TIKA-4199: -- SUCCESS: Integrated in Jenkins build Tika » tika-main-jdk11 #1544 (See [https://ci-builds.apache.org/job/Tika/job/tika-main-jdk11/1544/]) TIKA-4199: revert "complete delegate class", field "in" is a dummy; remove workaround for commons-compress 1.26 (tilman: [https://github.com/apache/tika/commit/8b398201a969b952bfee3166cec1395ae409071b]) * (edit) tika-parsers/tika-parsers-standard/tika-parsers-standard-modules/tika-parser-pkg-module/src/main/java/org/apache/tika/parser/pkg/PackageParser.java TIKA-4199: adjust test results now that commons compress bug has been fixed (tilman: [https://github.com/apache/tika/commit/5b259d60a490699252ea582aaec02a3575e4f7ff]) * (edit) tika-parsers/tika-parsers-standard/tika-parsers-standard-package/src/test/java/org/apache/tika/parser/microsoft/ooxml/TruncatedOOXMLTest.java TIKA-4199: update commons-compress (tilman: [https://github.com/apache/tika/commit/4d6acfc109f842421030e05c33794bc8090caebb]) * (edit) tika-parent/pom.xml > commons-compress 1.26.0 breaks Apache Tika 2.9.1 > > > Key: TIKA-4199 > URL: https://issues.apache.org/jira/browse/TIKA-4199 > Project: Tika > Issue Type: Bug > Components: parser >Affects Versions: 2.9.1 >Reporter: Alexander Veit >Assignee: Tilman Hausherr >Priority: Major > Fix For: 2.9.2, 3.0.0 > > > An update to commons-compress 1.26.0 to fix CVE-2024-25710 and CVE-2024-26308 > breaks Tika. > > For more information see https://issues.apache.org/jira/browse/COMPRESS-661. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (TIKA-4199) commons-compress 1.26.0 breaks Apache Tika 2.9.1
[ https://issues.apache.org/jira/browse/TIKA-4199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17823577#comment-17823577 ] Hudson commented on TIKA-4199: -- SUCCESS: Integrated in Jenkins build Tika » tika-main-jdk11 #1540 (See [https://ci-builds.apache.org/job/Tika/job/tika-main-jdk11/1540/]) TIKA-4199: add comment, print to stderr (tilman: [https://github.com/apache/tika/commit/32ef34ff49ccd6a8a7e595861216e6fdeded]) * (edit) tika-parsers/tika-parsers-standard/tika-parsers-standard-package/src/test/java/org/apache/tika/parser/pkg/Seven7ParserTest.java > commons-compress 1.26.0 breaks Apache Tika 2.9.1 > > > Key: TIKA-4199 > URL: https://issues.apache.org/jira/browse/TIKA-4199 > Project: Tika > Issue Type: Bug > Components: parser >Affects Versions: 2.9.1 >Reporter: Alexander Veit >Priority: Major > Fix For: 2.9.2, 3.0.0 > > > An update to commons-compress 1.26.0 to fix CVE-2024-25710 and CVE-2024-26308 > breaks Tika. > > For more information see https://issues.apache.org/jira/browse/COMPRESS-661. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (TIKA-4199) commons-compress 1.26.0 breaks Apache Tika 2.9.1
[ https://issues.apache.org/jira/browse/TIKA-4199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17819093#comment-17819093 ] Hudson commented on TIKA-4199: -- SUCCESS: Integrated in Jenkins build Tika » tika-main-jdk11 #1520 (See [https://ci-builds.apache.org/job/Tika/job/tika-main-jdk11/1520/]) TIKA-4199: replace deprecated (tilman: [https://github.com/apache/tika/commit/a305ab772277db6cdcbd60653e6cf1eb147a1df7]) * (edit) tika-parsers/tika-parsers-standard/tika-parsers-standard-modules/tika-parser-pkg-module/src/main/java/org/apache/tika/parser/pkg/PackageParser.java > commons-compress 1.26.0 breaks Apache Tika 2.9.1 > > > Key: TIKA-4199 > URL: https://issues.apache.org/jira/browse/TIKA-4199 > Project: Tika > Issue Type: Bug > Components: parser >Affects Versions: 2.9.1 >Reporter: Alexander Veit >Priority: Major > Fix For: 2.9.2, 3.0.0 > > > An update to commons-compress 1.26.0 to fix CVE-2024-25710 and CVE-2024-26308 > breaks Tika. > > For more information see https://issues.apache.org/jira/browse/COMPRESS-661. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (TIKA-4199) commons-compress 1.26.0 breaks Apache Tika 2.9.1
[ https://issues.apache.org/jira/browse/TIKA-4199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17818937#comment-17818937 ] Tilman Hausherr commented on TIKA-4199: --- I tried an another solution {code:java} if (archive.markSupported()) { archive = new ArchiveInputStreamWrapper(archive); } {code} which also works. The wrapper delegates all except markSupported. I'll wait a few days if the commons compress people fix this. If not then I'll commit that solution. > commons-compress 1.26.0 breaks Apache Tika 2.9.1 > > > Key: TIKA-4199 > URL: https://issues.apache.org/jira/browse/TIKA-4199 > Project: Tika > Issue Type: Bug > Components: parser >Affects Versions: 2.9.1 >Reporter: Alexander Veit >Priority: Major > > An update to commons-compress 1.26.0 to fix CVE-2024-25710 and CVE-2024-26308 > breaks Tika. > > For more information see https://issues.apache.org/jira/browse/COMPRESS-661. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (TIKA-4199) commons-compress 1.26.0 breaks Apache Tika 2.9.1
[ https://issues.apache.org/jira/browse/TIKA-4199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17818905#comment-17818905 ] Hudson commented on TIKA-4199: -- FAILURE: Integrated in Jenkins build Tika » tika-main-jdk11 #1517 (See [https://ci-builds.apache.org/job/Tika/job/tika-main-jdk11/1517/]) TIKA-4199: complete delegate class (tilman: [https://github.com/apache/tika/commit/a3a830359f088f216ffaca31bf640e296d72531a]) * (edit) tika-core/src/main/java/org/apache/tika/io/BoundedInputStream.java > commons-compress 1.26.0 breaks Apache Tika 2.9.1 > > > Key: TIKA-4199 > URL: https://issues.apache.org/jira/browse/TIKA-4199 > Project: Tika > Issue Type: Bug > Components: parser >Affects Versions: 2.9.1 >Reporter: Alexander Veit >Priority: Major > > An update to commons-compress 1.26.0 to fix CVE-2024-25710 and CVE-2024-26308 > breaks Tika. > > For more information see https://issues.apache.org/jira/browse/COMPRESS-661. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (TIKA-4199) commons-compress 1.26.0 breaks Apache Tika 2.9.1
[ https://issues.apache.org/jira/browse/TIKA-4199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17818877#comment-17818877 ] Hudson commented on TIKA-4199: -- SUCCESS: Integrated in Jenkins build Tika » tika-main-jdk11 #1516 (See [https://ci-builds.apache.org/job/Tika/job/tika-main-jdk11/1516/]) TIKA-4199: complete delegate class (tilman: [https://github.com/apache/tika/commit/e5d57528d92fe41bd2c7ba4545323e8b9cae4883]) * (edit) tika-parsers/tika-parsers-standard/tika-parsers-standard-modules/tika-parser-pkg-module/src/main/java/org/apache/tika/parser/pkg/PackageParser.java > commons-compress 1.26.0 breaks Apache Tika 2.9.1 > > > Key: TIKA-4199 > URL: https://issues.apache.org/jira/browse/TIKA-4199 > Project: Tika > Issue Type: Bug > Components: parser >Affects Versions: 2.9.1 >Reporter: Alexander Veit >Priority: Major > > An update to commons-compress 1.26.0 to fix CVE-2024-25710 and CVE-2024-26308 > breaks Tika. > > For more information see https://issues.apache.org/jira/browse/COMPRESS-661. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (TIKA-4199) commons-compress 1.26.0 breaks Apache Tika 2.9.1
[ https://issues.apache.org/jira/browse/TIKA-4199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17818871#comment-17818871 ] Tim Allison commented on TIKA-4199: --- I opened TIKA-4201 to add a hard limit to the read in the IWorksParser. > commons-compress 1.26.0 breaks Apache Tika 2.9.1 > > > Key: TIKA-4199 > URL: https://issues.apache.org/jira/browse/TIKA-4199 > Project: Tika > Issue Type: Bug > Components: parser >Affects Versions: 2.9.1 >Reporter: Alexander Veit >Priority: Major > > An update to commons-compress 1.26.0 to fix CVE-2024-25710 and CVE-2024-26308 > breaks Tika. > > For more information see https://issues.apache.org/jira/browse/COMPRESS-661. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (TIKA-4199) commons-compress 1.26.0 breaks Apache Tika 2.9.1
[ https://issues.apache.org/jira/browse/TIKA-4199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17818867#comment-17818867 ] Tilman Hausherr commented on TIKA-4199: --- {quote}I'm not declaring this a problem with commons-compress! {quote} My bet was 51% it's with Tika but from the latest test code you inspired me to write in COMPRESS-661, it might be them or BufferedInputStream itself. I also found another incomplete delegate class (BoundedInputStream), I'll complete that one too. > commons-compress 1.26.0 breaks Apache Tika 2.9.1 > > > Key: TIKA-4199 > URL: https://issues.apache.org/jira/browse/TIKA-4199 > Project: Tika > Issue Type: Bug > Components: parser >Affects Versions: 2.9.1 >Reporter: Alexander Veit >Priority: Major > > An update to commons-compress 1.26.0 to fix CVE-2024-25710 and CVE-2024-26308 > breaks Tika. > > For more information see https://issues.apache.org/jira/browse/COMPRESS-661. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (TIKA-4199) commons-compress 1.26.0 breaks Apache Tika 2.9.1
[ https://issues.apache.org/jira/browse/TIKA-4199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17818853#comment-17818853 ] Tim Allison commented on TIKA-4199: --- As I look at the IWorkPackageParser and the detectType(), I think we should rework the mark/reset there. There's currently no hard limit on the number of bytes read when trying to extract the root element. So, it is entirely possible that more than the mark() value is read. I think we happened to get lucky earlier, and we're relying on the same luck by doubling the mark value. > commons-compress 1.26.0 breaks Apache Tika 2.9.1 > > > Key: TIKA-4199 > URL: https://issues.apache.org/jira/browse/TIKA-4199 > Project: Tika > Issue Type: Bug > Components: parser >Affects Versions: 2.9.1 >Reporter: Alexander Veit >Priority: Major > > An update to commons-compress 1.26.0 to fix CVE-2024-25710 and CVE-2024-26308 > breaks Tika. > > For more information see https://issues.apache.org/jira/browse/COMPRESS-661. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (TIKA-4199) commons-compress 1.26.0 breaks Apache Tika 2.9.1
[ https://issues.apache.org/jira/browse/TIKA-4199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17818846#comment-17818846 ] Tim Allison commented on TIKA-4199: --- Thank you [~tilman] for working on this! I'm sorry I opened a duplicate ticket. To confirm, the current workaround is to write each embedded file to disc instead of handling in memory --> {{tis.getPath()}} If I have any time, I'll see if I can create a small reproducer for the commons-compress team that uses mark/reset on a wrapped ArchiveInputStream. To be clear, without looking further, I'm not declaring this a problem with commons-compress! :D > commons-compress 1.26.0 breaks Apache Tika 2.9.1 > > > Key: TIKA-4199 > URL: https://issues.apache.org/jira/browse/TIKA-4199 > Project: Tika > Issue Type: Bug > Components: parser >Affects Versions: 2.9.1 >Reporter: Alexander Veit >Priority: Major > > An update to commons-compress 1.26.0 to fix CVE-2024-25710 and CVE-2024-26308 > breaks Tika. > > For more information see https://issues.apache.org/jira/browse/COMPRESS-661. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (TIKA-4199) commons-compress 1.26.0 breaks Apache Tika 2.9.1
[ https://issues.apache.org/jira/browse/TIKA-4199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17818823#comment-17818823 ] Tilman Hausherr commented on TIKA-4199: --- After merging I discovered that the SevenZWrapper class is incomplete (markSupported / mark / reset was missing, and many more). I tested reverting my one-line change, and some of the previously failing tests (e.g. the 7z tests) were now succeeding. So this kindof suggests that the cause is related to markSupported / mark / reset. If we ever find that cause, then the one-line change in {{PackageParser}} can be removed because it makes things slower. > commons-compress 1.26.0 breaks Apache Tika 2.9.1 > > > Key: TIKA-4199 > URL: https://issues.apache.org/jira/browse/TIKA-4199 > Project: Tika > Issue Type: Bug > Components: parser >Affects Versions: 2.9.1 >Reporter: Alexander Veit >Priority: Major > > An update to commons-compress 1.26.0 to fix CVE-2024-25710 and CVE-2024-26308 > breaks Tika. > > For more information see https://issues.apache.org/jira/browse/COMPRESS-661. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (TIKA-4199) commons-compress 1.26.0 breaks Apache Tika 2.9.1
[ https://issues.apache.org/jira/browse/TIKA-4199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17818774#comment-17818774 ] Tilman Hausherr commented on TIKA-4199: --- I'm working on it https://github.com/apache/pdfbox/pull/180 > commons-compress 1.26.0 breaks Apache Tika 2.9.1 > > > Key: TIKA-4199 > URL: https://issues.apache.org/jira/browse/TIKA-4199 > Project: Tika > Issue Type: Bug > Components: parser >Affects Versions: 2.9.1 >Reporter: Alexander Veit >Priority: Major > > An update to commons-compress 1.26.0 to fix CVE-2024-25710 and CVE-2024-26308 > breaks Tika. > > For more information see https://issues.apache.org/jira/browse/COMPRESS-661. -- This message was sent by Atlassian Jira (v8.20.10#820010)