Tom, I haven’t had time to look into this. I won’t before the 2.9.2 release. If anyone else is able to take a look, that’d be great. I pinged the original contributor, but open source doesn’t require warranty support. :/ Any and all help is appreciated!
Best, Tim On Tue, Mar 26, 2024 at 12:05 PM Tom Conlon <tomconlon...@gmail.com> wrote: > Hi, > > https://issues.apache.org/jira/browse/TIKA-4152 > > We should fix this. > > > This item was created almost 6 months ago, is it possible for someone to > take a look at it please? > > Many thanks > Tom > > On Mon, 16 Oct 2023 at 11:46, Tom Conlon <tomconlon...@gmail.com> wrote: > >> Hi, >> Would it be possible for the issue "Fix tika as a service" >> https://issues.apache.org/jira/browse/TIKA-4152 >> to be reviewed before release? >> >> Thanks >> Tom >> >> On Mon, 16 Oct 2023 at 11:19, Tim Allison <talli...@apache.org> wrote: >> >>> Y, I think that issue was raised during early regression tests, and it >>> seemed to make sense. >>> >>> The new readPictures exception was caused by: >>> https://svn.apache.org/viewvc/poi/trunk/poi-scratchpad/src/main/java/org/apache/poi/hslf/usermodel/HSLFSlideShowImpl.java?r1=1911524&r2=1911525& >>> on August 7. I still can't explain why this didn't show up in the >>> regression tests in late September. My only guess is that I didn't >>> correctly swap out the tika-app jar version from the bin/ directory. :( >>> >>> I'm not sure if it is better to arbitrarily set the max override to a >>> large value or revert the POI upgrade. >>> >>> On Sat, Oct 14, 2023 at 7:27 AM Tilman Hausherr <thaush...@t-online.de> >>> wrote: >>> >>>> Also many changes in excel files, e.g. >>>> ZDAC5OCEPVR6AHYY3BU3CZS7UX3F6J4Z, "false: 107382" becomes "0: 107382" so I >>>> guess there has been a change about how to interpret that value. Also >>>> "error" is now nothing. >>>> >>>> Tilman >>>> >>>> On 14.10.2023 13:16, Tim Allison wrote: >>>> >>>> Looks like we have a bunch of new >>>> "org.apache.poi.util.RecordFormatException: Tried to allocate an array of >>>> length 10,xxx,xxx, but the maximum length for this record type is >>>> 10,000,000." triggered by: >>>> org.apache.poi.hslf.usermodel.HSLFSlideShowImpl.readPictures ... I'm not >>>> sure why the regression tests didn't pick this up. >>>> >>>> The changes in rfc822 detection have also had some effects. The few >>>> handfuls that I've reviewed are actually positive changes. I'll review >>>> systematically on Monday. >>>> >>>> On Sat, Oct 14, 2023 at 6:35 AM Tim Allison <talli...@apache.org> >>>> wrote: >>>> >>>>> Reports are here: >>>>> https://corpora.tika.apache.org/base/reports/tika-2.9.1-reports.tgz >>>>> >>>>> I haven't had a chance to look at them yet. :( Will take a look early >>>>> Monday (ET). >>>>> >>>>> On Wed, Oct 11, 2023 at 10:24 AM Tim Allison <talli...@apache.org> >>>>> wrote: >>>>> >>>>>> Unless there are objections, I'll kick off the 2.9.1 regression tests >>>>>> shortly. I just cherry-picked TIKA-4153 into 2.x...will be interesting >>>>>> to >>>>>> see how that works. >>>>>> >>>>>> Best, >>>>>> >>>>>> Tim >>>>>> >>>>>> On Tue, Oct 10, 2023 at 1:37 PM Tim Allison <talli...@apache.org> >>>>>> wrote: >>>>>> >>>>>>> All, >>>>>>> Nandita's email didn't go through for some reason. >>>>>>> Seems reasonable to kick off a 2.9.1 release cycle? What do you >>>>>>> think? >>>>>>> >>>>>>> Best, >>>>>>> >>>>>>> Tim >>>>>>> >>>>>>> >>>>>>> >>>>>>> *From:* Nandita Mohan >>>>>>> *Sent:* Monday, October 9, 2023 3:41 PM >>>>>>> *To:* user@tika.apache.org >>>>>>> *Subject:* Requesting Tika Server release: commons-compress >>>>>>> vulnerability >>>>>>> >>>>>>> >>>>>>> >>>>>>> Hi there, >>>>>>> >>>>>>> >>>>>>> >>>>>>> I work on a service which needs to upgrade our images due to this >>>>>>> vulnerability in Apache *commons-compress*: Apache Commons Compress >>>>>>> denial of service vulnerability · CVE-2023-42503 · GitHub Advisory >>>>>>> Database >>>>>>> <https://github.com/advisories/GHSA-cgwf-w82q-5jrr> >>>>>>> >>>>>>> >>>>>>> >>>>>>> This is due to use of Tika Server 2.9.0 (Apache Tika – Apache Tika >>>>>>> 1.27 <https://tika.apache.org/2.9.0/index.html>), which has >>>>>>> commons-compress as a dependency. I saw that Tim Allison recently >>>>>>> updated >>>>>>> this* commons-compress* version in the Github mirror repo: TIKA-4123 >>>>>>> -- general updates for 3.0.0-BETA -- upgrade commons-compress · >>>>>>> apache/tika@3c88246 (github.com) >>>>>>> <https://github.com/apache/tika/commit/3c882460838c818ab2aff310d1fba9a084fe4800> >>>>>>> >>>>>>> >>>>>>> >>>>>>> We would greatly appreciate if this could be released to tika-server >>>>>>> package in the next week , so we can update our images soon from this >>>>>>> vulnerability. >>>>>>> >>>>>>> >>>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> Nandita Mohan >>>>>>> >>>>>> >>>>