I have also encountered the same issue in a simple test that tries to
identify an
"application/vnd.google-earth.kmz" file. I can work around the "invalid
mark" problem
by wrapping the InputStream used in Tike.detect(InputStream stream, String
name)
with an TikaInputStream. Sadly I still have problems with 3.2.0 since the
test now
fails since the "application/vnd.google-earth.kmz" file is detected as a
plain "application/zip".

Reverting back to 3.1.0 makes the detection work with a plain InputStream

/Pontus

On 2025/05/28 16:33:23 Craig Muchinsky via user wrote:
> I tested using the release, my dependency management tool made me aware
> that 3.2.0 was available so I decided to kick the tires and ran into this
> issue. I will have to spend some time on a reproduction scenario
>
> On Wed, May 28, 2025 at 1:33 AM Tilman Hausherr <[email protected]>
> wrote:
>
> > Did you test with the release or with the candidate or with an earlier
> > build? A bug like you mentioned was fixed just a few days ago. Please
share
> > the file and some minimal code. Tilman On 5/28/2025 2
> > *Caution*: External ([email protected])
> > First-Time Sender   Details
> > <
https://protection.inkyphishfence.com/details?id=Y29sbGlicmEvY3JhaWcubXVjaGluc2t5QGNvbGxpYnJhLmNvbS8xMDA5MGNmMjBhODUwMjViNzQzYzVlM2VhYjk3MDI4MS8xNzQ4NDEwNDIyLjcwMTE2Nzg=#key=31268a81d07715bf5cf4cef79d6ad111
>
> >   Report This Email
> > <
https://protection.inkyphishfence.com/report?id=Y29sbGlicmEvY3JhaWcubXVjaGluc2t5QGNvbGxpYnJhLmNvbS8xMDA5MGNmMjBhODUwMjViNzQzYzVlM2VhYjk3MDI4MS8xNzQ4NDEwNDIyLjcwMTE2Nzg=#key=31268a81d07715bf5cf4cef79d6ad111
>
> >
> > Did you test with the release or with the candidate or with an earlier
> >
> > build? A bug like you mentioned was fixed just a few days ago. Please
> >
> > share the file and some minimal code.
> >
> > Tilman
> >
> >
> >
> > On 5/28/2025 2:08 AM, Craig Muchinsky via user wrote:
> >
> > > After upgrading to tika 3.2.0, I started seeing the following
> >
> > > exception when attempting to detect the mime type for a given file,
> >
> > > I'm wondering if something in the way input streams are handled has
> >
> > > changed, or if this might be a regression?
> >
> > >
> >
> > > Caused by: java.io.IOException: Resetting to invalid mark
> >
> > > at [email protected]
> > /java.io.BufferedInputStream.implReset(BufferedInputStream.java:583)
> >
> > > at [email protected]
> > /java.io.BufferedInputStream.reset(BufferedInputStream.java:569)
> >
> > > at
> >
app//org.apache.tika.io.BoundedInputStream.reset(BoundedInputStream.java:115)
> >
> > > at
> >
app//org.apache.tika.detect.zip.DefaultZipContainerDetector.detectStreaming(DefaultZipContainerDetector.java:279)
> >
> > > at
> >
app//org.apache.tika.detect.zip.DefaultZipContainerDetector.detect(DefaultZipContainerDetector.java:192)
> >
> > > at
> >
app//org.apache.tika.detect.CompositeDetector.detect(CompositeDetector.java:84)
> >
> > > at app//org.apache.tika.Tika.detect(Tika.java:160)
> >
> > > at app//org.apache.tika.Tika.detect(Tika.java:185)
> >
> >
> >
> >
> >
>

Reply via email to