New reports are here: https://corpora.tika.apache.org/base/reports/tika-2.8.1-prerc1-b.tgz
The alignment in tika-eval is still not working as planned, but the content looks ok...more work to do on tika-eval. We're getting many more exceptions with gz, and I noticed that we're also getting many fewer attachments in some OLE2 based Office files. Both items were happening in the last reports I ran...as I look back. On Fri, Jul 21, 2023 at 10:07 AM Tim Allison <[email protected]> wrote: > LOL...I knew about this 2 years ago: > https://github.com/jai-imageio/jai-imageio-core/issues/57#issuecomment-870863133 > > On Fri, Jul 21, 2023 at 6:17 AM Tim Allison <[email protected]> wrote: > >> I'm still trying to figure this one out. I also noticed "sometimes a >> problem and sometimes not." >> >> I did find that if the sun/java bmp parser is selected then there's no >> NPE. If the jaiimageio BMP reader is selected, there's an NPE. >> >> The NPE comes from missing a "properties" file in the jaiimageio >> library. I'm trying to figure out if that's a Tika packaging problem or >> something else. >> >> On Thu, Jul 20, 2023 at 9:48 PM Tilman Hausherr <[email protected]> >> wrote: >> >>> On 20.07.2023 17:39, Tim Allison wrote: >>> > How about this: >>> > https://corpora.tika.apache.org/base/reports/tika-2.8.1-pre-rc1-v3.tgz >>> >>> >>> Thank you this looks much better. >>> >>> Now the the last file in new_exceptions_in_B_details.xlsx , r-cuda.bmp . >>> When doing a drag and drop on tika-app, I SOMETIMES get the NPE. If I >>> call tika app with the file from the command line, I don't. >>> >>> Btw why this for a bmp file? >>> >>> tiff:BitsPerSample: 5 65 >>> tiff:ImageLength: 384 >>> tiff:ImageWidth: 12 >>> >>> >>> >>>
