[ 
https://issues.apache.org/jira/browse/TIKA-4563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18051544#comment-18051544
 ] 

Tilman Hausherr edited comment on TIKA-4563 at 1/13/26 4:17 PM:
----------------------------------------------------------------

One new exception: SchemaTypeLoaderException: XML-BEANS compiled schema: Could 
not locate compiled schema resource 

 (not new) One NPE for govdocs1/099/099673.doc , because 
AbstractPOIFSExtractor.handleOLENative() does not make a null check.

 (not new) Some NPEs in iwork.PagesContentHandler.startElement() likely because 
"sf:page-number" occurs but no sf:headers to initialize headers.

Many content differences with "Gehrmann Copyright" / "GehrmannCopyright". Maybe 
difference in the html parser? See  [^kio5_perldoc.mo] 

bug_trackers/GHOSTSCRIPT/226943-694743/GHOSTSCRIPT-691476-0.pdf this was 
detected as a PDF file, now as an MPEG file. It has a PDF file inside. It is a 
"wrapster" file, which is / was a tool to wrap files with an MP3 header to that 
it could be downloaded with Napster.
https://www.wallstreet-online.de/diskussion/391950-1/wie-funktioniert-wrapster


was (Author: tilman):
One new exception: SchemaTypeLoaderException: XML-BEANS compiled schema: Could 
not locate compiled schema resource 

 (not new) One NPE for govdocs1/099/099673.doc , because 
AbstractPOIFSExtractor.handleOLENative() does not make a null check.

 (not new) Some NPEs in iwork.PagesContentHandler.startElement() likely because 
"sf:page-number" occurs but no sf:headers to initialize headers.

Many content differences with "Gehrmann Copyright" / "GehrmannCopyright". Maybe 
difference in the html parser? See  [^kio5_perldoc.mo] 

> Prep for 3.3.0 release
> ----------------------
>
>                 Key: TIKA-4563
>                 URL: https://issues.apache.org/jira/browse/TIKA-4563
>             Project: Tika
>          Issue Type: Task
>            Reporter: Tim Allison
>            Priority: Major
>         Attachments: kio5_perldoc.mo, tika-3.3.0-20260110.tgz, tika-3.3.0.tgz
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to