InvalidFormatException on a PackagePart in OOXML
------------------------------------------------

                 Key: TIKA-530
                 URL: https://issues.apache.org/jira/browse/TIKA-530
             Project: Tika
          Issue Type: Bug
    Affects Versions: 0.8
            Reporter: Sjoerd Smeets


Hi,

I receive the following error when parsing an ooxml file:

Caused by: org.apache.poi.openxml4j.exceptions.InvalidFormatException: Absolute 
URI forbidden:  
file://///ravn.co.uk/London/Jobs/first%20introduction%20/Welcome%20day/1.avi

    at 
org.apache.poi.openxml4j.opc.PackagePartName?.throwExceptionIfAbsoluteUri(PackagePartName?.java:426)
 ~[poi-ooxml-3.7-beta1.jar:3.7-beta1]
    at 
org.apache.poi.openxml4j.opc.PackagePartName?.throwExceptionIfInvalidPartUri(PackagePartName?.java:175)
 ~[poi-ooxml-3.7-beta1.jar:3.7-beta1]
    at 
org.apache.poi.openxml4j.opc.PackagePartName?.<init>(PackagePartName?.java:83) 
~[poi-ooxml-3.7-beta1.jar:3.7-beta1]
    at 
org.apache.poi.openxml4j.opc.PackagingURIHelper.createPartName(PackagingURIHelper.java:470)
 ~[poi-ooxml-3.7-beta1.jar:3.7-beta1]
    at org.apache.poi.POIXMLDocument.getTargetPart(POIXMLDocument.java:95) 
~[poi-ooxml-3.7-beta1.jar:3.7-beta1]
    at org.apache.poi.POIXMLDocument.getTargetPart(POIXMLDocument.java:84) 
~[poi-ooxml-3.7-beta1.jar:3.7-beta1]
    at org.apache.poi.xslf.XSLFSlideShow.<init>(XSLFSlideShow.java:89) 
~[poi-ooxml-3.7-beta1.jar:3.7-beta1]
    at 
org.apache.poi.xslf.extractor.XSLFPowerPointExtractor.<init>(XSLFPowerPointExtractor.java:45)
 ~[poi-ooxml-3.7-beta1.jar:3.7-beta1]
    at 
org.apache.poi.extractor.ExtractorFactory.createExtractor(ExtractorFactory?.java:183)
 ~[poi-ooxml-3.7-beta1.jar:3.7-beta1]
    at 
org.apache.poi.extractor.ExtractorFactory.createExtractor(ExtractorFactory?.java:150)
 ~[poi-ooxml-3.7-beta1.jar:3.7-beta1]
    at 
org.apache.tika.parser.microsoft.ooxml.OOXMLExtractorFactory.parse(OOXMLExtractorFactory.java:53)
 ~[tika-parsers-0.8-SNAPSHOT.jar:na]

I can see that Absolute URI is forbidden, however, should it not just ignore 
the PackagePartName in POI and move on with the other parts?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to