https://bz.apache.org/bugzilla/show_bug.cgi?id=66878
Bug ID: 66878
Summary: Invalid URL entered by user as hyperlink target causes
exception when parsing.
Product: POI
Version: unspecified
Hardware: Other
OS: Linux
Status: NEW
Severity: minor
Priority: P2
Component: OPC
Assignee: [email protected]
Reporter: [email protected]
Target Milestone: ---
An invalid value entered into the document for a hyperlink causes an exception.
This is the destination of the hyperlink.
As this is a user entered value, I'm not sure why it should ever be looked at.
I don't believe there is any validation, so this field can have any garbage in
it. It should be ignored by POI.
Version is whatever comes with Apache Tika 2.7.0
org.apache.poi.openxml4j.opc.PackageRelationshipCollection - Cannot convert
https://cloud.google.com/bigtable/docs/backups#what-for%20https://cloud.google.com/bigtable/docs/release-notes#December_08_2022
in a valid relationship URI-> dummy-URI used
java.net.URISyntaxException: Illegal character in fragment at index 110:
https://cloud.google.com/bigtable/docs/backups#what-for%20https://cloud.google.com/bigtable/docs/release-notes#December_08_2022
at java.base/java.net.URI$Parser.fail(URI.java:2974)
at java.base/java.net.URI$Parser.checkChars(URI.java:3145)
at java.base/java.net.URI$Parser.parse(URI.java:3189)
at java.base/java.net.URI.<init>(URI.java:623)
at
org.apache.poi.openxml4j.opc.PackagingURIHelper.toURI(PackagingURIHelper.java:723)
at
org.apache.poi.openxml4j.opc.PackageRelationshipCollection.parseRelationshipsPart(PackageRelationshipCollection.java:358)
at
org.apache.poi.openxml4j.opc.PackageRelationshipCollection.<init>(PackageRelationshipCollection.java:160)
at
org.apache.poi.openxml4j.opc.PackageRelationshipCollection.<init>(PackageRelationshipCollection.java:130)
at
org.apache.poi.openxml4j.opc.PackagePart.loadRelationships(PackagePart.java:565)
at
org.apache.poi.openxml4j.opc.OPCPackage.getParts(OPCPackage.java:751)
at org.apache.poi.openxml4j.opc.OPCPackage.open(OPCPackage.java:322)
at
org.apache.tika.parser.microsoft.ooxml.OOXMLExtractorFactory.parse(OOXMLExtractorFactory.java:123)
at
org.apache.tika.parser.microsoft.ooxml.OOXMLParser.parse(OOXMLParser.java:115)
at
org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:298)
at
org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:298)
at
org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:195)
at org.apache.tika.Tika.parseToString(Tika.java:525)
at org.apache.tika.Tika.parseToString(Tika.java:495)
at
com.purato.index.documenthandler.TikaDocumentHandler.getText(TikaDocumentHandler.java:52)
--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]