[ 
https://issues.apache.org/jira/browse/JENA-394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13576966#comment-13576966
 ] 

Richard Cyganiak commented on JENA-394:
---------------------------------------

I see. I only checked 5th edition. So this will possibly fix itself at some 
point in the future with a Xerces upgrade. I agree that there's nothing that 
can reasonably be done about it in Jena. Thanks for checking!

RDF/XML references 2nd edition, so technically it's not even a bug. But RDF-WG 
should probably make it one.

Virtuoso ticket that would need fixing to make Turtle work: 
https://sourceforge.net/p/virtuoso/bugs/86/
                
> RDF/XML parser incorrectly disallows some Unicode characters
> ------------------------------------------------------------
>
>                 Key: JENA-394
>                 URL: https://issues.apache.org/jira/browse/JENA-394
>             Project: Apache Jena
>          Issue Type: Bug
>          Components: RDF/XML
>    Affects Versions: Jena 2.10.0
>            Reporter: Richard Cyganiak
>            Priority: Minor
>         Attachments: japanese-chars.xml, katakana-middle-dot.xml
>
>
> The Unicode character 'KATAKANA MIDDLE DOT' (U+30FB) in the local part of a 
> property name causes a parse exception in the RDF/XML parser. This seems to 
> be incorrect, as the character is allowed in IRIs and is allowed in XML local 
> names, as far as I can tell.
> Example file:
> <?xml version="1.0" encoding="utf-8" ?>
> <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"; 
> xmlns="http://example.com/ns#";>
>   <rdf:Description rdf:about="#this">
>     <隣接自治体・行政区 rdf:resource="#that"/>
>   </rdf:Description>
> </rdf:RDF>
> The offending character is the “dot” in the middle of the property name.
> rdfcat execution with stack trace:
> $ bin/rdfcat ~/katakana-middle-dot.xml 
> 18:09:37 ERROR riot                 :: Element type "?????" must be followed 
> by either attribute specifications, ">" or "/>".
> Exception in thread "main" org.apache.jena.riot.RiotException: Element type 
> "?????" must be followed by either attribute specifications, ">" or "/>".
>       at 
> org.apache.jena.riot.system.ErrorHandlerFactory$ErrorHandlerStd.fatal(ErrorHandlerFactory.java:132)
>       at 
> org.apache.jena.riot.lang.LangRDFXML$ErrorHandlerBridge.fatalError(LangRDFXML.java:242)
>       at 
> com.hp.hpl.jena.rdf.arp.impl.ARPSaxErrorHandler.fatalError(ARPSaxErrorHandler.java:48)
>       at com.hp.hpl.jena.rdf.arp.impl.XMLHandler.warning(XMLHandler.java:209)
>       at 
> com.hp.hpl.jena.rdf.arp.impl.XMLHandler.fatalError(XMLHandler.java:239)
>       at org.apache.xerces.util.ErrorHandlerWrapper.fatalError(Unknown Source)
>       at org.apache.xerces.impl.XMLErrorReporter.reportError(Unknown Source)
>       at org.apache.xerces.impl.XMLErrorReporter.reportError(Unknown Source)
>       at org.apache.xerces.impl.XMLErrorReporter.reportError(Unknown Source)
>       at org.apache.xerces.impl.XMLScanner.reportFatalError(Unknown Source)
>       at 
> org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanStartElement(Unknown
>  Source)
>       at 
> org.apache.xerces.impl.XMLDocumentFragmentScannerImpl$FragmentContentDispatcher.dispatch(Unknown
>  Source)
>       at 
> org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown 
> Source)
>       at org.apache.xerces.parsers.DTDConfiguration.parse(Unknown Source)
>       at org.apache.xerces.parsers.DTDConfiguration.parse(Unknown Source)
>       at org.apache.xerces.parsers.XMLParser.parse(Unknown Source)
>       at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown Source)
>       at 
> com.hp.hpl.jena.rdf.arp.impl.RDFXMLParser.parse(RDFXMLParser.java:151)
>       at com.hp.hpl.jena.rdf.arp.ARP.load(ARP.java:119)
>       at org.apache.jena.riot.lang.LangRDFXML.parse(LangRDFXML.java:141)
>       at 
> org.apache.jena.riot.RDFParserRegistry$ReaderRIOTFactoryImpl$1.read(RDFParserRegistry.java:148)
>       at org.apache.jena.riot.RDFDataMgr.process(RDFDataMgr.java:749)
>       at org.apache.jena.riot.RDFDataMgr.read(RDFDataMgr.java:258)
>       at org.apache.jena.riot.RDFDataMgr.read(RDFDataMgr.java:244)
>       at 
> org.apache.jena.riot.adapters.RDFReaderRIOT.read(RDFReaderRIOT.java:65)
>       at com.hp.hpl.jena.rdf.model.impl.ModelCom.read(ModelCom.java:276)
>       at 
> com.hp.hpl.jena.util.FileManager.readModelWorker(FileManager.java:403)
>       at com.hp.hpl.jena.util.FileManager.readModel(FileManager.java:342)
>       at jena.rdfcat.readInput(rdfcat.java:375)
>       at jena.rdfcat$ReadAction.run(rdfcat.java:552)
>       at jena.rdfcat.go(rdfcat.java:278)
>       at jena.rdfcat.main(rdfcat.java:260)
>  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to