Hi Arne,

Thanks for testing 4.8.0-SNAPSHOT.

Part of #1773 is to change to the same IRI handling used elsewhere in Jena. While still based in jena-iri, the IRIx layer has a specific set of scheme specific rules. Pure jena-iri is not up-to-date with all the RFCs

The RDF/XMLfile itself is fine. The issue is the base URI in the parser setup.

The URN scheme urn:uuid: defines the rests of the URI to match the syntax of a UUID: 671940cc-e6b5-47ad-9992-2d9185f53464

RFC 8141 defines URNs as urn:NID:NSS -- it tightened up on URN syntax to require at least two characters in the middle part (NID) and one in the final part (NSS). It also permitted fragments, which were in the first URN RFC.


So <urn:uuid> --

* is legal by URI syntax,
* not correct the details a URN (must have 2 colons)
* not correct by the detail of the urn:uuid namespace. RFC 4122.

If you use a legal base, the file parses OK.
Is that possible for you?

urn:uid:abc
http://example.org/

(UID isn't registered -- and also Jena only has schema specific rules for certain URI and URN registrations.

   Andy

https://www.rfc-editor.org/rfc/rfc8141.html
https://www.rfc-editor.org/rfc/rfc4122.html

PS There will be a transition legacy route to get to the 4.7.0 parser but that is temporary.

On 03/03/2023 21:47, Arne Bernhardt wrote:
Hello,
the following code, which works fine under Jena 4.6, no longer works under
Jena 4.8.SNAPSHOT:

RDFParser.create()
         .source(graphUri)
         .base("urn:uuid")
         .lang(Lang.RDFXML)
         .parse(streamSink);

The graph looks like this:
<?xml version="1.0" encoding="utf-8"?>
<rdf:RDF xmlns:cim="http://iec.ch/TC57/CIM100#"; xmlns:md="
http://iec.ch/TC57/61970-552/ModelDescription/1#"; xmlns:rdf="
http://www.w3.org/1999/02/22-rdf-syntax-ns#"; xmlns:eu="
http://iec.ch/TC57/CIM100-European#";>
   <cim:LoadArea rdf:ID="_5b5b515b-91bb-41c6-ba63-71a711139a86">
     <cim:IdentifiedObject.name>1555284823 LoadArea
</cim:IdentifiedObject.name>
     <cim:IdentifiedObject.mRID>5b5b515b-91bb-41c6-ba63-71a711139a86
</cim:IdentifiedObject.mRID>
   </cim:LoadArea>
   <cim:SubLoadArea rdf:ID="_27f108dd-e578-4921-8d3a-753e67bd718e">
     <cim:IdentifiedObject.name>1055343234 SubLoadArea
</cim:IdentifiedObject.name>
     <cim:SubLoadArea.LoadArea rdf:resource=
"#_5b5b515b-91bb-41c6-ba63-71a711139a86" />
     <cim:IdentifiedObject.mRID>27f108dd-e578-4921-8d3a-753e67bd718e
</cim:IdentifiedObject.mRID>
   </cim:SubLoadArea>
</rdf:RDF>

The error is: "org.apache.jena.riot.RiotException: [line: 3, col: 64]
{E214} Resolving against bad URI <urn:uuid>:
<#_5b5b515b-91bb-41c6-ba63-71a711139a86>"

The example is an extract from the CGMES Conformity Assessment Scheme v3 -
Test Configurations (
https://www.entsoe.eu/data/cim/cim-conformity-and-interoperability/ ->
https://www.entsoe.eu/Documents/CIM_documents/Grid_Model_CIM/ENTSO-E_Test_Configurations_v3.0.2.zip
).

Could my problem be related to the changes in
https://github.com/apache/jena/issues/1773?
Are my options or my base URI wrong?
Or if the format is wrong, what specification does it violate? (I haven't
figured out this URI/IRI thing yet, maybe I haven't found the right sources
for it).
How do I get Jena to accept the file, preferably as is?

Greetings
Arne

Reply via email to