Hi Arne,
Thanks for testing 4.8.0-SNAPSHOT.
Part of #1773 is to change to the same IRI handling used elsewhere in
Jena. While still based in jena-iri, the IRIx layer has a specific set
of scheme specific rules. Pure jena-iri is not up-to-date with all the RFCs
The RDF/XMLfile itself is fine. The issue is the base URI in the parser
setup.
The URN scheme urn:uuid: defines the rests of the URI to match the
syntax of a UUID: 671940cc-e6b5-47ad-9992-2d9185f53464
RFC 8141 defines URNs as urn:NID:NSS -- it tightened up on URN syntax to
require at least two characters in the middle part (NID) and one in the
final part (NSS). It also permitted fragments, which were in the first
URN RFC.
So <urn:uuid> --
* is legal by URI syntax,
* not correct the details a URN (must have 2 colons)
* not correct by the detail of the urn:uuid namespace. RFC 4122.
If you use a legal base, the file parses OK.
Is that possible for you?
urn:uid:abc
http://example.org/
(UID isn't registered -- and also Jena only has schema specific rules
for certain URI and URN registrations.
Andy
https://www.rfc-editor.org/rfc/rfc8141.html
https://www.rfc-editor.org/rfc/rfc4122.html
PS There will be a transition legacy route to get to the 4.7.0 parser
but that is temporary.
On 03/03/2023 21:47, Arne Bernhardt wrote:
Hello,
the following code, which works fine under Jena 4.6, no longer works under
Jena 4.8.SNAPSHOT:
RDFParser.create()
.source(graphUri)
.base("urn:uuid")
.lang(Lang.RDFXML)
.parse(streamSink);
The graph looks like this:
<?xml version="1.0" encoding="utf-8"?>
<rdf:RDF xmlns:cim="http://iec.ch/TC57/CIM100#" xmlns:md="
http://iec.ch/TC57/61970-552/ModelDescription/1#" xmlns:rdf="
http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:eu="
http://iec.ch/TC57/CIM100-European#">
<cim:LoadArea rdf:ID="_5b5b515b-91bb-41c6-ba63-71a711139a86">
<cim:IdentifiedObject.name>1555284823 LoadArea
</cim:IdentifiedObject.name>
<cim:IdentifiedObject.mRID>5b5b515b-91bb-41c6-ba63-71a711139a86
</cim:IdentifiedObject.mRID>
</cim:LoadArea>
<cim:SubLoadArea rdf:ID="_27f108dd-e578-4921-8d3a-753e67bd718e">
<cim:IdentifiedObject.name>1055343234 SubLoadArea
</cim:IdentifiedObject.name>
<cim:SubLoadArea.LoadArea rdf:resource=
"#_5b5b515b-91bb-41c6-ba63-71a711139a86" />
<cim:IdentifiedObject.mRID>27f108dd-e578-4921-8d3a-753e67bd718e
</cim:IdentifiedObject.mRID>
</cim:SubLoadArea>
</rdf:RDF>
The error is: "org.apache.jena.riot.RiotException: [line: 3, col: 64]
{E214} Resolving against bad URI <urn:uuid>:
<#_5b5b515b-91bb-41c6-ba63-71a711139a86>"
The example is an extract from the CGMES Conformity Assessment Scheme v3 -
Test Configurations (
https://www.entsoe.eu/data/cim/cim-conformity-and-interoperability/ ->
https://www.entsoe.eu/Documents/CIM_documents/Grid_Model_CIM/ENTSO-E_Test_Configurations_v3.0.2.zip
).
Could my problem be related to the changes in
https://github.com/apache/jena/issues/1773?
Are my options or my base URI wrong?
Or if the format is wrong, what specification does it violate? (I haven't
figured out this URI/IRI thing yet, maybe I haven't found the right sources
for it).
How do I get Jena to accept the file, preferably as is?
Greetings
Arne