https://github.com/apache/jena/pull/820 makes a change to SPARQL queries in the way that \u and \U escapes are handled.

\U is like \u but with 8 hex.

It applies to Syntax.ARQ ("with extensions") which is the default in jena-arq and also used in Fuseki.

Syntax.SPARQL_11 remains untouched.

See discussion:

    w3c/rdf-tests#67
    w3c/sparql-12#77

tl;dr:

SPARQL and Turtle differ in the way they handle \u and \U escapes.

In SPARQL, they can appear anywhere (even for special characters like "{" or "#" or in keywords "SEL\u0045CT") because it happens in the input stream before parsing. A javacc feature that isn't present in other parser tools.

ARQ has not supported \U up to now. RIOT does.

In Turtle, \u and \U can appear in strings and URIs, not in prefix names, and nowhere else. This is a better design.

The general community consensus seems to be that the change should happen and it was a mistake in SPARQL 1.0.

Personally, I don't recall having seen a query in the wild that uses \u outside of strings or URIs. It is a confustication vector.

    Andy

Reply via email to