Hi, I’ve been trying to get basic language ranges working for the SHACL engine in RDF4J and I’ve stumbled upon some differences between how RDF4J and Jena implement basic language ranges.
The SPARQL spec points to: https://www.ietf.org/rfc/rfc4647.txt <https://www.ietf.org/rfc/rfc4647.txt> Specifically sections - 2.1. Basic Language Range - 3.3.1. Basic Filtering Looking at the ABNF in 2.1. language-range = (1*8ALPHA *("-" 1*8alphanum)) / "*" alphanum = ALPHA / DIGIT It looks like “*” is legal, “en” is legal and “en-gb” is legal (and even “a-ab-abc-12345678-a”). But “*-gb” is not legal and neither is “en-*”. It seems like the range “en” would match a tag “en-gb” and a tag “en”. I had a deep dive into the langMatch code in Jena and it seems to support “*” at any position in the range. Is Jena supporting part of the extended range specification, or am I missing something? (I have been missing a lot of things lately :P so I wouldn’t be surprised). Cheers, Håvard PS: From 2.2. Extended Language Range extended-language-range = (1*8ALPHA / "*”) *("-" (1*8alphanum / "*"))
