[ https://issues.apache.org/jira/browse/COMMONSRDF-51?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Stian Soiland-Reyes updated COMMONSRDF-51: ------------------------------------------ Description: The [RDF-1.1 specification states that the [value space of Literal language tags is lowercase|https://www.w3.org/TR/rdf11-concepts/#section-Graph-Literal], which does not conflict with the case-insensitive specification in BCP47. The Literal.equals and Literal.hashCode API contracts should specify that language tags must be compared using lowercase, even if they are otherwise stored and returned as upper-case by getLanguageTag. The API currently has incorrect language by saying "character-by-character" for language tag comparisons, as that implies case-sensitive comparisons are used. The lowercasing must also be done using a locale that is consistent (known example where lowercase and uppercase do not roundtrip as expected for US-ASCII characters is Turkish [1]), so I would recommend actually stating that .toLowerCase(Locale.ENGLISH) is used. was: The RDF-1.1 specification states that the value space of Literal language tags is lowercase, which does not conflict with the case-insensitive specification in BCP47. The Literal.equals and Literal.hashCode API contracts should specify that language tags must be compared using lowercase, even if they are otherwise stored and returned as upper-case by getLanguageTag. The API currently has incorrect language by saying "character-by-character" for language tag comparisons, as that implies case-sensitive comparisons are used. The lowercasing must also be done using a locale that is consistent (known example where lowercase and uppercase do not roundtrip as expected for US-ASCII characters is Turkish [1]), so I would recommend actually stating that .toLowerCase(Locale.ENGLISH) is used. > RDF-1.1 specifies that language tags need to be compared using lower-case > ------------------------------------------------------------------------- > > Key: COMMONSRDF-51 > URL: https://issues.apache.org/jira/browse/COMMONSRDF-51 > Project: Apache Commons RDF > Issue Type: Bug > Components: api > Affects Versions: 0.3.0 > Reporter: Peter Ansell > > The [RDF-1.1 specification states that the [value space of Literal language > tags is > lowercase|https://www.w3.org/TR/rdf11-concepts/#section-Graph-Literal], which > does not conflict with the case-insensitive specification in BCP47. The > Literal.equals and Literal.hashCode API contracts should specify that > language tags must be compared using lowercase, even if they are otherwise > stored and returned as upper-case by getLanguageTag. The API currently has > incorrect language by saying "character-by-character" for language tag > comparisons, as that implies case-sensitive comparisons are used. > The lowercasing must also be done using a locale that is consistent (known > example where lowercase and uppercase do not roundtrip as expected for > US-ASCII characters is Turkish [1]), so I would recommend actually stating > that .toLowerCase(Locale.ENGLISH) is used. -- This message was sent by Atlassian JIRA (v6.3.4#6332)