[jira] [Commented] (COMMONSRDF-51) RDF-1.1 specifies that language tags need to be compared using lower-case

ASF GitHub Bot (JIRA) Fri, 20 Jan 2017 09:26:46 -0800

    [ 
https://issues.apache.org/jira/browse/COMMONSRDF-51?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15832139#comment-15832139
 ]


ASF GitHub Bot commented on COMMONSRDF-51:
------------------------------------------

Github user stain commented on the issue:

    https://github.com/apache/commons-rdf/pull/30
  
    Right, BCP47 normalisation would make sense, but sadly that is not directly
    permitted by RDF 1.1, only normalisation to lower case :-( - probably to
    avoid dependency on the registry.
    
    However I think we can try to make Commons RDF present a consistent RDF
    1.1-compliant view, which I would think includes that creating a literal
    with en-gb lowercase would return en-gb lowercase, also with RDF4J as back
    end. (Would this require our wrapper LiteralImpl to always lowercase for
    RDF4J?)
    
    Can we extend the RDF4J test to also cover the other settings? How are they
    provided? If the user is explicitly asking to go beyond the RDF standards,
    then they should not be surprised if Commons RDF's view goes along with
    that (or falls over), so then perhaps we don't need to worry about it here?
    (e.g. Jena can be configured to support generalized RDF which don't work
    well with the normal TripleImpl).
    
    
    
    On 16 Jan 2017 9:52 pm, "Peter Ansell" <notificati...@github.com> wrote:
    
    *@ansell* commented on this pull request.
    
    Looks fairly good to me. I disagree with the test assertion that disallows
    normalisation using the BCP47 conventions (e.g., en-GB) in their
    constructors, but it is a minor issue.
    ------------------------------
    
    In api/src/test/java/org/apache/commons/rdf/api/AbstractRDFTest.java
    <https://github.com/apache/commons-rdf/pull/30#pullrequestreview-16887719>:
    
    > @@ -194,6 +194,114 @@ public void testCreateLiteralLangISO693_3() throws 
Exception {
             assertEquals("\"Herbert Van de Sompel\"@vls", 
vls.ntriplesString());
         }
    
    +    public void testCreateLiteralLangCaseInsensitive() throws Exception {
    
    Does this need @Test <https://github.com/Test> annotation?
    ------------------------------
    
    In api/src/test/java/org/apache/commons/rdf/api/AbstractRDFTest.java
    <https://github.com/apache/commons-rdf/pull/30#pullrequestreview-16887719>:
    
    > @@ -194,6 +194,114 @@ public void testCreateLiteralLangISO693_3() throws 
Exception {
             assertEquals("\"Herbert Van de Sompel\"@vls", 
vls.ntriplesString());
         }
    
    +    public void testCreateLiteralLangCaseInsensitive() throws Exception {
    +        // COMMONSRDF-51: Literal langtag may not be in lowercase, but
    +        // must be COMPARED (aka .equals and .hashCode()) in lowercase
    +        // as the language space is lower case.
    +        final Literal lower = factory.createLiteral("Hello", "en-gb");
    +        final Literal upper = factory.createLiteral("Hello", "EN-GB");
    +        final Literal mixed = factory.createLiteral("Hello", "en-GB");
    +
    +
    +        assertEquals("en-gb", lower.getLanguageTag().get());
    
    RDF4J may not follow this in some cases. It may use the BCP47 normalisation
    conventions to obtain en-GB instead.
    
    —
    You are receiving this because you authored the thread.
    Reply to this email directly, view it on GitHub
    <https://github.com/apache/commons-rdf/pull/30#pullrequestreview-16887719>,
    or mute the thread
    
<https://github.com/notifications/unsubscribe-auth/AAPd5Zd7XV-563iLQNzvWAEI4dO7Hm4Qks5rS-aYgaJpZM4Lh1hF>
    .



> RDF-1.1 specifies that language tags need to be compared using lower-case
> -------------------------------------------------------------------------
>
>                 Key: COMMONSRDF-51
>                 URL: https://issues.apache.org/jira/browse/COMMONSRDF-51
>             Project: Apache Commons RDF
>          Issue Type: Bug
>          Components: api
>    Affects Versions: 0.3.0
>            Reporter: Peter Ansell
>            Assignee: Stian Soiland-Reyes
>
> The [RDF-1.1 specification states that the [value space of Literal language 
> tags is 
> lowercase|https://www.w3.org/TR/rdf11-concepts/#section-Graph-Literal], which 
> does not conflict with the case-insensitive specification in BCP47. The 
> Literal.equals and Literal.hashCode API contracts should specify that 
> language tags must be compared using lowercase, even if they are otherwise 
> stored and returned as upper-case by getLanguageTag. The API currently has 
> incorrect language by saying "character-by-character" for language tag 
> comparisons, as that implies case-sensitive comparisons are used.
> The lowercasing must also be done using a locale that is consistent (known 
> example where lowercase and uppercase do not roundtrip as expected for 
> US-ASCII characters is Turkish [1]), so I would recommend actually stating 
> that .toLowerCase(Locale.ENGLISH) is used.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (COMMONSRDF-51) RDF-1.1 specifies that language tags need to be compared using lower-case

Reply via email to