[ 
https://issues.apache.org/jira/browse/MARMOTTA-43?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13691684#comment-13691684
 ] 

Peter Ansell commented on MARMOTTA-43:
--------------------------------------

There is now a static utility method for creating literals that may help with 
this issue : RDFParserHelper.createLiteral

If Marmotta updates to Java-7 before Sesame-2.8 comes out you can implement a 
LanguageHandler that works for you and reuse it by setting it on the parser 
config object using 
parser.getParserConfig().set(BasicParserSettings.LANGUAGE_HANDLERS, 
Arrays.asList(myBCP47LanguageHandlerImpl)). 

The fix for https://openrdf.atlassian.net/browse/SES-1826 will be to copy 
RFC3066LanguageHandler [1] and reimplement it as BCP47LanguageHandler using the 
Java-7 Locale methods.

If users want canonicalization/normalization in Sesame, they can optionally 
turn it on by calling :

    parser.getParserConfig().set(BasicParserSettings.NORMALIZE_LANGUAGE_TAGS, 
true)

If the parser is using the same algorithm as implemented in 
RDFParserHelper.createLiteral, then the normalization will occur, after 
verification of the raw form, using any LanguageHandler implementations in the 
list from parser.getParserConfig().get(BasicParserSettings.LANGUAGE_HANDLERS)

Not everyone will want that though, and some users even complained when we 
switched on verification by default, even though they weren't using RFC3066 
tags.

To switch verification off for those users (for parsers using the same 
algorithm as RDFParserHelper.createLiteral), one can use:

    parser.getParserConfig().set(BasicParserSettings.VERIFY_LANGUAGE_TAGS, 
false)

[1] 
https://bitbucket.org/openrdf/sesame/src/8457ab71c179ba4573d60922a7d26cf0dd0809ec/core/rio/languages/src/main/java/org/openrdf/rio/languages/RFC3066LanguageHandler.java?at=master

                
> RDF1.1 Language Tags
> --------------------
>
>                 Key: MARMOTTA-43
>                 URL: https://issues.apache.org/jira/browse/MARMOTTA-43
>             Project: Marmotta
>          Issue Type: Bug
>          Components: Triple Store
>    Affects Versions: 2.6
>            Reporter: Jakob Frank
>            Assignee: Sebastian Schaffert
>            Priority: Minor
>              Labels: kiwi, rdf
>             Fix For: 3.1-incubating
>
>   Original Estimate: 3h
>  Remaining Estimate: 3h
>
> In RDF 1.1, language tags MUST conform to the BCP47 standard.
> We are currently using the format: xx_YY_zzz, but BCP47 requires xx-YY-zzz.
> Furthermore, RDF 1.1 requires lang tags to be all lowercase.
> Also in the DB-model, the lang-tag column is fixed to length 5 - which might 
> be not sufficient for some special lang-tags.
> http://www.w3.org/TR/2013/WD-rdf11-concepts-20130115/#section-Graph-Literal
> http://tools.ietf.org/html/bcp47#section-2.2.9

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to