[
https://issues.apache.org/jira/browse/MARMOTTA-43?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13691684#comment-13691684
]
Peter Ansell commented on MARMOTTA-43:
--------------------------------------
There is now a static utility method for creating literals that may help with
this issue : RDFParserHelper.createLiteral
If Marmotta updates to Java-7 before Sesame-2.8 comes out you can implement a
LanguageHandler that works for you and reuse it by setting it on the parser
config object using
parser.getParserConfig().set(BasicParserSettings.LANGUAGE_HANDLERS,
Arrays.asList(myBCP47LanguageHandlerImpl)).
The fix for https://openrdf.atlassian.net/browse/SES-1826 will be to copy
RFC3066LanguageHandler [1] and reimplement it as BCP47LanguageHandler using the
Java-7 Locale methods.
If users want canonicalization/normalization in Sesame, they can optionally
turn it on by calling :
parser.getParserConfig().set(BasicParserSettings.NORMALIZE_LANGUAGE_TAGS,
true)
If the parser is using the same algorithm as implemented in
RDFParserHelper.createLiteral, then the normalization will occur, after
verification of the raw form, using any LanguageHandler implementations in the
list from parser.getParserConfig().get(BasicParserSettings.LANGUAGE_HANDLERS)
Not everyone will want that though, and some users even complained when we
switched on verification by default, even though they weren't using RFC3066
tags.
To switch verification off for those users (for parsers using the same
algorithm as RDFParserHelper.createLiteral), one can use:
parser.getParserConfig().set(BasicParserSettings.VERIFY_LANGUAGE_TAGS,
false)
[1]
https://bitbucket.org/openrdf/sesame/src/8457ab71c179ba4573d60922a7d26cf0dd0809ec/core/rio/languages/src/main/java/org/openrdf/rio/languages/RFC3066LanguageHandler.java?at=master
> RDF1.1 Language Tags
> --------------------
>
> Key: MARMOTTA-43
> URL: https://issues.apache.org/jira/browse/MARMOTTA-43
> Project: Marmotta
> Issue Type: Bug
> Components: Triple Store
> Affects Versions: 2.6
> Reporter: Jakob Frank
> Assignee: Sebastian Schaffert
> Priority: Minor
> Labels: kiwi, rdf
> Fix For: 3.1-incubating
>
> Original Estimate: 3h
> Remaining Estimate: 3h
>
> In RDF 1.1, language tags MUST conform to the BCP47 standard.
> We are currently using the format: xx_YY_zzz, but BCP47 requires xx-YY-zzz.
> Furthermore, RDF 1.1 requires lang tags to be all lowercase.
> Also in the DB-model, the lang-tag column is fixed to length 5 - which might
> be not sufficient for some special lang-tags.
> http://www.w3.org/TR/2013/WD-rdf11-concepts-20130115/#section-Graph-Literal
> http://tools.ietf.org/html/bcp47#section-2.2.9
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira