[
https://issues.apache.org/jira/browse/STANBOL-1417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14497642#comment-14497642
]
Rupert Westenthaler commented on STANBOL-1417:
----------------------------------------------
First a request with no Content-Language to a chain that only contains the
language detection engine.
{code}
curl -X POST -H "Accept: text/turtle" -H "Content-type: text/plain" \
> --data "The Stanbol enhancer can detect famous cities such as \
> Paris and people such as Bob Marley."
> http://localhost:8088/enhancer/chain/STANBOL-1417-test
<urn:enhancement-785913f9-6716-9530-2098-5a3fb8ec8463>
a <http://fise.iks-project.eu/ontology/TextAnnotation> ,
<http://fise.iks-project.eu/ontology/Enhancement> ;
<http://fise.iks-project.eu/ontology/confidence>
"0.9999978087054073"^^<http://www.w3.org/2001/XMLSchema#double> ;
<http://fise.iks-project.eu/ontology/extracted-from>
<urn:content-item-sha1-7c31e64955afb9f4a09e72075ac48125de156c94> ;
<http://purl.org/dc/terms/created>
"2015-04-16T06:45:29.185Z"^^<http://www.w3.org/2001/XMLSchema#dateTime> ;
<http://purl.org/dc/terms/creator>
"org.apache.stanbol.enhancer.engines.langdetect.LanguageDetectionEnhancementEngine"^^<http://www.w3.org/2001/XMLSchema#string>
;
<http://purl.org/dc/terms/language>
"en" ;
<http://purl.org/dc/terms/type>
<http://purl.org/dc/terms/LinguisticSystem> .
{code}
The response contains a single Language Annotation for English with a
confidence of {{0.9999987}}
To demonstrate this feature in the next request we explicitly parse the
{{Content-Language: de}} (for German) in the request.
{code}
curl -X POST -H "Accept: text/turtle" -H "Content-type: text/plain" -H
"Content-Language: de" \
> --data "The Stanbol enhancer can detect famous cities such as \
> Paris and people such as Bob Marley."
> http://localhost:8088/enhancer/chain/STANBOL-1417-test
<urn:enhancement-29accc13-832f-e14c-73e5-a17484787a4d>
a <http://fise.iks-project.eu/ontology/TextAnnotation> ,
<http://fise.iks-project.eu/ontology/Enhancement> ;
<http://fise.iks-project.eu/ontology/confidence>
"0.999995337655101"^^<http://www.w3.org/2001/XMLSchema#double> ;
<http://fise.iks-project.eu/ontology/extracted-from>
<urn:content-item-sha1-7c31e64955afb9f4a09e72075ac48125de156c94> ;
<http://purl.org/dc/terms/created>
"2015-04-16T06:45:52.291Z"^^<http://www.w3.org/2001/XMLSchema#dateTime> ;
<http://purl.org/dc/terms/creator>
"org.apache.stanbol.enhancer.engines.langdetect.LanguageDetectionEnhancementEngine"^^<http://www.w3.org/2001/XMLSchema#string>
;
<http://purl.org/dc/terms/language>
"en" ;
<http://purl.org/dc/terms/type>
<http://purl.org/dc/terms/LinguisticSystem> .
<urn:enhancement-571f3031-badc-6d5a-753f-376c5b334c95>
a <http://fise.iks-project.eu/ontology/TextAnnotation> ,
<http://fise.iks-project.eu/ontology/Enhancement> ;
<http://fise.iks-project.eu/ontology/confidence>
"1.0"^^<http://www.w3.org/2001/XMLSchema#float> ;
<http://fise.iks-project.eu/ontology/extracted-from>
<urn:content-item-sha1-7c31e64955afb9f4a09e72075ac48125de156c94> ;
<http://purl.org/dc/terms/created>
"2015-04-16T06:45:52.288Z"^^<http://www.w3.org/2001/XMLSchema#dateTime> ;
<http://purl.org/dc/terms/creator>
"Content-Language Header of the
request"^^<http://www.w3.org/2001/XMLSchema#string> ;
<http://purl.org/dc/terms/language>
"de" ;
<http://purl.org/dc/terms/type>
<http://purl.org/dc/terms/LinguisticSystem> .
{code}
Now the response contains two Language Annotations. The one also present in the
first request (from the language detection engine) for English and a second one
for German.
The Language Annotation for German uses "Content-Language Header of the
request" as dc:creator and a confidence of 1.0
> Create Language Annotation for parsed "Content-Language" header
> ---------------------------------------------------------------
>
> Key: STANBOL-1417
> URL: https://issues.apache.org/jira/browse/STANBOL-1417
> Project: Stanbol
> Issue Type: Improvement
> Components: Enhancement Engines
> Affects Versions: 0.12.0
> Reporter: Rupert Westenthaler
> Assignee: Rupert Westenthaler
> Priority: Minor
> Fix For: 1.0.0, 0.12.1
>
>
> Stanbol supports parsing the language of the content by using the
> "Content-Language" header since STANBOL-660. However currently only the
> `dc:language` property is set for the ContentItem.
> However based on the specification of STANBOL-613 this information is only
> used as fallback if no language annotation is present in the ContentItem. So
> as soon as any Language Identification Engine is present in the Chain the
> "Content-Language" as parsed by the User will get ignored. This is not the
> intention of a user explicitly parsing the language.
> To force Stanbol to use the parsed language a Language Annotation with the
> confidence 1.0 needs to be added to the metadata of the ContentItem instead.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)