Hi Rupert,
there are different use cases for NIF, which is why there are different
NIF profiles.
The "NIF Simple" profile mixes selection and annotation, but only allows
one value for annotation. So
<Alcoholism.txt#char=37028,37043>
nif:sentimentValue "-0.80"^^xsd:decimal ;
nif:sentimentValue "0.20"^^xsd:decimal ;
would be incorrect usage.
What you are describing is already called the "NIF Stanbol" profile
which separates *selection* and *annotation* . (Note that OA separates
even more 2 types of selection, Annotation, Body of Annotation).
Think of "NIF Simple" as a filter that only keeps the best estimate from
the "NIF Stanbol" profile and simplifies the structure in terms of
triple and URN count.
It was inspired by the primary and secondary NLP graphs here:
http://de.slideshare.net/laroyo/querydriven-hypothesis-generation-for-answering-queries-over-nlp-graphs
all the best,
Sebastian
Am 03.06.2013 09:27, schrieb Rupert Westenthaler:
Hi Sebastian, all
On Fri, May 31, 2013 at 2:56 AM, Sebastian Hellmann
<[email protected]> wrote:
<Alcoholism.txt#char=37028,37043>
a nif:RFC5147String ;
nif:anchorOf "Benzodiazepines, while useful in the management of acute
alcohol withdrawal, if used long-term can cause a worse outcome in
alcoholism."
nif:beginIndex "37028" ;
nif:endIndex "37165" ;
# nif simple profile, just two property
nif:sentimentValue "-0.80"^^xsd:decimal ;
nif:sentimentValueConfidence "0.9999978209631343" ;
#nif stanbol profile
nif:opinion <http://uri_or_urn_for_the_marl_opinion> ;
nif:referenceContext <Alcoholism.txt#char=0,91429> .
<http://uri_or_urn_for_the_marl_opinion>
#some properties omitted
marl:extractedFrom <Alcoholism.txt#char=37028,37043> ;
<http://fise.iks-project.eu/ontology/confidence>
"0.9999978209631343"^^<http://www.w3.org/2001/XMLSchema#double> ;
<http://fise.iks-project.eu/ontology/extracted-from>
<urn:content-item-sha1-b744059cdc5f802db787e9c40a7c3df53c5b6e68> ;
<http://purl.org/dc/terms/created>
"2013-05-31T00:45:56.555Z"^^<http://www.w3.org/2001/XMLSchema#dateTime> ;
<http://purl.org/dc/terms/creator>
"org.apache.stanbol.enhancer.engines.sentdetect.SentimentDetectionEnhancementEngine"^^<http://www.w3.org/2001/XMLSchema#string>
;
rdf:type <http://purl.org/marl/ns#Opinion> .
IMHO we need to separate properties that define the selected part of
the content with those annotating the content! based on the above
example this are
Selection
<Alcoholism.txt#char=37028,37043>
a nif:RFC5147String ;
nif:anchorOf "Benzodiazepines, while useful in the management of acute
alcohol withdrawal, if used long-term can cause a worse outcome in
alcoholism."
nif:beginIndex "37028" ;
nif:endIndex "37165" ;
Annotation
<Alcoholism.txt#char=37028,37043>
# nif simple profile, just two property
nif:sentimentValue "-0.80"^^xsd:decimal ;
nif:sentimentValueConfidence "0.9999978209631343" ;
#nif stanbol profile
nif:opinion <http://uri_or_urn_for_the_marl_opinion> ;
nif:referenceContext <Alcoholism.txt#char=0,91429> .
<http://uri_or_urn_for_the_marl_opinion>
#some properties omitted
marl:extractedFrom <Alcoholism.txt#char=37028,37043> ;
<http://fise.iks-project.eu/ontology/confidence>
"0.9999978209631343"^^<http://www.w3.org/2001/XMLSchema#double> ;
<http://fise.iks-project.eu/ontology/extracted-from>
<urn:content-item-sha1-b744059cdc5f802db787e9c40a7c3df53c5b6e68> ;
<http://purl.org/dc/terms/created>
"2013-05-31T00:45:56.555Z"^^<http://www.w3.org/2001/XMLSchema#dateTime> ;
<http://purl.org/dc/terms/creator>
"org.apache.stanbol.enhancer.engines.sentdetect.SentimentDetectionEnhancementEngine"^^<http://www.w3.org/2001/XMLSchema#string>
;
rdf:type <http://purl.org/marl/ns#Opinion> .
The reason for that is that:
* all Selections will have a unique URI (defined by the Web-Fragment).
So we can only have a single resource for a given selection in the
document
* all Annotations are subject to the opinion of a EnhancementEngine.
So we might have different annotations about the same part of the
selection. There might be an SentimentDetectionEnhancementEngine1 that
want to assign "nif:sentimentValue "-0.80"^^xsd:decimal" and also
SentimentDetectionEnhancementEngine2 that assigns "nif:sentimentValue
"0.20"^^xsd:decimal" to the same sentence.
Representing this kind of Annotations would result in unwanted side
effects such as
<Alcoholism.txt#char=37028,37043>
nif:sentimentValue "-0.80"^^xsd:decimal ;
nif:sentimentValue "0.20"^^xsd:decimal ;
nif:sentimentValueConfidence "0.9999978209631343" ;
nif:sentimentValueConfidence "0.9798786889989078" ;
nif:opinion <http://uri_or_urn_for_the_marl_opinion1> ;
nif:opinion <http://uri_or_urn_for_the_marl_opinion2> ;
<http://uri_or_urn_for_the_marl_opinion1>
<http://purl.org/dc/terms/creator>
"org.apache.stanbol.enhancer.engines.sentdetect.SentimentDetectionEnhancementEngine1"^^<http://www.w3.org/2001/XMLSchema#string>
;
<http://uri_or_urn_for_the_marl_opinion2>
<http://purl.org/dc/terms/creator>
"org.apache.stanbol.enhancer.engines.sentdetect.SentimentDetectionEnhancementEngine2"^^<http://www.w3.org/2001/XMLSchema#string>
;
best
Rupert
--
| Rupert Westenthaler [email protected]
| Bodenlehenstraße 11 ++43-699-11108907
| A-5500 Bischofshofen
--
Dipl. Inf. Sebastian Hellmann
Department of Computer Science, University of Leipzig
Events: NLP & DBpedia 2013 (http://nlp-dbpedia2013.blogs.aksw.org,
Deadline: *July 8th*)
Venha para a Alemanha como PhD: http://bis.informatik.uni-leipzig.de/csf
Projects: http://nlp2rdf.org , http://linguistics.okfn.org ,
http://dbpedia.org/Wiktionary , http://dbpedia.org
Homepage: http://bis.informatik.uni-leipzig.de/SebastianHellmann
Research Group: http://aksw.org