Hi Rupert,
there are different use cases for NIF, which is why there are different NIF profiles. The "NIF Simple" profile mixes selection and annotation, but only allows one value for annotation. So
 <Alcoholism.txt#char=37028,37043>
     nif:sentimentValue "-0.80"^^xsd:decimal ;
     nif:sentimentValue "0.20"^^xsd:decimal ;

would be incorrect usage.

What you are describing is already called the "NIF Stanbol" profile which separates *selection* and *annotation* . (Note that OA separates even more 2 types of selection, Annotation, Body of Annotation). Think of "NIF Simple" as a filter that only keeps the best estimate from the "NIF Stanbol" profile and simplifies the structure in terms of triple and URN count.

It was inspired by the primary and secondary NLP graphs here:
http://de.slideshare.net/laroyo/querydriven-hypothesis-generation-for-answering-queries-over-nlp-graphs

all the best,
Sebastian


Am 03.06.2013 09:27, schrieb Rupert Westenthaler:
Hi Sebastian, all

On Fri, May 31, 2013 at 2:56 AM, Sebastian Hellmann
<[email protected]> wrote:

<Alcoholism.txt#char=37028,37043>
     a  nif:RFC5147String ;
     nif:anchorOf "Benzodiazepines, while useful in the management of acute
alcohol withdrawal, if used long-term can cause a worse outcome in
alcoholism."
     nif:beginIndex "37028" ;
     nif:endIndex "37165" ;
# nif simple profile, just two property
     nif:sentimentValue "-0.80"^^xsd:decimal ;
     nif:sentimentValueConfidence "0.9999978209631343" ;
#nif stanbol profile
     nif:opinion <http://uri_or_urn_for_the_marl_opinion> ;
     nif:referenceContext <Alcoholism.txt#char=0,91429>  .

<http://uri_or_urn_for_the_marl_opinion>
#some properties omitted
     marl:extractedFrom <Alcoholism.txt#char=37028,37043> ;
     <http://fise.iks-project.eu/ontology/confidence>
"0.9999978209631343"^^<http://www.w3.org/2001/XMLSchema#double> ;
       <http://fise.iks-project.eu/ontology/extracted-from>
<urn:content-item-sha1-b744059cdc5f802db787e9c40a7c3df53c5b6e68> ;
       <http://purl.org/dc/terms/created>
"2013-05-31T00:45:56.555Z"^^<http://www.w3.org/2001/XMLSchema#dateTime> ;
       <http://purl.org/dc/terms/creator>
"org.apache.stanbol.enhancer.engines.sentdetect.SentimentDetectionEnhancementEngine"^^<http://www.w3.org/2001/XMLSchema#string>
;
         rdf:type <http://purl.org/marl/ns#Opinion> .

IMHO we need to separate properties that define the selected part of
the content with those annotating the content! based on the above
example this are

Selection

<Alcoholism.txt#char=37028,37043>
     a  nif:RFC5147String ;
     nif:anchorOf "Benzodiazepines, while useful in the management of acute
alcohol withdrawal, if used long-term can cause a worse outcome in
alcoholism."
     nif:beginIndex "37028" ;
     nif:endIndex "37165" ;
Annotation

<Alcoholism.txt#char=37028,37043>
# nif simple profile, just two property
     nif:sentimentValue "-0.80"^^xsd:decimal ;
     nif:sentimentValueConfidence "0.9999978209631343" ;
#nif stanbol profile
     nif:opinion <http://uri_or_urn_for_the_marl_opinion> ;
     nif:referenceContext <Alcoholism.txt#char=0,91429>  .

<http://uri_or_urn_for_the_marl_opinion>
#some properties omitted
     marl:extractedFrom <Alcoholism.txt#char=37028,37043> ;
     <http://fise.iks-project.eu/ontology/confidence>
"0.9999978209631343"^^<http://www.w3.org/2001/XMLSchema#double> ;
       <http://fise.iks-project.eu/ontology/extracted-from>
<urn:content-item-sha1-b744059cdc5f802db787e9c40a7c3df53c5b6e68> ;
       <http://purl.org/dc/terms/created>
"2013-05-31T00:45:56.555Z"^^<http://www.w3.org/2001/XMLSchema#dateTime> ;
       <http://purl.org/dc/terms/creator>
"org.apache.stanbol.enhancer.engines.sentdetect.SentimentDetectionEnhancementEngine"^^<http://www.w3.org/2001/XMLSchema#string>
;
         rdf:type <http://purl.org/marl/ns#Opinion> .

The reason for that is that:

* all Selections will have a unique URI (defined by the Web-Fragment).
So we can only have a single resource for a given selection in the
document
* all Annotations are subject to the opinion of a EnhancementEngine.
So we might have different annotations about the same part of the
selection. There might be an SentimentDetectionEnhancementEngine1 that
want to assign "nif:sentimentValue "-0.80"^^xsd:decimal" and also
SentimentDetectionEnhancementEngine2 that assigns "nif:sentimentValue
"0.20"^^xsd:decimal" to the same sentence.

Representing this kind of Annotations would result in unwanted side
effects such as

  <Alcoholism.txt#char=37028,37043>
      nif:sentimentValue "-0.80"^^xsd:decimal ;
      nif:sentimentValue "0.20"^^xsd:decimal ;
      nif:sentimentValueConfidence "0.9999978209631343" ;
      nif:sentimentValueConfidence "0.9798786889989078" ;
      nif:opinion <http://uri_or_urn_for_the_marl_opinion1> ;
      nif:opinion <http://uri_or_urn_for_the_marl_opinion2> ;

  <http://uri_or_urn_for_the_marl_opinion1>
        <http://purl.org/dc/terms/creator>
        
"org.apache.stanbol.enhancer.engines.sentdetect.SentimentDetectionEnhancementEngine1"^^<http://www.w3.org/2001/XMLSchema#string>
;

  <http://uri_or_urn_for_the_marl_opinion2>
        <http://purl.org/dc/terms/creator>
       
"org.apache.stanbol.enhancer.engines.sentdetect.SentimentDetectionEnhancementEngine2"^^<http://www.w3.org/2001/XMLSchema#string>
;


best
Rupert

--
| Rupert Westenthaler             [email protected]
| Bodenlehenstraße 11                             ++43-699-11108907
| A-5500 Bischofshofen



--
Dipl. Inf. Sebastian Hellmann
Department of Computer Science, University of Leipzig
Events: NLP & DBpedia 2013 (http://nlp-dbpedia2013.blogs.aksw.org, Deadline: *July 8th*)
Venha para a Alemanha como PhD: http://bis.informatik.uni-leipzig.de/csf
Projects: http://nlp2rdf.org , http://linguistics.okfn.org , http://dbpedia.org/Wiktionary , http://dbpedia.org
Homepage: http://bis.informatik.uni-leipzig.de/SebastianHellmann
Research Group: http://aksw.org

Reply via email to