On 19 July 2011 11:14, Olivier Grisel <[email protected]> wrote: > 2011/7/18 Rupert Westenthaler <[email protected]>: >> Hi Florent >> >> On Mon, Jul 18, 2011 at 11:09 PM, florent andré >> <[email protected]> wrote: >>> Hi ! >>> >>> I worked on the UIMA engine to make it more generic. >>> It's now easy to add uima annotator, and I try the RegexAnnotator [1]. >>> >>> Depending on the configured regex, this annotator can output email, isbn,... >>> >>> AFAIK there is for now just TextAnnotation and EntityAnnotation type, and I >>> don't know if they are suitable for things like email, telephone number,... >>> >> I see several possibilities: >> >> 1) use a TextAnnotation with a custom value for dc:type >> >> urn:123 rdf:type TextAnnotation >> urn:123 dc:type <http://www.w3.org/2006/vcard/ns#Cell> >> urn:123 selected-text "+43 655 290989" >> urn:123 start "123"^^xsd:int >> urn:123 end "137"^^xsd:int >> >> I used here the concepts defined for CellPhones by the vCard ontology >> >> 2) use a TextAnnotation with an additional type >> >> urn:123 rdf:type TextAnnotation >> urn:123 rdf:type http://schema.org/ContactPoint >> urn:123 selected-text "+43 655 290989" >> urn:123 start "123"^^xsd:int >> urn:123 end "137"^^xsd:int >> urn:123 http://schema.org/telephone "+43655290989" >> >> Here I used the ContactPoint as defined by schema.org >> >> As I am writing this I have a preference for variant (2). Any other >> opinions, suggestions? > > urn:123 should not have both types at the same time: it's not true > that the annotation is the ContactPoint. +1 > > I would rather use: > > urn:123 rdf:type TextAnnotation > urn:123 selected-text "+43 655 290989" > urn:123 start "123"^^xsd:int > urn:123 end "137"^^xsd:int > urn:123 dc:related urn:456 > > urn:456 rdf:type http://schema.org/ContactPoint > urn:456 http://schema.org/telephone "+43655290989" > > > In the future, we should define a dedicated property to replace > dc:related so as to express: > > urn:456 "is a suggested semantic interpretation of the 'stuff' > referenced by the text annotation" urn:123. > > There might be several possible interpretation with confidence various > confidence score in case of ambiguity. +1 > > -- > Olivier > http://twitter.com/ogrisel - http://github.com/ogrisel >
-- Enrico Daga -- http://www.enridaga.net skype: enri-pan
