Re: Stanbol community: we need to hear you more!

Olivier Grisel Wed, 06 Jul 2011 01:36:41 -0700

2011/7/6 Tommaso Teofili <[email protected]>:
> As far as I know there have been some discussions about Portuguese models on
> OpenNLP mailing list [1] so Alex could find help about this topic there.
> My 2 cents,
> Tommaso
>
> [1] : http://markmail.org/thread/tjypzqrxe4r2cdnw



Current Stanbol enhancer need:

- Sentence segmentation (available for opennlp version 1.5)
- Tokenizer (available for opennlp version 1.5 or SimpleTokenizer is
probably ok for any European language)
- NameFinder model for generic entities (People, Place, Organization)
missing and not trivial to train
- POS taggers for domain specific taxonomy annotations (already available)

We don't need nor want to use a parser model for entity /  concept
extraction. It might be possible but too slow not scalable. It might
be useful later for relation extraction though.

So the main issue here is the missing NameFinder and the lag of
DBpedia portuguese labels / abstracts for the EntityHub.

-- 
Olivier
http://twitter.com/ogrisel - http://github.com/ogrisel

Re: Stanbol community: we need to hear you more!

Reply via email to