Re: Stanbol community: we need to hear you more!

Alex Lopez Wed, 06 Jul 2011 06:49:10 -0700

Tomasso, Olivier,

thanks for the pointers!

At this stage of the project we are not still 100% focused on this,however this trails will provide valuable help when we do start to diginto all this topics.


Em 06-07-2011 09:44, Tommaso Teofili escreveu:

2011/7/6 Olivier Grisel<[email protected]>

2011/7/6 Tommaso Teofili<[email protected]>:

As far as I know there have been some discussions about Portuguese models

on

OpenNLP mailing list [1] so Alex could find help about this topic there.
My 2 cents,
Tommaso

[1] : http://markmail.org/thread/tjypzqrxe4r2cdnw



Current Stanbol enhancer need:

- Sentence segmentation (available for opennlp version 1.5)
- Tokenizer (available for opennlp version 1.5 or SimpleTokenizer is
probably ok for any European language)
- NameFinder model for generic entities (People, Place, Organization)
missing and not trivial to train
- POS taggers for domain specific taxonomy annotations (already available)

We don't need nor want to use a parser model for entity /  concept
extraction. It might be possible but too slow not scalable. It might
be useful later for relation extraction though.


I agree; it's more a pointer to eventually help Alex find people interested
in training Portuguese models.
Tommaso

So the main issue here is the missing NameFinder and the lag of
DBpedia portuguese labels / abstracts for the EntityHub.

--
Olivier
http://twitter.com/ogrisel - http://github.com/ogrisel

Re: Stanbol community: we need to hear you more!

Reply via email to