Hi Daniel, Now we have a Chunk page at Wiki: https://sourceforge.net/apps/mediawiki/opennlp/index.php?title=Chunker
Now we need to check how to extract chunk information from Bosque Corpus. We have a small sample of Amazonia.AD here: http://svn.apache.org/viewvc/incubator/opennlp/trunk/opennlp-tools/src/test/resources/opennlp/tools/formats/ad.sample?view=markup. The AD format has phrases, but not chunks. I'm thinking about using the heuristic proposed by A Machine Learning Approach to Portuguese Clause Identification<http://webscience.org.br/wiki/images/f/f9/Clause-propor2010.pdf>to do that. Thanks, William On Mon, Jan 3, 2011 at 10:18 PM, daniel gatis <[email protected]> wrote: > Yeah! This year begun very well ;) > > thank you. > > On Mon, Jan 3, 2011 at 3:52 PM, William Colen <[email protected]> wrote: > > > Hi, Daniel, > > > > Sorry for the late reply. > > I'll work on a draft for that page, and will check how to train a > > Portuguese > > chunker using Bosque. It will take some days to finish, but I'll come > back > > as soon as I have something. > > > > Regards > > William > > > > > > > > On Tue, Dec 28, 2010 at 4:39 AM, daniel gatis <[email protected]> > > wrote: > > > > > Hi everyone, > > > I want to train a portuguese chunker with bosque corpora, but the wiki > > > topic > > > about this is blank. > > > So how can i train a portuguese chunker? > > > > > >
