On Wed, Jun 27, 2012 at 3:50 PM, Jörn Kottmann <[email protected]> wrote: > On 06/27/2012 03:34 PM, Nicolas Hernandez wrote: >> >> Since I like to use the maltparser [1], I need now to adapt the ftb to >> a tag set called ftb+ as described by [2]. >> (only multi-word expressions which can be recognized by regular >> expression are considered, some pos tags result in the concatenation >> of the cat and subcat attributes...) >> >> I plan to do it by processing the MarkupAnnotations provided by the >> Tika MarkupAnnotator [3]. > > > Do you plan to use the UIMA POS Tagger Trainer to produce a model for you?
Actually... I use the POS Tagger Trainer CLI because I use some tools which detect and correct annotation errors but work on this format... > > We could add support for ftb+ tag sets to the code that produces training > data > out of the FTB. > > With the separator support in the training format that should work out fine > in the end. > The CLI tools also give you easy access to the build-in evaluation. > > Jörn -- Dr. Nicolas Hernandez Associate Professor (Maître de Conférences) Université de Nantes - LINA CNRS UMR 6241 http://enicolashernandez.blogspot.com http://www.univ-nantes.fr/hernandez-n +33 (0)2 51 12 53 94 +33 (0)2 40 30 60 67
