So I found simple example in sources: WordTagSampleStreamTest.java, it parses string "This_x1 is_x2 a_x3 test_x4 sentence_x5 ._x6" using POSSample.
As I understand, with normal approach there are few steps for each language: 1. collect data for model 2. create POS dictionary like this: <dictionary> <entry tags="x1"> <token>This</token> </entry> <entry tags="x2"> <token>is</token> </entry> <entry tags="x3"> <token>a</token> </entry> ... 3. learn model with this dictionary Is it right approach? Is POS Tagger appropriate for this task? Thanks in advance, Yakov On Tue, Aug 27, 2013 at 6:31 PM, Yakov Keranchuk <[email protected]>wrote: > Hi > > Is it possible to make tagging for tokens with own rules? > Example: *The quick brown fox_animal jumps_action over the lazy dog_animal > * > * > * > Do we need to create custom dictionary for POS tagger? > If it so can there be only one dictionary for a few languages? > > Best regards, > Yakov >
