--- On Mon, 4/15/13, Jörn Kottmann <[email protected]> wrote: > Yes, the NER should be capable of detecting the terms, but > you could also try to use a dictionary.
Are you referring to a POS dictionary? I would have just 2 parts of speech: the terms and the other words, correct? What's the advantage of using NER over POS? > Your training data is too small, especially when you train > with a cutoff of 5 and the maxent model, > the perceptron will work better. So perception is good for a small set of training data? Is a maxent even necessary when words are not composed of other words? > Label more data until you have a few thousand sentences. Yes, this is my problem. I don't have thousands of sentences and I'm afraid to take the time and label the 100 or so that I have only for it to fail. Is there a (dis)advantage to training with 1000 long sentences over say, 2500 short ones? Thanks!
