Hi,

What is the best way to go about instantiating the POSDictionary with a custom tag dictionary and with case sensitive flag set to false??

I need this for a full parse tree output as I have no control over the input. I've managed to create a parse model, but the only thing holding me back is the defining the case insensitive tag dictionary.

Now I know a new fix was recently done here and I have a copy of 1.5.2rc, but I just can't see how to go about it.

The majority of the methods within POSDictionary are deprecated and its recommend to use POSDictionary.create(), but there is no way to set the case sensitivity flag, which is true by default.

I can use the POSDictionary(String file, boolean caseSensitive) (deprecated constructor), but this leads to a NPE when calling getTags(String word) as when it attempts to find a word loaded into the dictionary, it is not found because it is in its proper case i.e. ('Italy', 'italy')

The reason being for this input:

'I INTEND TO COMPLETE SCHOOL AND GET A CERTIFICATE THIS SUMMER.'

results in

(TOP (S (NP (PRP I)) (VP (VBD INTEND) (VP (TO TO) (VP (VB COMPLETE) (NP (NNP SCHOOL) (CC AND) (NNP GET) (NNP A) (NNP CERTIFICATE) (NNP THIS) (NNP SUMMER))))) (. .)))

with all nouns being identified as NNP(s), the only way to avoid this is to use case sensitive set as 'false', which was possible with 1.3.1, which also gave the correct output.

(TOP (S (NP (PRP I)) (VP (VBP INTEND) (VP (TO TO) (VP (VP (VB COMPLETE) (NP (NN SCHOOL))) (CC AND) (VP (VB GET) (NP (DT A) (NN CERTIFICATE) (NN THIS) (NNP SUMMER.))))))))

Maybe I'm being blind to the solution.

Your help is greatly appreciated.

Reply via email to