Hi All I ve tried to use the postagger command to learn models of various morphological features. Even if I know it is not adapted to, I also try to build a model for lemma tagging....
Below you will see the error I ve got [1]. The problem is due to fact that java.io.DataOutputStream is not able to serialize strings larger than 64KB. [2] presents the problem and gives some workarounds. What do you think about ? /Nicolas [1] Writing pos tagger model ... failed Error during writing model file '/tmp/train-lemma.model' encoded string too long: 153687 bytes java.io.UTFDataFormatException: encoded string too long: 153687 bytes at java.io.DataOutputStream.writeUTF(DataOutputStream.java:364) at java.io.DataOutputStream.writeUTF(DataOutputStream.java:323) at opennlp.maxent.io.BinaryGISModelWriter.writeUTF(BinaryGISModelWriter.java:73) [2] http://www.drillio.com/en/software-development/java/encoded-string-too-long-64kb-limit/
