Hi all, I'm new to this list, hope it is the appropriate place for help questions. At Trendrr we had a lot of success with OpenNLP-tools 1.4.3 and Maxent 2.5.2 for sentiment classification, and were hoping to move to 1.5.0/3.0.0 as we explored more features. However, I am running into something of a roadblock, as it appears some things that affect serialization of models have changed. Here is the problem:
In the past, to train a model, we called DocumentCategorizerME.train, which returned a GISModel. We could then serialize this using GISModelWriter (from maxent) or some such thing. Having updated opennlp-tools and maxent, I now find that my earlier usage of DocumentCategorizerME is deprecated, and I am instead urged to use a call that returns a DoccatModel. Now I can no longer use GISModelWriter, as DoccatModel is not a subclass of AbstractModel. So my first question is this: what is the recommended method to serialize a DoccatModel? I've come across GenericModelSerializer, but it appears not to perform a lot of the legwork that GISModelWriter did. Now, there is another way to train a GIS model, through the GIS class in maxent. However, this is not suitable for us, as we need to specify our own feature generators, and this does not appear to be possible in the GIS class. I suppose then that if anybody could suggest a way to train a GIS model in which I am able to specify my own feature generator(s), my problems would be solved, so that is my second "question". If you're reading this, thanks for bearing with me, and I appreciate any input you have. Cheers, Dan
