[ https://issues.apache.org/jira/browse/OPENNLP-776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15428454#comment-15428454 ]
Tristan Nixon commented on OPENNLP-776: --------------------------------------- Good point. I thought the only way to provide custom serialization was to use Externalizable, which does require a no-arg constructor, but now I see one can put the readObject and writeObject methods into a Serializable and get the same effect (leaving me wondering what the point of Externalizable is...). One slight complication with this is that if we rely on Object's no-arg constructor, the implicit initialization of fields like artifactMap and artifactSerializers does not happen, so I need to do this explicitly in the readObject method, meaning they cannot be final anymore (nor can isLoadedFromSerialized). Otherwise, it seems to be working fine! See the attached patch. > Model Objects should be Serializable > ------------------------------------ > > Key: OPENNLP-776 > URL: https://issues.apache.org/jira/browse/OPENNLP-776 > Project: OpenNLP > Issue Type: Improvement > Affects Versions: tools-1.5.3 > Reporter: Tristan Nixon > Assignee: Joern Kottmann > Priority: Minor > Labels: features, patch > Fix For: 1.6.1 > > Attachments: externalizable.patch, serialization_proxy.patch > > > Marking model objects (ParserModel, SentenceModel, etc.) as Serializable can > enable a number of features offered by other Java frameworks (my own use case > is described below). You've already got a good mechanism for > (de-)serialization, but it cannot be leveraged by other frameworks without > implementing the Serializable interface. I'm attaching a patch to BaseModel > that implements the methods in the java.io.Externalizable interface as > wrappers to the existing (de-)serialization methods. This simple change can > open up a number of useful opportunities for integrating OpenNLP with other > frameworks. > My use case is that I am incorporating OpenNLP into a Spark application. This > requires that components of the system be distributed between the driver and > worker nodes within the cluster. In order to do this, Spark uses Java > serialization API to transmit objects between nodes. This is far more > efficient than instantiating models on each node independently. -- This message was sent by Atlassian JIRA (v6.3.4#6332)