[jira] [Updated] (OPENNLP-776) Model Objects should be Serializable
[ https://issues.apache.org/jira/browse/OPENNLP-776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joern Kottmann updated OPENNLP-776: --- Attachment: serializable-basemodel-joern.patch > Model Objects should be Serializable > > > Key: OPENNLP-776 > URL: https://issues.apache.org/jira/browse/OPENNLP-776 > Project: OpenNLP > Issue Type: Improvement >Affects Versions: tools-1.5.3 >Reporter: Tristan Nixon >Assignee: Joern Kottmann >Priority: Minor > Labels: features, patch > Fix For: 1.6.1 > > Attachments: externalizable.patch, > serializable-basemodel-joern.patch, serializable-basemodel.patch, > serialization_proxy.patch > > > Marking model objects (ParserModel, SentenceModel, etc.) as Serializable can > enable a number of features offered by other Java frameworks (my own use case > is described below). You've already got a good mechanism for > (de-)serialization, but it cannot be leveraged by other frameworks without > implementing the Serializable interface. I'm attaching a patch to BaseModel > that implements the methods in the java.io.Externalizable interface as > wrappers to the existing (de-)serialization methods. This simple change can > open up a number of useful opportunities for integrating OpenNLP with other > frameworks. > My use case is that I am incorporating OpenNLP into a Spark application. This > requires that components of the system be distributed between the driver and > worker nodes within the cluster. In order to do this, Spark uses Java > serialization API to transmit objects between nodes. This is far more > efficient than instantiating models on each node independently. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (OPENNLP-776) Model Objects should be Serializable
[ https://issues.apache.org/jira/browse/OPENNLP-776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tristan Nixon updated OPENNLP-776: -- Attachment: serializable-basemodel.patch Patch to make BaseModel serializable > Model Objects should be Serializable > > > Key: OPENNLP-776 > URL: https://issues.apache.org/jira/browse/OPENNLP-776 > Project: OpenNLP > Issue Type: Improvement >Affects Versions: tools-1.5.3 >Reporter: Tristan Nixon >Assignee: Joern Kottmann >Priority: Minor > Labels: features, patch > Fix For: 1.6.1 > > Attachments: externalizable.patch, serializable-basemodel.patch, > serialization_proxy.patch > > > Marking model objects (ParserModel, SentenceModel, etc.) as Serializable can > enable a number of features offered by other Java frameworks (my own use case > is described below). You've already got a good mechanism for > (de-)serialization, but it cannot be leveraged by other frameworks without > implementing the Serializable interface. I'm attaching a patch to BaseModel > that implements the methods in the java.io.Externalizable interface as > wrappers to the existing (de-)serialization methods. This simple change can > open up a number of useful opportunities for integrating OpenNLP with other > frameworks. > My use case is that I am incorporating OpenNLP into a Spark application. This > requires that components of the system be distributed between the driver and > worker nodes within the cluster. In order to do this, Spark uses Java > serialization API to transmit objects between nodes. This is far more > efficient than instantiating models on each node independently. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (OPENNLP-776) Model Objects should be Serializable
[ https://issues.apache.org/jira/browse/OPENNLP-776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tristan Nixon updated OPENNLP-776: -- Attachment: serialization_proxy.patch Patch containing modifications to model classes to provide serialization proxies. > Model Objects should be Serializable > > > Key: OPENNLP-776 > URL: https://issues.apache.org/jira/browse/OPENNLP-776 > Project: OpenNLP > Issue Type: Improvement >Affects Versions: tools-1.5.3 >Reporter: Tristan Nixon >Priority: Minor > Labels: features, patch > Fix For: 1.6.1 > > Attachments: externalizable.patch, serialization_proxy.patch > > > Marking model objects (ParserModel, SentenceModel, etc.) as Serializable can > enable a number of features offered by other Java frameworks (my own use case > is described below). You've already got a good mechanism for > (de-)serialization, but it cannot be leveraged by other frameworks without > implementing the Serializable interface. I'm attaching a patch to BaseModel > that implements the methods in the java.io.Externalizable interface as > wrappers to the existing (de-)serialization methods. This simple change can > open up a number of useful opportunities for integrating OpenNLP with other > frameworks. > My use case is that I am incorporating OpenNLP into a Spark application. This > requires that components of the system be distributed between the driver and > worker nodes within the cluster. In order to do this, Spark uses Java > serialization API to transmit objects between nodes. This is far more > efficient than instantiating models on each node independently. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (OPENNLP-776) Model Objects should be Serializable
[ https://issues.apache.org/jira/browse/OPENNLP-776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joern Kottmann updated OPENNLP-776: --- Fix Version/s: 1.6.1 > Model Objects should be Serializable > > > Key: OPENNLP-776 > URL: https://issues.apache.org/jira/browse/OPENNLP-776 > Project: OpenNLP > Issue Type: Improvement >Affects Versions: tools-1.5.3 >Reporter: Tristan Nixon >Priority: Minor > Labels: features, patch > Fix For: 1.6.1 > > Attachments: externalizable.patch > > > Marking model objects (ParserModel, SentenceModel, etc.) as Serializable can > enable a number of features offered by other Java frameworks (my own use case > is described below). You've already got a good mechanism for > (de-)serialization, but it cannot be leveraged by other frameworks without > implementing the Serializable interface. I'm attaching a patch to BaseModel > that implements the methods in the java.io.Externalizable interface as > wrappers to the existing (de-)serialization methods. This simple change can > open up a number of useful opportunities for integrating OpenNLP with other > frameworks. > My use case is that I am incorporating OpenNLP into a Spark application. This > requires that components of the system be distributed between the driver and > worker nodes within the cluster. In order to do this, Spark uses Java > serialization API to transmit objects between nodes. This is far more > efficient than instantiating models on each node independently. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (OPENNLP-776) Model Objects should be Serializable
[ https://issues.apache.org/jira/browse/OPENNLP-776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tristan Nixon updated OPENNLP-776: -- Attachment: (was: externalizable.patch) > Model Objects should be Serializable > > > Key: OPENNLP-776 > URL: https://issues.apache.org/jira/browse/OPENNLP-776 > Project: OpenNLP > Issue Type: Improvement >Affects Versions: tools-1.5.3 >Reporter: Tristan Nixon >Priority: Minor > Labels: features, patch > Attachments: externalizable.patch > > > Marking model objects (ParserModel, SentenceModel, etc.) as Serializable can > enable a number of features offered by other Java frameworks (my own use case > is described below). You've already got a good mechanism for > (de-)serialization, but it cannot be leveraged by other frameworks without > implementing the Serializable interface. I'm attaching a patch to BaseModel > that implements the methods in the java.io.Externalizable interface as > wrappers to the existing (de-)serialization methods. This simple change can > open up a number of useful opportunities for integrating OpenNLP with other > frameworks. > My use case is that I am incorporating OpenNLP into a Spark application. This > requires that components of the system be distributed between the driver and > worker nodes within the cluster. In order to do this, Spark uses Java > serialization API to transmit objects between nodes. This is far more > efficient than instantiating models on each node independently. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (OPENNLP-776) Model Objects should be Serializable
[ https://issues.apache.org/jira/browse/OPENNLP-776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tristan Nixon updated OPENNLP-776: -- Attachment: externalizable.patch Also model classes can't be final if we're going to inherit from them (TokenizerModel is). Patch revised. > Model Objects should be Serializable > > > Key: OPENNLP-776 > URL: https://issues.apache.org/jira/browse/OPENNLP-776 > Project: OpenNLP > Issue Type: Improvement >Affects Versions: tools-1.5.3 >Reporter: Tristan Nixon >Priority: Minor > Labels: features, patch > Attachments: externalizable.patch > > > Marking model objects (ParserModel, SentenceModel, etc.) as Serializable can > enable a number of features offered by other Java frameworks (my own use case > is described below). You've already got a good mechanism for > (de-)serialization, but it cannot be leveraged by other frameworks without > implementing the Serializable interface. I'm attaching a patch to BaseModel > that implements the methods in the java.io.Externalizable interface as > wrappers to the existing (de-)serialization methods. This simple change can > open up a number of useful opportunities for integrating OpenNLP with other > frameworks. > My use case is that I am incorporating OpenNLP into a Spark application. This > requires that components of the system be distributed between the driver and > worker nodes within the cluster. In order to do this, Spark uses Java > serialization API to transmit objects between nodes. This is far more > efficient than instantiating models on each node independently. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (OPENNLP-776) Model Objects should be Serializable
[ https://issues.apache.org/jira/browse/OPENNLP-776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tristan Nixon updated OPENNLP-776: -- Attachment: externalizable.patch Actually, there is one more thing that must happen for this to be viable. Externalizable sub-classes must provide a public no-arg constructor, and there must be some constructor on some parent class that they can call, which should probably in-turn call BaseModel( COMPONENT_NAME, true ). It would be most convenient if each model type provided at least a protected no-arg constructor (similar to the public ones in my previous patch), as this encapsulates the functionality nicely. I'm attaching a revised patch of the necessary changes. > Model Objects should be Serializable > > > Key: OPENNLP-776 > URL: https://issues.apache.org/jira/browse/OPENNLP-776 > Project: OpenNLP > Issue Type: Improvement >Affects Versions: tools-1.5.3 >Reporter: Tristan Nixon >Priority: Minor > Labels: features, patch > Attachments: externalizable.patch > > > Marking model objects (ParserModel, SentenceModel, etc.) as Serializable can > enable a number of features offered by other Java frameworks (my own use case > is described below). You've already got a good mechanism for > (de-)serialization, but it cannot be leveraged by other frameworks without > implementing the Serializable interface. I'm attaching a patch to BaseModel > that implements the methods in the java.io.Externalizable interface as > wrappers to the existing (de-)serialization methods. This simple change can > open up a number of useful opportunities for integrating OpenNLP with other > frameworks. > My use case is that I am incorporating OpenNLP into a Spark application. This > requires that components of the system be distributed between the driver and > worker nodes within the cluster. In order to do this, Spark uses Java > serialization API to transmit objects between nodes. This is far more > efficient than instantiating models on each node independently. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (OPENNLP-776) Model Objects should be Serializable
[ https://issues.apache.org/jira/browse/OPENNLP-776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tristan Nixon updated OPENNLP-776: -- Attachment: (was: model-constructors.patch) > Model Objects should be Serializable > > > Key: OPENNLP-776 > URL: https://issues.apache.org/jira/browse/OPENNLP-776 > Project: OpenNLP > Issue Type: Improvement >Affects Versions: tools-1.5.3 >Reporter: Tristan Nixon >Priority: Minor > Labels: features, patch > Attachments: externalizable.patch > > > Marking model objects (ParserModel, SentenceModel, etc.) as Serializable can > enable a number of features offered by other Java frameworks (my own use case > is described below). You've already got a good mechanism for > (de-)serialization, but it cannot be leveraged by other frameworks without > implementing the Serializable interface. I'm attaching a patch to BaseModel > that implements the methods in the java.io.Externalizable interface as > wrappers to the existing (de-)serialization methods. This simple change can > open up a number of useful opportunities for integrating OpenNLP with other > frameworks. > My use case is that I am incorporating OpenNLP into a Spark application. This > requires that components of the system be distributed between the driver and > worker nodes within the cluster. In order to do this, Spark uses Java > serialization API to transmit objects between nodes. This is far more > efficient than instantiating models on each node independently. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (OPENNLP-776) Model Objects should be Serializable
[ https://issues.apache.org/jira/browse/OPENNLP-776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tristan Nixon updated OPENNLP-776: -- Attachment: (was: BaseModel-serialization.patch) > Model Objects should be Serializable > > > Key: OPENNLP-776 > URL: https://issues.apache.org/jira/browse/OPENNLP-776 > Project: OpenNLP > Issue Type: Improvement >Affects Versions: tools-1.5.3 >Reporter: Tristan Nixon >Priority: Minor > Labels: features, patch > Attachments: externalizable.patch > > > Marking model objects (ParserModel, SentenceModel, etc.) as Serializable can > enable a number of features offered by other Java frameworks (my own use case > is described below). You've already got a good mechanism for > (de-)serialization, but it cannot be leveraged by other frameworks without > implementing the Serializable interface. I'm attaching a patch to BaseModel > that implements the methods in the java.io.Externalizable interface as > wrappers to the existing (de-)serialization methods. This simple change can > open up a number of useful opportunities for integrating OpenNLP with other > frameworks. > My use case is that I am incorporating OpenNLP into a Spark application. This > requires that components of the system be distributed between the driver and > worker nodes within the cluster. In order to do this, Spark uses Java > serialization API to transmit objects between nodes. This is far more > efficient than instantiating models on each node independently. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (OPENNLP-776) Model Objects should be Serializable
[ https://issues.apache.org/jira/browse/OPENNLP-776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joern Kottmann updated OPENNLP-776: --- Component/s: (was: Formats) Removed Formats component. This issue applies to all the components which have models. > Model Objects should be Serializable > > > Key: OPENNLP-776 > URL: https://issues.apache.org/jira/browse/OPENNLP-776 > Project: OpenNLP > Issue Type: Improvement >Affects Versions: tools-1.5.3 >Reporter: Tristan Nixon >Priority: Minor > Labels: features, patch > Attachments: BaseModel-serialization.patch, model-constructors.patch > > > Marking model objects (ParserModel, SentenceModel, etc.) as Serializable can > enable a number of features offered by other Java frameworks (my own use case > is described below). You've already got a good mechanism for > (de-)serialization, but it cannot be leveraged by other frameworks without > implementing the Serializable interface. I'm attaching a patch to BaseModel > that implements the methods in the java.io.Externalizable interface as > wrappers to the existing (de-)serialization methods. This simple change can > open up a number of useful opportunities for integrating OpenNLP with other > frameworks. > My use case is that I am incorporating OpenNLP into a Spark application. This > requires that components of the system be distributed between the driver and > worker nodes within the cluster. In order to do this, Spark uses Java > serialization API to transmit objects between nodes. This is far more > efficient than instantiating models on each node independently. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (OPENNLP-776) Model Objects should be Serializable
[ https://issues.apache.org/jira/browse/OPENNLP-776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tristan Nixon updated OPENNLP-776: -- Attachment: model-constructors.patch I realized that for automatic de-serialization, all models need No-Op constructors. See attached. > Model Objects should be Serializable > > > Key: OPENNLP-776 > URL: https://issues.apache.org/jira/browse/OPENNLP-776 > Project: OpenNLP > Issue Type: Improvement > Components: Formats >Affects Versions: tools-1.5.3 >Reporter: Tristan Nixon >Priority: Minor > Labels: features, patch > Attachments: BaseModel-serialization.patch, model-constructors.patch > > > Marking model objects (ParserModel, SentenceModel, etc.) as Serializable can > enable a number of features offered by other Java frameworks (my own use case > is described below). You've already got a good mechanism for > (de-)serialization, but it cannot be leveraged by other frameworks without > implementing the Serializable interface. I'm attaching a patch to BaseModel > that implements the methods in the java.io.Externalizable interface as > wrappers to the existing (de-)serialization methods. This simple change can > open up a number of useful opportunities for integrating OpenNLP with other > frameworks. > My use case is that I am incorporating OpenNLP into a Spark application. This > requires that components of the system be distributed between the driver and > worker nodes within the cluster. In order to do this, Spark uses Java > serialization API to transmit objects between nodes. This is far more > efficient than instantiating models on each node independently. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (OPENNLP-776) Model Objects should be Serializable
[ https://issues.apache.org/jira/browse/OPENNLP-776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tristan Nixon updated OPENNLP-776: -- Attachment: BaseModel-serialization.patch My patch > Model Objects should be Serializable > > > Key: OPENNLP-776 > URL: https://issues.apache.org/jira/browse/OPENNLP-776 > Project: OpenNLP > Issue Type: Improvement > Components: Formats >Affects Versions: tools-1.5.3 >Reporter: Tristan Nixon >Priority: Minor > Labels: features, patch > Attachments: BaseModel-serialization.patch > > > Marking model objects (ParserModel, SentenceModel, etc.) as Serializable can > enable a number of features offered by other Java frameworks (my own use case > is described below). You've already got a good mechanism for > (de-)serialization, but it cannot be leveraged by other frameworks without > implementing the Serializable interface. I'm attaching a patch to BaseModel > that implements the methods in the java.io.Externalizable interface as > wrappers to the existing (de-)serialization methods. This simple change can > open up a number of useful opportunities for integrating OpenNLP with other > frameworks. > My use case is that I am incorporating OpenNLP into a Spark application. This > requires that components of the system be distributed between the driver and > worker nodes within the cluster. In order to do this, Spark uses Java > serialization API to transmit objects between nodes. This is far more > efficient than instantiating models on each node independently. -- This message was sent by Atlassian JIRA (v6.3.4#6332)