Re: [jira] [Commented] (OPENNLP-776) Model Objects should be Serializable

2016-11-12 Thread Joern Kottmann
Yes it is in the 776 branch.

On Nov 12, 2016 9:51 PM, "Tristan Nixon (JIRA)"  wrote:

>
> [ https://issues.apache.org/jira/browse/OPENNLP-776?page=
> com.atlassian.jira.plugin.system.issuetabpanels:comment-
> tabpanel=15660270#comment-15660270 ]
>
> Tristan Nixon commented on OPENNLP-776:
> ---
>
> I'm not seeing any changes on trunk, is there a branch or tag I should be
> looking at?
>
> > Model Objects should be Serializable
> > 
> >
> > Key: OPENNLP-776
> > URL: https://issues.apache.org/jira/browse/OPENNLP-776
> > Project: OpenNLP
> >  Issue Type: Improvement
> >Affects Versions: tools-1.5.3
> >Reporter: Tristan Nixon
> >Assignee: Joern Kottmann
> >Priority: Minor
> >  Labels: features, patch
> > Fix For: 1.7.0
> >
> > Attachments: externalizable.patch, 
> > serializable-basemodel-joern.patch,
> serializable-basemodel.patch, serialization_proxy.patch
> >
> >
> > Marking model objects (ParserModel, SentenceModel, etc.) as Serializable
> can enable a number of features offered by other Java frameworks (my own
> use case is described below). You've already got a good mechanism for
> (de-)serialization, but it cannot be leveraged by other frameworks without
> implementing the Serializable interface. I'm attaching a patch to BaseModel
> that implements the methods in the java.io.Externalizable interface as
> wrappers to the existing (de-)serialization methods. This simple change can
> open up a number of useful opportunities for integrating OpenNLP with other
> frameworks.
> > My use case is that I am incorporating OpenNLP into a Spark application.
> This requires that components of the system be distributed between the
> driver and worker nodes within the cluster. In order to do this, Spark uses
> Java serialization API to transmit objects between nodes. This is far more
> efficient than instantiating models on each node independently.
>
>
>
> --
> This message was sent by Atlassian JIRA
> (v6.3.4#6332)
>


[jira] [Commented] (OPENNLP-776) Model Objects should be Serializable

2016-11-07 Thread Tristan Nixon (JIRA)

[ 
https://issues.apache.org/jira/browse/OPENNLP-776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15644259#comment-15644259
 ] 

Tristan Nixon commented on OPENNLP-776:
---

I've been swamped with other work, but I should be able to look at this 
tomorrow.

> Model Objects should be Serializable
> 
>
> Key: OPENNLP-776
> URL: https://issues.apache.org/jira/browse/OPENNLP-776
> Project: OpenNLP
>  Issue Type: Improvement
>Affects Versions: tools-1.5.3
>Reporter: Tristan Nixon
>Assignee: Joern Kottmann
>Priority: Minor
>  Labels: features, patch
> Fix For: 1.6.1
>
> Attachments: externalizable.patch, 
> serializable-basemodel-joern.patch, serializable-basemodel.patch, 
> serialization_proxy.patch
>
>
> Marking model objects (ParserModel, SentenceModel, etc.) as Serializable can 
> enable a number of features offered by other Java frameworks (my own use case 
> is described below). You've already got a good mechanism for 
> (de-)serialization, but it cannot be leveraged by other frameworks without 
> implementing the Serializable interface. I'm attaching a patch to BaseModel 
> that implements the methods in the java.io.Externalizable interface as 
> wrappers to the existing (de-)serialization methods. This simple change can 
> open up a number of useful opportunities for integrating OpenNLP with other 
> frameworks.
> My use case is that I am incorporating OpenNLP into a Spark application. This 
> requires that components of the system be distributed between the driver and 
> worker nodes within the cluster. In order to do this, Spark uses Java 
> serialization API to transmit objects between nodes. This is far more 
> efficient than instantiating models on each node independently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OPENNLP-776) Model Objects should be Serializable

2016-10-28 Thread Joern Kottmann (JIRA)

[ 
https://issues.apache.org/jira/browse/OPENNLP-776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15615803#comment-15615803
 ] 

Joern Kottmann commented on OPENNLP-776:


I pushed this to branch 776. It would be really nice if you or someone else 
could review these changes. I tested this with Spark and the Doccat Model and 
it worked without any issues. 

> Model Objects should be Serializable
> 
>
> Key: OPENNLP-776
> URL: https://issues.apache.org/jira/browse/OPENNLP-776
> Project: OpenNLP
>  Issue Type: Improvement
>Affects Versions: tools-1.5.3
>Reporter: Tristan Nixon
>Assignee: Joern Kottmann
>Priority: Minor
>  Labels: features, patch
> Fix For: 1.6.1
>
> Attachments: externalizable.patch, 
> serializable-basemodel-joern.patch, serializable-basemodel.patch, 
> serialization_proxy.patch
>
>
> Marking model objects (ParserModel, SentenceModel, etc.) as Serializable can 
> enable a number of features offered by other Java frameworks (my own use case 
> is described below). You've already got a good mechanism for 
> (de-)serialization, but it cannot be leveraged by other frameworks without 
> implementing the Serializable interface. I'm attaching a patch to BaseModel 
> that implements the methods in the java.io.Externalizable interface as 
> wrappers to the existing (de-)serialization methods. This simple change can 
> open up a number of useful opportunities for integrating OpenNLP with other 
> frameworks.
> My use case is that I am incorporating OpenNLP into a Spark application. This 
> requires that components of the system be distributed between the driver and 
> worker nodes within the cluster. In order to do this, Spark uses Java 
> serialization API to transmit objects between nodes. This is far more 
> efficient than instantiating models on each node independently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OPENNLP-776) Model Objects should be Serializable

2016-10-04 Thread Tristan Nixon (JIRA)

[ 
https://issues.apache.org/jira/browse/OPENNLP-776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15545918#comment-15545918
 ] 

Tristan Nixon commented on OPENNLP-776:
---

Sorry, I probably should have removed that older patch.

> Model Objects should be Serializable
> 
>
> Key: OPENNLP-776
> URL: https://issues.apache.org/jira/browse/OPENNLP-776
> Project: OpenNLP
>  Issue Type: Improvement
>Affects Versions: tools-1.5.3
>Reporter: Tristan Nixon
>Assignee: Joern Kottmann
>Priority: Minor
>  Labels: features, patch
> Fix For: 1.6.1
>
> Attachments: externalizable.patch, serializable-basemodel.patch, 
> serialization_proxy.patch
>
>
> Marking model objects (ParserModel, SentenceModel, etc.) as Serializable can 
> enable a number of features offered by other Java frameworks (my own use case 
> is described below). You've already got a good mechanism for 
> (de-)serialization, but it cannot be leveraged by other frameworks without 
> implementing the Serializable interface. I'm attaching a patch to BaseModel 
> that implements the methods in the java.io.Externalizable interface as 
> wrappers to the existing (de-)serialization methods. This simple change can 
> open up a number of useful opportunities for integrating OpenNLP with other 
> frameworks.
> My use case is that I am incorporating OpenNLP into a Spark application. This 
> requires that components of the system be distributed between the driver and 
> worker nodes within the cluster. In order to do this, Spark uses Java 
> serialization API to transmit objects between nodes. This is far more 
> efficient than instantiating models on each node independently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OPENNLP-776) Model Objects should be Serializable

2016-10-04 Thread Joern Kottmann (JIRA)

[ 
https://issues.apache.org/jira/browse/OPENNLP-776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15545887#comment-15545887
 ] 

Joern Kottmann commented on OPENNLP-776:


Sorry for the confusion is was speaking all the time about 
serializable-basemodel.patch and not serialization_proxy.patch.

> Model Objects should be Serializable
> 
>
> Key: OPENNLP-776
> URL: https://issues.apache.org/jira/browse/OPENNLP-776
> Project: OpenNLP
>  Issue Type: Improvement
>Affects Versions: tools-1.5.3
>Reporter: Tristan Nixon
>Assignee: Joern Kottmann
>Priority: Minor
>  Labels: features, patch
> Fix For: 1.6.1
>
> Attachments: externalizable.patch, serializable-basemodel.patch, 
> serialization_proxy.patch
>
>
> Marking model objects (ParserModel, SentenceModel, etc.) as Serializable can 
> enable a number of features offered by other Java frameworks (my own use case 
> is described below). You've already got a good mechanism for 
> (de-)serialization, but it cannot be leveraged by other frameworks without 
> implementing the Serializable interface. I'm attaching a patch to BaseModel 
> that implements the methods in the java.io.Externalizable interface as 
> wrappers to the existing (de-)serialization methods. This simple change can 
> open up a number of useful opportunities for integrating OpenNLP with other 
> frameworks.
> My use case is that I am incorporating OpenNLP into a Spark application. This 
> requires that components of the system be distributed between the driver and 
> worker nodes within the cluster. In order to do this, Spark uses Java 
> serialization API to transmit objects between nodes. This is far more 
> efficient than instantiating models on each node independently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OPENNLP-776) Model Objects should be Serializable

2016-10-04 Thread Tristan Nixon (JIRA)

[ 
https://issues.apache.org/jira/browse/OPENNLP-776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15545791#comment-15545791
 ] 

Tristan Nixon commented on OPENNLP-776:
---

Well, it's a bit of a messy type hierarchy, since the write( int) method is 
defined on both the abstract class OutputStream AND on the interface 
DataOutput, which is inherited by interface ObjectOutput. The 
ObjectOutputStream class inherits from BOTH OutputStream AND ObjectOutput. 
However, the Externalizable interface defines the method writeExternal( 
ObjectOutput ), which implies that there could be other implementations of this 
interface that are not necessarily subtypes of OutputStream. This is in fact 
what some other frameworks do - they provide an alternative implementation.

> Model Objects should be Serializable
> 
>
> Key: OPENNLP-776
> URL: https://issues.apache.org/jira/browse/OPENNLP-776
> Project: OpenNLP
>  Issue Type: Improvement
>Affects Versions: tools-1.5.3
>Reporter: Tristan Nixon
>Assignee: Joern Kottmann
>Priority: Minor
>  Labels: features, patch
> Fix For: 1.6.1
>
> Attachments: externalizable.patch, serializable-basemodel.patch, 
> serialization_proxy.patch
>
>
> Marking model objects (ParserModel, SentenceModel, etc.) as Serializable can 
> enable a number of features offered by other Java frameworks (my own use case 
> is described below). You've already got a good mechanism for 
> (de-)serialization, but it cannot be leveraged by other frameworks without 
> implementing the Serializable interface. I'm attaching a patch to BaseModel 
> that implements the methods in the java.io.Externalizable interface as 
> wrappers to the existing (de-)serialization methods. This simple change can 
> open up a number of useful opportunities for integrating OpenNLP with other 
> frameworks.
> My use case is that I am incorporating OpenNLP into a Spark application. This 
> requires that components of the system be distributed between the driver and 
> worker nodes within the cluster. In order to do this, Spark uses Java 
> serialization API to transmit objects between nodes. This is far more 
> efficient than instantiating models on each node independently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OPENNLP-776) Model Objects should be Serializable

2016-10-04 Thread Joern Kottmann (JIRA)

[ 
https://issues.apache.org/jira/browse/OPENNLP-776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15545655#comment-15545655
 ] 

Joern Kottmann commented on OPENNLP-776:


Hmm, then I don't understand it. The write method can only be called with an 
object of type java.io.ObjectOutputStream and that must extend OutputStream, so 
it should be safe to assume that? No?

ObjectOutputStream is a class and not an interface. It is possible to pass in 
an object of it, or define a new class which extend it, in both cases the 
object has also the type OutputStream, right?


> Model Objects should be Serializable
> 
>
> Key: OPENNLP-776
> URL: https://issues.apache.org/jira/browse/OPENNLP-776
> Project: OpenNLP
>  Issue Type: Improvement
>Affects Versions: tools-1.5.3
>Reporter: Tristan Nixon
>Assignee: Joern Kottmann
>Priority: Minor
>  Labels: features, patch
> Fix For: 1.6.1
>
> Attachments: externalizable.patch, serializable-basemodel.patch, 
> serialization_proxy.patch
>
>
> Marking model objects (ParserModel, SentenceModel, etc.) as Serializable can 
> enable a number of features offered by other Java frameworks (my own use case 
> is described below). You've already got a good mechanism for 
> (de-)serialization, but it cannot be leveraged by other frameworks without 
> implementing the Serializable interface. I'm attaching a patch to BaseModel 
> that implements the methods in the java.io.Externalizable interface as 
> wrappers to the existing (de-)serialization methods. This simple change can 
> open up a number of useful opportunities for integrating OpenNLP with other 
> frameworks.
> My use case is that I am incorporating OpenNLP into a Spark application. This 
> requires that components of the system be distributed between the driver and 
> worker nodes within the cluster. In order to do this, Spark uses Java 
> serialization API to transmit objects between nodes. This is far more 
> efficient than instantiating models on each node independently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OPENNLP-776) Model Objects should be Serializable

2016-10-04 Thread Joern Kottmann (JIRA)

[ 
https://issues.apache.org/jira/browse/OPENNLP-776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15545560#comment-15545560
 ] 

Joern Kottmann commented on OPENNLP-776:


Can you give me an example? OpenNLP today only runs on Java 7 and is not tested 
on any other JVMs. So you probably here and there run into issues.
Do you run it on Android? I think it is save to simply hand over the stream and 
assume the type is InputStream / OutputStream.

> Model Objects should be Serializable
> 
>
> Key: OPENNLP-776
> URL: https://issues.apache.org/jira/browse/OPENNLP-776
> Project: OpenNLP
>  Issue Type: Improvement
>Affects Versions: tools-1.5.3
>Reporter: Tristan Nixon
>Assignee: Joern Kottmann
>Priority: Minor
>  Labels: features, patch
> Fix For: 1.6.1
>
> Attachments: externalizable.patch, serializable-basemodel.patch, 
> serialization_proxy.patch
>
>
> Marking model objects (ParserModel, SentenceModel, etc.) as Serializable can 
> enable a number of features offered by other Java frameworks (my own use case 
> is described below). You've already got a good mechanism for 
> (de-)serialization, but it cannot be leveraged by other frameworks without 
> implementing the Serializable interface. I'm attaching a patch to BaseModel 
> that implements the methods in the java.io.Externalizable interface as 
> wrappers to the existing (de-)serialization methods. This simple change can 
> open up a number of useful opportunities for integrating OpenNLP with other 
> frameworks.
> My use case is that I am incorporating OpenNLP into a Spark application. This 
> requires that components of the system be distributed between the driver and 
> worker nodes within the cluster. In order to do this, Spark uses Java 
> serialization API to transmit objects between nodes. This is far more 
> efficient than instantiating models on each node independently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OPENNLP-776) Model Objects should be Serializable

2016-10-04 Thread Joern Kottmann (JIRA)

[ 
https://issues.apache.org/jira/browse/OPENNLP-776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15545212#comment-15545212
 ] 

Joern Kottmann commented on OPENNLP-776:


Thanks, looks good, I think we can more or less merge it like that for the 
1.6.1 release. One question, in which case can the else block of the if( in 
instanceof InputStream ) be entered in the read and write methods ? As far as I 
understand will this always be true, since the type is defined as part of the 
Java API and won't change. I suggest we drop the else block.

I will test this on my cluster in the next days and then report back here.


> Model Objects should be Serializable
> 
>
> Key: OPENNLP-776
> URL: https://issues.apache.org/jira/browse/OPENNLP-776
> Project: OpenNLP
>  Issue Type: Improvement
>Affects Versions: tools-1.5.3
>Reporter: Tristan Nixon
>Assignee: Joern Kottmann
>Priority: Minor
>  Labels: features, patch
> Fix For: 1.6.1
>
> Attachments: externalizable.patch, serializable-basemodel.patch, 
> serialization_proxy.patch
>
>
> Marking model objects (ParserModel, SentenceModel, etc.) as Serializable can 
> enable a number of features offered by other Java frameworks (my own use case 
> is described below). You've already got a good mechanism for 
> (de-)serialization, but it cannot be leveraged by other frameworks without 
> implementing the Serializable interface. I'm attaching a patch to BaseModel 
> that implements the methods in the java.io.Externalizable interface as 
> wrappers to the existing (de-)serialization methods. This simple change can 
> open up a number of useful opportunities for integrating OpenNLP with other 
> frameworks.
> My use case is that I am incorporating OpenNLP into a Spark application. This 
> requires that components of the system be distributed between the driver and 
> worker nodes within the cluster. In order to do this, Spark uses Java 
> serialization API to transmit objects between nodes. This is far more 
> efficient than instantiating models on each node independently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OPENNLP-776) Model Objects should be Serializable

2016-08-19 Thread Tristan Nixon (JIRA)

[ 
https://issues.apache.org/jira/browse/OPENNLP-776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15428454#comment-15428454
 ] 

Tristan Nixon commented on OPENNLP-776:
---

Good point. I thought the only way to provide custom serialization was to use 
Externalizable, which does require a no-arg constructor, but now I see one can 
put the readObject and writeObject methods into a Serializable and get the same 
effect (leaving me wondering what the point of Externalizable is...).

One slight complication with this is that if we rely on Object's no-arg 
constructor, the implicit initialization of fields like artifactMap and 
artifactSerializers does not happen, so I need to do this explicitly in the 
readObject method, meaning they cannot be final anymore (nor can 
isLoadedFromSerialized). 

Otherwise, it seems to be working fine! See the attached patch.

> Model Objects should be Serializable
> 
>
> Key: OPENNLP-776
> URL: https://issues.apache.org/jira/browse/OPENNLP-776
> Project: OpenNLP
>  Issue Type: Improvement
>Affects Versions: tools-1.5.3
>Reporter: Tristan Nixon
>Assignee: Joern Kottmann
>Priority: Minor
>  Labels: features, patch
> Fix For: 1.6.1
>
> Attachments: externalizable.patch, serialization_proxy.patch
>
>
> Marking model objects (ParserModel, SentenceModel, etc.) as Serializable can 
> enable a number of features offered by other Java frameworks (my own use case 
> is described below). You've already got a good mechanism for 
> (de-)serialization, but it cannot be leveraged by other frameworks without 
> implementing the Serializable interface. I'm attaching a patch to BaseModel 
> that implements the methods in the java.io.Externalizable interface as 
> wrappers to the existing (de-)serialization methods. This simple change can 
> open up a number of useful opportunities for integrating OpenNLP with other 
> frameworks.
> My use case is that I am incorporating OpenNLP into a Spark application. This 
> requires that components of the system be distributed between the driver and 
> worker nodes within the cluster. In order to do this, Spark uses Java 
> serialization API to transmit objects between nodes. This is far more 
> efficient than instantiating models on each node independently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OPENNLP-776) Model Objects should be Serializable

2016-08-08 Thread Tristan Nixon (JIRA)

[ 
https://issues.apache.org/jira/browse/OPENNLP-776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15412332#comment-15412332
 ] 

Tristan Nixon commented on OPENNLP-776:
---

This pattern is quite common in frameworks that manage object state for you. 
Classes are instantiated via a no-arg constructor, and then state is set via 
setters and/or some specialized de-serialization method.

Many different serialization frameworks work this way, such as JAXB, Jackson, 
etc. Also ORM frameworks (hibernate, JPA), IOC frameworks (Spring, CDI), and 
many others.

> Model Objects should be Serializable
> 
>
> Key: OPENNLP-776
> URL: https://issues.apache.org/jira/browse/OPENNLP-776
> Project: OpenNLP
>  Issue Type: Improvement
>Affects Versions: tools-1.5.3
>Reporter: Tristan Nixon
>Priority: Minor
>  Labels: features, patch
> Fix For: 1.6.1
>
> Attachments: externalizable.patch
>
>
> Marking model objects (ParserModel, SentenceModel, etc.) as Serializable can 
> enable a number of features offered by other Java frameworks (my own use case 
> is described below). You've already got a good mechanism for 
> (de-)serialization, but it cannot be leveraged by other frameworks without 
> implementing the Serializable interface. I'm attaching a patch to BaseModel 
> that implements the methods in the java.io.Externalizable interface as 
> wrappers to the existing (de-)serialization methods. This simple change can 
> open up a number of useful opportunities for integrating OpenNLP with other 
> frameworks.
> My use case is that I am incorporating OpenNLP into a Spark application. This 
> requires that components of the system be distributed between the driver and 
> worker nodes within the cluster. In order to do this, Spark uses Java 
> serialization API to transmit objects between nodes. This is far more 
> efficient than instantiating models on each node independently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OPENNLP-776) Model Objects should be Serializable

2016-07-18 Thread Tristan Nixon (JIRA)

[ 
https://issues.apache.org/jira/browse/OPENNLP-776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15383308#comment-15383308
 ] 

Tristan Nixon commented on OPENNLP-776:
---

Finally returning to this after more than a year. I'm not sure I really 
understand the objection to no-arg constructors. Nevertheless, creating 
Externalizable model sub-classes is an acceptable solution for my purposes.

However, in order for this to work, loadModel(InputStream in) must be made 
protected (currently it is private) so that it can be called from the 
readExternal method in the sub-classes. That change should be sufficient for a 
resolution to my issue. Thanks!

> Model Objects should be Serializable
> 
>
> Key: OPENNLP-776
> URL: https://issues.apache.org/jira/browse/OPENNLP-776
> Project: OpenNLP
>  Issue Type: Improvement
>Affects Versions: tools-1.5.3
>Reporter: Tristan Nixon
>Priority: Minor
>  Labels: features, patch
> Attachments: BaseModel-serialization.patch, model-constructors.patch
>
>
> Marking model objects (ParserModel, SentenceModel, etc.) as Serializable can 
> enable a number of features offered by other Java frameworks (my own use case 
> is described below). You've already got a good mechanism for 
> (de-)serialization, but it cannot be leveraged by other frameworks without 
> implementing the Serializable interface. I'm attaching a patch to BaseModel 
> that implements the methods in the java.io.Externalizable interface as 
> wrappers to the existing (de-)serialization methods. This simple change can 
> open up a number of useful opportunities for integrating OpenNLP with other 
> frameworks.
> My use case is that I am incorporating OpenNLP into a Spark application. This 
> requires that components of the system be distributed between the driver and 
> worker nodes within the cluster. In order to do this, Spark uses Java 
> serialization API to transmit objects between nodes. This is far more 
> efficient than instantiating models on each node independently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OPENNLP-776) Model Objects should be Serializable

2015-05-19 Thread Joern Kottmann (JIRA)

[ 
https://issues.apache.org/jira/browse/OPENNLP-776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14550576#comment-14550576
 ] 

Joern Kottmann commented on OPENNLP-776:


Having no-arg constructors on all those models is not nice.

Can you please elaborate on this:
 This is far more efficient than instantiating models on each node 
independently.

How can the proposed patch make that more efficient. The models still need to 
be created and actually that is done using the existing serialization support.

 Model Objects should be Serializable
 

 Key: OPENNLP-776
 URL: https://issues.apache.org/jira/browse/OPENNLP-776
 Project: OpenNLP
  Issue Type: Improvement
  Components: Formats
Affects Versions: tools-1.5.3
Reporter: Tristan Nixon
Priority: Minor
  Labels: features, patch
 Attachments: BaseModel-serialization.patch, model-constructors.patch


 Marking model objects (ParserModel, SentenceModel, etc.) as Serializable can 
 enable a number of features offered by other Java frameworks (my own use case 
 is described below). You've already got a good mechanism for 
 (de-)serialization, but it cannot be leveraged by other frameworks without 
 implementing the Serializable interface. I'm attaching a patch to BaseModel 
 that implements the methods in the java.io.Externalizable interface as 
 wrappers to the existing (de-)serialization methods. This simple change can 
 open up a number of useful opportunities for integrating OpenNLP with other 
 frameworks.
 My use case is that I am incorporating OpenNLP into a Spark application. This 
 requires that components of the system be distributed between the driver and 
 worker nodes within the cluster. In order to do this, Spark uses Java 
 serialization API to transmit objects between nodes. This is far more 
 efficient than instantiating models on each node independently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OPENNLP-776) Model Objects should be Serializable

2015-05-19 Thread Joern Kottmann (JIRA)

[ 
https://issues.apache.org/jira/browse/OPENNLP-776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14550584#comment-14550584
 ] 

Joern Kottmann commented on OPENNLP-776:


The models could be sub-classed in a user project to implement the 
java.io.Externalizable interface.

 Model Objects should be Serializable
 

 Key: OPENNLP-776
 URL: https://issues.apache.org/jira/browse/OPENNLP-776
 Project: OpenNLP
  Issue Type: Improvement
  Components: Formats
Affects Versions: tools-1.5.3
Reporter: Tristan Nixon
Priority: Minor
  Labels: features, patch
 Attachments: BaseModel-serialization.patch, model-constructors.patch


 Marking model objects (ParserModel, SentenceModel, etc.) as Serializable can 
 enable a number of features offered by other Java frameworks (my own use case 
 is described below). You've already got a good mechanism for 
 (de-)serialization, but it cannot be leveraged by other frameworks without 
 implementing the Serializable interface. I'm attaching a patch to BaseModel 
 that implements the methods in the java.io.Externalizable interface as 
 wrappers to the existing (de-)serialization methods. This simple change can 
 open up a number of useful opportunities for integrating OpenNLP with other 
 frameworks.
 My use case is that I am incorporating OpenNLP into a Spark application. This 
 requires that components of the system be distributed between the driver and 
 worker nodes within the cluster. In order to do this, Spark uses Java 
 serialization API to transmit objects between nodes. This is far more 
 efficient than instantiating models on each node independently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OPENNLP-776) Model Objects should be Serializable

2015-05-19 Thread Tristan Nixon (JIRA)

[ 
https://issues.apache.org/jira/browse/OPENNLP-776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14550561#comment-14550561
 ] 

Tristan Nixon commented on OPENNLP-776:
---

You're totally welcome! Let me know when this gets merged into a release, so I 
can update my project and get rid of my custom build.

 Model Objects should be Serializable
 

 Key: OPENNLP-776
 URL: https://issues.apache.org/jira/browse/OPENNLP-776
 Project: OpenNLP
  Issue Type: Improvement
  Components: Formats
Affects Versions: tools-1.5.3
Reporter: Tristan Nixon
Priority: Minor
  Labels: features, patch
 Attachments: BaseModel-serialization.patch


 Marking model objects (ParserModel, SentenceModel, etc.) as Serializable can 
 enable a number of features offered by other Java frameworks (my own use case 
 is described below). You've already got a good mechanism for 
 (de-)serialization, but it cannot be leveraged by other frameworks without 
 implementing the Serializable interface. I'm attaching a patch to BaseModel 
 that implements the methods in the java.io.Externalizable interface as 
 wrappers to the existing (de-)serialization methods. This simple change can 
 open up a number of useful opportunities for integrating OpenNLP with other 
 frameworks.
 My use case is that I am incorporating OpenNLP into a Spark application. This 
 requires that components of the system be distributed between the driver and 
 worker nodes within the cluster. In order to do this, Spark uses Java 
 serialization API to transmit objects between nodes. This is far more 
 efficient than instantiating models on each node independently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OPENNLP-776) Model Objects should be Serializable

2015-05-19 Thread Tristan Nixon (JIRA)

[ 
https://issues.apache.org/jira/browse/OPENNLP-776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14550604#comment-14550604
 ] 

Tristan Nixon commented on OPENNLP-776:
---

It does not make the (de-)serialization process more efficient. It allows me to 
use a model as a broadcast variable which means it is de-serialized once on 
each worker node, and can then be re-used for all work on that node. Otherwise, 
it may need to be de-serialized multiple times, adding quite a bit of overhead 
to the application.

 Model Objects should be Serializable
 

 Key: OPENNLP-776
 URL: https://issues.apache.org/jira/browse/OPENNLP-776
 Project: OpenNLP
  Issue Type: Improvement
  Components: Formats
Affects Versions: tools-1.5.3
Reporter: Tristan Nixon
Priority: Minor
  Labels: features, patch
 Attachments: BaseModel-serialization.patch, model-constructors.patch


 Marking model objects (ParserModel, SentenceModel, etc.) as Serializable can 
 enable a number of features offered by other Java frameworks (my own use case 
 is described below). You've already got a good mechanism for 
 (de-)serialization, but it cannot be leveraged by other frameworks without 
 implementing the Serializable interface. I'm attaching a patch to BaseModel 
 that implements the methods in the java.io.Externalizable interface as 
 wrappers to the existing (de-)serialization methods. This simple change can 
 open up a number of useful opportunities for integrating OpenNLP with other 
 frameworks.
 My use case is that I am incorporating OpenNLP into a Spark application. This 
 requires that components of the system be distributed between the driver and 
 worker nodes within the cluster. In order to do this, Spark uses Java 
 serialization API to transmit objects between nodes. This is far more 
 efficient than instantiating models on each node independently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OPENNLP-776) Model Objects should be Serializable

2015-05-15 Thread Tommaso Teofili (JIRA)

[ 
https://issues.apache.org/jira/browse/OPENNLP-776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14545075#comment-14545075
 ] 

Tommaso Teofili commented on OPENNLP-776:
-

I think this would be a quite valuable improvements, thanks Tristan for it.

 Model Objects should be Serializable
 

 Key: OPENNLP-776
 URL: https://issues.apache.org/jira/browse/OPENNLP-776
 Project: OpenNLP
  Issue Type: Improvement
  Components: Formats
Affects Versions: tools-1.5.3
Reporter: Tristan Nixon
Priority: Minor
  Labels: features, patch
 Attachments: BaseModel-serialization.patch


 Marking model objects (ParserModel, SentenceModel, etc.) as Serializable can 
 enable a number of features offered by other Java frameworks (my own use case 
 is described below). You've already got a good mechanism for 
 (de-)serialization, but it cannot be leveraged by other frameworks without 
 implementing the Serializable interface. I'm attaching a patch to BaseModel 
 that implements the methods in the java.io.Externalizable interface as 
 wrappers to the existing (de-)serialization methods. This simple change can 
 open up a number of useful opportunities for integrating OpenNLP with other 
 frameworks.
 My use case is that I am incorporating OpenNLP into a Spark application. This 
 requires that components of the system be distributed between the driver and 
 worker nodes within the cluster. In order to do this, Spark uses Java 
 serialization API to transmit objects between nodes. This is far more 
 efficient than instantiating models on each node independently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)