[jira] [Assigned] (OPENNLP-856) Refactor AdaptiveFeatureGenerator and FeatureGeneratorAdapter

2016-11-07 Thread Joern Kottmann (JIRA)

 [ 
https://issues.apache.org/jira/browse/OPENNLP-856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joern Kottmann reassigned OPENNLP-856:
--

Assignee: Joern Kottmann

> Refactor AdaptiveFeatureGenerator and FeatureGeneratorAdapter
> -
>
> Key: OPENNLP-856
> URL: https://issues.apache.org/jira/browse/OPENNLP-856
> Project: OpenNLP
>  Issue Type: Improvement
>  Components: Name Finder
>Affects Versions: 1.6.1
>Reporter: Jeff Zemerick
>Assignee: Joern Kottmann
>Priority: Minor
>  Labels: feature, features, generator
>
> Under the package opennlp.tools.util.featuregen there is an interface 
> AdaptiveFeatureGenerator and an abstract class FeatureGeneratorAdapter. The 
> interface defines the createFeatures(), updateAdaptiveData(), and 
> clearAdaptiveData() methods. The abstract class implements this interface to 
> provide default implementations of the updateAdaptiveData() and 
> clearAdaptiveData() functions. Feature generators then either implement the 
> interface or extend the abstract class.
> The purpose of this task is to refactor these classes to remove confusion 
> caused by the similarity between the interface and the abstract class. This 
> task deprecates the AdaptiveFeatureGenerator interface in favor of the 
> abstract class FeatureGeneratorAdapter.
> Default methods will be added to the AdaptiveFeatureGenerator interface to 
> maintain backward compatibility. To support the default methods the version 
> of the Java compiler will be set to 1.8.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OPENNLP-856) Refactor AdaptiveFeatureGenerator and FeatureGeneratorAdapter

2016-11-07 Thread Joern Kottmann (JIRA)

[ 
https://issues.apache.org/jira/browse/OPENNLP-856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15643576#comment-15643576
 ] 

Joern Kottmann commented on OPENNLP-856:


We need to keep the AdaptiveFeatureGenerator around otherwise the change will 
not be backward compatible. Java 8 added default methods and with them we can 
provide empty default implementations in the AdaptiveFeatureGenerator 
interface. Our users can then migrate their code to use this interface directly 
and stop extending FeatureGeneratorAdapter (which we then would deprecate and 
point that out).

It would be nice if you could send me the patch gain. I will create a Java 8 
branch in git so we can start collecting all Java 8 improvements until we 
switch to it.

> Refactor AdaptiveFeatureGenerator and FeatureGeneratorAdapter
> -
>
> Key: OPENNLP-856
> URL: https://issues.apache.org/jira/browse/OPENNLP-856
> Project: OpenNLP
>  Issue Type: Improvement
>  Components: Name Finder
>Affects Versions: 1.6.1
>Reporter: Jeff Zemerick
>Priority: Minor
>  Labels: feature, features, generator
>
> Under the package opennlp.tools.util.featuregen there is an interface 
> AdaptiveFeatureGenerator and an abstract class FeatureGeneratorAdapter. The 
> interface defines the createFeatures(), updateAdaptiveData(), and 
> clearAdaptiveData() methods. The abstract class implements this interface to 
> provide default implementations of the updateAdaptiveData() and 
> clearAdaptiveData() functions. Feature generators then either implement the 
> interface or extend the abstract class.
> The purpose of this task is to refactor these classes to remove confusion 
> caused by the similarity between the interface and the abstract class. This 
> task deprecates the AdaptiveFeatureGenerator interface in favor of the 
> abstract class FeatureGeneratorAdapter.
> Default methods will be added to the AdaptiveFeatureGenerator interface to 
> maintain backward compatibility. To support the default methods the version 
> of the Java compiler will be set to 1.8.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (OPENNLP-856) Refactor AdaptiveFeatureGenerator and FeatureGeneratorAdapter

2016-11-07 Thread Joern Kottmann (JIRA)

[ 
https://issues.apache.org/jira/browse/OPENNLP-856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15643576#comment-15643576
 ] 

Joern Kottmann edited comment on OPENNLP-856 at 11/7/16 9:10 AM:
-

We need to keep the AdaptiveFeatureGenerator around otherwise the change will 
not be backward compatible. Java 8 added default methods and with them we can 
provide empty default implementations in the AdaptiveFeatureGenerator 
interface. Our users can then migrate their code to use this interface directly 
and stop extending FeatureGeneratorAdapter (which we then would deprecate and 
point that out).

It would be nice if you could send me the patch gain. I will create a Java 8 
branch in git so we can start collecting all Java 8 improvements until we 
switch to it.

Here is a link about default methods:
https://docs.oracle.com/javase/tutorial/java/IandI/defaultmethods.html


was (Author: joern):
We need to keep the AdaptiveFeatureGenerator around otherwise the change will 
not be backward compatible. Java 8 added default methods and with them we can 
provide empty default implementations in the AdaptiveFeatureGenerator 
interface. Our users can then migrate their code to use this interface directly 
and stop extending FeatureGeneratorAdapter (which we then would deprecate and 
point that out).

It would be nice if you could send me the patch gain. I will create a Java 8 
branch in git so we can start collecting all Java 8 improvements until we 
switch to it.

> Refactor AdaptiveFeatureGenerator and FeatureGeneratorAdapter
> -
>
> Key: OPENNLP-856
> URL: https://issues.apache.org/jira/browse/OPENNLP-856
> Project: OpenNLP
>  Issue Type: Improvement
>  Components: Name Finder
>Affects Versions: 1.6.1
>Reporter: Jeff Zemerick
>Assignee: Joern Kottmann
>Priority: Minor
>  Labels: feature, features, generator, java1.8
>
> Under the package opennlp.tools.util.featuregen there is an interface 
> AdaptiveFeatureGenerator and an abstract class FeatureGeneratorAdapter. The 
> interface defines the createFeatures(), updateAdaptiveData(), and 
> clearAdaptiveData() methods. The abstract class implements this interface to 
> provide default implementations of the updateAdaptiveData() and 
> clearAdaptiveData() functions. Feature generators then either implement the 
> interface or extend the abstract class.
> The purpose of this task is to refactor these classes to remove confusion 
> caused by the similarity between the interface and the abstract class. This 
> task deprecates the AdaptiveFeatureGenerator interface in favor of the 
> abstract class FeatureGeneratorAdapter.
> Default methods will be added to the AdaptiveFeatureGenerator interface to 
> maintain backward compatibility. To support the default methods the version 
> of the Java compiler will be set to 1.8.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OPENNLP-856) Refactor AdaptiveFeatureGenerator and FeatureGeneratorAdapter

2016-11-07 Thread Joern Kottmann (JIRA)

 [ 
https://issues.apache.org/jira/browse/OPENNLP-856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joern Kottmann updated OPENNLP-856:
---
Labels: feature features generator java1.8  (was: feature features 
generator)

> Refactor AdaptiveFeatureGenerator and FeatureGeneratorAdapter
> -
>
> Key: OPENNLP-856
> URL: https://issues.apache.org/jira/browse/OPENNLP-856
> Project: OpenNLP
>  Issue Type: Improvement
>  Components: Name Finder
>Affects Versions: 1.6.1
>Reporter: Jeff Zemerick
>Assignee: Joern Kottmann
>Priority: Minor
>  Labels: feature, features, generator, java1.8
>
> Under the package opennlp.tools.util.featuregen there is an interface 
> AdaptiveFeatureGenerator and an abstract class FeatureGeneratorAdapter. The 
> interface defines the createFeatures(), updateAdaptiveData(), and 
> clearAdaptiveData() methods. The abstract class implements this interface to 
> provide default implementations of the updateAdaptiveData() and 
> clearAdaptiveData() functions. Feature generators then either implement the 
> interface or extend the abstract class.
> The purpose of this task is to refactor these classes to remove confusion 
> caused by the similarity between the interface and the abstract class. This 
> task deprecates the AdaptiveFeatureGenerator interface in favor of the 
> abstract class FeatureGeneratorAdapter.
> Default methods will be added to the AdaptiveFeatureGenerator interface to 
> maintain backward compatibility. To support the default methods the version 
> of the Java compiler will be set to 1.8.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (OPENNLP-880) Refactor the GIS trainer integration

2016-11-07 Thread Joern Kottmann (JIRA)
Joern Kottmann created OPENNLP-880:
--

 Summary: Refactor the GIS trainer integration
 Key: OPENNLP-880
 URL: https://issues.apache.org/jira/browse/OPENNLP-880
 Project: OpenNLP
  Issue Type: Improvement
Reporter: Joern Kottmann
Priority: Minor


The GIS code was never reshaped to fit properly into the new Training API. 
There are a couple of issues e.g. not using parameters which should be fixed.

TODO: Update this description and list the changes 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OPENNLP-776) Model Objects should be Serializable

2016-11-07 Thread Joern Kottmann (JIRA)

[ 
https://issues.apache.org/jira/browse/OPENNLP-776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15643588#comment-15643588
 ] 

Joern Kottmann commented on OPENNLP-776:


Do you have some updates? For us it works great with Spark.

> Model Objects should be Serializable
> 
>
> Key: OPENNLP-776
> URL: https://issues.apache.org/jira/browse/OPENNLP-776
> Project: OpenNLP
>  Issue Type: Improvement
>Affects Versions: tools-1.5.3
>Reporter: Tristan Nixon
>Assignee: Joern Kottmann
>Priority: Minor
>  Labels: features, patch
> Fix For: 1.6.1
>
> Attachments: externalizable.patch, 
> serializable-basemodel-joern.patch, serializable-basemodel.patch, 
> serialization_proxy.patch
>
>
> Marking model objects (ParserModel, SentenceModel, etc.) as Serializable can 
> enable a number of features offered by other Java frameworks (my own use case 
> is described below). You've already got a good mechanism for 
> (de-)serialization, but it cannot be leveraged by other frameworks without 
> implementing the Serializable interface. I'm attaching a patch to BaseModel 
> that implements the methods in the java.io.Externalizable interface as 
> wrappers to the existing (de-)serialization methods. This simple change can 
> open up a number of useful opportunities for integrating OpenNLP with other 
> frameworks.
> My use case is that I am incorporating OpenNLP into a Spark application. This 
> requires that components of the system be distributed between the driver and 
> worker nodes within the cluster. In order to do this, Spark uses Java 
> serialization API to transmit objects between nodes. This is far more 
> efficient than instantiating models on each node independently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OPENNLP-776) Model Objects should be Serializable

2016-11-07 Thread Tristan Nixon (JIRA)

[ 
https://issues.apache.org/jira/browse/OPENNLP-776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15644259#comment-15644259
 ] 

Tristan Nixon commented on OPENNLP-776:
---

I've been swamped with other work, but I should be able to look at this 
tomorrow.

> Model Objects should be Serializable
> 
>
> Key: OPENNLP-776
> URL: https://issues.apache.org/jira/browse/OPENNLP-776
> Project: OpenNLP
>  Issue Type: Improvement
>Affects Versions: tools-1.5.3
>Reporter: Tristan Nixon
>Assignee: Joern Kottmann
>Priority: Minor
>  Labels: features, patch
> Fix For: 1.6.1
>
> Attachments: externalizable.patch, 
> serializable-basemodel-joern.patch, serializable-basemodel.patch, 
> serialization_proxy.patch
>
>
> Marking model objects (ParserModel, SentenceModel, etc.) as Serializable can 
> enable a number of features offered by other Java frameworks (my own use case 
> is described below). You've already got a good mechanism for 
> (de-)serialization, but it cannot be leveraged by other frameworks without 
> implementing the Serializable interface. I'm attaching a patch to BaseModel 
> that implements the methods in the java.io.Externalizable interface as 
> wrappers to the existing (de-)serialization methods. This simple change can 
> open up a number of useful opportunities for integrating OpenNLP with other 
> frameworks.
> My use case is that I am incorporating OpenNLP into a Spark application. This 
> requires that components of the system be distributed between the driver and 
> worker nodes within the cluster. In order to do this, Spark uses Java 
> serialization API to transmit objects between nodes. This is far more 
> efficient than instantiating models on each node independently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (OPENNLP-881) Improve errror handling in Brat format parser

2016-11-07 Thread Joern Kottmann (JIRA)
Joern Kottmann created OPENNLP-881:
--

 Summary: Improve errror handling in Brat format parser
 Key: OPENNLP-881
 URL: https://issues.apache.org/jira/browse/OPENNLP-881
 Project: OpenNLP
  Issue Type: Improvement
  Components: Formats
Reporter: Joern Kottmann
Assignee: Joern Kottmann
Priority: Minor


The Brat format parser process lots of different files usually. In case there 
is an error it should always include the file name in the exception so the user 
has a chance to track down the error easily (e.g. without modifying OpenNLP 
code).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OPENNLP-881) Improve errror reporting in Brat format parser

2016-11-07 Thread Joern Kottmann (JIRA)

 [ 
https://issues.apache.org/jira/browse/OPENNLP-881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joern Kottmann updated OPENNLP-881:
---
Summary: Improve errror reporting in Brat format parser  (was: Improve 
errror handling in Brat format parser)

> Improve errror reporting in Brat format parser
> --
>
> Key: OPENNLP-881
> URL: https://issues.apache.org/jira/browse/OPENNLP-881
> Project: OpenNLP
>  Issue Type: Improvement
>  Components: Formats
>Reporter: Joern Kottmann
>Assignee: Joern Kottmann
>Priority: Minor
>
> The Brat format parser process lots of different files usually. In case there 
> is an error it should always include the file name in the exception so the 
> user has a chance to track down the error easily (e.g. without modifying 
> OpenNLP code).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Reopened] (OPENNLP-879) Use PriorityQueue instead of Heap in BeamSearch

2016-11-07 Thread Joern Kottmann (JIRA)

 [ 
https://issues.apache.org/jira/browse/OPENNLP-879?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joern Kottmann reopened OPENNLP-879:


It should use the remove instead of poll method. The remove method has exactly 
the same behaviour as we had before. If the queue runs empty and it tries to 
remove an element from it an exception is thrown.

> Use PriorityQueue instead of Heap in BeamSearch
> ---
>
> Key: OPENNLP-879
> URL: https://issues.apache.org/jira/browse/OPENNLP-879
> Project: OpenNLP
>  Issue Type: Improvement
>  Components: Machine Learning
>Reporter: Joern Kottmann
>Assignee: Joern Kottmann
>Priority: Trivial
> Fix For: 1.6.1
>
>
> It was pointed out in OPENNLP-830 that we can just use PriorityQueue in 
> BeamSeach instead of the cutstom Heap implementation. This class is slightly 
> faster of around 2- 3% with the Name Finder, not speed increase with the 
> POSTagger.
> In the end this will allow us to remove the custom Heap implementations and 
> the Java version will be maintained for us.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (OPENNLP-879) Use PriorityQueue instead of Heap in BeamSearch

2016-11-07 Thread Joern Kottmann (JIRA)

 [ 
https://issues.apache.org/jira/browse/OPENNLP-879?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joern Kottmann closed OPENNLP-879.
--
Resolution: Fixed

> Use PriorityQueue instead of Heap in BeamSearch
> ---
>
> Key: OPENNLP-879
> URL: https://issues.apache.org/jira/browse/OPENNLP-879
> Project: OpenNLP
>  Issue Type: Improvement
>  Components: Machine Learning
>Reporter: Joern Kottmann
>Assignee: Joern Kottmann
>Priority: Trivial
> Fix For: 1.6.1
>
>
> It was pointed out in OPENNLP-830 that we can just use PriorityQueue in 
> BeamSeach instead of the cutstom Heap implementation. This class is slightly 
> faster of around 2- 3% with the Name Finder, not speed increase with the 
> POSTagger.
> In the end this will allow us to remove the custom Heap implementations and 
> the Java version will be maintained for us.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)