[GitHub] incubator-joshua pull request #42: Fix various issues related to resources, ...

2016-08-30 Thread maxthomas
Github user maxthomas commented on a diff in the pull request:

https://github.com/apache/incubator-joshua/pull/42#discussion_r76825646
  
--- Diff: 
src/main/java/org/apache/joshua/decoder/ff/lm/buildin_lm/TrieLM.java ---
@@ -264,66 +215,65 @@ public static void main(String[] args) throws 
IOException {
 int n = Integer.valueOf(args[2]);
 LOG.info("N-gram order will be {}", n);
 
-Scanner scanner = new Scanner(new File(args[1]));
+try (Scanner scanner = new Scanner(new File(args[1]));) {
--- End diff --

I personally prefer the semicolon:

* it's necessary if other statements/resources are added to the autoclose 
block
* it's easier to determine the actual method call when there are many 
parens towards the end

but I don't mind either way. 

What would be nice is a commit that applies the format across the entire 
project. A ton of lines have trailing whitespace or tabs or whatever. I'n going 
to file an issue on this. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-joshua pull request #42: Fix various issues related to resources, ...

2016-08-30 Thread KellenSunderland
Github user KellenSunderland commented on a diff in the pull request:

https://github.com/apache/incubator-joshua/pull/42#discussion_r76822903
  
--- Diff: 
src/main/java/org/apache/joshua/decoder/ff/lm/buildin_lm/TrieLM.java ---
@@ -264,66 +215,65 @@ public static void main(String[] args) throws 
IOException {
 int n = Integer.valueOf(args[2]);
 LOG.info("N-gram order will be {}", n);
 
-Scanner scanner = new Scanner(new File(args[1]));
+try (Scanner scanner = new Scanner(new File(args[1]));) {
--- End diff --

Nitpick: would it be ok to remove the semicolon from the autocloseable 
lines?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-joshua pull request #42: Fix various issues related to resources, ...

2016-08-30 Thread KellenSunderland
Github user KellenSunderland commented on a diff in the pull request:

https://github.com/apache/incubator-joshua/pull/42#discussion_r76822076
  
--- Diff: 
src/main/java/org/apache/joshua/decoder/ff/lm/buildin_lm/TrieLM.java ---
@@ -188,62 +188,13 @@ public TrieLM(ArpaFile arpaFile) throws 
FileNotFoundException {
 
 
   @Override
-  protected double logProbabilityOfBackoffState_helper(
-  int[] ngram, int order, int qtyAdditionalBackoffWeight
-  ) {
+  protected double logProbabilityOfBackoffState_helper(int[] ngram, int 
order, int qtyAdditionalBackoffWeight) {
 throw new 
UnsupportedOperationException("probabilityOfBackoffState_helper undefined for 
TrieLM");
   }
 
   @Override
   protected float ngramLogProbability_helper(int[] ngram, int order) {
-
-//float logProb = (float) 
-JoshuaConfiguration.lm_ceiling_cost;//Float.NEGATIVE_INFINITY; // log(0.0f)
-float backoff = 0.0f; // log(1.0f)
-
-int i = ngram.length - 1;
-int word = ngram[i];
-i -= 1;
-
-int nodeID = ROOT_NODE_ID;
-
-while (true) {
-
-  {
-long key = Bits.encodeAsLong(nodeID, word);
-if (logProbs.containsKey(key)) {
-//  logProb = logProbs.get(key);
-  backoff = 0.0f; // log(0.0f)
-}
-  }
-
-  if (i < 0) {
-break;
-  }
-
-  {
-long key = Bits.encodeAsLong(nodeID, ngram[i]);
-
-if (children.containsKey(key)) {
-  nodeID = children.get(key);
-
-  backoff += backoffs.get(nodeID);
-
-  i -= 1;
-
-} else {
-  break;
-}
-  }
-
-}
-
-//double result = logProb + backoff;
-//if (result < -JoshuaConfiguration.lm_ceiling_cost) {
-//  result = -JoshuaConfiguration.lm_ceiling_cost;
-//}
-//
-//return result;
-return (Float) null;
+throw new UnsupportedOperationException();
--- End diff --

Good change, but I'd just like to note this is an example of 
https://en.wikipedia.org/wiki/Interface_segregation_principle that we should 
think about cleaning up one day.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-joshua pull request #42: Fix various issues related to resources, ...

2016-08-18 Thread maxthomas
GitHub user maxthomas opened a pull request:

https://github.com/apache/incubator-joshua/pull/42

Fix various issues related to resources, warnings, and copy-pasted code.

I didn't make an issue for this, because it hits on a lot of topics that 
would be tough to summarize in one issue. But, let me know if you'd like me to 
do so. 

The gist of this commit is code cleanup. Resource leaks are closed, 
warnings are resolved, and the beginning of refactoring the 10K lines of 
copy/pasted code throughout MERT, MIRA, etc. was started. 

I plan on working on this further on the 7 branch. But, because many 
resources weren't being closed explicitly (I found the finalize override in 
LineReader __extremely__ questionable), I figured I'd submit this on master 
right away. Feel free to ignore/postpone/whatever

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/maxthomas/incubator-joshua 
fix-warnings-resources

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-joshua/pull/42.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #42


commit 980f735853a9403823501a66ff857e0c5763ae52
Author: max thomas 
Date:   2016-08-18T19:04:52Z

Fix a number of issues:
- Reader now implements autocloseable
- Close various leaks from LineReader
- LineReader no longer implements custom finalize(). Resources should be
  explicitly closed when no longer needed. The compiler helps with this.
- Start refactoring copy/pasted code into a new type: 
ExistingUTF8EncodedTextFile.
  There is so much use of text files that this should really have
  its own type.
- Fix warnings about unused fields, unused methods
- Delete some old/legacy/unused classes




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---