[GitHub] incubator-joshua issue #24: Maven multi-module project layout proposal

2016-08-30 Thread mjpost
Github user mjpost commented on the issue:

https://github.com/apache/incubator-joshua/pull/24
  
No big deal, but if you update this pull request to be against the "7" 
branch instead of master, it will be marked as resolved.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-joshua pull request #49: 7 with master

2016-08-30 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/incubator-joshua/pull/49


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-joshua issue #49: 7 with master

2016-08-30 Thread mjpost
Github user mjpost commented on the issue:

https://github.com/apache/incubator-joshua/pull/49
  
Pushed!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-joshua issue #42: Fix various issues related to resources, warning...

2016-08-30 Thread maxthomas
Github user maxthomas commented on the issue:

https://github.com/apache/incubator-joshua/pull/42
  
see #49 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Re: [GitHub] incubator-joshua issue #42: Fix various issues related to resources, warning...

2016-08-30 Thread Matt Post
Rebasing changes the history, so I think you can't do that with repos that have 
been pushed, right? In which case merge...

matt (from my phone)

> On Aug 30, 2016, at 3:41 PM, maxthomas  wrote:
> 
> Github user maxthomas commented on the issue:
> 
>https://github.com/apache/incubator-joshua/pull/42
> 
>do you want me to rebase off master, or merge? 
> 
> 
> ---
> If your project is set up for it, you can reply to this email and have your
> reply appear on GitHub as well. If your project does not have this feature
> enabled and wishes so, or if the feature is enabled but not working, please
> contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
> with INFRA.
> ---



[GitHub] incubator-joshua issue #42: Fix various issues related to resources, warning...

2016-08-30 Thread maxthomas
Github user maxthomas commented on the issue:

https://github.com/apache/incubator-joshua/pull/42
  
do you want me to rebase off master, or merge? 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-joshua issue #42: Fix various issues related to resources, warning...

2016-08-30 Thread mjpost
Github user mjpost commented on the issue:

https://github.com/apache/incubator-joshua/pull/42
  
Yes, please merge master into 7!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-joshua issue #42: Fix various issues related to resources, warning...

2016-08-30 Thread maxthomas
Github user maxthomas commented on the issue:

https://github.com/apache/incubator-joshua/pull/42
  
yep, this is ready if you are ready!

i'm happy to merge this into 7. would you like me to do that after this is 
in master? just let me know. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (JOSHUA-308) Apply consistent formatting to project and remove trailing whitespace

2016-08-30 Thread Matt Post (JIRA)

[ 
https://issues.apache.org/jira/browse/JOSHUA-308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15449700#comment-15449700
 ] 

Matt Post commented on JOSHUA-308:
--

I'd prefer not to do this for the 6.1 branch, but once 6.1 is released, we 
could apply this to the 7 branch after merging it back into master.

> Apply consistent formatting to project and remove trailing whitespace
> -
>
> Key: JOSHUA-308
> URL: https://issues.apache.org/jira/browse/JOSHUA-308
> Project: Joshua
>  Issue Type: Improvement
>Reporter: Max Thomas
>Priority: Minor
>
> I suggest that the checked in code format be applied to all files, with the 
> following addition: remove trailing whitespace. Trailing whitespace makes it 
> unnecessarily more difficult to work with the code base.
> I thought that this was part of the format file, but I think it must be a 
> setting I have enabled outside of this in Eclipse.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] incubator-joshua issue #42: Fix various issues related to resources, warning...

2016-08-30 Thread mjpost
Github user mjpost commented on the issue:

https://github.com/apache/incubator-joshua/pull/42
  
So am I correct that this is ready to merge into master?

And @maxthomas, you think you can merge this into 7?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-joshua issue #42: Fix various issues related to resources, warning...

2016-08-30 Thread maxthomas
Github user maxthomas commented on the issue:

https://github.com/apache/incubator-joshua/pull/42
  
sure, that conflict comment should be fixed. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (JOSHUA-308) Apply consistent formatting to project and remove trailing whitespace

2016-08-30 Thread Max Thomas (JIRA)
Max Thomas created JOSHUA-308:
-

 Summary: Apply consistent formatting to project and remove 
trailing whitespace
 Key: JOSHUA-308
 URL: https://issues.apache.org/jira/browse/JOSHUA-308
 Project: Joshua
  Issue Type: Improvement
Reporter: Max Thomas
Priority: Minor


I suggest that the checked in code format be applied to all files, with the 
following addition: remove trailing whitespace. Trailing whitespace makes it 
unnecessarily more difficult to work with the code base.

I thought that this was part of the format file, but I think it must be a 
setting I have enabled outside of this in Eclipse.





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] incubator-joshua issue #42: Fix various issues related to resources, warning...

2016-08-30 Thread KellenSunderland
Github user KellenSunderland commented on the issue:

https://github.com/apache/incubator-joshua/pull/42
  
Wow Max, great update.  Many thanks!  I total agree on the questionable 
finalizes.  The code cleanup in general looks great, lots of dead code removed 
there.

As mentioned above there are some conflicts in one commented file, if you 
could fix those it'd be great.  Otherwise looks good to me 👍 .



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-joshua pull request #42: Fix various issues related to resources, ...

2016-08-30 Thread maxthomas
Github user maxthomas commented on a diff in the pull request:

https://github.com/apache/incubator-joshua/pull/42#discussion_r76825646
  
--- Diff: 
src/main/java/org/apache/joshua/decoder/ff/lm/buildin_lm/TrieLM.java ---
@@ -264,66 +215,65 @@ public static void main(String[] args) throws 
IOException {
 int n = Integer.valueOf(args[2]);
 LOG.info("N-gram order will be {}", n);
 
-Scanner scanner = new Scanner(new File(args[1]));
+try (Scanner scanner = new Scanner(new File(args[1]));) {
--- End diff --

I personally prefer the semicolon:

* it's necessary if other statements/resources are added to the autoclose 
block
* it's easier to determine the actual method call when there are many 
parens towards the end

but I don't mind either way. 

What would be nice is a commit that applies the format across the entire 
project. A ton of lines have trailing whitespace or tabs or whatever. I'n going 
to file an issue on this. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-joshua pull request #42: Fix various issues related to resources, ...

2016-08-30 Thread KellenSunderland
Github user KellenSunderland commented on a diff in the pull request:

https://github.com/apache/incubator-joshua/pull/42#discussion_r76822903
  
--- Diff: 
src/main/java/org/apache/joshua/decoder/ff/lm/buildin_lm/TrieLM.java ---
@@ -264,66 +215,65 @@ public static void main(String[] args) throws 
IOException {
 int n = Integer.valueOf(args[2]);
 LOG.info("N-gram order will be {}", n);
 
-Scanner scanner = new Scanner(new File(args[1]));
+try (Scanner scanner = new Scanner(new File(args[1]));) {
--- End diff --

Nitpick: would it be ok to remove the semicolon from the autocloseable 
lines?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-joshua pull request #42: Fix various issues related to resources, ...

2016-08-30 Thread KellenSunderland
Github user KellenSunderland commented on a diff in the pull request:

https://github.com/apache/incubator-joshua/pull/42#discussion_r76822076
  
--- Diff: 
src/main/java/org/apache/joshua/decoder/ff/lm/buildin_lm/TrieLM.java ---
@@ -188,62 +188,13 @@ public TrieLM(ArpaFile arpaFile) throws 
FileNotFoundException {
 
 
   @Override
-  protected double logProbabilityOfBackoffState_helper(
-  int[] ngram, int order, int qtyAdditionalBackoffWeight
-  ) {
+  protected double logProbabilityOfBackoffState_helper(int[] ngram, int 
order, int qtyAdditionalBackoffWeight) {
 throw new 
UnsupportedOperationException("probabilityOfBackoffState_helper undefined for 
TrieLM");
   }
 
   @Override
   protected float ngramLogProbability_helper(int[] ngram, int order) {
-
-//float logProb = (float) 
-JoshuaConfiguration.lm_ceiling_cost;//Float.NEGATIVE_INFINITY; // log(0.0f)
-float backoff = 0.0f; // log(1.0f)
-
-int i = ngram.length - 1;
-int word = ngram[i];
-i -= 1;
-
-int nodeID = ROOT_NODE_ID;
-
-while (true) {
-
-  {
-long key = Bits.encodeAsLong(nodeID, word);
-if (logProbs.containsKey(key)) {
-//  logProb = logProbs.get(key);
-  backoff = 0.0f; // log(0.0f)
-}
-  }
-
-  if (i < 0) {
-break;
-  }
-
-  {
-long key = Bits.encodeAsLong(nodeID, ngram[i]);
-
-if (children.containsKey(key)) {
-  nodeID = children.get(key);
-
-  backoff += backoffs.get(nodeID);
-
-  i -= 1;
-
-} else {
-  break;
-}
-  }
-
-}
-
-//double result = logProb + backoff;
-//if (result < -JoshuaConfiguration.lm_ceiling_cost) {
-//  result = -JoshuaConfiguration.lm_ceiling_cost;
-//}
-//
-//return result;
-return (Float) null;
+throw new UnsupportedOperationException();
--- End diff --

Good change, but I'd just like to note this is an example of 
https://en.wikipedia.org/wiki/Interface_segregation_principle that we should 
think about cleaning up one day.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-joshua issue #42: Fix various issues related to resources, warning...

2016-08-30 Thread maxthomas
Github user maxthomas commented on the issue:

https://github.com/apache/incubator-joshua/pull/42
  
just rebased off latest master


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-joshua issue #48: Fixed crashing when using Trie based KenLM model...

2016-08-30 Thread KellenSunderland
Github user KellenSunderland commented on the issue:

https://github.com/apache/incubator-joshua/pull/48
  
Hey Matt, exactly right for the summary.

I am actually not sure how often collisions happen, it may be worth 
measuring.  Collisions causing crashing in this scenario happen once in about 
100k-250k translation requests.  This is a less likely scenario than just a 
state collision though, as you have to get a collision that gives you an 
out-of-range word id as a unigram.  

We've tested turning off state sharing between KenLMs, and indeed that also 
solves the crashing issue.  I don't think I know the details of KenLM and 
Joshua well enough to properly judge if this would be a better solution than 
the one I provided.  If someone with more knowledge then I provides a new PR 
I'll happily +1 it.  

One downside to just turning off state sharing is that we will still get 
collisions, we just won't get crashing.  I think if have collisions (even with 
a single model) we usually get an incorrect result (not a crash). 

There are some other implementation approaches that could also be 
considered to fix the issue too (I'd like to hear what Kenneth thinks would be 
the best approach).  One idea would be to define a standard hash function for 
the State struct, and then we could use the State itself as a key for a normal 
unordered_map.  Then we wouldn't need this multimap.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-joshua issue #47: Fix Travis-CI by adding a step to install ant on...

2016-08-30 Thread mjpost
Github user mjpost commented on the issue:

https://github.com/apache/incubator-joshua/pull/47
  
A better option here would be to upgrade the [Berkeley 
Aligner](https://github.com/joshua-decoder/berkeleyaligner/tree/master/src/edu/berkeley/nlp)
 (Joshua's version) to Maven, and distribute it as a maven resource. This 
reflects the fact that the Berkeley Aligner code — already almost a decade 
old — is unlikely to change much.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-joshua issue #48: Fixed crashing when using Trie based KenLM model...

2016-08-30 Thread mjpost
Github user mjpost commented on the issue:

https://github.com/apache/incubator-joshua/pull/48
  
Holy smokes, thanks for tracking this down. So if I understand correctly, 
this only occurs under the following circumstances:

- decoding with multiple KenLM language models
- built with different vocabularies (the usual case)
- a hash collision occurs and returns a state containing an ID that is 
invalid in the calling KenLM

Do you have any idea how often hash collisions actually occur?

I wonder if turning off sharing of KenLM states across LMs would also have 
worked, with little to no effect on performance.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-joshua issue #42: Fix various issues related to resources, warning...

2016-08-30 Thread mjpost
Github user mjpost commented on the issue:

https://github.com/apache/incubator-joshua/pull/42
  
So, @maxthomas — what do you think about either of the following:

- You (a) remerge master into this, and then (b) merge master into 7
- Skip (a) and just rework this against the 7 branch?

I think this is good to have in the codebase, but I'm not sure that it's 
critical to have for the 6.1 release. But if it's not too much work to do both, 
I'm not against trying.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-joshua pull request #47: Fix Travis-CI by adding a step to install...

2016-08-30 Thread KellenSunderland
GitHub user KellenSunderland opened a pull request:

https://github.com/apache/incubator-joshua/pull/47

Fix Travis-CI by adding a step to install ant on OSX

It looks like the download-deps.sh now requires ant to build BerkleyLM?  If 
so feel free to merge this, it should make sure the OSX image has ant installed 
before building.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/KellenSunderland/incubator-joshua master

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-joshua/pull/47.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #47


commit 0b447595ae51ab3d2956d60748962a092e192286
Author: Kellen Sunderland 
Date:   2016-08-30T08:38:35Z

Fix Travis-CI by adding a step to install ant on OSX




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---