[GitHub] incubator-joshua issue #24: Maven multi-module project layout proposal
Github user mjpost commented on the issue: https://github.com/apache/incubator-joshua/pull/24 No big deal, but if you update this pull request to be against the "7" branch instead of master, it will be marked as resolved. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-joshua pull request #49: 7 with master
Github user asfgit closed the pull request at: https://github.com/apache/incubator-joshua/pull/49 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-joshua issue #49: 7 with master
Github user mjpost commented on the issue: https://github.com/apache/incubator-joshua/pull/49 Pushed! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-joshua issue #42: Fix various issues related to resources, warning...
Github user maxthomas commented on the issue: https://github.com/apache/incubator-joshua/pull/42 see #49 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
Re: [GitHub] incubator-joshua issue #42: Fix various issues related to resources, warning...
Rebasing changes the history, so I think you can't do that with repos that have been pushed, right? In which case merge... matt (from my phone) > On Aug 30, 2016, at 3:41 PM, maxthomaswrote: > > Github user maxthomas commented on the issue: > >https://github.com/apache/incubator-joshua/pull/42 > >do you want me to rebase off master, or merge? > > > --- > If your project is set up for it, you can reply to this email and have your > reply appear on GitHub as well. If your project does not have this feature > enabled and wishes so, or if the feature is enabled but not working, please > contact infrastructure at infrastruct...@apache.org or file a JIRA ticket > with INFRA. > ---
[GitHub] incubator-joshua issue #42: Fix various issues related to resources, warning...
Github user maxthomas commented on the issue: https://github.com/apache/incubator-joshua/pull/42 do you want me to rebase off master, or merge? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-joshua issue #42: Fix various issues related to resources, warning...
Github user mjpost commented on the issue: https://github.com/apache/incubator-joshua/pull/42 Yes, please merge master into 7! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-joshua issue #42: Fix various issues related to resources, warning...
Github user maxthomas commented on the issue: https://github.com/apache/incubator-joshua/pull/42 yep, this is ready if you are ready! i'm happy to merge this into 7. would you like me to do that after this is in master? just let me know. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (JOSHUA-308) Apply consistent formatting to project and remove trailing whitespace
[ https://issues.apache.org/jira/browse/JOSHUA-308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15449700#comment-15449700 ] Matt Post commented on JOSHUA-308: -- I'd prefer not to do this for the 6.1 branch, but once 6.1 is released, we could apply this to the 7 branch after merging it back into master. > Apply consistent formatting to project and remove trailing whitespace > - > > Key: JOSHUA-308 > URL: https://issues.apache.org/jira/browse/JOSHUA-308 > Project: Joshua > Issue Type: Improvement >Reporter: Max Thomas >Priority: Minor > > I suggest that the checked in code format be applied to all files, with the > following addition: remove trailing whitespace. Trailing whitespace makes it > unnecessarily more difficult to work with the code base. > I thought that this was part of the format file, but I think it must be a > setting I have enabled outside of this in Eclipse. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] incubator-joshua issue #42: Fix various issues related to resources, warning...
Github user mjpost commented on the issue: https://github.com/apache/incubator-joshua/pull/42 So am I correct that this is ready to merge into master? And @maxthomas, you think you can merge this into 7? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-joshua issue #42: Fix various issues related to resources, warning...
Github user maxthomas commented on the issue: https://github.com/apache/incubator-joshua/pull/42 sure, that conflict comment should be fixed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Created] (JOSHUA-308) Apply consistent formatting to project and remove trailing whitespace
Max Thomas created JOSHUA-308: - Summary: Apply consistent formatting to project and remove trailing whitespace Key: JOSHUA-308 URL: https://issues.apache.org/jira/browse/JOSHUA-308 Project: Joshua Issue Type: Improvement Reporter: Max Thomas Priority: Minor I suggest that the checked in code format be applied to all files, with the following addition: remove trailing whitespace. Trailing whitespace makes it unnecessarily more difficult to work with the code base. I thought that this was part of the format file, but I think it must be a setting I have enabled outside of this in Eclipse. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] incubator-joshua issue #42: Fix various issues related to resources, warning...
Github user KellenSunderland commented on the issue: https://github.com/apache/incubator-joshua/pull/42 Wow Max, great update. Many thanks! I total agree on the questionable finalizes. The code cleanup in general looks great, lots of dead code removed there. As mentioned above there are some conflicts in one commented file, if you could fix those it'd be great. Otherwise looks good to me ð . --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-joshua pull request #42: Fix various issues related to resources, ...
Github user maxthomas commented on a diff in the pull request: https://github.com/apache/incubator-joshua/pull/42#discussion_r76825646 --- Diff: src/main/java/org/apache/joshua/decoder/ff/lm/buildin_lm/TrieLM.java --- @@ -264,66 +215,65 @@ public static void main(String[] args) throws IOException { int n = Integer.valueOf(args[2]); LOG.info("N-gram order will be {}", n); -Scanner scanner = new Scanner(new File(args[1])); +try (Scanner scanner = new Scanner(new File(args[1]));) { --- End diff -- I personally prefer the semicolon: * it's necessary if other statements/resources are added to the autoclose block * it's easier to determine the actual method call when there are many parens towards the end but I don't mind either way. What would be nice is a commit that applies the format across the entire project. A ton of lines have trailing whitespace or tabs or whatever. I'n going to file an issue on this. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-joshua pull request #42: Fix various issues related to resources, ...
Github user KellenSunderland commented on a diff in the pull request: https://github.com/apache/incubator-joshua/pull/42#discussion_r76822903 --- Diff: src/main/java/org/apache/joshua/decoder/ff/lm/buildin_lm/TrieLM.java --- @@ -264,66 +215,65 @@ public static void main(String[] args) throws IOException { int n = Integer.valueOf(args[2]); LOG.info("N-gram order will be {}", n); -Scanner scanner = new Scanner(new File(args[1])); +try (Scanner scanner = new Scanner(new File(args[1]));) { --- End diff -- Nitpick: would it be ok to remove the semicolon from the autocloseable lines? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-joshua pull request #42: Fix various issues related to resources, ...
Github user KellenSunderland commented on a diff in the pull request: https://github.com/apache/incubator-joshua/pull/42#discussion_r76822076 --- Diff: src/main/java/org/apache/joshua/decoder/ff/lm/buildin_lm/TrieLM.java --- @@ -188,62 +188,13 @@ public TrieLM(ArpaFile arpaFile) throws FileNotFoundException { @Override - protected double logProbabilityOfBackoffState_helper( - int[] ngram, int order, int qtyAdditionalBackoffWeight - ) { + protected double logProbabilityOfBackoffState_helper(int[] ngram, int order, int qtyAdditionalBackoffWeight) { throw new UnsupportedOperationException("probabilityOfBackoffState_helper undefined for TrieLM"); } @Override protected float ngramLogProbability_helper(int[] ngram, int order) { - -//float logProb = (float) -JoshuaConfiguration.lm_ceiling_cost;//Float.NEGATIVE_INFINITY; // log(0.0f) -float backoff = 0.0f; // log(1.0f) - -int i = ngram.length - 1; -int word = ngram[i]; -i -= 1; - -int nodeID = ROOT_NODE_ID; - -while (true) { - - { -long key = Bits.encodeAsLong(nodeID, word); -if (logProbs.containsKey(key)) { -// logProb = logProbs.get(key); - backoff = 0.0f; // log(0.0f) -} - } - - if (i < 0) { -break; - } - - { -long key = Bits.encodeAsLong(nodeID, ngram[i]); - -if (children.containsKey(key)) { - nodeID = children.get(key); - - backoff += backoffs.get(nodeID); - - i -= 1; - -} else { - break; -} - } - -} - -//double result = logProb + backoff; -//if (result < -JoshuaConfiguration.lm_ceiling_cost) { -// result = -JoshuaConfiguration.lm_ceiling_cost; -//} -// -//return result; -return (Float) null; +throw new UnsupportedOperationException(); --- End diff -- Good change, but I'd just like to note this is an example of https://en.wikipedia.org/wiki/Interface_segregation_principle that we should think about cleaning up one day. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-joshua issue #42: Fix various issues related to resources, warning...
Github user maxthomas commented on the issue: https://github.com/apache/incubator-joshua/pull/42 just rebased off latest master --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-joshua issue #48: Fixed crashing when using Trie based KenLM model...
Github user KellenSunderland commented on the issue: https://github.com/apache/incubator-joshua/pull/48 Hey Matt, exactly right for the summary. I am actually not sure how often collisions happen, it may be worth measuring. Collisions causing crashing in this scenario happen once in about 100k-250k translation requests. This is a less likely scenario than just a state collision though, as you have to get a collision that gives you an out-of-range word id as a unigram. We've tested turning off state sharing between KenLMs, and indeed that also solves the crashing issue. I don't think I know the details of KenLM and Joshua well enough to properly judge if this would be a better solution than the one I provided. If someone with more knowledge then I provides a new PR I'll happily +1 it. One downside to just turning off state sharing is that we will still get collisions, we just won't get crashing. I think if have collisions (even with a single model) we usually get an incorrect result (not a crash). There are some other implementation approaches that could also be considered to fix the issue too (I'd like to hear what Kenneth thinks would be the best approach). One idea would be to define a standard hash function for the State struct, and then we could use the State itself as a key for a normal unordered_map. Then we wouldn't need this multimap. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-joshua issue #47: Fix Travis-CI by adding a step to install ant on...
Github user mjpost commented on the issue: https://github.com/apache/incubator-joshua/pull/47 A better option here would be to upgrade the [Berkeley Aligner](https://github.com/joshua-decoder/berkeleyaligner/tree/master/src/edu/berkeley/nlp) (Joshua's version) to Maven, and distribute it as a maven resource. This reflects the fact that the Berkeley Aligner code â already almost a decade old â is unlikely to change much. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-joshua issue #48: Fixed crashing when using Trie based KenLM model...
Github user mjpost commented on the issue: https://github.com/apache/incubator-joshua/pull/48 Holy smokes, thanks for tracking this down. So if I understand correctly, this only occurs under the following circumstances: - decoding with multiple KenLM language models - built with different vocabularies (the usual case) - a hash collision occurs and returns a state containing an ID that is invalid in the calling KenLM Do you have any idea how often hash collisions actually occur? I wonder if turning off sharing of KenLM states across LMs would also have worked, with little to no effect on performance. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-joshua issue #42: Fix various issues related to resources, warning...
Github user mjpost commented on the issue: https://github.com/apache/incubator-joshua/pull/42 So, @maxthomas â what do you think about either of the following: - You (a) remerge master into this, and then (b) merge master into 7 - Skip (a) and just rework this against the 7 branch? I think this is good to have in the codebase, but I'm not sure that it's critical to have for the 6.1 release. But if it's not too much work to do both, I'm not against trying. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-joshua pull request #47: Fix Travis-CI by adding a step to install...
GitHub user KellenSunderland opened a pull request: https://github.com/apache/incubator-joshua/pull/47 Fix Travis-CI by adding a step to install ant on OSX It looks like the download-deps.sh now requires ant to build BerkleyLM? If so feel free to merge this, it should make sure the OSX image has ant installed before building. You can merge this pull request into a Git repository by running: $ git pull https://github.com/KellenSunderland/incubator-joshua master Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-joshua/pull/47.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #47 commit 0b447595ae51ab3d2956d60748962a092e192286 Author: Kellen SunderlandDate: 2016-08-30T08:38:35Z Fix Travis-CI by adding a step to install ant on OSX --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---