[jira] [Resolved] (JOSHUA-253) Enable execution of Unit tests

2016-06-02 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/JOSHUA-253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney resolved JOSHUA-253.
-
Resolution: Fixed

yeah we fixed it in the Maven work

> Enable execution of Unit tests
> --
>
> Key: JOSHUA-253
> URL: https://issues.apache.org/jira/browse/JOSHUA-253
> Project: Joshua
>  Issue Type: Test
>Affects Versions: 6.0
>Reporter: Lewis John McGibbney
>Assignee: Lewis John McGibbney
> Fix For: 6.1
>
> Attachments: JOSHUA-253.patch
>
>
> As per our [discussion on this 
> topic|http://www.mail-archive.com/dev%40joshua.incubator.apache.org/msg00270.html],
>  [~teofili] correctly identified that unit level tests are not executed.
> We need to fix this such that they are.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (JOSHUA-253) Enable execution of Unit tests

2016-06-02 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/JOSHUA-253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney reassigned JOSHUA-253:
---

Assignee: Lewis John McGibbney

> Enable execution of Unit tests
> --
>
> Key: JOSHUA-253
> URL: https://issues.apache.org/jira/browse/JOSHUA-253
> Project: Joshua
>  Issue Type: Test
>Affects Versions: 6.0
>Reporter: Lewis John McGibbney
>Assignee: Lewis John McGibbney
> Fix For: 6.1
>
> Attachments: JOSHUA-253.patch
>
>
> As per our [discussion on this 
> topic|http://www.mail-archive.com/dev%40joshua.incubator.apache.org/msg00270.html],
>  [~teofili] correctly identified that unit level tests are not executed.
> We need to fix this such that they are.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (JOSHUA-276) Trivial fixes to 1.8 Javadoc

2016-05-31 Thread Lewis John McGibbney (JIRA)
Lewis John McGibbney created JOSHUA-276:
---

 Summary: Trivial fixes to 1.8 Javadoc
 Key: JOSHUA-276
 URL: https://issues.apache.org/jira/browse/JOSHUA-276
 Project: Joshua
  Issue Type: Bug
  Components: core
Affects Versions: 6.0.5
Reporter: Lewis John McGibbney
Priority: Trivial
 Fix For: 6.1


There are some trivial Javadoc issues to be fixed in now master branch
{code}
[INFO] 
[INFO] BUILD FAILURE
[INFO] 
[INFO] Total time: 37.358s
[INFO] Finished at: Wed Jun 01 03:28:40 UTC 2016

[INFO] Final Memory: 40M/861M
[INFO] 
[ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-javadoc-plugin:2.8:aggregate (default-cli) on 
project joshua: An error has occurred in JavaDocs report generation:
[ERROR] Exit code: 1 - 
/home/jenkins/jenkins-slave/workspace/joshua_master/src/main/java/org/apache/joshua/decoder/StructuredTranslationFactory.java:47:
 warning: no description for @param
[ERROR] * @param sourceSentence
[ERROR] ^
[ERROR] 
/home/jenkins/jenkins-slave/workspace/joshua_master/src/main/java/org/apache/joshua/decoder/StructuredTranslationFactory.java:48:
 warning: no description for @param
[ERROR] * @param hypergraph
[ERROR] ^
[ERROR] 
/home/jenkins/jenkins-slave/workspace/joshua_master/src/main/java/org/apache/joshua/decoder/StructuredTranslationFactory.java:49:
 warning: no description for @param
[ERROR] * @param featureFunctions
[ERROR] ^
[ERROR] 
/home/jenkins/jenkins-slave/workspace/joshua_master/src/main/java/org/apache/joshua/decoder/ff/FeatureVector.java:80:
 error: reference not found
[ERROR] * features) and in {@link 
org.apache.joshua.decoder.ff.tm.BilingualRule#estimateRuleCost(java.util.List)}
[ERROR] ^
[ERROR] 
[ERROR] Command line was: 
/home/jenkins/jenkins-slave/tools/hudson.model.JDK/latest1.8/jre/../bin/javadoc 
@options @packages
[ERROR] 
[ERROR] Refer to the generated Javadoc files in 
'/home/jenkins/jenkins-slave/workspace/joshua_master/target/site/apidocs' dir.
[ERROR] -> [Help 1]
[ERROR] 
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e 
switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR] 
[ERROR] For more information about the errors and possible solutions, please 
read the following articles:
[ERROR] [Help 1] 
http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException
Build step 'Invoke top-level Maven targets' marked build as failure
Publishing Javadoc

{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (JOSHUA-252) Make it possible to use Maven to build Joshua

2016-05-31 Thread Lewis John McGibbney (JIRA)

[ 
https://issues.apache.org/jira/browse/JOSHUA-252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15309166#comment-15309166
 ] 

Lewis John McGibbney commented on JOSHUA-252:
-

ACK done
https://builds.apache.org/view/H-L/view/Joshua/job/joshua_master/
There is a transient build slave error which I'll try and sort out.
[~post] NICE WORK :) 

> Make it possible to use Maven to build Joshua
> -
>
> Key: JOSHUA-252
> URL: https://issues.apache.org/jira/browse/JOSHUA-252
> Project: Joshua
>  Issue Type: Improvement
>  Components: build
>Reporter: Tommaso Teofili
>Assignee: Tommaso Teofili
> Fix For: 6.1
>
>
> As per discussion on the dev@ list for now Ant is the official build tool for 
> Joshua however we would like to possibly switch to Maven if / when someone is 
> able to do so.
> Assigning to me for now as I could be able to look into this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (JOSHUA-275) Revamp the Configuration System

2016-06-20 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/JOSHUA-275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney updated JOSHUA-275:

Fix Version/s: 6,2

> Revamp the Configuration System
> ---
>
> Key: JOSHUA-275
> URL: https://issues.apache.org/jira/browse/JOSHUA-275
> Project: Joshua
>  Issue Type: Improvement
>Affects Versions: 6.1, 6.2, 7
>Reporter: Kellen Sunderland
> Fix For: 6,2
>
>
> I'd like to propose we centralize Joshua's configuration system to make use 
> of typesafe/config https://github.com/typesafehub/config .  This config 
> system looks like JSON but with comments so it's easy to read.  Because it's 
> JSON it supports hierarchies of configurations, lists of configuration etc 
> quite easily.  It has some nice features like parsing time automatically.  
> The main advantage here though is that we have a standard config system that 
> doesn't have to be manually parsed.
> Here's a quick example of how we can use it:
> {code:java}
> @Inject
> public PackedGrammar(@TypesafeConfig("PackedGrammar.grammar_dir")
>  String grammar_dir,
>  @TypesafeConfig("PackedGrammar.span_limit")
>  int span_limit, 
>  String owner, 
>  String type) throws FileNotFoundException, 
> IOException ...
> {code}
> and then a config similar to
> \# Joshua configuration file
> {code:javascript}
> config = {
> default-non-terminal = X
> goal-symbol = GOAL
> ...
> 
> PackedGrammar: {
> type: thrax,
> grammar_dir: /local/grammars/...
> span_limit: 50
> }
> ...
> }
> {code}
> Version: TBD, but it's a breaking change so we may consider putting it in 
> Joshua 7.
> Totally open to other config / injection systems if others want to suggest 
> any of their favorites.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (JOSHUA-268) Phrase-based model error (NullPointerException)

2016-06-20 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/JOSHUA-268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney updated JOSHUA-268:

Fix Version/s: 6.2

> Phrase-based model error (NullPointerException)
> ---
>
> Key: JOSHUA-268
> URL: https://issues.apache.org/jira/browse/JOSHUA-268
> Project: Joshua
>  Issue Type: Bug
>  Components: decoders
>Affects Versions: 6.0.5
> Environment: fedora 23
>Reporter: Kyle Richardson
>Priority: Minor
> Fix For: 6.2
>
>
> I'm trying to run the phrase.sh example script (the only modification I made 
> was to take out the --optimizer-runs option, because the system says that 
> this is an "Unknown option"). 
> The error comes at the tuning stage (specifically, it fails at some point in 
> the tuning then complains that it cannot find the "joshua.config.final" 
> file). 
> Looking into the log file (tune/joshua.log), it seems to translate and tune a 
> number of sentences, then it raises the following NullPointerException: 
> Memory used after sentence 7 is 42.5 MB
> Translation 7: -30.617 good how is fine
> Input 2: Collecting options took 0.000 seconds
> Input 8: Collecting options took 0.000 seconds
> Input 2: FATAL UNCAUGHT EXCEPTION: null
> java.lang.NullPointerException
> at joshua.decoder.phrase.Candidate.score(Candidate.java:214)
> at joshua.decoder.phrase.Candidate.compareTo(Candidate.java:136)
> at joshua.decoder.phrase.Candidate.compareTo(Candidate.java:19)
> at java.util.HashMap.compareComparables(HashMap.java:371)
> at java.util.HashMap$TreeNode.treeify(HashMap.java:1920)
> at java.util.HashMap.treeifyBin(HashMap.java:771)
> at java.util.HashMap.putVal(HashMap.java:643)
> at java.util.HashMap.put(HashMap.java:611)
> at java.util.HashSet.add(HashSet.java:219)
> at joshua.decoder.phrase.Stack.addCandidate(Stack.java:125)
> at joshua.decoder.phrase.Stacks.search(Stacks.java:166)
> at joshua.decoder.DecoderThread.translate(DecoderThread.java:113)
> atjoshua.decoder.Decoder$DecoderThreadRunner.run(Decoder.java:218)
> There's nothing informative in the tune/mert.log, it just says that it exited 
> prematurely. The other processes seem to work as expected (although in the 
> giza.log, there are a number of "Sentence mismatch error! Line " warnings). 
> I'm running this on Fedora 23  with Moses.  I had no problems training the 
> hiero model.
> note---
> There appears to be an open ticket for more or less the same problem 
> (JOSHUA-267), the difference however is that in that in this ticket, it 
> appears that the tuner fails on the first input, whereas here, it already 
> decodes/tunes several inputs before failing (see above). 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (JOSHUA-275) Revamp the Configuration System

2016-06-20 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/JOSHUA-275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney updated JOSHUA-275:

Fix Version/s: (was: 6,2)
   6.2

> Revamp the Configuration System
> ---
>
> Key: JOSHUA-275
> URL: https://issues.apache.org/jira/browse/JOSHUA-275
> Project: Joshua
>  Issue Type: Improvement
>Affects Versions: 6.1, 6.2, 7
>Reporter: Kellen Sunderland
> Fix For: 6.2
>
>
> I'd like to propose we centralize Joshua's configuration system to make use 
> of typesafe/config https://github.com/typesafehub/config .  This config 
> system looks like JSON but with comments so it's easy to read.  Because it's 
> JSON it supports hierarchies of configurations, lists of configuration etc 
> quite easily.  It has some nice features like parsing time automatically.  
> The main advantage here though is that we have a standard config system that 
> doesn't have to be manually parsed.
> Here's a quick example of how we can use it:
> {code:java}
> @Inject
> public PackedGrammar(@TypesafeConfig("PackedGrammar.grammar_dir")
>  String grammar_dir,
>  @TypesafeConfig("PackedGrammar.span_limit")
>  int span_limit, 
>  String owner, 
>  String type) throws FileNotFoundException, 
> IOException ...
> {code}
> and then a config similar to
> \# Joshua configuration file
> {code:javascript}
> config = {
> default-non-terminal = X
> goal-symbol = GOAL
> ...
> 
> PackedGrammar: {
> type: thrax,
> grammar_dir: /local/grammars/...
> span_limit: 50
> }
> ...
> }
> {code}
> Version: TBD, but it's a breaking change so we may consider putting it in 
> Joshua 7.
> Totally open to other config / injection systems if others want to suggest 
> any of their favorites.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (JOSHUA-265) Refactor key interfaces and core code for a future release.

2016-06-20 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/JOSHUA-265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney updated JOSHUA-265:

Fix Version/s: 6.2

> Refactor key interfaces and core code for a future release. 
> 
>
> Key: JOSHUA-265
> URL: https://issues.apache.org/jira/browse/JOSHUA-265
> Project: Joshua
>  Issue Type: Improvement
>Reporter: Kellen Sunderland
>Priority: Minor
> Fix For: 6.2
>
>
> We've discussed making some modifications to the key interfaces.  This ticket 
> can focus on making large changes to the codebase for a future release.  This 
> work will likely take some time and some collaboration.  I'd suggest some the 
> code for this be a separate release branch.
> Some issues we can work on:
> *  I'd propose we conform to the SOLID principles for our major interfaces.  
> https://en.wikipedia.org/wiki/SOLID_(object-oriented_design)  . 
> *  We can look at Sparse / Dense feature vectors and how to handle them 
> naturally in Joshua.
> *  Refactor objects that may now be used more broadly than was originally 
> intended (for example Vocabulary class).
> *  We should have a general discussion around what parts of the codebase are 
> responsible for what functions.  We should clearly define what logic should 
> be a part of the Grammar versus the Feature Functions for example, and make 
> sure logic doesn't leak from one of these objects to the others.
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (JOSHUA-269) Fix Javadoc in JOSHUA-252 branch to comply with JDK1.8 Spec

2016-06-20 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/JOSHUA-269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney resolved JOSHUA-269.
-
Resolution: Fixed

> Fix Javadoc in JOSHUA-252 branch to comply with JDK1.8 Spec
> ---
>
> Key: JOSHUA-269
> URL: https://issues.apache.org/jira/browse/JOSHUA-269
> Project: Joshua
>  Issue Type: Improvement
>  Components: documentation
>Affects Versions: 6.0.5
>Reporter: Lewis John McGibbney
>Assignee: Lewis John McGibbney
>Priority: Blocker
> Fix For: 6.1
>
>
> When we build the JOSHUA-252 codebase on Jira, we get the following
> {code}
> [INFO] 
> 
> [ERROR] BUILD ERROR
> [INFO] 
> 
> [INFO] An error has occurred in JavaDocs report generation: 
> Exit code: 1 - 
> /home/jenkins/jenkins-slave/workspace/joshua_maven/src/main/java/org/apache/joshua/oracle/OracleExtractionHG.java:629:
>  warning: no @param for tbl
>   public void get_ngrams(HashMap tbl, int order, 
> ArrayList wrds,
>   ^
> /home/jenkins/jenkins-slave/workspace/joshua_maven/src/main/java/org/apache/joshua/oracle/OracleExtractionHG.java:629:
>  warning: no @param for order
>   public void get_ngrams(HashMap tbl, int order, 
> ArrayList wrds,
>   ^
> /home/jenkins/jenkins-slave/workspace/joshua_maven/src/main/java/org/apache/joshua/oracle/OracleExtractionHG.java:629:
>  warning: no @param for wrds
>   public void get_ngrams(HashMap tbl, int order, 
> ArrayList wrds,
>   ^
> /home/jenkins/jenkins-slave/workspace/joshua_maven/src/main/java/org/apache/joshua/oracle/OracleExtractionHG.java:629:
>  warning: no @param for ignore_null_equiv_symbol
>   public void get_ngrams(HashMap tbl, int order, 
> ArrayList wrds,
>   ^
> /home/jenkins/jenkins-slave/workspace/joshua_maven/src/main/java/org/apache/joshua/oracle/OracleExtractionHG.java:45:
>  error: malformed HTML
>  * @author Zhifei Li,  (Johns Hopkins University)
>   ^
> /home/jenkins/jenkins-slave/workspace/joshua_maven/src/main/java/org/apache/joshua/oracle/OracleExtractionHG.java:45:
>  error: bad use of '>'
>  * @author Zhifei Li,  (Johns Hopkins University)
> ^
> /home/jenkins/jenkins-slave/workspace/joshua_maven/src/main/java/org/apache/joshua/oracle/OracleExtractionHG.java:91:
>  warning: no description for @param
>* @param lm_feat_id_
>  ^
> /home/jenkins/jenkins-slave/workspace/joshua_maven/src/main/java/org/apache/joshua/oracle/SplitHg.java:33:
>  error: malformed HTML
>  * @author Zhifei Li,  (Johns Hopkins University)
>   ^
> /home/jenkins/jenkins-slave/workspace/joshua_maven/src/main/java/org/apache/joshua/oracle/SplitHg.java:33:
>  error: bad use of '>'
>  * @author Zhifei Li,  (Johns Hopkins University)
> ^
> /home/jenkins/jenkins-slave/workspace/joshua_maven/src/main/java/org/apache/joshua/ui/tree_visualizer/browser/Browser.java:77:
>  error: @param name not found
>* @param args the paths to the source, reference, and n-best files
> ^
> /home/jenkins/jenkins-slave/workspace/joshua_maven/src/main/java/org/apache/joshua/ui/tree_visualizer/browser/Browser.java:79:
>  warning: no @param for argv
>   public static void main(String[] argv) throws IOException {
>  ^
> /home/jenkins/jenkins-slave/workspace/joshua_maven/src/main/java/org/apache/joshua/ui/tree_visualizer/browser/Browser.java:79:
>  warning: no @throws for java.io.IOException
>   public static void main(String[] argv) throws IOException {
>  ^
> /home/jenkins/jenkins-slave/workspace/joshua_maven/src/main/java/org/apache/joshua/ui/tree_visualizer/tree/Tree.java:165:
>  warning: no @return
>   public int size() {
>  ^
> /home/jenkins/jenkins-slave/workspace/joshua_maven/src/main/java/org/apache/joshua/ui/tree_visualizer/tree/Tree.java:172:
>  warning: no @return
>   public Node root() {
>   ^
> /home/jenkins/jenkins-slave/workspace/joshua_maven/src/main/java/org/apache/joshua/ui/tree_visualizer/tree/Tree.java:51:
>  error: malformed HTML
>  * @author Jonny Weese 
>^
> /home/jenkins/jenkins-slave/workspace/joshua_maven/src/main/java/org/apache/joshua/ui/tree_visualizer/tree/Tree.java:51:
>  error: bad use of '>'
>  * @author Jonny Weese 
> ^
> 

[jira] [Resolved] (JOSHUA-239) Dependency addition to Joshua-Decoder/Joshua/pom.xml

2016-04-11 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/JOSHUA-239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney resolved JOSHUA-239.
-
Resolution: Won't Fix

> Dependency addition to Joshua-Decoder/Joshua/pom.xml
> 
>
> Key: JOSHUA-239
> URL: https://issues.apache.org/jira/browse/JOSHUA-239
> Project: Joshua
>  Issue Type: Bug
>Reporter: Chris A. Mattmann
> Fix For: 6.1
>
>
>  
> args4j args4j 
> 2.32
> Joshua-Decoder/Joshua committer please add this dependency to pom.xml
> thank you/
> Martin



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (JOSHUA-247) Feature request: confidence scores

2016-04-04 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/JOSHUA-247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney updated JOSHUA-247:

Fix Version/s: 6.1

> Feature request: confidence scores
> --
>
> Key: JOSHUA-247
> URL: https://issues.apache.org/jira/browse/JOSHUA-247
> Project: Joshua
>  Issue Type: New Feature
>Reporter: Matt Post
> Fix For: 6.1
>
>
> There is a lot of work on sentence-level and word-level quality estimation 
> for MT. It's pretty hard to do, but it should be possible to provide 
> coarse-level confidence scores at the word level, perhaps normalized by the 
> overall translation score.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (JOSHUA-239) Dependency addition to Joshua-Decoder/Joshua/pom.xml

2016-04-04 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/JOSHUA-239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney updated JOSHUA-239:

Fix Version/s: 6.1

> Dependency addition to Joshua-Decoder/Joshua/pom.xml
> 
>
> Key: JOSHUA-239
> URL: https://issues.apache.org/jira/browse/JOSHUA-239
> Project: Joshua
>  Issue Type: Bug
>Reporter: Chris A. Mattmann
> Fix For: 6.1
>
>
>  
> args4j args4j 
> 2.32
> Joshua-Decoder/Joshua committer please add this dependency to pom.xml
> thank you/
> Martin



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (JOSHUA-238) pipeline.pl should print help

2016-04-04 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/JOSHUA-238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney updated JOSHUA-238:

Fix Version/s: 6.1

> pipeline.pl should print help
> -
>
> Key: JOSHUA-238
> URL: https://issues.apache.org/jira/browse/JOSHUA-238
> Project: Joshua
>  Issue Type: Bug
>Reporter: Lewis John McGibbney
> Fix For: 6.1
>
>
> It would be very handy if pipeline.pl prints all command line options if 
> invoked without parameters or with --help.
> This issue will address that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Reopened] (JOSHUA-95) Vocabulary locking

2016-04-04 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/JOSHUA-95?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney reopened JOSHUA-95:


> Vocabulary locking
> --
>
> Key: JOSHUA-95
> URL: https://issues.apache.org/jira/browse/JOSHUA-95
> Project: Joshua
>  Issue Type: Bug
>Reporter: Matt Post
>Assignee: Juri Ganitkevitch
> Fix For: 6.1
>
>
> Vocabulary::id() is still synchronized and a potential point of contention. 
> It would be nice to resolve this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (JOSHUA-71) OS X installation depends on coreutils to run thrax test

2016-04-04 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/JOSHUA-71?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney updated JOSHUA-71:
---
Fix Version/s: 6.1

> OS X installation depends on coreutils to run thrax test
> 
>
> Key: JOSHUA-71
> URL: https://issues.apache.org/jira/browse/JOSHUA-71
> Project: Joshua
>  Issue Type: Bug
>Reporter: Luke Orland
> Fix For: 6.1
>
>
> the {{gstat}} command from coreutils is not installed in Darwin by default. 
> One must resolve that dependency via Homebrew, Macports, etc.
> The {{test/thrax/test.sh}} test will fail on an OS X system that does not 
> have coreutils installed. We should either change the test so that it does 
> not require coreutils in Darwin or make it clear in the (developer) 
> installation/setup instructions that coreutils are required for this test, 
> check for coreutils when running the thrax test, and output a helpful message 
> instructing the developer to go install coreutils if {{gstat}} is not found.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Reopened] (JOSHUA-93) Clean up examples

2016-04-04 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/JOSHUA-93?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney reopened JOSHUA-93:


> Clean up examples
> -
>
> Key: JOSHUA-93
> URL: https://issues.apache.org/jira/browse/JOSHUA-93
> Project: Joshua
>  Issue Type: Bug
>Reporter: Matt Post
>Assignee: Juri Ganitkevitch
> Fix For: 6.1
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Reopened] (JOSHUA-75) MERT hangs with very large development sets

2016-04-04 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/JOSHUA-75?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney reopened JOSHUA-75:


> MERT hangs with very large development sets
> ---
>
> Key: JOSHUA-75
> URL: https://issues.apache.org/jira/browse/JOSHUA-75
> Project: Joshua
>  Issue Type: Bug
>Reporter: Matt Post
>Assignee: Matt Post
> Fix For: 6.1
>
>
> From an email from Phu Le:
> > If you want to have a look, here it is:
> > http://ltvp.net/zmert.tar.gz  (zmert folder when it hangs)
> > To reproduce, you can just rerun zmert:
> > java -Xms1G -Xmx3G -cp meteor.jar:zmert.jar joshua.zmert.ZMERT -maxMem 1000 
> > zmert_config.txt
> > 3 input files (dev.matched, dev.reference, decoder_config_base) I fed to 
> > MEMT:
> > http://ltvp.net/data.tar.gz
> > Thanks in advance. Give me a starting point if you have any idea. I'll try 
> > to fix it myself if possible.
> > Regards,
> > LTVP



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (JOSHUA-95) Vocabulary locking

2016-04-04 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/JOSHUA-95?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney updated JOSHUA-95:
---
Fix Version/s: (was: 5.0)
   6.1

> Vocabulary locking
> --
>
> Key: JOSHUA-95
> URL: https://issues.apache.org/jira/browse/JOSHUA-95
> Project: Joshua
>  Issue Type: Bug
>Reporter: Matt Post
>Assignee: Juri Ganitkevitch
> Fix For: 6.1
>
>
> Vocabulary::id() is still synchronized and a potential point of contention. 
> It would be nice to resolve this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (JOSHUA-93) Clean up examples

2016-04-04 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/JOSHUA-93?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney updated JOSHUA-93:
---
Fix Version/s: (was: 5.0)
   6.1

> Clean up examples
> -
>
> Key: JOSHUA-93
> URL: https://issues.apache.org/jira/browse/JOSHUA-93
> Project: Joshua
>  Issue Type: Bug
>Reporter: Matt Post
>Assignee: Juri Ganitkevitch
> Fix For: 6.1
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (JOSHUA-75) MERT hangs with very large development sets

2016-04-04 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/JOSHUA-75?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney updated JOSHUA-75:
---
Fix Version/s: 6.1

> MERT hangs with very large development sets
> ---
>
> Key: JOSHUA-75
> URL: https://issues.apache.org/jira/browse/JOSHUA-75
> Project: Joshua
>  Issue Type: Bug
>Reporter: Matt Post
>Assignee: Matt Post
> Fix For: 6.1
>
>
> From an email from Phu Le:
> > If you want to have a look, here it is:
> > http://ltvp.net/zmert.tar.gz  (zmert folder when it hangs)
> > To reproduce, you can just rerun zmert:
> > java -Xms1G -Xmx3G -cp meteor.jar:zmert.jar joshua.zmert.ZMERT -maxMem 1000 
> > zmert_config.txt
> > 3 input files (dev.matched, dev.reference, decoder_config_base) I fed to 
> > MEMT:
> > http://ltvp.net/data.tar.gz
> > Thanks in advance. Give me a starting point if you have any idea. I'll try 
> > to fix it myself if possible.
> > Regards,
> > LTVP



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (JOSHUA-14) GIZA should support parallelization

2016-04-04 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/JOSHUA-14?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney updated JOSHUA-14:
---
Fix Version/s: 6.1

> GIZA should support parallelization
> ---
>
> Key: JOSHUA-14
> URL: https://issues.apache.org/jira/browse/JOSHUA-14
> Project: Joshua
>  Issue Type: Bug
>Reporter: Joshua Decoder
> Fix For: 6.1
>
>
> https://groups.google.com/forum/#!topic/joshua_support/bFXaCmHOPAg



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Reopened] (JOSHUA-71) OS X installation depends on coreutils to run thrax test

2016-04-04 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/JOSHUA-71?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney reopened JOSHUA-71:


> OS X installation depends on coreutils to run thrax test
> 
>
> Key: JOSHUA-71
> URL: https://issues.apache.org/jira/browse/JOSHUA-71
> Project: Joshua
>  Issue Type: Bug
>Reporter: Luke Orland
> Fix For: 6.1
>
>
> the {{gstat}} command from coreutils is not installed in Darwin by default. 
> One must resolve that dependency via Homebrew, Macports, etc.
> The {{test/thrax/test.sh}} test will fail on an OS X system that does not 
> have coreutils installed. We should either change the test so that it does 
> not require coreutils in Darwin or make it clear in the (developer) 
> installation/setup instructions that coreutils are required for this test, 
> check for coreutils when running the thrax test, and output a helpful message 
> instructing the developer to go install coreutils if {{gstat}} is not found.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Reopened] (JOSHUA-21) ZMERT shouldn't have to stop and restart the decoder

2016-04-04 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/JOSHUA-21?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney reopened JOSHUA-21:


> ZMERT shouldn't have to stop and restart the decoder
> 
>
> Key: JOSHUA-21
> URL: https://issues.apache.org/jira/browse/JOSHUA-21
> Project: Joshua
>  Issue Type: Bug
>Reporter: Joshua Decoder
> Fix For: 6.1
>
>
> Loading the models is an expensive step, and there's no reason that MERT runs 
> have to load them multiple times.  It should just load the decoder once and 
> reuse the running decoder across iterations.  This could be accomplished with 
> a special input command to Joshua that changes the model weights and resets 
> the sentence counter.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (JOSHUA-21) ZMERT shouldn't have to stop and restart the decoder

2016-04-04 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/JOSHUA-21?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney updated JOSHUA-21:
---
Fix Version/s: 6.1

> ZMERT shouldn't have to stop and restart the decoder
> 
>
> Key: JOSHUA-21
> URL: https://issues.apache.org/jira/browse/JOSHUA-21
> Project: Joshua
>  Issue Type: Bug
>Reporter: Joshua Decoder
> Fix For: 6.1
>
>
> Loading the models is an expensive step, and there's no reason that MERT runs 
> have to load them multiple times.  It should just load the decoder once and 
> reuse the running decoder across iterations.  This could be accomplished with 
> a special input command to Joshua that changes the model weights and resets 
> the sentence counter.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Reopened] (JOSHUA-51) add jhclark/bigfatlm

2016-04-04 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/JOSHUA-51?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney reopened JOSHUA-51:


> add jhclark/bigfatlm
> 
>
> Key: JOSHUA-51
> URL: https://issues.apache.org/jira/browse/JOSHUA-51
> Project: Joshua
>  Issue Type: Bug
>Reporter: Matt Post
>Assignee: Matt Post
> Fix For: 6.1
>
>
> It would be nice to leverage more Hadoop tools in the pipeline.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Reopened] (JOSHUA-22) Parallelize MBR computation

2016-04-04 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/JOSHUA-22?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney reopened JOSHUA-22:


> Parallelize MBR computation
> ---
>
> Key: JOSHUA-22
> URL: https://issues.apache.org/jira/browse/JOSHUA-22
> Project: Joshua
>  Issue Type: Bug
>Reporter: Joshua Decoder
> Fix For: 6.1
>
>
> MBR should be multithreaded.  This would be easy to add following the model 
> used in the InputManager.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (JOSHUA-22) Parallelize MBR computation

2016-04-04 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/JOSHUA-22?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney updated JOSHUA-22:
---
Fix Version/s: 6.1

> Parallelize MBR computation
> ---
>
> Key: JOSHUA-22
> URL: https://issues.apache.org/jira/browse/JOSHUA-22
> Project: Joshua
>  Issue Type: Bug
>Reporter: Joshua Decoder
> Fix For: 6.1
>
>
> MBR should be multithreaded.  This would be easy to add following the model 
> used in the InputManager.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (JOSHUA-51) add jhclark/bigfatlm

2016-04-04 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/JOSHUA-51?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney updated JOSHUA-51:
---
Fix Version/s: 6.1

> add jhclark/bigfatlm
> 
>
> Key: JOSHUA-51
> URL: https://issues.apache.org/jira/browse/JOSHUA-51
> Project: Joshua
>  Issue Type: Bug
>Reporter: Matt Post
>Assignee: Matt Post
> Fix For: 6.1
>
>
> It would be nice to leverage more Hadoop tools in the pipeline.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Reopened] (JOSHUA-148) Process Substitution Error in run-giza.pl?

2016-04-04 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/JOSHUA-148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney reopened JOSHUA-148:
-

> Process Substitution Error in run-giza.pl?
> --
>
> Key: JOSHUA-148
> URL: https://issues.apache.org/jira/browse/JOSHUA-148
> Project: Joshua
>  Issue Type: Bug
>Reporter: Tony Fader
>
> This line creates a command to run and stores the value in 
> {{$__ALIGNMENT_CMD}}:
> https://github.com/joshua-decoder/joshua/blob/master/scripts/training/run-giza.pl#L232
> Then the variable is passed as a command line option in this line: 
> https://github.com/joshua-decoder/joshua/blob/master/scripts/training/run-giza.pl#L265
> This breaks on my system (x86_64 linux running bash as a shell). I modified 
> the code to (I think) work on bash. 
> Here's the changes I made. They probably won't work in other shells that 
> don't have bash's {{<(blah)}} subprocess substitution. Also I found that I 
> had to replace the stdout redirection of the command ({{> output}}) to a 
> command line option ({{-o=output}}) for some reason. I'm not sure what was 
> going on there, either.
> Changes here: 
> https://github.com/afader/joshua/commit/a7540443f59856f363b7101ab4db23c4818504e3



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Reopened] (JOSHUA-14) GIZA should support parallelization

2016-04-04 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/JOSHUA-14?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney reopened JOSHUA-14:


> GIZA should support parallelization
> ---
>
> Key: JOSHUA-14
> URL: https://issues.apache.org/jira/browse/JOSHUA-14
> Project: Joshua
>  Issue Type: Bug
>Reporter: Joshua Decoder
> Fix For: 6.1
>
>
> https://groups.google.com/forum/#!topic/joshua_support/bFXaCmHOPAg



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (JOSHUA-176) Packing phrase tables

2016-04-04 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/JOSHUA-176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney closed JOSHUA-176.
---

> Packing phrase tables
> -
>
> Key: JOSHUA-176
> URL: https://issues.apache.org/jira/browse/JOSHUA-176
> Project: Joshua
>  Issue Type: Bug
>Reporter: Matt Post
>
> The packer can pack Moses phrase tables, but the decoder can't use them.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Reopened] (JOSHUA-210) Fix nonterminalmatcher logic

2016-04-04 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/JOSHUA-210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney reopened JOSHUA-210:
-

> Fix nonterminalmatcher logic
> 
>
> Key: JOSHUA-210
> URL: https://issues.apache.org/jira/browse/JOSHUA-210
> Project: Joshua
>  Issue Type: Bug
>Reporter: Matt Post
>
> It's slow and overly general, and in particular, 
> Vocabulary::getNonterminalIndices() may not be thread safe.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (JOSHUA-258) Add back penn-treebank-(de)tokenizer perl scripts

2016-04-28 Thread Lewis John McGibbney (JIRA)
Lewis John McGibbney created JOSHUA-258:
---

 Summary: Add back penn-treebank-(de)tokenizer perl scripts
 Key: JOSHUA-258
 URL: https://issues.apache.org/jira/browse/JOSHUA-258
 Project: Joshua
  Issue Type: Task
Affects Versions: 6.0.5
Reporter: Lewis John McGibbney
Assignee: Lewis John McGibbney
 Fix For: 6.1


I've been working with the 
[joshua_translation_engine|https://github.com/joshua-decoder/joshua_translation_engine]
 (which is friggin excellent, we will definately be standing this up on 
something more heavyweight in the near future) and recently reported [issue 
15|https://github.com/joshua-decoder/joshua_translation_engine/issues/15]

This issue therefore proposes to add back in penn-treebank-(de)tokenizer perl 
scripts which were removed between 6.0.4 and 6.0.5 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (JOSHUA-258) Add back penn-treebank-(de)tokenizer perl scripts

2016-04-28 Thread Lewis John McGibbney (JIRA)

[ 
https://issues.apache.org/jira/browse/JOSHUA-258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15263260#comment-15263260
 ] 

Lewis John McGibbney commented on JOSHUA-258:
-

Cool

> Add back penn-treebank-(de)tokenizer perl scripts
> -
>
> Key: JOSHUA-258
> URL: https://issues.apache.org/jira/browse/JOSHUA-258
> Project: Joshua
>  Issue Type: Task
>Affects Versions: 6.0.5
>Reporter: Lewis John McGibbney
>Assignee: Lewis John McGibbney
> Fix For: 6.1
>
>
> I've been working with the 
> [joshua_translation_engine|https://github.com/joshua-decoder/joshua_translation_engine]
>  (which is friggin excellent, we will definately be standing this up on 
> something more heavyweight in the near future) and recently reported [issue 
> 15|https://github.com/joshua-decoder/joshua_translation_engine/issues/15]
> This issue therefore proposes to add back in penn-treebank-(de)tokenizer perl 
> scripts which were removed between 6.0.4 and 6.0.5 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (JOSHUA-254) Update README with correct branding

2016-04-26 Thread Lewis John McGibbney (JIRA)
Lewis John McGibbney created JOSHUA-254:
---

 Summary: Update README with correct branding
 Key: JOSHUA-254
 URL: https://issues.apache.org/jira/browse/JOSHUA-254
 Project: Joshua
  Issue Type: Task
Affects Versions: 6.0.5
Reporter: Lewis John McGibbney
Assignee: Lewis John McGibbney
Priority: Trivial
 Fix For: 6.1


This issue is trivial and involves updating the project README to direct all 
links to the correct place as well as address branding.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (JOSHUA-262) Implement all logging as Slf4j over Log4j

2016-05-20 Thread Lewis John McGibbney (JIRA)

[ 
https://issues.apache.org/jira/browse/JOSHUA-262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15294656#comment-15294656
 ] 

Lewis John McGibbney commented on JOSHUA-262:
-

I honestly have no idea.

> Implement all logging as Slf4j over Log4j
> -
>
> Key: JOSHUA-262
> URL: https://issues.apache.org/jira/browse/JOSHUA-262
> Project: Joshua
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 6.0.5
>Reporter: Lewis John McGibbney
>Assignee: Thamme Gowda N
> Fix For: 6.1
>
>
> [~hsaputra] suggested that we implement all logging as Slf4j over Log4j. If 
> we use [parameterized logging 
> notation|http://www.slf4j.org/faq.html#logging_performance] we can have good 
> logging in place.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (JOSHUA-261) Remove ext directory from source tree

2016-05-09 Thread Lewis John McGibbney (JIRA)

[ 
https://issues.apache.org/jira/browse/JOSHUA-261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15276505#comment-15276505
 ] 

Lewis John McGibbney commented on JOSHUA-261:
-

Yes absolutely. No problems. 

> Remove ext directory from source tree
> -
>
> Key: JOSHUA-261
> URL: https://issues.apache.org/jira/browse/JOSHUA-261
> Project: Joshua
>  Issue Type: Task
>Affects Versions: 6.0.5
>Reporter: Lewis John McGibbney
>Priority: Blocker
> Fix For: 6.1
>
>
> Right now we have a bunch of cofe bundled in to the 
> [ext|https://github.com/apache/incubator-joshua/tree/master/ext] directory. I 
> don't think any of this code can be shipped with an Apache Joshua 
> (Incubating) release so we need to think about a mechanism for removing it 
> and making Joshua work in other ways.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (JOSHUA-261) Remove ext directory from source tree

2016-05-09 Thread Lewis John McGibbney (JIRA)

[ 
https://issues.apache.org/jira/browse/JOSHUA-261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15276506#comment-15276506
 ] 

Lewis John McGibbney commented on JOSHUA-261:
-

In all honesty, the code can remain in the source tree in SCM but we just can't 
ship it with a release. 

> Remove ext directory from source tree
> -
>
> Key: JOSHUA-261
> URL: https://issues.apache.org/jira/browse/JOSHUA-261
> Project: Joshua
>  Issue Type: Task
>Affects Versions: 6.0.5
>Reporter: Lewis John McGibbney
>Priority: Blocker
> Fix For: 6.1
>
>
> Right now we have a bunch of cofe bundled in to the 
> [ext|https://github.com/apache/incubator-joshua/tree/master/ext] directory. I 
> don't think any of this code can be shipped with an Apache Joshua 
> (Incubating) release so we need to think about a mechanism for removing it 
> and making Joshua work in other ways.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (JOSHUA-252) Make it possible to use Maven to build Joshua

2016-05-13 Thread Lewis John McGibbney (JIRA)

[ 
https://issues.apache.org/jira/browse/JOSHUA-252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15283123#comment-15283123
 ] 

Lewis John McGibbney commented on JOSHUA-252:
-

[~teofili] I am working on this today I will post a pull request ASAP

> Make it possible to use Maven to build Joshua
> -
>
> Key: JOSHUA-252
> URL: https://issues.apache.org/jira/browse/JOSHUA-252
> Project: Joshua
>  Issue Type: Improvement
>  Components: build
>Reporter: Tommaso Teofili
>Assignee: Tommaso Teofili
> Fix For: 6.1
>
>
> As per discussion on the dev@ list for now Ant is the official build tool for 
> Joshua however we would like to possibly switch to Maven if / when someone is 
> able to do so.
> Assigning to me for now as I could be able to look into this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (JOSHUA-262) Implement all logging as Slf4j over Log4j

2016-05-13 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/JOSHUA-262?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney updated JOSHUA-262:

Component/s: core

> Implement all logging as Slf4j over Log4j
> -
>
> Key: JOSHUA-262
> URL: https://issues.apache.org/jira/browse/JOSHUA-262
> Project: Joshua
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 6.0.5
>Reporter: Lewis John McGibbney
> Fix For: 6.1
>
>
> [~hsaputra] suggested that we implement all logging as Slf4j over Log4j. If 
> we use [parameterized logging 
> notation|http://www.slf4j.org/faq.html#logging_performance] we can have good 
> logging in place.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (JOSHUA-252) Make it possible to use Maven to build Joshua

2016-05-13 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/JOSHUA-252?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney updated JOSHUA-252:

Component/s: build

> Make it possible to use Maven to build Joshua
> -
>
> Key: JOSHUA-252
> URL: https://issues.apache.org/jira/browse/JOSHUA-252
> Project: Joshua
>  Issue Type: Improvement
>  Components: build
>Reporter: Tommaso Teofili
>Assignee: Tommaso Teofili
> Fix For: 6.1
>
>
> As per discussion on the dev@ list for now Ant is the official build tool for 
> Joshua however we would like to possibly switch to Maven if / when someone is 
> able to do so.
> Assigning to me for now as I could be able to look into this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (JOSHUA-262) Implement all logging as Slf4j over Log4j

2016-05-13 Thread Lewis John McGibbney (JIRA)
Lewis John McGibbney created JOSHUA-262:
---

 Summary: Implement all logging as Slf4j over Log4j
 Key: JOSHUA-262
 URL: https://issues.apache.org/jira/browse/JOSHUA-262
 Project: Joshua
  Issue Type: Improvement
Affects Versions: 6.0.5
Reporter: Lewis John McGibbney
 Fix For: 6.1


[~hsaputra] suggested that we implement all logging as Slf4j over Log4j. If we 
use [parameterized logging 
notation|http://www.slf4j.org/faq.html#logging_performance] we can have good 
logging in place.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (JOSHUA-259) Integration tests are failing

2016-05-13 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/JOSHUA-259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney updated JOSHUA-259:

Fix Version/s: 6.1

> Integration tests are failing
> -
>
> Key: JOSHUA-259
> URL: https://issues.apache.org/jira/browse/JOSHUA-259
> Project: Joshua
>  Issue Type: Bug
>Reporter: Kellen Sunderland
> Fix For: 6.1
>
>
> Several integration tests are currently failing with Joshua.  I have a quick 
> fix coming for one of the tests but just in case we need more discussion 
> around the failures I'll open a bug.
> The currently failing tests for me:
> test/decoder/too-long
> test/server/http
> test/server/tcp-text
> test/thrax/extraction
> and 
> test/decoder/moses-compat (but this is easy to fix, simple extra space in the 
> expected file)
> These are failing under OS X 10.11.  If working under other environments feel 
> free to post a 'works for me'.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (JOSHUA-260) Integrate IoC (Inversion of Control) into Joshua

2016-05-13 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/JOSHUA-260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney updated JOSHUA-260:

Fix Version/s: 6.1

> Integrate IoC (Inversion of Control) into Joshua
> 
>
> Key: JOSHUA-260
> URL: https://issues.apache.org/jira/browse/JOSHUA-260
> Project: Joshua
>  Issue Type: Improvement
>Reporter: Kellen Sunderland
>Assignee: Kellen Sunderland
> Fix For: 6.1
>
>
> I'd like to propose we investigate looking into using guice 
> (https://github.com/google/guice) in conjunction with joshua's configuration 
> system.  I believe it would give us a nice way to map what is in the 
> configuration to the code paths, and implementations used within Joshua.  It 
> also would go a long way to allowing us to integrate unit tests throughout 
> all the important classes in Joshua.  What does everyone think?  Would IoC be 
> a good pattern to adopt?  Is everyone ok with using guice (versus say some 
> other IoC library).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (JOSHUA-261) Remove ext directory from source tree

2016-05-06 Thread Lewis John McGibbney (JIRA)
Lewis John McGibbney created JOSHUA-261:
---

 Summary: Remove ext directory from source tree
 Key: JOSHUA-261
 URL: https://issues.apache.org/jira/browse/JOSHUA-261
 Project: Joshua
  Issue Type: Task
Affects Versions: 6.0.5
Reporter: Lewis John McGibbney
Priority: Blocker
 Fix For: 6.1


Right now we have a bunch of cofe bundled in to the 
[ext|https://github.com/apache/incubator-joshua/tree/master/ext] directory. I 
don't think any of this code can be shipped with an Apache Joshua (Incubating) 
release so we need to think about a mechanism for removing it and making Joshua 
work in other ways.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (JOSHUA-258) Add back penn-treebank-(de)tokenizer perl scripts

2016-05-05 Thread Lewis John McGibbney (JIRA)

[ 
https://issues.apache.org/jira/browse/JOSHUA-258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15273071#comment-15273071
 ] 

Lewis John McGibbney commented on JOSHUA-258:
-

Will keep this issue open for the time being then. Please feel free to close 
off it you have a solution in mind. Thanks

> Add back penn-treebank-(de)tokenizer perl scripts
> -
>
> Key: JOSHUA-258
> URL: https://issues.apache.org/jira/browse/JOSHUA-258
> Project: Joshua
>  Issue Type: Task
>Affects Versions: 6.0.5
>Reporter: Lewis John McGibbney
>Assignee: Lewis John McGibbney
> Fix For: 6.1
>
>
> I've been working with the 
> [joshua_translation_engine|https://github.com/joshua-decoder/joshua_translation_engine]
>  (which is friggin excellent, we will definately be standing this up on 
> something more heavyweight in the near future) and recently reported [issue 
> 15|https://github.com/joshua-decoder/joshua_translation_engine/issues/15]
> This issue therefore proposes to add back in penn-treebank-(de)tokenizer perl 
> scripts which were removed between 6.0.4 and 6.0.5 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (JOSHUA-253) Enable execution of Unit tests

2016-04-14 Thread Lewis John McGibbney (JIRA)
Lewis John McGibbney created JOSHUA-253:
---

 Summary: Enable execution of Unit tests
 Key: JOSHUA-253
 URL: https://issues.apache.org/jira/browse/JOSHUA-253
 Project: Joshua
  Issue Type: Test
Affects Versions: 6.0
Reporter: Lewis John McGibbney
 Fix For: 6.1


As per our [discussion on this 
topic|http://www.mail-archive.com/dev%40joshua.incubator.apache.org/msg00270.html],
 [~teofili] correctly identified that unit level tests are not executed.
We need to fix this such that they are.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (JOSHUA-253) Enable execution of Unit tests

2016-04-27 Thread Lewis John McGibbney (JIRA)

[ 
https://issues.apache.org/jira/browse/JOSHUA-253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15260231#comment-15260231
 ] 

Lewis John McGibbney commented on JOSHUA-253:
-

[~teofili] [~post] where are the Unit tests we have to run? I've undertaken 
some analysis of the $JOSHUA_HOME/test diretory. As far as I can see they are 
invoked... so I am definitely missing something here. 

> Enable execution of Unit tests
> --
>
> Key: JOSHUA-253
> URL: https://issues.apache.org/jira/browse/JOSHUA-253
> Project: Joshua
>  Issue Type: Test
>Affects Versions: 6.0
>Reporter: Lewis John McGibbney
> Fix For: 6.1
>
>
> As per our [discussion on this 
> topic|http://www.mail-archive.com/dev%40joshua.incubator.apache.org/msg00270.html],
>  [~teofili] correctly identified that unit level tests are not executed.
> We need to fix this such that they are.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (JOSHUA-253) Enable execution of Unit tests

2016-04-27 Thread Lewis John McGibbney (JIRA)

[ 
https://issues.apache.org/jira/browse/JOSHUA-253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15260269#comment-15260269
 ] 

Lewis John McGibbney commented on JOSHUA-253:
-

bq. so no Java unit test is executed as far as I know

where are the unit tests? I can fix this right now if you can point them out. 
Thanks

> Enable execution of Unit tests
> --
>
> Key: JOSHUA-253
> URL: https://issues.apache.org/jira/browse/JOSHUA-253
> Project: Joshua
>  Issue Type: Test
>Affects Versions: 6.0
>Reporter: Lewis John McGibbney
> Fix For: 6.1
>
>
> As per our [discussion on this 
> topic|http://www.mail-archive.com/dev%40joshua.incubator.apache.org/msg00270.html],
>  [~teofili] correctly identified that unit level tests are not executed.
> We need to fix this such that they are.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (JOSHUA-257) Add license headers to all Python scripts

2016-04-28 Thread Lewis John McGibbney (JIRA)
Lewis John McGibbney created JOSHUA-257:
---

 Summary: Add license headers to all Python scripts
 Key: JOSHUA-257
 URL: https://issues.apache.org/jira/browse/JOSHUA-257
 Project: Joshua
  Issue Type: Task
Affects Versions: 6.0.5
Reporter: Lewis John McGibbney
Assignee: Lewis John McGibbney
Priority: Critical
 Fix For: 6.1


Simple case of adding license headers to all Python scripts as below
{code}
./ext/kenlm/python/example.py
./ext/kenlm/setup.py
./scripts/features/addSparseFeatures.py
./scripts/lm/compile_berkeley.py
./scripts/samt/filterGrammar.py
./scripts/samt/lexprob2samt.py
./scripts/samt/selectFeatures.py
./scripts/support/merge_lms.py
./scripts/support/run_bundler.py
./scripts/toolkit/chunki.py
./scripts/toolkit/extract_references.py
./scripts/toolkit/joini.py
./scripts/toolkit/shorti.py
./scripts/training/class-lm/replaceTokensWithClasses.py
./scripts/training/run_thrax.py
./scripts/training/run_tuner.py
./test/prune-equivalent-translations.py
./test/scripts/merge_lms_test.py
./test/scripts/run_bundler_test.py
./thrax/scripts/berant_to_reference.py
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (JOSHUA-270) pipeline.pl needs major refactoring

2016-05-24 Thread Lewis John McGibbney (JIRA)
Lewis John McGibbney created JOSHUA-270:
---

 Summary: pipeline.pl needs major refactoring
 Key: JOSHUA-270
 URL: https://issues.apache.org/jira/browse/JOSHUA-270
 Project: Joshua
  Issue Type: Bug
  Components: pipeline
Affects Versions: 6.0.5
Reporter: Lewis John McGibbney
 Fix For: 6.1


Right now 
[pipeline.pl|https://github.com/apache/incubator-joshua/blob/master/scripts/training/pipeline.pl]
 is well over 2000 lines long and extremely difficult to navigate. 
I propose the following
 * All ENV is refactored into an pipeline_environment file
 * All Command line parsing and definitions are refactored into a pipeline_cli 
file
 * Sanity checking is refactored into a pipeline_sanity_check file
 * Dependenct Variable Checking is refactored into 
pipeline_dependent_variable_setting file
 * filter and preprocess corpora is refactored into 
pipeline_filter_preprocess_corpora
 * pipeline_subsampling becomes a file
 * pipeline_alignment becomes a file
 * pipeline_parsing becomes a file
 * pipeline_thrax becomes a file
 * pipeline_tuning becomes a file
 * pipeline_testing becomes a file
 * pipeline_subreoutines becomes a file



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (JOSHUA-271) Thrax invocation should not reply upon $HADOOP being set

2016-05-24 Thread Lewis John McGibbney (JIRA)
Lewis John McGibbney created JOSHUA-271:
---

 Summary: Thrax invocation should not reply upon $HADOOP being set
 Key: JOSHUA-271
 URL: https://issues.apache.org/jira/browse/JOSHUA-271
 Project: Joshua
  Issue Type: Bug
  Components: pipeline, thrax
Affects Versions: 6.0.5
Reporter: Lewis John McGibbney
 Fix For: 6.1


Right now one cannot run thrax unless the $HADOOP env variable is defined. 
Every time the hadoop script is invoked it means that the path is coded as 
$HADOOP/bin/hadoop however what happens if you are using a VM (Vagrant) to 
connect to a cluster for which no $HADOOP env variable is defined? 
The hadoop script should be on the path and available to use from there. The 
only check which should be made is whether it is available from the path or 
not, if it is not then start_hadoop_cluster subroutine can be called. This 
reduces code and makes more sense.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (JOSHUA-287) KenLM.java catches UnsatisfiedLinkError when attempting to load libken.so (libken.dylib on OSX)

2016-07-27 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/JOSHUA-287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney updated JOSHUA-287:

Issue Type: Bug  (was: Improvement)

> KenLM.java catches UnsatisfiedLinkError when attempting to load libken.so 
> (libken.dylib on OSX)
> ---
>
> Key: JOSHUA-287
> URL: https://issues.apache.org/jira/browse/JOSHUA-287
> Project: Joshua
>  Issue Type: Bug
>  Components: core, kenlm
>Affects Versions: 6.0.5
>Reporter: Lewis John McGibbney
> Fix For: 6.1
>
>
> As explained in 
> http://www.mail-archive.com/dev%40joshua.incubator.apache.org/msg01189.html 
> currently we have an issue, where, when checked out from master the following 
> RuntimeException is thrown.
> {code}
> ---
>  T E S T S
> ---
> Running TestSuite
> WARN - sentence 0 too long 401, truncating to length 200
> WARN - sentence 0 too long 401, truncating to length 200
> WARN - sentence 0 too long 401, truncating to length 200
> WARN - sentence 0 too long 401, truncating to length 200
> tm_pt_0=-2.000 tm_glue_0=3.000 lm_0=-206.718 lm_0_oov=2.000 
> OOVPenalty=-200.000 | -198.000
> ERROR - * FATAL: Can't find libken.so (libken.dylib on OS X) in $JOSHUA/lib
> ERROR - *This probably means that the KenLM library didn't compile.
> ERROR - *Make sure that BOOST_ROOT is set to the root of your boost
> ERROR - *installation (it's not /opt/local/, the default), change to
> ERROR - *$JOSHUA, and type 'ant kenlm'. If problems persist, see the
> ERROR - *website (joshua-decoder.org).
> WARN - no grammars supplied!  Supplying dummy glue grammar.
> WARN - no grammars supplied!  Supplying dummy glue grammar.
> WARN - no grammars supplied!  Supplying dummy glue grammar.
> WARN - no grammars supplied!  Supplying dummy glue grammar.
> WARN - no grammars supplied!  Supplying dummy glue grammar.
> WARN - no grammars supplied!  Supplying dummy glue grammar.
> WARN - no grammars supplied!  Supplying dummy glue grammar.
> WARN - no grammars supplied!  Supplying dummy glue grammar.
> {code}
> We need to fix this such that we can run static source code analysis via 
> sonar and have our results available on analysis.apache.org.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (JOSHUA-287) KenLM.java catches UnsatisfiedLinkError when attempting to load libken.so (libken.dylib on OSX)

2016-08-13 Thread lewis john mcgibbney (JIRA)

[ 
https://issues.apache.org/jira/browse/JOSHUA-287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15420151#comment-15420151
 ] 

lewis john mcgibbney commented on JOSHUA-287:
-

Brilliant Kellen thank you




> KenLM.java catches UnsatisfiedLinkError when attempting to load libken.so 
> (libken.dylib on OSX)
> ---
>
> Key: JOSHUA-287
> URL: https://issues.apache.org/jira/browse/JOSHUA-287
> Project: Joshua
>  Issue Type: Bug
>  Components: core, kenlm
>Affects Versions: 6.0.5
>Reporter: Lewis John McGibbney
>Assignee: Kellen Sunderland
> Fix For: 6.1
>
>
> As explained in 
> http://www.mail-archive.com/dev%40joshua.incubator.apache.org/msg01189.html 
> currently we have an issue, where, when checked out from master the following 
> RuntimeException is thrown.
> {code}
> ---
>  T E S T S
> ---
> Running TestSuite
> WARN - sentence 0 too long 401, truncating to length 200
> WARN - sentence 0 too long 401, truncating to length 200
> WARN - sentence 0 too long 401, truncating to length 200
> WARN - sentence 0 too long 401, truncating to length 200
> tm_pt_0=-2.000 tm_glue_0=3.000 lm_0=-206.718 lm_0_oov=2.000 
> OOVPenalty=-200.000 | -198.000
> ERROR - * FATAL: Can't find libken.so (libken.dylib on OS X) in $JOSHUA/lib
> ERROR - *This probably means that the KenLM library didn't compile.
> ERROR - *Make sure that BOOST_ROOT is set to the root of your boost
> ERROR - *installation (it's not /opt/local/, the default), change to
> ERROR - *$JOSHUA, and type 'ant kenlm'. If problems persist, see the
> ERROR - *website (joshua-decoder.org).
> WARN - no grammars supplied!  Supplying dummy glue grammar.
> WARN - no grammars supplied!  Supplying dummy glue grammar.
> WARN - no grammars supplied!  Supplying dummy glue grammar.
> WARN - no grammars supplied!  Supplying dummy glue grammar.
> WARN - no grammars supplied!  Supplying dummy glue grammar.
> WARN - no grammars supplied!  Supplying dummy glue grammar.
> WARN - no grammars supplied!  Supplying dummy glue grammar.
> WARN - no grammars supplied!  Supplying dummy glue grammar.
> {code}
> We need to fix this such that we can run static source code analysis via 
> sonar and have our results available on analysis.apache.org.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (JOSHUA-249) Joshua Logo

2016-08-13 Thread Lewis John McGibbney (JIRA)

[ 
https://issues.apache.org/jira/browse/JOSHUA-249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15420149#comment-15420149
 ] 

Lewis John McGibbney commented on JOSHUA-249:
-

Cool can you please resolve this issue.




> Joshua Logo
> ---
>
> Key: JOSHUA-249
> URL: https://issues.apache.org/jira/browse/JOSHUA-249
> Project: Joshua
>  Issue Type: Task
>Reporter: Lewis John McGibbney
>Assignee: Lewis John McGibbney
>Priority: Minor
> Fix For: 6.1
>
> Attachments: apache_joshua_logo.png, apache_joshua_logo.xcf
>
>
> As we discussed on the mailing lists, this issue should gather all proposed 
> Joshua logo's so we can VOTE on one or more of them.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (JOSHUA-283) Implement fast_align as one of the available alignment options

2016-07-20 Thread Lewis John McGibbney (JIRA)
Lewis John McGibbney created JOSHUA-283:
---

 Summary: Implement fast_align as one of the available alignment 
options
 Key: JOSHUA-283
 URL: https://issues.apache.org/jira/browse/JOSHUA-283
 Project: Joshua
  Issue Type: Bug
  Components: alignment, pipeline
Reporter: Lewis John McGibbney
Assignee: Lewis John McGibbney
 Fix For: 6.1


For some time now, I've been having issues using GIZA++ for alignment whilst 
running a Joshua pipeline.
Whilst looking for an alternative [~post] and [~kellen.sunderland] mentioned 
the berkeley aligner and fast_align respectively.
Due to the fact that 1) berkeley aligner has not been touched in ~9 years, and 
2) no artifact currently exists on Maven Central, I am taking the advice and 
attempting to use fast_align.
This issue will augment the alignment code in Joshua to permit use of 
fast_align which is ALv2.0 licensed.

https://github.com/clab/fast_align 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (JOSHUA-281) split2files.pl support script no longer exists hence pipeline fails

2016-07-15 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/JOSHUA-281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney closed JOSHUA-281.
---
Resolution: Invalid

This is not a bug at all, my input parameters for the pipeline.pl invocation 
were incorrect.

> split2files.pl support script no longer exists hence pipeline fails
> ---
>
> Key: JOSHUA-281
> URL: https://issues.apache.org/jira/browse/JOSHUA-281
> Project: Joshua
>  Issue Type: Bug
>  Components: pipeline
>Affects Versions: 6.0.5
>Reporter: Lewis John McGibbney
>Assignee: Lewis John McGibbney
>Priority: Blocker
> Fix For: 6.1
>
>
> When I attempt to run a pipeline, I get the following
> {code}
> lmcgibbn@LMC-032857 /usr/local/incubator-joshua(master) $ ../bin/pipeline.pl  
> --rundir . --type hiero --corpus 
> /usr/local/jpl/xdata/joshua_experiments/russian_model/commoncrawl.ru-en 
> --tune 
> /usr/local/jpl/xdata/joshua_experiments/russian_model/commoncrawl.ru-en.tune 
> --test 
> /usr/local/jpl/xdata/joshua_experiments/russian_model/commoncrawl.ru-en.test 
> --source en --target ru --rundir experiment_1/1 --readme "Russian model 
> generation experiment 1 run 1" --mbr
> [train-copy-and-filter] rebuilding...
>   
> dep=/usr/local/jpl/xdata/joshua_experiments/russian_model/commoncrawl.ru-en.en
>  [CHANGED]
>   
> dep=/usr/local/jpl/xdata/joshua_experiments/russian_model/commoncrawl.ru-en.ru
>  [CHANGED]
>   dep=/usr/local/incubator-joshua/experiment_1/1/data/train/train.en [NOT 
> FOUND]
>   dep=/usr/local/incubator-joshua/experiment_1/1/data/train/train.ru [NOT 
> FOUND]
>   cmd=/usr/local/incubator-joshua/scripts/training/paste 
> /usr/local/jpl/xdata/joshua_experiments/russian_model/commoncrawl.ru-en.en 
> /usr/local/jpl/xdata/joshua_experiments/russian_model/commoncrawl.ru-en.ru | 
> /usr/local/incubator-joshua/scripts/training/filter-empty-lines.pl | 
> /usr/local/incubator-joshua/scripts/training/split2files.pl 
> /usr/local/incubator-joshua/experiment_1/1/data/train/train.en 
> /usr/local/incubator-joshua/experiment_1/1/data/train/train.ru
>   JOB FAILED (return code 127)
> /bin/bash: /usr/local/incubator-joshua/scripts/training/split2files.pl: No 
> such file or directory
> {code}
> The following commit changed the name of the file
> {code}
> Repository: incubator-joshua
> Updated Branches:
>   refs/heads/master 09fb6a2d3 -> f02bd279e
> combined split2files implementations
> Project: http://git-wip-us.apache.org/repos/asf/incubator-joshua/repo
> Commit: 
> http://git-wip-us.apache.org/repos/asf/incubator-joshua/commit/f02bd279
> Tree: http://git-wip-us.apache.org/repos/asf/incubator-joshua/tree/f02bd279
> Diff: http://git-wip-us.apache.org/repos/asf/incubator-joshua/diff/f02bd279
> Branch: refs/heads/master
> Commit: f02bd279e892408c9eca2a2a241f21f59cb105e9
> Parents: 09fb6a2
> Author: Matt Post 
> Authored: Wed May 18 09:12:07 2016 -0400
> Committer: Matt Post 
> Committed: Wed May 18 09:12:07 2016 -0400
> --
>  scripts/support/split2files  | 44 +++
>  scripts/support/splittabs.pl | 42 -
>  scripts/training/pipeline.pl |  8 ++---
>  scripts/training/split2files.pl  | 38 ---
>  scripts/training/trim_parallel_corpus.pl |  2 +-
>  5 files changed, 49 insertions(+), 85 deletions(-)
> --
> {code}
> I'll submit a PR to do the simple string replace... which is hopefully all 
> that is wrong here.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (JOSHUA-280) Existing Spanish --> English Language pack not compatible with Joshua master

2016-07-01 Thread Lewis John McGibbney (JIRA)

[ 
https://issues.apache.org/jira/browse/JOSHUA-280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15359690#comment-15359690
 ] 

Lewis John McGibbney commented on JOSHUA-280:
-

[~post] any idea whats up here? Thanks

> Existing Spanish --> English Language pack not compatible with Joshua master
> 
>
> Key: JOSHUA-280
> URL: https://issues.apache.org/jira/browse/JOSHUA-280
> Project: Joshua
>  Issue Type: Bug
>  Components: language packs
>Affects Versions: 6.0.5
>Reporter: Lewis John McGibbney
>Priority: Critical
> Fix For: 6.1
>
>
> When I work with the existing Spanish --> English language pack at 
> http://cs.jhu.edu/~post/language-packs/language-pack-es-en-phrase-2015-03-06.tgz,
>  I get the following error
> {code}
> lmcgibbn@LMC-032857 
> /usr/local/Cellar/joshua/HEAD/libexec/language-pack-es-en-phrase-2015-03-06(NUTCH-2089)
>  $ ./run-joshua-server.sh
> INFO - Parameters read from configuration file: joshua.config
> INFO - tm = 'moses -owner pt -maxspan 0 -path phrase-table.packed 
> -max-source-len 5'
> INFO - defaultnonterminal = 'X'
> INFO - goalsymbol = 'GOAL'
> INFO - featurefunction = 'StateMinimizingLanguageModel -lm_type kenlm 
> -lm_order 5 -lm_file lm.kenlm'
> INFO - markoovs = 'false'
> INFO - search = 'stack'
> INFO - pop-limit: 100
> INFO - poplimit = '100'
> INFO - topn = '0'
> INFO - useuniquenbest = 'true'
> INFO - outputformat = '%s'
> INFO - includealignindex = 'false'
> INFO - featurefunction = 'OOVPenalty'
> INFO - featurefunction = 'WordPenalty'
> INFO - featurefunction = 'Distortion'
> INFO - featurefunction = 'PhrasePenalty'
> INFO - c = 'joshua.config'
> INFO - server-port: 5674
> INFO - serverport = '5674'
> INFO - Read 9 weights (0 of them dense)
> INFO - Reading vocabulary: phrase-table.packed/vocabulary
> INFO - Read 191983 entries from the vocabulary
> INFO - Reading packed config: phrase-table.packed/config
> 102030405060708090.100%
> Exception in thread "main" java.lang.RuntimeException: The grammar at 
> phrase-table.packed was packed with packer version 0, but the earliest 
> supported version is 3
>   at 
> org.apache.joshua.decoder.ff.tm.packed.PackedGrammar.readConfig(PackedGrammar.java:1061)
>   at 
> org.apache.joshua.decoder.ff.tm.packed.PackedGrammar.(PackedGrammar.java:143)
>   at 
> org.apache.joshua.decoder.phrase.PhraseTable.(PhraseTable.java:65)
>   at 
> org.apache.joshua.decoder.Decoder.initializeTranslationGrammars(Decoder.java:603)
>   at org.apache.joshua.decoder.Decoder.initialize(Decoder.java:514)
>   at org.apache.joshua.decoder.Decoder.(Decoder.java:126)
>   at org.apache.joshua.decoder.JoshuaDecoder.main(JoshuaDecoder.java:69)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (JOSHUA-280) Existing Language packs not compatible with Joshua master

2016-07-01 Thread Lewis John McGibbney (JIRA)

[ 
https://issues.apache.org/jira/browse/JOSHUA-280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15359696#comment-15359696
 ] 

Lewis John McGibbney commented on JOSHUA-280:
-

The existing chinese language pack works just fine
{code}
lmcgibbn@LMC-032857 
/usr/local/Cellar/joshua/HEAD/libexec/zh-en-hiero-pack-2016-01(NUTCH-2089) $ 
./run-joshua-server.sh
Parameters read from configuration file:
tm = 'thrax -path grammar.packed -maxspan 20 -owner pt'
tm = 'thrax -path grammar.glue -maxspan -1 -owner glue'
defaultnonterminal = 'X'
goalsymbol = 'GOAL'
featurefunction = 'LanguageModel -lm_order 5 -lm_type berkeleylm -lm_file 
lm.berkeleylm'
markoovs = 'false'
search = 'cky'
poplimit = '100'
topn = '0'
useuniquenbest = 'true'
outputformat = '%S'
includealignindex = 'false'
featurefunction = 'OOVPenalty'
featurefunction = 'WordPenalty'
Parameters overridden from the command line:
server-port: 5674
serverport = '5674'
c = 'joshua.config'
Read 10 weights (0 of them dense)
Reading vocabulary: grammar.packed/vocabulary
Read 300317 entries from the vocabulary
Reading packed config: grammar.packed/config
102030405060708090.100%
Reading encoder configuration: grammar.packed/encoding
Loaded 62685418 rules
Reading grammar from file grammar.glue...
MemoryBasedBatchGrammar: Read 4 rules with 4 distinct source sides from 
'grammar.glue'
Memory used 3447.1 MB
Grammar loading took: 39 seconds.
Stateful object with state index 0
Loading Berkeley LM from binary lm.berkeleylm
FEATURE: tm_pt (weight 0.000)
FEATURE: tm_glue (weight 0.000)
FEATURE: lm_0, order 5 (weight 0.194)
FEATURE: OOVPenalty (weight 0.015)
FEATURE: WordPenalty (weight -0.460)
Grammar sorting happening lazily on-demand.
Model loading took 42 seconds
Memory used 4355.5 MB
** TCP Server running and listening on port 5674.
{code}

> Existing Language packs not compatible with Joshua master
> -
>
> Key: JOSHUA-280
> URL: https://issues.apache.org/jira/browse/JOSHUA-280
> Project: Joshua
>  Issue Type: Bug
>  Components: language packs
>Affects Versions: 6.0.5
>Reporter: Lewis John McGibbney
>Priority: Critical
> Fix For: 6.1
>
>
> When I work with the existing Spanish --> English language pack at 
> http://cs.jhu.edu/~post/language-packs/language-pack-es-en-phrase-2015-03-06.tgz,
>  I get the following error
> {code}
> lmcgibbn@LMC-032857 
> /usr/local/Cellar/joshua/HEAD/libexec/language-pack-es-en-phrase-2015-03-06(NUTCH-2089)
>  $ ./run-joshua-server.sh
> INFO - Parameters read from configuration file: joshua.config
> INFO - tm = 'moses -owner pt -maxspan 0 -path phrase-table.packed 
> -max-source-len 5'
> INFO - defaultnonterminal = 'X'
> INFO - goalsymbol = 'GOAL'
> INFO - featurefunction = 'StateMinimizingLanguageModel -lm_type kenlm 
> -lm_order 5 -lm_file lm.kenlm'
> INFO - markoovs = 'false'
> INFO - search = 'stack'
> INFO - pop-limit: 100
> INFO - poplimit = '100'
> INFO - topn = '0'
> INFO - useuniquenbest = 'true'
> INFO - outputformat = '%s'
> INFO - includealignindex = 'false'
> INFO - featurefunction = 'OOVPenalty'
> INFO - featurefunction = 'WordPenalty'
> INFO - featurefunction = 'Distortion'
> INFO - featurefunction = 'PhrasePenalty'
> INFO - c = 'joshua.config'
> INFO - server-port: 5674
> INFO - serverport = '5674'
> INFO - Read 9 weights (0 of them dense)
> INFO - Reading vocabulary: phrase-table.packed/vocabulary
> INFO - Read 191983 entries from the vocabulary
> INFO - Reading packed config: phrase-table.packed/config
> 102030405060708090.100%
> Exception in thread "main" java.lang.RuntimeException: The grammar at 
> phrase-table.packed was packed with packer version 0, but the earliest 
> supported version is 3
>   at 
> org.apache.joshua.decoder.ff.tm.packed.PackedGrammar.readConfig(PackedGrammar.java:1061)
>   at 
> org.apache.joshua.decoder.ff.tm.packed.PackedGrammar.(PackedGrammar.java:143)
>   at 
> org.apache.joshua.decoder.phrase.PhraseTable.(PhraseTable.java:65)
>   at 
> org.apache.joshua.decoder.Decoder.initializeTranslationGrammars(Decoder.java:603)
>   at org.apache.joshua.decoder.Decoder.initialize(Decoder.java:514)
>   at org.apache.joshua.decoder.Decoder.(Decoder.java:126)
>   at org.apache.joshua.decoder.JoshuaDecoder.main(JoshuaDecoder.java:69)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (JOSHUA-279) Cannot build Joshua master branch

2016-07-01 Thread Lewis John McGibbney (JIRA)
Lewis John McGibbney created JOSHUA-279:
---

 Summary: Cannot build Joshua master branch
 Key: JOSHUA-279
 URL: https://issues.apache.org/jira/browse/JOSHUA-279
 Project: Joshua
  Issue Type: Bug
  Components: tests, build, documentation
Reporter: Lewis John McGibbney
Priority: Blocker
 Fix For: 6.1


Hi Folks,
We need to be cautious of whatever is committed to master branch... the build 
has been broken for quite some time and there are constant Javadoc issues which 
make the build unstable as well.
For example, when i make an attempt to build master branch we have failing tests
{code}
lmcgibbn@LMC-032857 /usr/local/incubator-joshua(master) $ mvn clean install
...
---
 T E S T S
---
Running TestSuite
tm_pt_0=-2.000 tm_glue_0=3.000 lm_0=-206.718 lm_0_oov=2.000 OOVPenalty=-200.000 
| -198.000
ERROR - * FATAL: Can't find libken.so (libken.dylib on OS X) in $JOSHUA/lib
ERROR - *This probably means that the KenLM library didn't compile.
ERROR - *Make sure that BOOST_ROOT is set to the root of your boost
ERROR - *installation (it's not /opt/local/, the default), change to
ERROR - *$JOSHUA, and type 'ant kenlm'. If problems persist, see the
ERROR - *website (joshua-decoder.org).
WARN - sentence 0 too long 401, truncating to length 200
WARN - sentence 0 too long 401, truncating to length 200
WARN - sentence 0 too long 401, truncating to length 200
WARN - sentence 0 too long 401, truncating to length 200
WARN - no grammars supplied!  Supplying dummy glue grammar.
WARN - no grammars supplied!  Supplying dummy glue grammar.
WARN - no grammars supplied!  Supplying dummy glue grammar.
WARN - no grammars supplied!  Supplying dummy glue grammar.
WARN - no grammars supplied!  Supplying dummy glue grammar.
WARN - no grammars supplied!  Supplying dummy glue grammar.
WARN - no grammars supplied!  Supplying dummy glue grammar.
WARN - no grammars supplied!  Supplying dummy glue grammar.
%
%
%
%
%
%
%
%
%
Tests run: 126, Failures: 1, Errors: 0, Skipped: 6, Time elapsed: 1.818 sec <<< 
FAILURE! - in TestSuite
setUp(org.apache.joshua.decoder.ff.lm.class_lm.ClassBasedLanguageModelTest)  
Time elapsed: 0.075 sec  <<< FAILURE!
java.lang.ExceptionInInitializerError
at 
org.apache.joshua.decoder.ff.lm.class_lm.ClassBasedLanguageModelTest.setUp(ClassBasedLanguageModelTest.java:52)
Caused by: java.lang.RuntimeException: java.lang.UnsatisfiedLinkError: no ken 
in java.library.path
at 
org.apache.joshua.decoder.ff.lm.class_lm.ClassBasedLanguageModelTest.setUp(ClassBasedLanguageModelTest.java:52)
Caused by: java.lang.UnsatisfiedLinkError: no ken in java.library.path
at 
org.apache.joshua.decoder.ff.lm.class_lm.ClassBasedLanguageModelTest.setUp(ClassBasedLanguageModelTest.java:52)


Results :

Failed tests:
org.apache.joshua.decoder.ff.lm.class_lm.ClassBasedLanguageModelTest.setUp(org.apache.joshua.decoder.ff.lm.class_lm.ClassBasedLanguageModelTest)
  Run 1: ClassBasedLanguageModelTest.setUp:52 » ExceptionInInitializer
  Run 2: PASS


Tests run: 124, Failures: 1, Errors: 0, Skipped: 4

[INFO] 
[INFO] BUILD FAILURE
{code}

As a workaround I thought I will try to build the project without running the 
test suite, however now Javadoc issues prevent me from doing so!

{code}
lmcgibbn@LMC-032857 /usr/local/incubator-joshua(master) $ mvn clean install 
-DskipTests
...
1 error
14 warnings
[INFO] 
[INFO] BUILD FAILURE
[INFO] 
[INFO] Total time: 28.144 s
[INFO] Finished at: 2016-07-01T14:11:42-07:00
[INFO] Final Memory: 37M/303M
[INFO] 
[ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-javadoc-plugin:2.8:jar (attach-javadocs) on 
project joshua: MavenReportException: Error while creating archive:
[ERROR] Exit code: 1 - 
/usr/local/incubator-joshua/src/main/java/org/apache/joshua/decoder/ff/lm/LanguageModelFF.java:217:
 warning: no @param for rule
[ERROR] public int[] getRuleIds(final Rule rule) {
[ERROR] ^
[ERROR] 
/usr/local/incubator-joshua/src/main/java/org/apache/joshua/decoder/ff/lm/LanguageModelFF.java:217:
 warning: no @return
[ERROR] public int[] getRuleIds(final Rule rule) {
[ERROR] ^
[ERROR] 
/usr/local/incubator-joshua/src/main/java/org/apache/joshua/decoder/ff/lm/LanguageModelFF.java:231:
 warning: no @param for words
[ERROR] public int getOovs(final int[] words) {
[ERROR] ^
[ERROR] 
/usr/local/incubator-joshua/src/main/java/org/apache/joshua/decoder/ff/lm/LanguageModelFF.java:231:
 warning: no @return
[ERROR] public int 

[jira] [Assigned] (JOSHUA-279) Cannot build Joshua master branch

2016-07-01 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/JOSHUA-279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney reassigned JOSHUA-279:
---

Assignee: Lewis John McGibbney

> Cannot build Joshua master branch
> -
>
> Key: JOSHUA-279
> URL: https://issues.apache.org/jira/browse/JOSHUA-279
> Project: Joshua
>  Issue Type: Bug
>  Components: build, documentation, tests
>Reporter: Lewis John McGibbney
>Assignee: Lewis John McGibbney
>Priority: Blocker
> Fix For: 6.1
>
>
> Hi Folks,
> We need to be cautious of whatever is committed to master branch... the build 
> has been broken for quite some time and there are constant Javadoc issues 
> which make the build unstable as well.
> For example, when i make an attempt to build master branch we have failing 
> tests
> {code}
> lmcgibbn@LMC-032857 /usr/local/incubator-joshua(master) $ mvn clean install
> ...
> ---
>  T E S T S
> ---
> Running TestSuite
> tm_pt_0=-2.000 tm_glue_0=3.000 lm_0=-206.718 lm_0_oov=2.000 
> OOVPenalty=-200.000 | -198.000
> ERROR - * FATAL: Can't find libken.so (libken.dylib on OS X) in $JOSHUA/lib
> ERROR - *This probably means that the KenLM library didn't compile.
> ERROR - *Make sure that BOOST_ROOT is set to the root of your boost
> ERROR - *installation (it's not /opt/local/, the default), change to
> ERROR - *$JOSHUA, and type 'ant kenlm'. If problems persist, see the
> ERROR - *website (joshua-decoder.org).
> WARN - sentence 0 too long 401, truncating to length 200
> WARN - sentence 0 too long 401, truncating to length 200
> WARN - sentence 0 too long 401, truncating to length 200
> WARN - sentence 0 too long 401, truncating to length 200
> WARN - no grammars supplied!  Supplying dummy glue grammar.
> WARN - no grammars supplied!  Supplying dummy glue grammar.
> WARN - no grammars supplied!  Supplying dummy glue grammar.
> WARN - no grammars supplied!  Supplying dummy glue grammar.
> WARN - no grammars supplied!  Supplying dummy glue grammar.
> WARN - no grammars supplied!  Supplying dummy glue grammar.
> WARN - no grammars supplied!  Supplying dummy glue grammar.
> WARN - no grammars supplied!  Supplying dummy glue grammar.
> %
> %
> %
> %
> %
> %
> %
> %
> %
> Tests run: 126, Failures: 1, Errors: 0, Skipped: 6, Time elapsed: 1.818 sec 
> <<< FAILURE! - in TestSuite
> setUp(org.apache.joshua.decoder.ff.lm.class_lm.ClassBasedLanguageModelTest)  
> Time elapsed: 0.075 sec  <<< FAILURE!
> java.lang.ExceptionInInitializerError
>   at 
> org.apache.joshua.decoder.ff.lm.class_lm.ClassBasedLanguageModelTest.setUp(ClassBasedLanguageModelTest.java:52)
> Caused by: java.lang.RuntimeException: java.lang.UnsatisfiedLinkError: no ken 
> in java.library.path
>   at 
> org.apache.joshua.decoder.ff.lm.class_lm.ClassBasedLanguageModelTest.setUp(ClassBasedLanguageModelTest.java:52)
> Caused by: java.lang.UnsatisfiedLinkError: no ken in java.library.path
>   at 
> org.apache.joshua.decoder.ff.lm.class_lm.ClassBasedLanguageModelTest.setUp(ClassBasedLanguageModelTest.java:52)
> Results :
> Failed tests:
> org.apache.joshua.decoder.ff.lm.class_lm.ClassBasedLanguageModelTest.setUp(org.apache.joshua.decoder.ff.lm.class_lm.ClassBasedLanguageModelTest)
>   Run 1: ClassBasedLanguageModelTest.setUp:52 » ExceptionInInitializer
>   Run 2: PASS
> Tests run: 124, Failures: 1, Errors: 0, Skipped: 4
> [INFO] 
> 
> [INFO] BUILD FAILURE
> {code}
> As a workaround I thought I will try to build the project without running the 
> test suite, however now Javadoc issues prevent me from doing so!
> {code}
> lmcgibbn@LMC-032857 /usr/local/incubator-joshua(master) $ mvn clean install 
> -DskipTests
> ...
> 1 error
> 14 warnings
> [INFO] 
> 
> [INFO] BUILD FAILURE
> [INFO] 
> 
> [INFO] Total time: 28.144 s
> [INFO] Finished at: 2016-07-01T14:11:42-07:00
> [INFO] Final Memory: 37M/303M
> [INFO] 
> 
> [ERROR] Failed to execute goal 
> org.apache.maven.plugins:maven-javadoc-plugin:2.8:jar (attach-javadocs) on 
> project joshua: MavenReportException: Error while creating archive:
> [ERROR] Exit code: 1 - 
> /usr/local/incubator-joshua/src/main/java/org/apache/joshua/decoder/ff/lm/LanguageModelFF.java:217:
>  warning: no @param for rule
> [ERROR] public int[] getRuleIds(final Rule rule) {
> [ERROR] ^
> [ERROR] 
> /usr/local/incubator-joshua/src/main/java/org/apache/joshua/decoder/ff/lm/LanguageModelFF.java:217:
>  warning: no 

[jira] [Commented] (JOSHUA-279) Cannot build Joshua master branch

2016-07-01 Thread Lewis John McGibbney (JIRA)

[ 
https://issues.apache.org/jira/browse/JOSHUA-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15359676#comment-15359676
 ] 

Lewis John McGibbney commented on JOSHUA-279:
-

commit 342312e309ec1bb9b1074688c1fbd3897783bc49
Author: Lewis John McGibbney 
Date:   Fri Jul 1 14:40:44 2016 -0700

JOSHUA-279 Cannot build Joshua master branch

The above commit fixes the Javadoc and I can now build. The test suite is still 
failing so I am still building with the -DskipTests flag

> Cannot build Joshua master branch
> -
>
> Key: JOSHUA-279
> URL: https://issues.apache.org/jira/browse/JOSHUA-279
> Project: Joshua
>  Issue Type: Bug
>  Components: build, documentation, tests
>Reporter: Lewis John McGibbney
>Priority: Blocker
> Fix For: 6.1
>
>
> Hi Folks,
> We need to be cautious of whatever is committed to master branch... the build 
> has been broken for quite some time and there are constant Javadoc issues 
> which make the build unstable as well.
> For example, when i make an attempt to build master branch we have failing 
> tests
> {code}
> lmcgibbn@LMC-032857 /usr/local/incubator-joshua(master) $ mvn clean install
> ...
> ---
>  T E S T S
> ---
> Running TestSuite
> tm_pt_0=-2.000 tm_glue_0=3.000 lm_0=-206.718 lm_0_oov=2.000 
> OOVPenalty=-200.000 | -198.000
> ERROR - * FATAL: Can't find libken.so (libken.dylib on OS X) in $JOSHUA/lib
> ERROR - *This probably means that the KenLM library didn't compile.
> ERROR - *Make sure that BOOST_ROOT is set to the root of your boost
> ERROR - *installation (it's not /opt/local/, the default), change to
> ERROR - *$JOSHUA, and type 'ant kenlm'. If problems persist, see the
> ERROR - *website (joshua-decoder.org).
> WARN - sentence 0 too long 401, truncating to length 200
> WARN - sentence 0 too long 401, truncating to length 200
> WARN - sentence 0 too long 401, truncating to length 200
> WARN - sentence 0 too long 401, truncating to length 200
> WARN - no grammars supplied!  Supplying dummy glue grammar.
> WARN - no grammars supplied!  Supplying dummy glue grammar.
> WARN - no grammars supplied!  Supplying dummy glue grammar.
> WARN - no grammars supplied!  Supplying dummy glue grammar.
> WARN - no grammars supplied!  Supplying dummy glue grammar.
> WARN - no grammars supplied!  Supplying dummy glue grammar.
> WARN - no grammars supplied!  Supplying dummy glue grammar.
> WARN - no grammars supplied!  Supplying dummy glue grammar.
> %
> %
> %
> %
> %
> %
> %
> %
> %
> Tests run: 126, Failures: 1, Errors: 0, Skipped: 6, Time elapsed: 1.818 sec 
> <<< FAILURE! - in TestSuite
> setUp(org.apache.joshua.decoder.ff.lm.class_lm.ClassBasedLanguageModelTest)  
> Time elapsed: 0.075 sec  <<< FAILURE!
> java.lang.ExceptionInInitializerError
>   at 
> org.apache.joshua.decoder.ff.lm.class_lm.ClassBasedLanguageModelTest.setUp(ClassBasedLanguageModelTest.java:52)
> Caused by: java.lang.RuntimeException: java.lang.UnsatisfiedLinkError: no ken 
> in java.library.path
>   at 
> org.apache.joshua.decoder.ff.lm.class_lm.ClassBasedLanguageModelTest.setUp(ClassBasedLanguageModelTest.java:52)
> Caused by: java.lang.UnsatisfiedLinkError: no ken in java.library.path
>   at 
> org.apache.joshua.decoder.ff.lm.class_lm.ClassBasedLanguageModelTest.setUp(ClassBasedLanguageModelTest.java:52)
> Results :
> Failed tests:
> org.apache.joshua.decoder.ff.lm.class_lm.ClassBasedLanguageModelTest.setUp(org.apache.joshua.decoder.ff.lm.class_lm.ClassBasedLanguageModelTest)
>   Run 1: ClassBasedLanguageModelTest.setUp:52 » ExceptionInInitializer
>   Run 2: PASS
> Tests run: 124, Failures: 1, Errors: 0, Skipped: 4
> [INFO] 
> 
> [INFO] BUILD FAILURE
> {code}
> As a workaround I thought I will try to build the project without running the 
> test suite, however now Javadoc issues prevent me from doing so!
> {code}
> lmcgibbn@LMC-032857 /usr/local/incubator-joshua(master) $ mvn clean install 
> -DskipTests
> ...
> 1 error
> 14 warnings
> [INFO] 
> 
> [INFO] BUILD FAILURE
> [INFO] 
> 
> [INFO] Total time: 28.144 s
> [INFO] Finished at: 2016-07-01T14:11:42-07:00
> [INFO] Final Memory: 37M/303M
> [INFO] 
> 
> [ERROR] Failed to execute goal 
> org.apache.maven.plugins:maven-javadoc-plugin:2.8:jar (attach-javadocs) on 
> project joshua: MavenReportException: Error while creating archive:
> [ERROR] Exit code: 1 - 
> 

[jira] [Commented] (JOSHUA-324) Address Apache Joshua 6.1 RC#2 Issues

2017-01-25 Thread Lewis John McGibbney (JIRA)

[ 
https://issues.apache.org/jira/browse/JOSHUA-324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15838087#comment-15838087
 ] 

Lewis John McGibbney commented on JOSHUA-324:
-

[~post] the only pending issue is the mvn assembly issue I described at 
http://www.mail-archive.com/dev%40joshua.incubator.apache.org/msg02023.html
I'll have a crack today and try to resolve it.

> Address Apache Joshua 6.1 RC#2 Issues
> -
>
> Key: JOSHUA-324
> URL: https://issues.apache.org/jira/browse/JOSHUA-324
> Project: Joshua
>  Issue Type: Task
>Affects Versions: 6.1
>Reporter: Lewis John McGibbney
>Assignee: Lewis John McGibbney
>Priority: Blocker
> Fix For: 6.1
>
>
> Feedback from [~jmclean] (thank you Justin) on our RC#2 is as follows
> {code}
> ==
> - Your missing incubating in the release artifacts name. [1]
> - There are a number of binary files in the source release that look to be
> compiled source code.
> I checked:
> - name doesn’t include incubating
> - signatures and hashes correct
> - DISCLAIMER exists
> - LICENSE is missing a few things (see below)
> - a source file is missing an Apache header [7]
> - Several unexpected binary files are contained in the source release
> [8][9][10][11]
> - Can compile from source
> License is missing:
> - MIT licensed normalize.css v3.0.3 bundled in [5]
> - glyph icon fonts [6]
> Not an issue but it's a little odd to have LICENSE and NOTICE.txt - usually
> both are bare or both have .txt extension.
> Also while looking at your site I noticed that the download links of you
> incubating site [2] points to github, please change to point to the offical
> release area.
> Also the 6.1 release has already been tagged and it available for public
> download on github [4]  before this vote is finished. This is IMO against
> Apache release policy [3] please remove.
> I also notice you recently released the language packs (18th Nov) but there
> doesn’t seem to have been a vote for that? Any reason for this?
> ===
> [1] http://incubator.apache.org/incubation/Incubation_Policy.html#Releases
> [2] 
> https://cwiki.apache.org/confluence/display/JOSHUA/Apache+Joshua+%28Incubating%29+Home
> [3] http://www.apache.org/dev/release.html#what
> [4] https://github.com/apache/incubator-joshua/releases
> [5] ./demo/bootstrap/css/bootstrap.min.css
> [6] apache-joshua-6.1/demo/bootstrap/fonts/*
> [7] ./src/test/java/org/apache/joshua/decoder/ff/tm/OwnerMapTest.java
> [8] ./bin/GIZA++
> [9] ./bin/mkcls
> [10 ]./bin/snt2cooc.out
> [11] ,/src/test/resources/berkeley_lm/lm.berkeleylm.gz
> [12] http://www.mail-archive.com/general%40incubator.apache.org/msg57543.html
> [13] http://www.mail-archive.com/general%40incubator.apache.org/msg57551.html
> {code}
> This is a blocking issue and until addressed we cannot release 6.1-incubating



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (JOSHUA-324) Address Apache Joshua 6.1 RC#2 Issues

2017-02-21 Thread Lewis John McGibbney (JIRA)

[ 
https://issues.apache.org/jira/browse/JOSHUA-324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15876442#comment-15876442
 ] 

Lewis John McGibbney commented on JOSHUA-324:
-

[~teofili] yes thank you very much, please do.

> Address Apache Joshua 6.1 RC#2 Issues
> -
>
> Key: JOSHUA-324
> URL: https://issues.apache.org/jira/browse/JOSHUA-324
> Project: Joshua
>  Issue Type: Task
>Affects Versions: 6.1
>Reporter: Lewis John McGibbney
>Assignee: Tommaso Teofili
>Priority: Blocker
> Fix For: 6.1
>
>
> Feedback from [~jmclean] (thank you Justin) on our RC#2 is as follows
> {code}
> ==
> - Your missing incubating in the release artifacts name. [1]
> - There are a number of binary files in the source release that look to be
> compiled source code.
> I checked:
> - name doesn’t include incubating
> - signatures and hashes correct
> - DISCLAIMER exists
> - LICENSE is missing a few things (see below)
> - a source file is missing an Apache header [7]
> - Several unexpected binary files are contained in the source release
> [8][9][10][11]
> - Can compile from source
> License is missing:
> - MIT licensed normalize.css v3.0.3 bundled in [5]
> - glyph icon fonts [6]
> Not an issue but it's a little odd to have LICENSE and NOTICE.txt - usually
> both are bare or both have .txt extension.
> Also while looking at your site I noticed that the download links of you
> incubating site [2] points to github, please change to point to the offical
> release area.
> Also the 6.1 release has already been tagged and it available for public
> download on github [4]  before this vote is finished. This is IMO against
> Apache release policy [3] please remove.
> I also notice you recently released the language packs (18th Nov) but there
> doesn’t seem to have been a vote for that? Any reason for this?
> ===
> [1] http://incubator.apache.org/incubation/Incubation_Policy.html#Releases
> [2] 
> https://cwiki.apache.org/confluence/display/JOSHUA/Apache+Joshua+%28Incubating%29+Home
> [3] http://www.apache.org/dev/release.html#what
> [4] https://github.com/apache/incubator-joshua/releases
> [5] ./demo/bootstrap/css/bootstrap.min.css
> [6] apache-joshua-6.1/demo/bootstrap/fonts/*
> [7] ./src/test/java/org/apache/joshua/decoder/ff/tm/OwnerMapTest.java
> [8] ./bin/GIZA++
> [9] ./bin/mkcls
> [10 ]./bin/snt2cooc.out
> [11] ,/src/test/resources/berkeley_lm/lm.berkeleylm.gz
> [12] http://www.mail-archive.com/general%40incubator.apache.org/msg57543.html
> [13] http://www.mail-archive.com/general%40incubator.apache.org/msg57551.html
> {code}
> This is a blocking issue and until addressed we cannot release 6.1-incubating



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (JOSHUA-324) Address Apache Joshua 6.1 RC#2 Issues

2017-01-17 Thread Lewis John McGibbney (JIRA)

[ 
https://issues.apache.org/jira/browse/JOSHUA-324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15827360#comment-15827360
 ] 

Lewis John McGibbney commented on JOSHUA-324:
-

I'll be finishing my QA and producing an RC#3 tomorrow folks. Thanks.
I've just committed 
{code}
commit ae755a8bc0b1de9475285fcc8d35d8a8b5f00a6f
Author: Lewis John McGibbney 
Date:   Tue Jan 17 19:12:10 2017 -0800

JOSHUA-324 Address Apache Joshua 6.1 RC#2 Issues
{code}

> Address Apache Joshua 6.1 RC#2 Issues
> -
>
> Key: JOSHUA-324
> URL: https://issues.apache.org/jira/browse/JOSHUA-324
> Project: Joshua
>  Issue Type: Task
>Affects Versions: 6.1
>Reporter: Lewis John McGibbney
>Assignee: Lewis John McGibbney
>Priority: Blocker
> Fix For: 6.1
>
>
> Feedback from [~jmclean] (thank you Justin) on our RC#2 is as follows
> {code}
> ==
> - Your missing incubating in the release artifacts name. [1]
> - There are a number of binary files in the source release that look to be
> compiled source code.
> I checked:
> - name doesn’t include incubating
> - signatures and hashes correct
> - DISCLAIMER exists
> - LICENSE is missing a few things (see below)
> - a source file is missing an Apache header [7]
> - Several unexpected binary files are contained in the source release
> [8][9][10][11]
> - Can compile from source
> License is missing:
> - MIT licensed normalize.css v3.0.3 bundled in [5]
> - glyph icon fonts [6]
> Not an issue but it's a little odd to have LICENSE and NOTICE.txt - usually
> both are bare or both have .txt extension.
> Also while looking at your site I noticed that the download links of you
> incubating site [2] points to github, please change to point to the offical
> release area.
> Also the 6.1 release has already been tagged and it available for public
> download on github [4]  before this vote is finished. This is IMO against
> Apache release policy [3] please remove.
> I also notice you recently released the language packs (18th Nov) but there
> doesn’t seem to have been a vote for that? Any reason for this?
> ===
> [1] http://incubator.apache.org/incubation/Incubation_Policy.html#Releases
> [2] 
> https://cwiki.apache.org/confluence/display/JOSHUA/Apache+Joshua+%28Incubating%29+Home
> [3] http://www.apache.org/dev/release.html#what
> [4] https://github.com/apache/incubator-joshua/releases
> [5] ./demo/bootstrap/css/bootstrap.min.css
> [6] apache-joshua-6.1/demo/bootstrap/fonts/*
> [7] ./src/test/java/org/apache/joshua/decoder/ff/tm/OwnerMapTest.java
> [8] ./bin/GIZA++
> [9] ./bin/mkcls
> [10 ]./bin/snt2cooc.out
> [11] ,/src/test/resources/berkeley_lm/lm.berkeleylm.gz
> [12] http://www.mail-archive.com/general%40incubator.apache.org/msg57543.html
> [13] http://www.mail-archive.com/general%40incubator.apache.org/msg57551.html
> {code}
> This is a blocking issue and until addressed we cannot release 6.1-incubating



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (JOSHUA-297) List supported versions of Hadoop

2016-08-24 Thread Lewis John McGibbney (JIRA)

[ 
https://issues.apache.org/jira/browse/JOSHUA-297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15436177#comment-15436177
 ] 

Lewis John McGibbney commented on JOSHUA-297:
-

The supported version is 2.5.2
https://github.com/joshua-decoder/thrax/blob/master/.classpath#L8


> List supported versions of Hadoop
> -
>
> Key: JOSHUA-297
> URL: https://issues.apache.org/jira/browse/JOSHUA-297
> Project: Joshua
>  Issue Type: Task
>Reporter: Bob Paulin
>Assignee: Matt Post
>Priority: Minor
> Fix For: 6.1
>
> Attachments: thrax-hadoop0.20.2.log, thrax-hadoop2.6.4.log
>
>
> When working through the training tutorial I noticed that no version of 
> Hadoop was listed so I tried the latest Hadoop 2.6.4.  The Thrax Job failed 
> on this version.  It worked however with 0.20.2 .  I found this on 
> http://joshua.incubator.apache.org/6.0/pipeline.html by hovering over a link 
> on the Hadoop section.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (JOSHUA-304) word-align.conf alignment template file not compatible with berkeley aligner

2016-08-29 Thread Lewis John McGibbney (JIRA)

[ 
https://issues.apache.org/jira/browse/JOSHUA-304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15446643#comment-15446643
 ] 

Lewis John McGibbney commented on JOSHUA-304:
-

Hi [~post]
What new steps did you actually add?
I've wiped everything that was generated by Joshua. I've rebuilt JOSHUA-304 
branch. I'm getting the following

{code}
$JOSHUA/bin/pipeline.pl --type hiero --rundir 
/usr/local/jpl/xdata/joshua_experiments/fisher_callhome_experiment/0 --readme 
"Baseline Hiero run 0 --lm-gen berkeleylm --lm berkeleylm --aligner berkeley 
JOSHUA-304" --source es --target en --lm-gen berkeleylm --lm berkeleylm 
--aligner berkeley --corpus $SPANISH/corpus/asr/callhome_train --corpus 
$SPANISH/corpus/asr/fisher_train --tune  $SPANISH/corpus/asr/fisher_dev --test  
$SPANISH/corpus/asr/callhome_devtest
...
snip
...
[test-vocab-es] rebuilding...
  
dep=/usr/local/jpl/xdata/joshua_experiments/fisher_callhome_experiment/0/data/test/corpus.es
 [CHANGED]
  
dep=/usr/local/jpl/xdata/joshua_experiments/fisher_callhome_experiment/0/data/test/vocab.es
 [NOT FOUND]
  cmd=cat 
/usr/local/jpl/xdata/joshua_experiments/fisher_callhome_experiment/0/data/test/corpus.es
 | /usr/local/incubator-joshua/scripts/training/build-vocab.pl > 
/usr/local/jpl/xdata/joshua_experiments/fisher_callhome_experiment/0/data/test/vocab.es
  took 0 seconds (0s)
[test-vocab-en] rebuilding...
  
dep=/usr/local/jpl/xdata/joshua_experiments/fisher_callhome_experiment/0/data/test/corpus.en
 [CHANGED]
  
dep=/usr/local/jpl/xdata/joshua_experiments/fisher_callhome_experiment/0/data/test/vocab.en
 [NOT FOUND]
  cmd=cat 
/usr/local/jpl/xdata/joshua_experiments/fisher_callhome_experiment/0/data/test/corpus.en
 | /usr/local/incubator-joshua/scripts/training/build-vocab.pl > 
/usr/local/jpl/xdata/joshua_experiments/fisher_callhome_experiment/0/data/test/vocab.en
  took 0 seconds (0s)
[source-numlines] rebuilding...
  
dep=/usr/local/jpl/xdata/joshua_experiments/fisher_callhome_experiment/0/data/train/corpus.es
 [CHANGED]
  cmd=cat 
/usr/local/jpl/xdata/joshua_experiments/fisher_callhome_experiment/0/data/train/corpus.es
 | wc -l
  took 0 seconds (0s)
[source-numlines] retrieved cached result =>   151810
[berkeley-aligner-chunk-0] rebuilding...
  dep=alignments/0/word-align.conf [CHANGED]
  
dep=/usr/local/jpl/xdata/joshua_experiments/fisher_callhome_experiment/0/data/train/splits/corpus.es.0
 [NOT FOUND]
  
dep=/usr/local/jpl/xdata/joshua_experiments/fisher_callhome_experiment/0/data/train/splits/corpus.en.0
 [NOT FOUND]
  dep=alignments/0/training.align [NOT FOUND]
  cmd=java -d64 -Xmx10g -jar 
/usr/local/incubator-joshua/ext/berkeleyaligner/distribution/berkeleyaligner.jar
 ++alignments/0/word-align.conf
  JOB FAILED (return code 1)
[aligner-combine] rebuilding...
  dep=alignments/0/training.en-es.align [NOT FOUND]
  dep=alignments/training.align [NOT FOUND]
  cmd=cat alignments/0/training.en-es.align > alignments/training.align
  JOB FAILED (return code 1)
cat: alignments/0/training.en-es.align: No such file or directory
{code}

> word-align.conf alignment template file not compatible with berkeley aligner
> 
>
> Key: JOSHUA-304
> URL: https://issues.apache.org/jira/browse/JOSHUA-304
> Project: Joshua
>  Issue Type: Bug
>  Components: alignment, berkeley, templates
>Affects Versions: 6.0.5
>Reporter: Lewis John McGibbney
>Priority: Blocker
> Fix For: 6.1
>
>
> It takes me quite some time to debug what was going on and why pipeline's 
> were failing when using the berkeley aligner.
> It turns out that the word-align.conf template provided at
> https://github.com/apache/incubator-joshua/blob/master/scripts/training/templates/alignment/word-align.conf
> is not compatible with the berkeley aligner. 
> In particular the following lines are non compatible
> https://github.com/apache/incubator-joshua/blob/master/scripts/training/templates/alignment/word-align.conf#L12-L15
> Evidence of this is provided below
> {code}
> lmcgibbn@LMC-032857 /usr/local/incubator-joshua/lib(master) $ java -d64 
> -Xmx10g -jar /usr/local/incubator-joshua/lib/berkeleyaligner.jar 
> ++/usr/local/incubator-joshua/experiments/fisher_callhome_experiment/6/alignments/0/word-align.conf
> Invalid enum: 'MODEL1 HMM'; valid choices: MODEL1|MODEL2|HMM|SYNTACTIC|NONE
> lmcgibbn@LMC-032857 /usr/local/incubator-joshua/lib(master) $ java -d64 
> -Xmx10g -jar /usr/local/incubator-joshua/lib/berkeleyaligner.jar 
> ++/usr/local/incubator-joshua/experiments/fisher_callhome_experiment/6/alignments/0/word-align.conf
> Invalid enum: 'MODEL1, HMM'; valid choices: MODEL1|MODEL2|HMM|SYNTACTIC|NONE
> lmcgibbn@LMC-032857 /usr/local/incubator-joshua/lib(master) $ java -d64 
> -Xmx10g -jar /usr/local/incubator-joshua/lib/berkeleyaligner.jar 
> 

[jira] [Commented] (JOSHUA-304) word-align.conf alignment template file not compatible with berkeley aligner

2016-08-29 Thread Lewis John McGibbney (JIRA)

[ 
https://issues.apache.org/jira/browse/JOSHUA-304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15446876#comment-15446876
 ] 

Lewis John McGibbney commented on JOSHUA-304:
-

[~post] np at all. No need for sorry.
I just tested after clean download of third party deps that this works a charm. 
Thanks for looking in to it I really appreciate it.
I am +1 for merge into master and resolve this as fixed [~post]

> word-align.conf alignment template file not compatible with berkeley aligner
> 
>
> Key: JOSHUA-304
> URL: https://issues.apache.org/jira/browse/JOSHUA-304
> Project: Joshua
>  Issue Type: Bug
>  Components: alignment, berkeley, templates
>Affects Versions: 6.0.5
>Reporter: Lewis John McGibbney
>Priority: Blocker
> Fix For: 6.1
>
>
> It takes me quite some time to debug what was going on and why pipeline's 
> were failing when using the berkeley aligner.
> It turns out that the word-align.conf template provided at
> https://github.com/apache/incubator-joshua/blob/master/scripts/training/templates/alignment/word-align.conf
> is not compatible with the berkeley aligner. 
> In particular the following lines are non compatible
> https://github.com/apache/incubator-joshua/blob/master/scripts/training/templates/alignment/word-align.conf#L12-L15
> Evidence of this is provided below
> {code}
> lmcgibbn@LMC-032857 /usr/local/incubator-joshua/lib(master) $ java -d64 
> -Xmx10g -jar /usr/local/incubator-joshua/lib/berkeleyaligner.jar 
> ++/usr/local/incubator-joshua/experiments/fisher_callhome_experiment/6/alignments/0/word-align.conf
> Invalid enum: 'MODEL1 HMM'; valid choices: MODEL1|MODEL2|HMM|SYNTACTIC|NONE
> lmcgibbn@LMC-032857 /usr/local/incubator-joshua/lib(master) $ java -d64 
> -Xmx10g -jar /usr/local/incubator-joshua/lib/berkeleyaligner.jar 
> ++/usr/local/incubator-joshua/experiments/fisher_callhome_experiment/6/alignments/0/word-align.conf
> Invalid enum: 'MODEL1, HMM'; valid choices: MODEL1|MODEL2|HMM|SYNTACTIC|NONE
> lmcgibbn@LMC-032857 /usr/local/incubator-joshua/lib(master) $ java -d64 
> -Xmx10g -jar /usr/local/incubator-joshua/lib/berkeleyaligner.jar 
> ++/usr/local/incubator-joshua/experiments/fisher_callhome_experiment/6/alignments/0/word-align.conf
> Invalid enum: 'MODEL1 HMM'; valid choices: MODEL1|MODEL2|HMM|SYNTACTIC|NONE
> lmcgibbn@LMC-032857 /usr/local/incubator-joshua/lib(master) $ java -d64 
> -Xmx10g -jar /usr/local/incubator-joshua/lib/berkeleyaligner.jar 
> ++/usr/local/incubator-joshua/experiments/fisher_callhome_experiment/6/alignments/0/word-align.conf
> Invalid enum: 'JOINT JOINT'; valid choices: FORWARD|REVERSE|BOTH_INDEP|JOINT
> lmcgibbn@LMC-032857 /usr/local/incubator-joshua/lib(master) $ java -d64 
> -Xmx10g -jar /usr/local/incubator-joshua/lib/berkeleyaligner.jar 
> ++/usr/local/incubator-joshua/experiments/fisher_callhome_experiment/6/alignments/0/word-align.conf
> Exception in thread "main" java.lang.NumberFormatException: For input string: 
> "5 5"
>   at 
> java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
>   at java.lang.Integer.parseInt(Integer.java:580)
>   at java.lang.Integer.parseInt(Integer.java:615)
>   at 
> edu.berkeley.nlp.fig.basic.OptInfo.interpretValue(OptionsParser.java:143)
>   at 
> edu.berkeley.nlp.fig.basic.OptInfo.interpretValue(OptionsParser.java:240)
>   at edu.berkeley.nlp.fig.basic.OptInfo.set(OptionsParser.java:294)
>   at 
> edu.berkeley.nlp.fig.basic.OptionsParser.readOptionsFile(OptionsParser.java:555)
>   at 
> edu.berkeley.nlp.fig.basic.OptionsParser.doParse(OptionsParser.java:604)
>   at edu.berkeley.nlp.fig.exec.Execution.init(Execution.java:293)
>   at edu.berkeley.nlp.wordAlignment.Main.main(Main.java:149)
> lmcgibbn@LMC-032857 /usr/local/incubator-joshua/lib(master) $ java -d64 
> -Xmx10g -jar /usr/local/incubator-joshua/lib/berkeleyaligner.jar 
> ++/usr/local/incubator-joshua/experiments/fisher_callhome_experiment/6/alignments/0/word-align.conf
> Cannot create directory: alignments/0
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (JOSHUA-299) Move regression tests to proper unit tests

2016-09-07 Thread lewis john mcgibbney (JIRA)

[ 
https://issues.apache.org/jira/browse/JOSHUA-299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15471850#comment-15471850
 ] 

lewis john mcgibbney commented on JOSHUA-299:
-

Nope did not sorry. Please progress!




-- 
http://home.apache.org/~lewismc/
@hectorMcSpector
http://www.linkedin.com/in/lmcgibbney


> Move regression tests to proper unit tests
> --
>
> Key: JOSHUA-299
> URL: https://issues.apache.org/jira/browse/JOSHUA-299
> Project: Joshua
>  Issue Type: Bug
>Reporter: Matt Post
>Assignee: Lewis John McGibbney
> Fix For: 6.1
>
>
> Many of the regression tests (test*.sh under src/test/resources) have been 
> moved to proper unit tests, but this move should be completed, and the 
> regression tests should be deleted. This should be done for 6.1



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (JOSHUA-299) Move regression tests to proper unit tests

2016-09-09 Thread lewis john mcgibbney (JIRA)

[ 
https://issues.apache.org/jira/browse/JOSHUA-299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15477498#comment-15477498
 ] 

lewis john mcgibbney commented on JOSHUA-299:
-

Mvn clean test is the way to go




-- 
http://home.apache.org/~lewismc/
@hectorMcSpector
http://www.linkedin.com/in/lmcgibbney


> Move regression tests to proper unit tests
> --
>
> Key: JOSHUA-299
> URL: https://issues.apache.org/jira/browse/JOSHUA-299
> Project: Joshua
>  Issue Type: Bug
>Reporter: Matt Post
>Assignee: Lewis John McGibbney
> Fix For: 6.1
>
>
> Many of the regression tests (test*.sh under src/test/resources) have been 
> moved to proper unit tests, but this move should be completed, and the 
> regression tests should be deleted. This should be done for 6.1



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (JOSHUA-312) Even though alignment is cached, it is always re-done in pipeline re-execution

2016-09-22 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/JOSHUA-312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney updated JOSHUA-312:

Summary: Even though alignment is cached, it is always re-done in pipeline 
re-execution  (was: Alignment is never cached)

> Even though alignment is cached, it is always re-done in pipeline re-execution
> --
>
> Key: JOSHUA-312
> URL: https://issues.apache.org/jira/browse/JOSHUA-312
> Project: Joshua
>  Issue Type: Improvement
>  Components: alignment
>Affects Versions: 6.0.5
>Reporter: Lewis John McGibbney
>Priority: Critical
> Fix For: 6.2
>
>
> Say if a pipeline fails after alignment. The alignment result is never cached 
> and it becomes necessary to undertake alignment... again!
> We should investigate the process for caching alignments as it would really 
> speed up rerunning end-to-end pipelines for large input datasets.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (JOSHUA-312) Alignment is never cached

2016-09-21 Thread Lewis John McGibbney (JIRA)
Lewis John McGibbney created JOSHUA-312:
---

 Summary: Alignment is never cached
 Key: JOSHUA-312
 URL: https://issues.apache.org/jira/browse/JOSHUA-312
 Project: Joshua
  Issue Type: Improvement
  Components: alignment
Affects Versions: 6.0.5
Reporter: Lewis John McGibbney
Priority: Critical
 Fix For: 6.2


Say if a pipeline fails after alignment. The alignment result is never cached 
and it becomes necessary to undertake alignment... again!
We should investigate the process for caching alignments as it would really 
speed up rerunning end-to-end pipelines for large input datasets.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (JOSHUA-299) Move regression tests to proper unit tests

2016-08-22 Thread Lewis John McGibbney (JIRA)

[ 
https://issues.apache.org/jira/browse/JOSHUA-299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15432062#comment-15432062
 ] 

Lewis John McGibbney commented on JOSHUA-299:
-

I'll scope this issue tomorrow [~post] and see if I can get a PR together.

> Move regression tests to proper unit tests
> --
>
> Key: JOSHUA-299
> URL: https://issues.apache.org/jira/browse/JOSHUA-299
> Project: Joshua
>  Issue Type: Bug
>Reporter: Matt Post
>Assignee: Lewis John McGibbney
> Fix For: 6.1
>
>
> Many of the regression tests (test*.sh under src/test/resources) have been 
> moved to proper unit tests, but this move should be completed, and the 
> regression tests should be deleted. This should be done for 6.1



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (JOSHUA-299) Move regression tests to proper unit tests

2016-08-22 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/JOSHUA-299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney reassigned JOSHUA-299:
---

Assignee: Lewis John McGibbney

> Move regression tests to proper unit tests
> --
>
> Key: JOSHUA-299
> URL: https://issues.apache.org/jira/browse/JOSHUA-299
> Project: Joshua
>  Issue Type: Bug
>Reporter: Matt Post
>Assignee: Lewis John McGibbney
> Fix For: 6.1
>
>
> Many of the regression tests (test*.sh under src/test/resources) have been 
> moved to proper unit tests, but this move should be completed, and the 
> regression tests should be deleted. This should be done for 6.1



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (JOSHUA-304) word-align.conf alignment template file not compatible with berkeley aligner

2016-08-23 Thread Lewis John McGibbney (JIRA)

[ 
https://issues.apache.org/jira/browse/JOSHUA-304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15434164#comment-15434164
 ] 

Lewis John McGibbney commented on JOSHUA-304:
-

It should be noted that in order for me to override the exceptions thrown above 
the template ended up looking like the following
{code}
## word-align.conf
## --
## This is an example training script for the Berkeley
## word aligner.  In this configuration it uses two HMM
## alignment models trained jointly and then decoded 
## using the competitive thresholding heuristic.

##
# Training: Defines the training regimen 
##

forwardModels   HMM
reverseModels   HMM
modeJOINT
iters   5

###
# Execution: Controls output and program flow 
###

execDir alignments/0
create
saveParams  false
numThreads  1
msPerLine   1
alignTraining

#
# Language/Data 
#

foreignSuffix   es.0
englishSuffix   en.0

# Choose the training sources, which can either be directories or files that 
list files/directories
trainSources 
/usr/local/incubator-joshua/experiments/fisher_callhome_experiment/6/data/train/splits/corpus
sentencesMAX
testSources /dev/null
overwriteExecDir true

#
# 1-best output 
#

competitiveThresholding

{code}

> word-align.conf alignment template file not compatible with berkeley aligner
> 
>
> Key: JOSHUA-304
> URL: https://issues.apache.org/jira/browse/JOSHUA-304
> Project: Joshua
>  Issue Type: Bug
>  Components: alignment, berkeley, templates
>Affects Versions: 6.0.5
>Reporter: Lewis John McGibbney
>Priority: Blocker
> Fix For: 6.1
>
>
> It takes me quite some time to debug what was going on and why pipeline's 
> were failing when using the berkeley aligner.
> It turns out that the word-align.conf template provided at
> https://github.com/apache/incubator-joshua/blob/master/scripts/training/templates/alignment/word-align.conf
> is not compatible with the berkeley aligner. 
> In particular the following lines are non compatible
> https://github.com/apache/incubator-joshua/blob/master/scripts/training/templates/alignment/word-align.conf#L12-L15
> Evidence of this is provided below
> {code}
> lmcgibbn@LMC-032857 /usr/local/incubator-joshua/lib(master) $ java -d64 
> -Xmx10g -jar /usr/local/incubator-joshua/lib/berkeleyaligner.jar 
> ++/usr/local/incubator-joshua/experiments/fisher_callhome_experiment/6/alignments/0/word-align.conf
> Invalid enum: 'MODEL1 HMM'; valid choices: MODEL1|MODEL2|HMM|SYNTACTIC|NONE
> lmcgibbn@LMC-032857 /usr/local/incubator-joshua/lib(master) $ java -d64 
> -Xmx10g -jar /usr/local/incubator-joshua/lib/berkeleyaligner.jar 
> ++/usr/local/incubator-joshua/experiments/fisher_callhome_experiment/6/alignments/0/word-align.conf
> Invalid enum: 'MODEL1, HMM'; valid choices: MODEL1|MODEL2|HMM|SYNTACTIC|NONE
> lmcgibbn@LMC-032857 /usr/local/incubator-joshua/lib(master) $ java -d64 
> -Xmx10g -jar /usr/local/incubator-joshua/lib/berkeleyaligner.jar 
> ++/usr/local/incubator-joshua/experiments/fisher_callhome_experiment/6/alignments/0/word-align.conf
> Invalid enum: 'MODEL1 HMM'; valid choices: MODEL1|MODEL2|HMM|SYNTACTIC|NONE
> lmcgibbn@LMC-032857 /usr/local/incubator-joshua/lib(master) $ java -d64 
> -Xmx10g -jar /usr/local/incubator-joshua/lib/berkeleyaligner.jar 
> ++/usr/local/incubator-joshua/experiments/fisher_callhome_experiment/6/alignments/0/word-align.conf
> Invalid enum: 'JOINT JOINT'; valid choices: FORWARD|REVERSE|BOTH_INDEP|JOINT
> lmcgibbn@LMC-032857 /usr/local/incubator-joshua/lib(master) $ java -d64 
> -Xmx10g -jar /usr/local/incubator-joshua/lib/berkeleyaligner.jar 
> ++/usr/local/incubator-joshua/experiments/fisher_callhome_experiment/6/alignments/0/word-align.conf
> Exception in thread "main" java.lang.NumberFormatException: For input string: 
> "5 5"
>   at 
> java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
>   at java.lang.Integer.parseInt(Integer.java:580)
>   at java.lang.Integer.parseInt(Integer.java:615)
>   at 
> edu.berkeley.nlp.fig.basic.OptInfo.interpretValue(OptionsParser.java:143)
>   at 
> edu.berkeley.nlp.fig.basic.OptInfo.interpretValue(OptionsParser.java:240)
>   at edu.berkeley.nlp.fig.basic.OptInfo.set(OptionsParser.java:294)
>   at 
> edu.berkeley.nlp.fig.basic.OptionsParser.readOptionsFile(OptionsParser.java:555)
>   at 
> edu.berkeley.nlp.fig.basic.OptionsParser.doParse(OptionsParser.java:604)
>   at 

[jira] [Commented] (JOSHUA-304) word-align.conf alignment template file not compatible with berkeley aligner

2016-08-24 Thread Lewis John McGibbney (JIRA)

[ 
https://issues.apache.org/jira/browse/JOSHUA-304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15435133#comment-15435133
 ] 

Lewis John McGibbney commented on JOSHUA-304:
-

It may help for me to post the options available within the current berkeley 
aligner jar which was built when I installed Joshua
{code}
lmcgibbn@LMC-032857 /usr/local/incubator-joshua(master) $ java -jar 
./lib/berkeleyaligner.jar  -help
Usage:
  log.maxIndLevel<  int> : Maximum indent level. [10]
  log.msPerLine  <  int> : Maximum number of milliseconds 
between consecutive lines of output. [1000]
  log.file   <  str> : File to write log. []
  log.stdout < bool> : Whether to output to the console. 
[true]
  log.note   <  str> : Dummy placeholder for a comment []
  log.forcePrint < bool> : Force printing from logs* [false]
  log.maxPrintErrors <  int> : Maximum number of errors (via 
error()) to print [1]
  EMWordAligner.nullProb <  dbl> : How to assign null-word 
probabilities (=1 means 1/n) [1.0E-6]
  EMWordAligner.usePosteriorDecoding < bool> : Use posterior decoding 
(recommended for best performance). [true]
  EMWordAligner.posteriorDecodingThreshold <  dbl> : Threshold in [0,1] for 
deciding whether an alignment should exist. [0.5]
  EMWordAligner.mergeConsiderNull < bool> : When merging expected sufficient 
statistics, take into account the NULL (fix). [false]
  EMWordAligner.handleUnknownWords < bool> : Don't crash with unknown words 
(better to train on test set). [false]
  EMWordAligner.priorFraction<  dbl> : Fraction of a count to add for links 
in dictionary prior (1 works well). [0.0]
  EMWordAligner.numThreads   <  int> : Number of concurrent threads to use 
during E-step (set to number of processors). [1]
  EMWordAligner.safeConcurrency  < bool> : Safe concurrency (gets rid of 
concurrency warnings at the expense of speed) [false]
  EMWordAligner.evaluateDuringTraining < bool> : Whether to evaluate the model 
after each training iteration (slower, more memory). [false]
  TreeWalkModel.usePushProbabilities < bool> : Separate parameters for moving 
and pushing. [true]
  TreeWalkModel.conditionOnTag   < bool> : Whether to condition distortion on 
the tag types. [true]
  TreeWalkModel.cacheTreePaths   < bool> : Whether to cache paths through trees 
(uses lots of memory; faster). [false]
  Evaluator.searchForThreshold   < bool> : Evaluate using line search [false]
  Evaluator.thresholdIntervals   <  int> : Sets the number of intervals for 
posterior threshold line search [20]
  Evaluator.saveAlignmentObjects < bool> : Save object files for proposed 
alignments (large files) [false]
  Main.trainSources  < str*> : Directories or files containing 
training files. [example/train]
  Main.testSources   < str*> : Directory or file containing testing 
files. [example/test]
  Main.sentences <  int> : Maximum number of the training 
sentences to use [2147483647]
  Main.offsetTrainingSentences   <  int> : Skip this number of the first 
training sentences [0]
  Main.maxTestSentences  <  int> : Maximum number of the test sentences 
to use [2147483647]
  Main.offsetTestSentences   <  int> : Skip this number of the first test 
sentences [0]
  Main.foreignSuffix <  str> : Foreign language file suffix [f]
  Main.englishSuffix <  str> : English language file suffix [e]
  Main.itgTrainTestSplitPoint<  int> : When writing test (ITG) posteriors, 
where to divide train/test data? [0]
  Main.itgInputDir   <  str> : What directory should we dump ITG 
test data to? []
  Main.reverseAlignments < bool> : Reverse test set alignments (i.e., 
foreign to english) [false]
  Main.oneIndexed< bool> : Are alignments one-indexed (default 
== no, 0-indexed) [false]
  Main.lowercaseWords< bool> : Convert all words to lowercase 
[false]
  Main.leaveTrainingOnDisk   < bool> : Don't load and store the training 
set upfront (slower, but less memory) [false]
  Main.saveRejects   < bool> : Save rejected sentence pairs [false]
  Main.forwardModels  : Which word alignment model to use in 
the forward direction. [MODEL1 HMM]
  Main.reverseModels  : Which word alignment model to use in 
the backward direction. [MODEL1 HMM]
  Main.iters < int*> : Number of iterations to run the 
model. [5 5]
  Main.mode   : Whether to train the two models 
jointly or independently. [JOINT JOINT]
  Main.trainingCacheMaxSize  <  int> : Max sentence length for caching the 
HMM trellis (efficiency only). [100]
  Main.loadParamsDir <  str> : Directory to load parameters from. []
  Main.loadLexicalModelOnly  < bool> : When true, the 

[jira] [Commented] (JOSHUA-304) word-align.conf alignment template file not compatible with berkeley aligner

2016-08-24 Thread Lewis John McGibbney (JIRA)

[ 
https://issues.apache.org/jira/browse/JOSHUA-304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15435615#comment-15435615
 ] 

Lewis John McGibbney commented on JOSHUA-304:
-

ACK will do.

> word-align.conf alignment template file not compatible with berkeley aligner
> 
>
> Key: JOSHUA-304
> URL: https://issues.apache.org/jira/browse/JOSHUA-304
> Project: Joshua
>  Issue Type: Bug
>  Components: alignment, berkeley, templates
>Affects Versions: 6.0.5
>Reporter: Lewis John McGibbney
>Priority: Blocker
> Fix For: 6.1
>
>
> It takes me quite some time to debug what was going on and why pipeline's 
> were failing when using the berkeley aligner.
> It turns out that the word-align.conf template provided at
> https://github.com/apache/incubator-joshua/blob/master/scripts/training/templates/alignment/word-align.conf
> is not compatible with the berkeley aligner. 
> In particular the following lines are non compatible
> https://github.com/apache/incubator-joshua/blob/master/scripts/training/templates/alignment/word-align.conf#L12-L15
> Evidence of this is provided below
> {code}
> lmcgibbn@LMC-032857 /usr/local/incubator-joshua/lib(master) $ java -d64 
> -Xmx10g -jar /usr/local/incubator-joshua/lib/berkeleyaligner.jar 
> ++/usr/local/incubator-joshua/experiments/fisher_callhome_experiment/6/alignments/0/word-align.conf
> Invalid enum: 'MODEL1 HMM'; valid choices: MODEL1|MODEL2|HMM|SYNTACTIC|NONE
> lmcgibbn@LMC-032857 /usr/local/incubator-joshua/lib(master) $ java -d64 
> -Xmx10g -jar /usr/local/incubator-joshua/lib/berkeleyaligner.jar 
> ++/usr/local/incubator-joshua/experiments/fisher_callhome_experiment/6/alignments/0/word-align.conf
> Invalid enum: 'MODEL1, HMM'; valid choices: MODEL1|MODEL2|HMM|SYNTACTIC|NONE
> lmcgibbn@LMC-032857 /usr/local/incubator-joshua/lib(master) $ java -d64 
> -Xmx10g -jar /usr/local/incubator-joshua/lib/berkeleyaligner.jar 
> ++/usr/local/incubator-joshua/experiments/fisher_callhome_experiment/6/alignments/0/word-align.conf
> Invalid enum: 'MODEL1 HMM'; valid choices: MODEL1|MODEL2|HMM|SYNTACTIC|NONE
> lmcgibbn@LMC-032857 /usr/local/incubator-joshua/lib(master) $ java -d64 
> -Xmx10g -jar /usr/local/incubator-joshua/lib/berkeleyaligner.jar 
> ++/usr/local/incubator-joshua/experiments/fisher_callhome_experiment/6/alignments/0/word-align.conf
> Invalid enum: 'JOINT JOINT'; valid choices: FORWARD|REVERSE|BOTH_INDEP|JOINT
> lmcgibbn@LMC-032857 /usr/local/incubator-joshua/lib(master) $ java -d64 
> -Xmx10g -jar /usr/local/incubator-joshua/lib/berkeleyaligner.jar 
> ++/usr/local/incubator-joshua/experiments/fisher_callhome_experiment/6/alignments/0/word-align.conf
> Exception in thread "main" java.lang.NumberFormatException: For input string: 
> "5 5"
>   at 
> java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
>   at java.lang.Integer.parseInt(Integer.java:580)
>   at java.lang.Integer.parseInt(Integer.java:615)
>   at 
> edu.berkeley.nlp.fig.basic.OptInfo.interpretValue(OptionsParser.java:143)
>   at 
> edu.berkeley.nlp.fig.basic.OptInfo.interpretValue(OptionsParser.java:240)
>   at edu.berkeley.nlp.fig.basic.OptInfo.set(OptionsParser.java:294)
>   at 
> edu.berkeley.nlp.fig.basic.OptionsParser.readOptionsFile(OptionsParser.java:555)
>   at 
> edu.berkeley.nlp.fig.basic.OptionsParser.doParse(OptionsParser.java:604)
>   at edu.berkeley.nlp.fig.exec.Execution.init(Execution.java:293)
>   at edu.berkeley.nlp.wordAlignment.Main.main(Main.java:149)
> lmcgibbn@LMC-032857 /usr/local/incubator-joshua/lib(master) $ java -d64 
> -Xmx10g -jar /usr/local/incubator-joshua/lib/berkeleyaligner.jar 
> ++/usr/local/incubator-joshua/experiments/fisher_callhome_experiment/6/alignments/0/word-align.conf
> Cannot create directory: alignments/0
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (JOSHUA-305) joshua-6.1-SNAPSHOT-source-release.zip takes ages to build

2016-08-24 Thread Lewis John McGibbney (JIRA)
Lewis John McGibbney created JOSHUA-305:
---

 Summary: joshua-6.1-SNAPSHOT-source-release.zip takes ages to build
 Key: JOSHUA-305
 URL: https://issues.apache.org/jira/browse/JOSHUA-305
 Project: Joshua
  Issue Type: Bug
  Components: build, core
Affects Versions: 6.0.5
Reporter: Lewis John McGibbney
Priority: Blocker
 Fix For: 6.1


When someone runs mvn clean install, the joshua-6.1-SNAPSHOT-source-release.zip 
step takes absolutely ages to build. We should investigate why this is the case.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (JOSHUA-305) joshua-6.1-SNAPSHOT-source-release.zip takes ages to build

2016-08-24 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/JOSHUA-305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney resolved JOSHUA-305.
-
Resolution: Not A Bug

This was due to a large language model being present within the joshua 
directory. This is not an issue.

> joshua-6.1-SNAPSHOT-source-release.zip takes ages to build
> --
>
> Key: JOSHUA-305
> URL: https://issues.apache.org/jira/browse/JOSHUA-305
> Project: Joshua
>  Issue Type: Bug
>  Components: build, core
>Affects Versions: 6.0.5
>Reporter: Lewis John McGibbney
>Priority: Blocker
> Fix For: 6.1
>
>
> When someone runs mvn clean install, the 
> joshua-6.1-SNAPSHOT-source-release.zip step takes absolutely ages to build. 
> We should investigate why this is the case.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (JOSHUA-319) test-decode decoder_command results in java.lang.NumberFormatException: For input string: "MAXSPAN"

2016-10-26 Thread Lewis John McGibbney (JIRA)

[ 
https://issues.apache.org/jira/browse/JOSHUA-319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15610743#comment-15610743
 ] 

Lewis John McGibbney commented on JOSHUA-319:
-

Some supplementary reading folks
http://www.mail-archive.com/dev%40joshua.incubator.apache.org/msg01769.html


> test-decode decoder_command results in java.lang.NumberFormatException: For 
> input string: "MAXSPAN"
> ---
>
> Key: JOSHUA-319
> URL: https://issues.apache.org/jira/browse/JOSHUA-319
> Project: Joshua
>  Issue Type: Bug
>  Components: decoders
>Reporter: Lewis John McGibbney
>Assignee: Lewis John McGibbney
> Fix For: 6.1
>
>
> When I run the following command
> {code}
> /usr/local/incubator-joshua/bin/pipeline.pl  --rundir . --type hiero --corpus 
> /usr/local/joshua_resources/russian_experiments/data/commoncrawl.ru-en --tune 
> /usr/local/joshua_resources/russian_experiments/data/commoncrawl.ru-en.tune 
> --test 
> /usr/local/joshua_resources/russian_experiments/data/commoncrawl.ru-en.test 
> --source en --target ru --readme "Experiment 3 Run 1 of ru --> en model 
> training" --aligner berkeley --hadoop-mem 10g --tmp 
> /usr/local/hadoop-2.5.2/hadoop_tmp_dir --first-step test --grammar 
> /usr/local/joshua_resources/russian_experiments/exp3/grammar.gz --joshua-mem 
> 10g
> {code}
> I end up with the following message.
> {code}
> INFO - Parameters read from configuration file: joshua.config
> INFO - tm = 'TYPE -maxspan MAXSPAN -owner OWNER -path 
> /usr/local/joshua_resources/russian_experiments/exp3/test/1/model/grammar.gz.packed'
> INFO - tm = 'thrax -maxspan -1 -owner glue -path 
> /usr/local/joshua_resources/russian_experiments/exp3/test/1/model/grammar.glue'
> INFO - defaultnonterminal = 'X'
> INFO - goalsymbol = 'GOAL'
> INFO - markoovs = 'false'
> INFO - search = 'cky'
> INFO - pop-limit: 5000
> INFO - poplimit = '5000'
> INFO - topn = '300'
> INFO - useuniquenbest = 'true'
> INFO - outputformat = '%i ||| %s ||| %f ||| %c'
> INFO - includealignindex = 'false'
> INFO - featurefunction = 'OOVPenalty'
> INFO - featurefunction = 'WordPenalty'
> INFO - c = 'joshua.config'
> INFO - threads = '1'
> INFO - topn = '0'
> INFO - outputformat = '%s'
> INFO - Read 3 weights (0 of them dense)
> Exception in thread "main" java.lang.NumberFormatException: For input string: 
> "MAXSPAN"
>   at 
> java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
>   at java.lang.Integer.parseInt(Integer.java:580)
>   at java.lang.Integer.parseInt(Integer.java:615)
>   at 
> org.apache.joshua.decoder.Decoder.initializeTranslationGrammars(Decoder.java:451)
>   at org.apache.joshua.decoder.Decoder.initialize(Decoder.java:389)
>   at org.apache.joshua.decoder.Decoder.(Decoder.java:128)
>   at org.apache.joshua.decoder.JoshuaDecoder.main(JoshuaDecoder.java:69)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (JOSHUA-317) SyntaxError: invalid syntax scripts/training/run_tuner.py", line 391

2016-10-26 Thread Lewis John McGibbney (JIRA)
Lewis John McGibbney created JOSHUA-317:
---

 Summary: SyntaxError: invalid syntax 
scripts/training/run_tuner.py", line 391
 Key: JOSHUA-317
 URL: https://issues.apache.org/jira/browse/JOSHUA-317
 Project: Joshua
  Issue Type: Bug
  Components: er
Affects Versions: 6.0.5
 Environment: Python 3.5
Reporter: Lewis John McGibbney
Assignee: Lewis John McGibbney
 Fix For: 6.1


{code}
[tune-bundle] rebuilding...
  dep=/usr/local/incubator-joshua/scripts/training/templates/tune/joshua.config 
[CHANGED]
  
dep=/usr/local/joshua_resources/russian_experiments/exp3/grammar.packed/slice_0.source
 [CHANGED]
  
dep=/usr/local/joshua_resources/russian_experiments/exp3/tune/model/run-joshua.sh
 [NOT FOUND]
  cmd=/usr/local/incubator-joshua/scripts/support/run_bundler.py --force 
--symlink --absolute --verbose -T /usr/local/hadoop-2.5.2/hadoop_tmp_dir 
/usr/local/incubator-joshua/scripts/training/templates/tune/joshua.config 
/usr/local/joshua_resources/russian_experiments/exp3/tune/model 
--copy-config-options '-top-n 300 -output-format "%i ||| %s ||| %f ||| %c" 
-mark-oovs false -search cky -weights "lm_0 1 tm_pt_0 1 tm_pt_1 1 tm_pt_2 1 
tm_pt_3 1 tm_pt_4 1 tm_pt_5 1 tm_glue_0 1 " -feature-function 
"StateMinimizingLanguageModel -lm_order 5 -lm_file 
/usr/local/joshua_resources/russian_experiments/exp3/lm.kenlm"  -tm0/type hiero 
-tm0/owner pt -tm0/maxspan 20 -tm1/owner glue' --pack-tm 
/usr/local/joshua_resources/russian_experiments/exp3/grammar.packed --tm 
/usr/local/joshua_resources/russian_experiments/exp3/data/tune/grammar.glue
  took 0 seconds (0s)
[mert-1] rebuilding...
  dep=/usr/local/joshua_resources/russian_experiments/exp3/data/tune/corpus.en 
[CHANGED]
  dep=/usr/local/joshua_resources/russian_experiments/exp3/tune/joshua.config 
[CHANGED]
  dep=tune/model/grammar.packed/slice_0.source [CHANGED]
  
dep=/usr/local/joshua_resources/russian_experiments/exp3/tune/joshua.config.final
 [NOT FOUND]
  cmd=/usr/local/incubator-joshua/scripts/training/run_tuner.py 
/usr/local/joshua_resources/russian_experiments/exp3/data/tune/corpus.en 
/usr/local/joshua_resources/russian_experiments/exp3/data/tune/corpus.ru 
--tunedir /usr/local/joshua_resources/russian_experiments/exp3/tune --tuner 
mert --decoder 
/usr/local/joshua_resources/russian_experiments/exp3/tune/decoder_command 
--decoder-config 
/usr/local/joshua_resources/russian_experiments/exp3/tune/joshua.config 
--decoder-output-file 
/usr/local/joshua_resources/russian_experiments/exp3/tune/output.nbest 
--decoder-log-file 
/usr/local/joshua_resources/russian_experiments/exp3/tune/joshua.log 
--iterations 10 --metric 'BLEU 4 closest'
  JOB FAILED (return code 1)
  File "/usr/local/incubator-joshua/scripts/training/run_tuner.py", line 391
'ITERATIONS': `iterations`,
  ^
SyntaxError: invalid syntax
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (JOSHUA-317) SyntaxError: invalid syntax scripts/training/run_tuner.py", line 391

2016-10-26 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/JOSHUA-317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney updated JOSHUA-317:

Component/s: (was: er)
 tuner

> SyntaxError: invalid syntax scripts/training/run_tuner.py", line 391
> 
>
> Key: JOSHUA-317
> URL: https://issues.apache.org/jira/browse/JOSHUA-317
> Project: Joshua
>  Issue Type: Bug
>  Components: tuner
>Affects Versions: 6.0.5
> Environment: Python 3.5
>Reporter: Lewis John McGibbney
>Assignee: Lewis John McGibbney
> Fix For: 6.1
>
>
> {code}
> [tune-bundle] rebuilding...
>   
> dep=/usr/local/incubator-joshua/scripts/training/templates/tune/joshua.config 
> [CHANGED]
>   
> dep=/usr/local/joshua_resources/russian_experiments/exp3/grammar.packed/slice_0.source
>  [CHANGED]
>   
> dep=/usr/local/joshua_resources/russian_experiments/exp3/tune/model/run-joshua.sh
>  [NOT FOUND]
>   cmd=/usr/local/incubator-joshua/scripts/support/run_bundler.py --force 
> --symlink --absolute --verbose -T /usr/local/hadoop-2.5.2/hadoop_tmp_dir 
> /usr/local/incubator-joshua/scripts/training/templates/tune/joshua.config 
> /usr/local/joshua_resources/russian_experiments/exp3/tune/model 
> --copy-config-options '-top-n 300 -output-format "%i ||| %s ||| %f ||| %c" 
> -mark-oovs false -search cky -weights "lm_0 1 tm_pt_0 1 tm_pt_1 1 tm_pt_2 1 
> tm_pt_3 1 tm_pt_4 1 tm_pt_5 1 tm_glue_0 1 " -feature-function 
> "StateMinimizingLanguageModel -lm_order 5 -lm_file 
> /usr/local/joshua_resources/russian_experiments/exp3/lm.kenlm"  -tm0/type 
> hiero -tm0/owner pt -tm0/maxspan 20 -tm1/owner glue' --pack-tm 
> /usr/local/joshua_resources/russian_experiments/exp3/grammar.packed --tm 
> /usr/local/joshua_resources/russian_experiments/exp3/data/tune/grammar.glue
>   took 0 seconds (0s)
> [mert-1] rebuilding...
>   
> dep=/usr/local/joshua_resources/russian_experiments/exp3/data/tune/corpus.en 
> [CHANGED]
>   dep=/usr/local/joshua_resources/russian_experiments/exp3/tune/joshua.config 
> [CHANGED]
>   dep=tune/model/grammar.packed/slice_0.source [CHANGED]
>   
> dep=/usr/local/joshua_resources/russian_experiments/exp3/tune/joshua.config.final
>  [NOT FOUND]
>   cmd=/usr/local/incubator-joshua/scripts/training/run_tuner.py 
> /usr/local/joshua_resources/russian_experiments/exp3/data/tune/corpus.en 
> /usr/local/joshua_resources/russian_experiments/exp3/data/tune/corpus.ru 
> --tunedir /usr/local/joshua_resources/russian_experiments/exp3/tune --tuner 
> mert --decoder 
> /usr/local/joshua_resources/russian_experiments/exp3/tune/decoder_command 
> --decoder-config 
> /usr/local/joshua_resources/russian_experiments/exp3/tune/joshua.config 
> --decoder-output-file 
> /usr/local/joshua_resources/russian_experiments/exp3/tune/output.nbest 
> --decoder-log-file 
> /usr/local/joshua_resources/russian_experiments/exp3/tune/joshua.log 
> --iterations 10 --metric 'BLEU 4 closest'
>   JOB FAILED (return code 1)
>   File "/usr/local/incubator-joshua/scripts/training/run_tuner.py", line 391
> 'ITERATIONS': `iterations`,
>   ^
> SyntaxError: invalid syntax
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (JOSHUA-319) test-decode decoder_command results in java.lang.NumberFormatException: For input string: "MAXSPAN"

2016-10-26 Thread Lewis John McGibbney (JIRA)
Lewis John McGibbney created JOSHUA-319:
---

 Summary: test-decode decoder_command results in 
java.lang.NumberFormatException: For input string: "MAXSPAN"
 Key: JOSHUA-319
 URL: https://issues.apache.org/jira/browse/JOSHUA-319
 Project: Joshua
  Issue Type: Bug
  Components: decoders
Reporter: Lewis John McGibbney
Assignee: Lewis John McGibbney
 Fix For: 6.1


When I run the following command
{code}
/usr/local/incubator-joshua/bin/pipeline.pl  --rundir . --type hiero --corpus 
/usr/local/joshua_resources/russian_experiments/data/commoncrawl.ru-en --tune 
/usr/local/joshua_resources/russian_experiments/data/commoncrawl.ru-en.tune 
--test 
/usr/local/joshua_resources/russian_experiments/data/commoncrawl.ru-en.test 
--source en --target ru --readme "Experiment 3 Run 1 of ru --> en model 
training" --aligner berkeley --hadoop-mem 10g --tmp 
/usr/local/hadoop-2.5.2/hadoop_tmp_dir --first-step test --grammar 
/usr/local/joshua_resources/russian_experiments/exp3/grammar.gz --joshua-mem 10g
{code}
I end up with the following message.
{code}
INFO - Parameters read from configuration file: joshua.config
INFO - tm = 'TYPE -maxspan MAXSPAN -owner OWNER -path 
/usr/local/joshua_resources/russian_experiments/exp3/test/1/model/grammar.gz.packed'
INFO - tm = 'thrax -maxspan -1 -owner glue -path 
/usr/local/joshua_resources/russian_experiments/exp3/test/1/model/grammar.glue'
INFO - defaultnonterminal = 'X'
INFO - goalsymbol = 'GOAL'
INFO - markoovs = 'false'
INFO - search = 'cky'
INFO - pop-limit: 5000
INFO - poplimit = '5000'
INFO - topn = '300'
INFO - useuniquenbest = 'true'
INFO - outputformat = '%i ||| %s ||| %f ||| %c'
INFO - includealignindex = 'false'
INFO - featurefunction = 'OOVPenalty'
INFO - featurefunction = 'WordPenalty'
INFO - c = 'joshua.config'
INFO - threads = '1'
INFO - topn = '0'
INFO - outputformat = '%s'
INFO - Read 3 weights (0 of them dense)
Exception in thread "main" java.lang.NumberFormatException: For input string: 
"MAXSPAN"
at 
java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
at java.lang.Integer.parseInt(Integer.java:580)
at java.lang.Integer.parseInt(Integer.java:615)
at 
org.apache.joshua.decoder.Decoder.initializeTranslationGrammars(Decoder.java:451)
at org.apache.joshua.decoder.Decoder.initialize(Decoder.java:389)
at org.apache.joshua.decoder.Decoder.(Decoder.java:128)
at org.apache.joshua.decoder.JoshuaDecoder.main(JoshuaDecoder.java:69)
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (JOSHUA-316) run_bundler.py returning JOB FAILED (return code 1) TypeError: memoryview: a bytes-like object is required, not 'str'

2016-10-25 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/JOSHUA-316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney updated JOSHUA-316:

Fix Version/s: (was: 6.2)
   6.1

> run_bundler.py returning JOB FAILED (return code 1) TypeError: memoryview: a 
> bytes-like object is required, not 'str'
> -
>
> Key: JOSHUA-316
> URL: https://issues.apache.org/jira/browse/JOSHUA-316
> Project: Joshua
>  Issue Type: Bug
>  Components: bundler
>Affects Versions: 6.0.5
>Reporter: Lewis John McGibbney
>Priority: Critical
> Fix For: 6.1
>
>
> {code}
> [glue-tune] rebuilding...
>   
> dep=/usr/local/joshua_resources/russian_experiments/exp2/grammar.packed/slice_0.source
>  [CHANGED]
>   
> dep=/usr/local/joshua_resources/russian_experiments/exp2/data/tune/grammar.glue
>  [NOT FOUND]
>   cmd=/usr/local/incubator-joshua/scripts/support/create_glue_grammar.sh 
> /usr/local/joshua_resources/russian_experiments/exp2/grammar.packed > 
> /usr/local/joshua_resources/russian_experiments/exp2/data/tune/grammar.glue
>   took 1 seconds (1s)
> [tune-bundle] rebuilding...
>   
> dep=/usr/local/incubator-joshua/scripts/training/templates/tune/joshua.config 
> [CHANGED]
>   
> dep=/usr/local/joshua_resources/russian_experiments/exp2/grammar.packed/slice_0.source
>  [CHANGED]
>   
> dep=/usr/local/joshua_resources/russian_experiments/exp2/tune/model/run-joshua.sh
>  [NOT FOUND]
>   cmd=/usr/local/incubator-joshua/scripts/support/run_bundler.py --force 
> --symlink --absolute --verbose -T /usr/local/hadoop-2.5.2/hadoop_tmp_dir 
> /usr/local/incubator-joshua/scripts/training/templates/tune/joshua.config 
> /usr/local/joshua_resources/russian_experiments/exp2/tune/model 
> --copy-config-options '-top-n 300 -output-format "%i ||| %s ||| %f ||| %c" 
> -mark-oovs false -search cky -weights "lm_0 1 tm_pt_0 1 tm_pt_1 1 tm_pt_2 1 
> tm_pt_3 1 tm_pt_4 1 tm_pt_5 1 tm_glue_0 1 " -feature-function 
> "StateMinimizingLanguageModel -lm_order 5 -lm_file 
> /usr/local/joshua_resources/russian_experiments/exp2/lm.kenlm"  -tm0/type 
> hiero -tm0/owner pt -tm0/maxspan 20 -tm1/owner glue' --pack-tm 
> /usr/local/joshua_resources/russian_experiments/exp2/grammar.packed --tm 
> /usr/local/joshua_resources/russian_experiments/exp2/data/tune/grammar.glue
>   JOB FAILED (return code 1)
> * Running the copy-config.pl script with the command: 
> /usr/local/incubator-joshua/scripts/copy-config.pl -top-n 300 -output-format 
> "%i ||| %s ||| %f ||| %c" -mark-oovs false -search cky -weights "lm_0 1 
> tm_pt_0 1 tm_pt_1 1 tm_pt_2 1 tm_pt_3 1 tm_pt_4 1 tm_pt_5 1 tm_glue_0 1 " 
> -feature-function "StateMinimizingLanguageModel -lm_order 5 -lm_file 
> /usr/local/joshua_resources/russian_experiments/exp2/lm.kenlm"  -tm0/type 
> hiero -tm0/owner pt -tm0/maxspan 20 -tm1/owner glue
> Traceback (most recent call last):
>   File "/usr/local/incubator-joshua/scripts/support/run_bundler.py", line 
> 748, in main
> operations = collect_operations(opts)
>   File "/usr/local/incubator-joshua/scripts/support/run_bundler.py", line 
> 637, in collect_operations
> opts.copy_config_options
>   File "/usr/local/incubator-joshua/scripts/support/run_bundler.py", line 
> 202, in filter_through_copy_config_script
> result, err = p.communicate(config_text)
>   File "/Users/lmcgibbn/miniconda3/lib/python3.5/subprocess.py", line 1072, 
> in communicate
> stdout, stderr = self._communicate(input, endtime, timeout)
>   File "/Users/lmcgibbn/miniconda3/lib/python3.5/subprocess.py", line 1700, 
> in _communicate
> input_view = memoryview(self._input)
> TypeError: memoryview: a bytes-like object is required, not 'str'
> During handling of the above exception, another exception occurred:
> Traceback (most recent call last):
>   File "/usr/local/incubator-joshua/scripts/support/run_bundler.py", line 
> 760, in 
> main(sys.argv)
>   File "/usr/local/incubator-joshua/scripts/support/run_bundler.py", line 
> 751, in main
> error_quit(e.message)
> AttributeError: 'TypeError' object has no attribute 'message'
> * WARNING: no key 'outputformat' found in config file (appending to end)
> * WARNING: no key 'search' found in config file (appending to end)
> * WARNING: no key 'topn' found in config file (appending to end)
> * WARNING: no key 'markoovs' found in config file (appending to end)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (JOSHUA-318) scripts/training/run_tuner.py should enable configurable memory usage when invioking joshua-decoder

2016-10-26 Thread Lewis John McGibbney (JIRA)
Lewis John McGibbney created JOSHUA-318:
---

 Summary: scripts/training/run_tuner.py should enable configurable 
memory usage when invioking joshua-decoder
 Key: JOSHUA-318
 URL: https://issues.apache.org/jira/browse/JOSHUA-318
 Project: Joshua
  Issue Type: Improvement
  Components: tuner
Affects Versions: 6.0.5
Reporter: Lewis John McGibbney
 Fix For: 6.2


When I run the run_tuner.py script I can easily run into the following
{code}
[mert-1] rebuilding...
  dep=/usr/local/joshua_resources/russian_experiments/exp3/data/tune/corpus.en
  dep=/usr/local/joshua_resources/russian_experiments/exp3/tune/joshua.config 
[CHANGED]
  dep=tune/model/grammar.gz.packed/slice_0.source [CHANGED]
  
dep=/usr/local/joshua_resources/russian_experiments/exp3/tune/joshua.config.final
 [NOT FOUND]
  cmd=/usr/local/incubator-joshua/scripts/training/run_tuner.py 
/usr/local/joshua_resources/russian_experiments/exp3/data/tune/corpus.en 
/usr/local/joshua_resources/russian_experiments/exp3/data/tune/corpus.ru 
--tunedir /usr/local/joshua_resources/russian_experiments/exp3/tune --tuner 
mert --decoder 
/usr/local/joshua_resources/russian_experiments/exp3/tune/decoder_command 
--decoder-config 
/usr/local/joshua_resources/russian_experiments/exp3/tune/joshua.config 
--decoder-output-file 
/usr/local/joshua_resources/russian_experiments/exp3/tune/output.nbest 
--decoder-log-file 
/usr/local/joshua_resources/russian_experiments/exp3/tune/joshua.log 
--iterations 10 --metric 'BLEU 4 closest'
  JOB FAILED (return code 1)
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
at 
org.apache.joshua.decoder.ff.tm.packed.PackedGrammar$PackedSlice.initializeFeatureStructures(PackedGrammar.java:385)
at 
org.apache.joshua.decoder.ff.tm.packed.PackedGrammar$PackedSlice.(PackedGrammar.java:368)
at 
org.apache.joshua.decoder.ff.tm.packed.PackedGrammar.(PackedGrammar.java:153)
at 
org.apache.joshua.decoder.Decoder.initializeTranslationGrammars(Decoder.java:458)
at org.apache.joshua.decoder.Decoder.initialize(Decoder.java:389)
at org.apache.joshua.decoder.Decoder.(Decoder.java:128)
at org.apache.joshua.decoder.JoshuaDecoder.main(JoshuaDecoder.java:69)
Traceback (most recent call last):
  File "/usr/local/incubator-joshua/scripts/training/run_tuner.py", line 553, 
in 
main(sys.argv)
  File "/usr/local/incubator-joshua/scripts/training/run_tuner.py", line 536, 
in main
run_zmert(opts.tunedir, opts.source, opts.target, opts.decoder, 
opts.decoder_config, opts.decoder_output_file, opts)
  File "/usr/local/incubator-joshua/scripts/training/run_tuner.py", line 417, 
in run_zmert
opts.metric, opts.iterations or 10)
  File "/usr/local/incubator-joshua/scripts/training/run_tuner.py", line 399, 
in setup_configs
for feature,weight in get_features(config):
  File "/usr/local/incubator-joshua/scripts/training/run_tuner.py", line 351, 
in get_features
output = check_output("%s/bin/joshua-decoder -c %s -show-weights -v 0" % 
(JOSHUA, config_file), shell=True)
  File "/Users/lmcgibbn/miniconda3/lib/python3.5/subprocess.py", line 626, in 
check_output
**kwargs).stdout
  File "/Users/lmcgibbn/miniconda3/lib/python3.5/subprocess.py", line 708, in 
run
output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command 
'/usr/local/incubator-joshua/bin/joshua-decoder -c 
/usr/local/joshua_resources/russian_experiments/exp3/tune/joshua.config 
-show-weights -v 0' returned non-zero exit status 1
{code}
This is because, by default the joshua-decoder script runs with 4g of memory. 
The run_runer.py script should be flexible enough to continue with the memory 
allocation provided when a pipe was initially invoked. This value should then 
be passed to the joshua-decoder script.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (JOSHUA-318) scripts/training/run_tuner.py should enable configurable memory usage when invioking joshua-decoder

2016-10-26 Thread Lewis John McGibbney (JIRA)

[ 
https://issues.apache.org/jira/browse/JOSHUA-318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15609503#comment-15609503
 ] 

Lewis John McGibbney commented on JOSHUA-318:
-

The following code is where the sh*t his the fan
{code}
def get_features(config_file):
"""Queries the decoder for all dense features that will be fired by the 
feature
functions activated in the config file"""

output = check_output("%s/bin/joshua-decoder -c %s -show-weights -v 0" % 
(JOSHUA, config_file), shell=True)
features = []
for index, item in enumerate(output.split('\n')):
if item != "":
features.append(tuple(item.split()))
return features
{code}

> scripts/training/run_tuner.py should enable configurable memory usage when 
> invioking joshua-decoder
> ---
>
> Key: JOSHUA-318
> URL: https://issues.apache.org/jira/browse/JOSHUA-318
> Project: Joshua
>  Issue Type: Improvement
>  Components: tuner
>Affects Versions: 6.0.5
>Reporter: Lewis John McGibbney
> Fix For: 6.2
>
>
> When I run the run_tuner.py script I can easily run into the following
> {code}
> [mert-1] rebuilding...
>   dep=/usr/local/joshua_resources/russian_experiments/exp3/data/tune/corpus.en
>   dep=/usr/local/joshua_resources/russian_experiments/exp3/tune/joshua.config 
> [CHANGED]
>   dep=tune/model/grammar.gz.packed/slice_0.source [CHANGED]
>   
> dep=/usr/local/joshua_resources/russian_experiments/exp3/tune/joshua.config.final
>  [NOT FOUND]
>   cmd=/usr/local/incubator-joshua/scripts/training/run_tuner.py 
> /usr/local/joshua_resources/russian_experiments/exp3/data/tune/corpus.en 
> /usr/local/joshua_resources/russian_experiments/exp3/data/tune/corpus.ru 
> --tunedir /usr/local/joshua_resources/russian_experiments/exp3/tune --tuner 
> mert --decoder 
> /usr/local/joshua_resources/russian_experiments/exp3/tune/decoder_command 
> --decoder-config 
> /usr/local/joshua_resources/russian_experiments/exp3/tune/joshua.config 
> --decoder-output-file 
> /usr/local/joshua_resources/russian_experiments/exp3/tune/output.nbest 
> --decoder-log-file 
> /usr/local/joshua_resources/russian_experiments/exp3/tune/joshua.log 
> --iterations 10 --metric 'BLEU 4 closest'
>   JOB FAILED (return code 1)
> Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
>   at 
> org.apache.joshua.decoder.ff.tm.packed.PackedGrammar$PackedSlice.initializeFeatureStructures(PackedGrammar.java:385)
>   at 
> org.apache.joshua.decoder.ff.tm.packed.PackedGrammar$PackedSlice.(PackedGrammar.java:368)
>   at 
> org.apache.joshua.decoder.ff.tm.packed.PackedGrammar.(PackedGrammar.java:153)
>   at 
> org.apache.joshua.decoder.Decoder.initializeTranslationGrammars(Decoder.java:458)
>   at org.apache.joshua.decoder.Decoder.initialize(Decoder.java:389)
>   at org.apache.joshua.decoder.Decoder.(Decoder.java:128)
>   at org.apache.joshua.decoder.JoshuaDecoder.main(JoshuaDecoder.java:69)
> Traceback (most recent call last):
>   File "/usr/local/incubator-joshua/scripts/training/run_tuner.py", line 553, 
> in 
> main(sys.argv)
>   File "/usr/local/incubator-joshua/scripts/training/run_tuner.py", line 536, 
> in main
> run_zmert(opts.tunedir, opts.source, opts.target, opts.decoder, 
> opts.decoder_config, opts.decoder_output_file, opts)
>   File "/usr/local/incubator-joshua/scripts/training/run_tuner.py", line 417, 
> in run_zmert
> opts.metric, opts.iterations or 10)
>   File "/usr/local/incubator-joshua/scripts/training/run_tuner.py", line 399, 
> in setup_configs
> for feature,weight in get_features(config):
>   File "/usr/local/incubator-joshua/scripts/training/run_tuner.py", line 351, 
> in get_features
> output = check_output("%s/bin/joshua-decoder -c %s -show-weights -v 0" % 
> (JOSHUA, config_file), shell=True)
>   File "/Users/lmcgibbn/miniconda3/lib/python3.5/subprocess.py", line 626, in 
> check_output
> **kwargs).stdout
>   File "/Users/lmcgibbn/miniconda3/lib/python3.5/subprocess.py", line 708, in 
> run
> output=stdout, stderr=stderr)
> subprocess.CalledProcessError: Command 
> '/usr/local/incubator-joshua/bin/joshua-decoder -c 
> /usr/local/joshua_resources/russian_experiments/exp3/tune/joshua.config 
> -show-weights -v 0' returned non-zero exit status 1
> {code}
> This is because, by default the joshua-decoder script runs with 4g of memory. 
> The run_runer.py script should be flexible enough to continue with the memory 
> allocation provided when a pipe was initially invoked. This value should then 
> be passed to the joshua-decoder script.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (JOSHUA-320) --joshua-mem pipeline parameter is not populated to mert processes

2016-10-27 Thread Lewis John McGibbney (JIRA)
Lewis John McGibbney created JOSHUA-320:
---

 Summary: --joshua-mem pipeline parameter is not populated to mert 
processes
 Key: JOSHUA-320
 URL: https://issues.apache.org/jira/browse/JOSHUA-320
 Project: Joshua
  Issue Type: Bug
  Components: mert, pipeline
Affects Versions: 6.0.5
Reporter: Lewis John McGibbney
Assignee: Lewis John McGibbney
 Fix For: 6.2


As we've discussed on the Joshua mailing list at 
http://www.mail-archive.com/dev%40joshua.incubator.apache.org/msg01765.html
it is not realistic to reserve only 4g for several tasks which are executed as 
part of a typical pipeline line.
In particular, MERT runs with 4g which is not enough. We should increase this 
to something like 8g or more.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (JOSHUA-321) Add JOSHUA env to ./bin/bleu and ./bin/extract-1best bash scripts

2016-11-09 Thread Lewis John McGibbney (JIRA)
Lewis John McGibbney created JOSHUA-321:
---

 Summary: Add JOSHUA env to ./bin/bleu and ./bin/extract-1best bash 
scripts
 Key: JOSHUA-321
 URL: https://issues.apache.org/jira/browse/JOSHUA-321
 Project: Joshua
  Issue Type: Bug
Affects Versions: 6.0.5
Reporter: Lewis John McGibbney
Assignee: Lewis John McGibbney
Priority: Trivial
 Fix For: 6.1


Right now both bleu and extract-1best do not have the required $JOSHUA env 
variable which will result in an error if it is not set within the users 
environment. This currently breaks the Homebrew install amongst other things so 
we should add it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (JOSHUA-323) Joshua 6.1 Release Management

2016-11-10 Thread Lewis John McGibbney (JIRA)
Lewis John McGibbney created JOSHUA-323:
---

 Summary: Joshua 6.1 Release Management
 Key: JOSHUA-323
 URL: https://issues.apache.org/jira/browse/JOSHUA-323
 Project: Joshua
  Issue Type: Task
  Components: release, build
Reporter: Lewis John McGibbney
Assignee: Lewis John McGibbney
Priority: Blocker
 Fix For: 6.1


This is a governing ticket for reference more than anything else. We need to 
add all release specific build additions to parent pom.xml which enable us to 
roll a release candidate.
The process is also being documented over at 
https://cwiki.apache.org/confluence/display/JOSHUA/Joshua+Release+Management+Procedure



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (JOSHUA-317) SyntaxError: invalid syntax scripts/training/run_tuner.py", line 391

2016-11-10 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/JOSHUA-317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney updated JOSHUA-317:

Fix Version/s: (was: 6.1)
   6.2

> SyntaxError: invalid syntax scripts/training/run_tuner.py", line 391
> 
>
> Key: JOSHUA-317
> URL: https://issues.apache.org/jira/browse/JOSHUA-317
> Project: Joshua
>  Issue Type: Bug
>  Components: tuner
>Affects Versions: 6.0.5
> Environment: Python 3.5
>Reporter: Lewis John McGibbney
>Assignee: Lewis John McGibbney
> Fix For: 6.2
>
>
> {code}
> [tune-bundle] rebuilding...
>   
> dep=/usr/local/incubator-joshua/scripts/training/templates/tune/joshua.config 
> [CHANGED]
>   
> dep=/usr/local/joshua_resources/russian_experiments/exp3/grammar.packed/slice_0.source
>  [CHANGED]
>   
> dep=/usr/local/joshua_resources/russian_experiments/exp3/tune/model/run-joshua.sh
>  [NOT FOUND]
>   cmd=/usr/local/incubator-joshua/scripts/support/run_bundler.py --force 
> --symlink --absolute --verbose -T /usr/local/hadoop-2.5.2/hadoop_tmp_dir 
> /usr/local/incubator-joshua/scripts/training/templates/tune/joshua.config 
> /usr/local/joshua_resources/russian_experiments/exp3/tune/model 
> --copy-config-options '-top-n 300 -output-format "%i ||| %s ||| %f ||| %c" 
> -mark-oovs false -search cky -weights "lm_0 1 tm_pt_0 1 tm_pt_1 1 tm_pt_2 1 
> tm_pt_3 1 tm_pt_4 1 tm_pt_5 1 tm_glue_0 1 " -feature-function 
> "StateMinimizingLanguageModel -lm_order 5 -lm_file 
> /usr/local/joshua_resources/russian_experiments/exp3/lm.kenlm"  -tm0/type 
> hiero -tm0/owner pt -tm0/maxspan 20 -tm1/owner glue' --pack-tm 
> /usr/local/joshua_resources/russian_experiments/exp3/grammar.packed --tm 
> /usr/local/joshua_resources/russian_experiments/exp3/data/tune/grammar.glue
>   took 0 seconds (0s)
> [mert-1] rebuilding...
>   
> dep=/usr/local/joshua_resources/russian_experiments/exp3/data/tune/corpus.en 
> [CHANGED]
>   dep=/usr/local/joshua_resources/russian_experiments/exp3/tune/joshua.config 
> [CHANGED]
>   dep=tune/model/grammar.packed/slice_0.source [CHANGED]
>   
> dep=/usr/local/joshua_resources/russian_experiments/exp3/tune/joshua.config.final
>  [NOT FOUND]
>   cmd=/usr/local/incubator-joshua/scripts/training/run_tuner.py 
> /usr/local/joshua_resources/russian_experiments/exp3/data/tune/corpus.en 
> /usr/local/joshua_resources/russian_experiments/exp3/data/tune/corpus.ru 
> --tunedir /usr/local/joshua_resources/russian_experiments/exp3/tune --tuner 
> mert --decoder 
> /usr/local/joshua_resources/russian_experiments/exp3/tune/decoder_command 
> --decoder-config 
> /usr/local/joshua_resources/russian_experiments/exp3/tune/joshua.config 
> --decoder-output-file 
> /usr/local/joshua_resources/russian_experiments/exp3/tune/output.nbest 
> --decoder-log-file 
> /usr/local/joshua_resources/russian_experiments/exp3/tune/joshua.log 
> --iterations 10 --metric 'BLEU 4 closest'
>   JOB FAILED (return code 1)
>   File "/usr/local/incubator-joshua/scripts/training/run_tuner.py", line 391
> 'ITERATIONS': `iterations`,
>   ^
> SyntaxError: invalid syntax
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (JOSHUA-323) Joshua 6.1 Release Management

2016-11-10 Thread Lewis John McGibbney (JIRA)

[ 
https://issues.apache.org/jira/browse/JOSHUA-323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15656193#comment-15656193
 ] 

Lewis John McGibbney commented on JOSHUA-323:
-

Progress going well. RAT license headers are taking a wee while but will have 
them cracked for tomorrow. Following files are outstanding.
Progress can be tracked over on 
https://github.com/apache/incubator-joshua/pull/76
{code}
Files with unapproved licenses:

  scripts/analysis/sentence-by-sentence.pl
  scripts/analysis/tree_visualizer
  scripts/copy-config.pl
  scripts/distributedLM/config.template
  scripts/distributedLM/create_remote_sym_tbl.pl
  scripts/distributedLM/filter_lm.pl
  scripts/distributedLM/get_grammar_eng_voc.pl
  scripts/distributedLM/get_grammar_eng_voc_from_cn_voc.pl
  scripts/distributedLM/global_symol_list
  scripts/distributedLM/lm.list.withweights
  scripts/ems/config.ghkm
  scripts/ems/config.hiero
  scripts/ems/config.phrase
  scripts/ems/experiment.meta
  scripts/language-pack/build_lp.sh
  scripts/language-pack/README.template
  scripts/misc/canonical_path
  scripts/misc/iso639
  scripts/preparation/detokenize.pl
  scripts/preparation/lowercase.pl
  scripts/preparation/nonbreaking_prefixes/nonbreaking_prefix.ca
  scripts/preparation/nonbreaking_prefixes/nonbreaking_prefix.cs
  scripts/preparation/nonbreaking_prefixes/nonbreaking_prefix.de
  scripts/preparation/nonbreaking_prefixes/nonbreaking_prefix.el
  scripts/preparation/nonbreaking_prefixes/nonbreaking_prefix.en
  scripts/preparation/nonbreaking_prefixes/nonbreaking_prefix.es
  scripts/preparation/nonbreaking_prefixes/nonbreaking_prefix.fr
  scripts/preparation/nonbreaking_prefixes/nonbreaking_prefix.hu
  scripts/preparation/nonbreaking_prefixes/nonbreaking_prefix.is
  scripts/preparation/nonbreaking_prefixes/nonbreaking_prefix.it
  scripts/preparation/nonbreaking_prefixes/nonbreaking_prefix.lv
  scripts/preparation/nonbreaking_prefixes/nonbreaking_prefix.nl
  scripts/preparation/nonbreaking_prefixes/nonbreaking_prefix.pl
  scripts/preparation/nonbreaking_prefixes/nonbreaking_prefix.pt
  scripts/preparation/nonbreaking_prefixes/nonbreaking_prefix.ro
  scripts/preparation/nonbreaking_prefixes/nonbreaking_prefix.ru
  scripts/preparation/nonbreaking_prefixes/nonbreaking_prefix.sk
  scripts/preparation/nonbreaking_prefixes/nonbreaking_prefix.sl
  scripts/preparation/nonbreaking_prefixes/nonbreaking_prefix.sv
  scripts/preparation/normalize.pl
  scripts/preparation/tokenize.pl
  scripts/support/bbn2plf.pl
  scripts/support/extract-1best
  scripts/support/grammar-packer.pl
  scripts/support/moses2joshua.pl
  scripts/support/moses2joshua_grammar.pl
  scripts/support/phrase2hiero.py
  scripts/support/score-hypothesis.pl
  scripts/support/split2files
  scripts/training/add-OOVs.pl
  scripts/training/build-vocab.pl
  scripts/training/cachepipe/bashrc
  scripts/training/cachepipe/CachePipe.pm
  scripts/training/filter-empty-lines.pl
  scripts/training/filter-rules.pl
  scripts/training/get_grammar_features.pl
  scripts/training/lowercase-leaves.pl
  scripts/training/mira/feature_label_munger.pl
  scripts/training/mira/run-mira.pl
  scripts/training/paralign.pl
  scripts/training/parallelize/LocalConfig.pm
  scripts/training/parallelize/Makefile
  scripts/training/parallelize/parallelize.pl
  scripts/training/parallelize/sentclient.c
  scripts/training/parallelize/sentserver.c
  scripts/training/parallelize/sentserver.h
  scripts/training/paste
  scripts/training/run-giza.pl
  scripts/training/scat
  scripts/training/summarize.pl
  scripts/training/templates/alignment/jacana/resources/model/tagdict
  scripts/training/templates/alignment/word-align.conf
  scripts/training/templates/glue-grammar
  scripts/training/templates/glue-grammar.itg
  scripts/training/templates/hadoop/core-site.xml
  scripts/training/templates/hadoop/hdfs-site.xml
  scripts/training/templates/hadoop/mapred-site.xml
  scripts/training/templates/hadoop/masters
  scripts/training/templates/hadoop/slaves
  scripts/training/templates/thrax-hiero.conf
  scripts/training/templates/thrax-phrasal.conf
  scripts/training/templates/thrax-phrase-gt.conf
  scripts/training/templates/thrax-phrase.conf
  scripts/training/templates/thrax-samt.conf
  scripts/training/templates/tune/decoder_command
  scripts/training/templates/tune/decoder_command.qsub
  scripts/training/templates/tune/joshua.config
  scripts/training/TODO
  scripts/training/trim_parallel_corpus.pl
  scripts/training/unmap-html.pl
{code}

> Joshua 6.1 Release Management
> -
>
> Key: JOSHUA-323
> URL: https://issues.apache.org/jira/browse/JOSHUA-323
> Project: Joshua
>  Issue Type: Task
>  Components: build, release
>Reporter: Lewis John McGibbney
>Assignee: Lewis John McGibbney
>Priority: Blocker
> Fix For: 6.1
>
>
> This is a 

[jira] [Commented] (JOSHUA-323) Joshua 6.1 Release Management

2016-11-11 Thread Lewis John McGibbney (JIRA)

[ 
https://issues.apache.org/jira/browse/JOSHUA-323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15656783#comment-15656783
 ] 

Lewis John McGibbney commented on JOSHUA-323:
-

All licensing is now addressed and merged into master. I have some work to do 
with regards to release packaging which is not quite up to scratch but I will 
work on that tomorrow.

> Joshua 6.1 Release Management
> -
>
> Key: JOSHUA-323
> URL: https://issues.apache.org/jira/browse/JOSHUA-323
> Project: Joshua
>  Issue Type: Task
>  Components: build, release
>Reporter: Lewis John McGibbney
>Assignee: Lewis John McGibbney
>Priority: Blocker
> Fix For: 6.1
>
>
> This is a governing ticket for reference more than anything else. We need to 
> add all release specific build additions to parent pom.xml which enable us to 
> roll a release candidate.
> The process is also being documented over at 
> https://cwiki.apache.org/confluence/display/JOSHUA/Joshua+Release+Management+Procedure



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (JOSHUA-312) Even though alignment is cached, it is always re-done in pipeline re-execution

2016-10-18 Thread Lewis John McGibbney (JIRA)

[ 
https://issues.apache.org/jira/browse/JOSHUA-312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15586111#comment-15586111
 ] 

Lewis John McGibbney commented on JOSHUA-312:
-

boom goes the dynamite :)
Thanks [~post]

> Even though alignment is cached, it is always re-done in pipeline re-execution
> --
>
> Key: JOSHUA-312
> URL: https://issues.apache.org/jira/browse/JOSHUA-312
> Project: Joshua
>  Issue Type: Improvement
>  Components: alignment
>Affects Versions: 6.0.5
>Reporter: Lewis John McGibbney
>Assignee: Lewis John McGibbney
>Priority: Critical
> Fix For: 6.1
>
>
> Say if a pipeline fails after alignment. The alignment result is never cached 
> and it becomes necessary to undertake alignment... again!
> We should investigate the process for caching alignments as it would really 
> speed up rerunning end-to-end pipelines for large input datasets.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (JOSHUA-316) run_bundler.py returning JOB FAILED (return code 1)

2016-10-21 Thread Lewis John McGibbney (JIRA)
Lewis John McGibbney created JOSHUA-316:
---

 Summary: run_bundler.py returning JOB FAILED (return code 1)
 Key: JOSHUA-316
 URL: https://issues.apache.org/jira/browse/JOSHUA-316
 Project: Joshua
  Issue Type: Bug
  Components: bundler
Affects Versions: 6.0.5
Reporter: Lewis John McGibbney
Priority: Critical
 Fix For: 6.2


{code}
[glue-tune] rebuilding...
  
dep=/usr/local/joshua_resources/russian_experiments/exp2/grammar.packed/slice_0.source
 [CHANGED]
  
dep=/usr/local/joshua_resources/russian_experiments/exp2/data/tune/grammar.glue 
[NOT FOUND]
  cmd=/usr/local/incubator-joshua/scripts/support/create_glue_grammar.sh 
/usr/local/joshua_resources/russian_experiments/exp2/grammar.packed > 
/usr/local/joshua_resources/russian_experiments/exp2/data/tune/grammar.glue
  took 1 seconds (1s)
[tune-bundle] rebuilding...
  dep=/usr/local/incubator-joshua/scripts/training/templates/tune/joshua.config 
[CHANGED]
  
dep=/usr/local/joshua_resources/russian_experiments/exp2/grammar.packed/slice_0.source
 [CHANGED]
  
dep=/usr/local/joshua_resources/russian_experiments/exp2/tune/model/run-joshua.sh
 [NOT FOUND]
  cmd=/usr/local/incubator-joshua/scripts/support/run_bundler.py --force 
--symlink --absolute --verbose -T /usr/local/hadoop-2.5.2/hadoop_tmp_dir 
/usr/local/incubator-joshua/scripts/training/templates/tune/joshua.config 
/usr/local/joshua_resources/russian_experiments/exp2/tune/model 
--copy-config-options '-top-n 300 -output-format "%i ||| %s ||| %f ||| %c" 
-mark-oovs false -search cky -weights "lm_0 1 tm_pt_0 1 tm_pt_1 1 tm_pt_2 1 
tm_pt_3 1 tm_pt_4 1 tm_pt_5 1 tm_glue_0 1 " -feature-function 
"StateMinimizingLanguageModel -lm_order 5 -lm_file 
/usr/local/joshua_resources/russian_experiments/exp2/lm.kenlm"  -tm0/type hiero 
-tm0/owner pt -tm0/maxspan 20 -tm1/owner glue' --pack-tm 
/usr/local/joshua_resources/russian_experiments/exp2/grammar.packed --tm 
/usr/local/joshua_resources/russian_experiments/exp2/data/tune/grammar.glue
  JOB FAILED (return code 1)
* Running the copy-config.pl script with the command: 
/usr/local/incubator-joshua/scripts/copy-config.pl -top-n 300 -output-format 
"%i ||| %s ||| %f ||| %c" -mark-oovs false -search cky -weights "lm_0 1 tm_pt_0 
1 tm_pt_1 1 tm_pt_2 1 tm_pt_3 1 tm_pt_4 1 tm_pt_5 1 tm_glue_0 1 " 
-feature-function "StateMinimizingLanguageModel -lm_order 5 -lm_file 
/usr/local/joshua_resources/russian_experiments/exp2/lm.kenlm"  -tm0/type hiero 
-tm0/owner pt -tm0/maxspan 20 -tm1/owner glue
Traceback (most recent call last):
  File "/usr/local/incubator-joshua/scripts/support/run_bundler.py", line 748, 
in main
operations = collect_operations(opts)
  File "/usr/local/incubator-joshua/scripts/support/run_bundler.py", line 637, 
in collect_operations
opts.copy_config_options
  File "/usr/local/incubator-joshua/scripts/support/run_bundler.py", line 202, 
in filter_through_copy_config_script
result, err = p.communicate(config_text)
  File "/Users/lmcgibbn/miniconda3/lib/python3.5/subprocess.py", line 1072, in 
communicate
stdout, stderr = self._communicate(input, endtime, timeout)
  File "/Users/lmcgibbn/miniconda3/lib/python3.5/subprocess.py", line 1700, in 
_communicate
input_view = memoryview(self._input)
TypeError: memoryview: a bytes-like object is required, not 'str'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/incubator-joshua/scripts/support/run_bundler.py", line 760, 
in 
main(sys.argv)
  File "/usr/local/incubator-joshua/scripts/support/run_bundler.py", line 751, 
in main
error_quit(e.message)
AttributeError: 'TypeError' object has no attribute 'message'
* WARNING: no key 'outputformat' found in config file (appending to end)
* WARNING: no key 'search' found in config file (appending to end)
* WARNING: no key 'topn' found in config file (appending to end)
* WARNING: no key 'markoovs' found in config file (appending to end)
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (JOSHUA-316) run_bundler.py returning JOB FAILED (return code 1) TypeError: memoryview: a bytes-like object is required, not 'str'

2016-10-21 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/JOSHUA-316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney updated JOSHUA-316:

Summary: run_bundler.py returning JOB FAILED (return code 1) TypeError: 
memoryview: a bytes-like object is required, not 'str'  (was: run_bundler.py 
returning JOB FAILED (return code 1))

> run_bundler.py returning JOB FAILED (return code 1) TypeError: memoryview: a 
> bytes-like object is required, not 'str'
> -
>
> Key: JOSHUA-316
> URL: https://issues.apache.org/jira/browse/JOSHUA-316
> Project: Joshua
>  Issue Type: Bug
>  Components: bundler
>Affects Versions: 6.0.5
>Reporter: Lewis John McGibbney
>Priority: Critical
> Fix For: 6.2
>
>
> {code}
> [glue-tune] rebuilding...
>   
> dep=/usr/local/joshua_resources/russian_experiments/exp2/grammar.packed/slice_0.source
>  [CHANGED]
>   
> dep=/usr/local/joshua_resources/russian_experiments/exp2/data/tune/grammar.glue
>  [NOT FOUND]
>   cmd=/usr/local/incubator-joshua/scripts/support/create_glue_grammar.sh 
> /usr/local/joshua_resources/russian_experiments/exp2/grammar.packed > 
> /usr/local/joshua_resources/russian_experiments/exp2/data/tune/grammar.glue
>   took 1 seconds (1s)
> [tune-bundle] rebuilding...
>   
> dep=/usr/local/incubator-joshua/scripts/training/templates/tune/joshua.config 
> [CHANGED]
>   
> dep=/usr/local/joshua_resources/russian_experiments/exp2/grammar.packed/slice_0.source
>  [CHANGED]
>   
> dep=/usr/local/joshua_resources/russian_experiments/exp2/tune/model/run-joshua.sh
>  [NOT FOUND]
>   cmd=/usr/local/incubator-joshua/scripts/support/run_bundler.py --force 
> --symlink --absolute --verbose -T /usr/local/hadoop-2.5.2/hadoop_tmp_dir 
> /usr/local/incubator-joshua/scripts/training/templates/tune/joshua.config 
> /usr/local/joshua_resources/russian_experiments/exp2/tune/model 
> --copy-config-options '-top-n 300 -output-format "%i ||| %s ||| %f ||| %c" 
> -mark-oovs false -search cky -weights "lm_0 1 tm_pt_0 1 tm_pt_1 1 tm_pt_2 1 
> tm_pt_3 1 tm_pt_4 1 tm_pt_5 1 tm_glue_0 1 " -feature-function 
> "StateMinimizingLanguageModel -lm_order 5 -lm_file 
> /usr/local/joshua_resources/russian_experiments/exp2/lm.kenlm"  -tm0/type 
> hiero -tm0/owner pt -tm0/maxspan 20 -tm1/owner glue' --pack-tm 
> /usr/local/joshua_resources/russian_experiments/exp2/grammar.packed --tm 
> /usr/local/joshua_resources/russian_experiments/exp2/data/tune/grammar.glue
>   JOB FAILED (return code 1)
> * Running the copy-config.pl script with the command: 
> /usr/local/incubator-joshua/scripts/copy-config.pl -top-n 300 -output-format 
> "%i ||| %s ||| %f ||| %c" -mark-oovs false -search cky -weights "lm_0 1 
> tm_pt_0 1 tm_pt_1 1 tm_pt_2 1 tm_pt_3 1 tm_pt_4 1 tm_pt_5 1 tm_glue_0 1 " 
> -feature-function "StateMinimizingLanguageModel -lm_order 5 -lm_file 
> /usr/local/joshua_resources/russian_experiments/exp2/lm.kenlm"  -tm0/type 
> hiero -tm0/owner pt -tm0/maxspan 20 -tm1/owner glue
> Traceback (most recent call last):
>   File "/usr/local/incubator-joshua/scripts/support/run_bundler.py", line 
> 748, in main
> operations = collect_operations(opts)
>   File "/usr/local/incubator-joshua/scripts/support/run_bundler.py", line 
> 637, in collect_operations
> opts.copy_config_options
>   File "/usr/local/incubator-joshua/scripts/support/run_bundler.py", line 
> 202, in filter_through_copy_config_script
> result, err = p.communicate(config_text)
>   File "/Users/lmcgibbn/miniconda3/lib/python3.5/subprocess.py", line 1072, 
> in communicate
> stdout, stderr = self._communicate(input, endtime, timeout)
>   File "/Users/lmcgibbn/miniconda3/lib/python3.5/subprocess.py", line 1700, 
> in _communicate
> input_view = memoryview(self._input)
> TypeError: memoryview: a bytes-like object is required, not 'str'
> During handling of the above exception, another exception occurred:
> Traceback (most recent call last):
>   File "/usr/local/incubator-joshua/scripts/support/run_bundler.py", line 
> 760, in 
> main(sys.argv)
>   File "/usr/local/incubator-joshua/scripts/support/run_bundler.py", line 
> 751, in main
> error_quit(e.message)
> AttributeError: 'TypeError' object has no attribute 'message'
> * WARNING: no key 'outputformat' found in config file (appending to end)
> * WARNING: no key 'search' found in config file (appending to end)
> * WARNING: no key 'topn' found in config file (appending to end)
> * WARNING: no key 'markoovs' found in config file (appending to end)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (JOSHUA-290) Provide Joshua artifact as a bundle

2016-11-14 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/JOSHUA-290?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney updated JOSHUA-290:

Fix Version/s: 6.2

> Provide Joshua artifact as a bundle
> ---
>
> Key: JOSHUA-290
> URL: https://issues.apache.org/jira/browse/JOSHUA-290
> Project: Joshua
>  Issue Type: Task
>  Components: build
>Reporter: Tommaso Teofili
>Assignee: Tommaso Teofili
> Fix For: 6.2
>
>
> I think it'd be good if we could make the Joshua artifact an OSGi _bundle_.
> This would have no impact on plain java applications but would give the 
> following benefits:
> - make it possible to install it in OSGi environments
> - optionally introduce semantic versioning (in addition with the baseline 
> plugin) that would help track e.g. if changes in APIs break backward 
> compatibility 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (JOSHUA-314) Enable set structured-output from config file

2016-11-14 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/JOSHUA-314?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney updated JOSHUA-314:

Fix Version/s: 6.2

> Enable set structured-output from config file
> -
>
> Key: JOSHUA-314
> URL: https://issues.apache.org/jira/browse/JOSHUA-314
> Project: Joshua
>  Issue Type: Improvement
>  Components: core
>Reporter: Tommaso Teofili
> Fix For: 6.2
>
>
> Currently if one sets _use-structured-output = true_ in joshua.config that 
> results in error when parsing the config as it's not explicitly handled by 
> {{JoshuaConfiguration#readConfig}} (it can only be set programmatically), I 
> think it'd be nice to be able to configure it from config file too.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (JOSHUA-51) add jhclark/bigfatlm

2016-11-14 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/JOSHUA-51?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney updated JOSHUA-51:
---
Fix Version/s: (was: 6.1)
   6.2

> add jhclark/bigfatlm
> 
>
> Key: JOSHUA-51
> URL: https://issues.apache.org/jira/browse/JOSHUA-51
> Project: Joshua
>  Issue Type: Bug
>Reporter: Matt Post
>Assignee: Matt Post
> Fix For: 6.2
>
>
> It would be nice to leverage more Hadoop tools in the pipeline.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (JOSHUA-51) add jhclark/bigfatlm

2016-11-14 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/JOSHUA-51?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney updated JOSHUA-51:
---
Fix Version/s: 6.1

> add jhclark/bigfatlm
> 
>
> Key: JOSHUA-51
> URL: https://issues.apache.org/jira/browse/JOSHUA-51
> Project: Joshua
>  Issue Type: Bug
>Reporter: Matt Post
>Assignee: Matt Post
> Fix For: 6.2
>
>
> It would be nice to leverage more Hadoop tools in the pipeline.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (JOSHUA-323) Joshua 6.1 Release Management

2016-11-14 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/JOSHUA-323?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney resolved JOSHUA-323.
-
Resolution: Fixed

> Joshua 6.1 Release Management
> -
>
> Key: JOSHUA-323
> URL: https://issues.apache.org/jira/browse/JOSHUA-323
> Project: Joshua
>  Issue Type: Task
>  Components: build, release
>Reporter: Lewis John McGibbney
>Assignee: Lewis John McGibbney
>Priority: Blocker
> Fix For: 6.1
>
>
> This is a governing ticket for reference more than anything else. We need to 
> add all release specific build additions to parent pom.xml which enable us to 
> roll a release candidate.
> The process is also being documented over at 
> https://cwiki.apache.org/confluence/display/JOSHUA/Joshua+Release+Management+Procedure



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   >