[GitHub] incubator-joshua pull request #64: Fixed Rescoring test

2016-09-15 Thread fhieber
GitHub user fhieber opened a pull request:

https://github.com/apache/incubator-joshua/pull/64

Fixed Rescoring test



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/fhieber/incubator-joshua 7

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-joshua/pull/64.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #64


commit 695feb33293423ab6fca1d8f03b2eb0f4fe9393a
Author: Hieber, Felix <fhie...@amazon.de>
Date:   2016-09-15T15:16:56Z

Fixed Rescoring test




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-joshua pull request #63: Fixed more tests in 7

2016-09-15 Thread fhieber
GitHub user fhieber opened a pull request:

https://github.com/apache/incubator-joshua/pull/63

Fixed more tests in 7



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/fhieber/incubator-joshua 7_new

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-joshua/pull/63.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #63


commit 1091b9e8d49165a2ec9f20b62f3a15b37ffb4dda
Author: Hieber, Felix <fhie...@amazon.de>
Date:   2016-09-15T14:57:01Z

Fixed SourceAnnotations Test for 7

commit 0d99e4d47f868d83afcb34379fbf984a5452dca1
Author: Hieber, Felix <fhie...@amazon.de>
Date:   2016-09-15T14:58:52Z

Fixed TooLongTest for 7

commit c6cd09748312f0338c1002f2adb19fd8308fcb7d
Author: Hieber, Felix <fhie...@amazon.de>
Date:   2016-09-15T15:05:10Z

Merge remote-tracking branch 'upstream/7' into 7_new




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-joshua pull request #62: Fixed NumTranslationOptions test and adde...

2016-09-15 Thread fhieber
GitHub user fhieber opened a pull request:

https://github.com/apache/incubator-joshua/pull/62

Fixed NumTranslationOptions test and added deterministic output order…

… of feature vectors

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/fhieber/incubator-joshua 7

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-joshua/pull/62.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #62


commit bc050e3b5c99f71c4d5945dc9852a13e825958f7
Author: Hieber, Felix <fhie...@amazon.de>
Date:   2016-09-15T14:48:58Z

Fixed NumTranslationOptions test and added deterministic output order of 
feature vectors




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-joshua pull request #61: Fixed NAry test for 7

2016-09-15 Thread fhieber
GitHub user fhieber opened a pull request:

https://github.com/apache/incubator-joshua/pull/61

Fixed NAry test for 7



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/fhieber/incubator-joshua 7

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-joshua/pull/61.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #61


commit f08573b5f64a37afcc824df233db8a7e8719f7e9
Author: Hieber, Felix <fhie...@amazon.de>
Date:   2016-09-15T12:28:13Z

Fixed NAry test for 7




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-joshua pull request #59: Fixed NoGrammarTest for 7

2016-09-15 Thread fhieber
GitHub user fhieber opened a pull request:

https://github.com/apache/incubator-joshua/pull/59

Fixed NoGrammarTest for 7



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/fhieber/incubator-joshua 7

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-joshua/pull/59.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #59


commit c2747787ae222a36362be2de8cc2e5e26fcc01fb
Author: Hieber, Felix <fhie...@amazon.de>
Date:   2016-09-15T11:43:02Z

Fixed NoGrammarTest for 7




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-joshua issue #58: Moved new unit tests (former regression tests) t...

2016-09-15 Thread fhieber
Github user fhieber commented on the issue:

https://github.com/apache/incubator-joshua/pull/58
  
Depends whether we had any end-to-end tests like this already in 7. For the 
existing unit tests I already made the necessary changes to pass in 7. Quote 
from my commit for the featureFector refactoring:

> This commit also updates all existing (and enabled) Unit tests which also 
pass. Existing regression tests do NOT work in this commit since many of the 
grammars are packed and would need to be re-packed.


[Commit](https://github.com/apache/incubator-joshua/commit/20afddf85263f3def242f721728ac148ef143ad5)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-joshua pull request #58: Moved new unit tests (former regression t...

2016-09-14 Thread fhieber
GitHub user fhieber opened a pull request:

https://github.com/apache/incubator-joshua/pull/58

Moved new unit tests (former regression tests) to correct location.

The previous merge from master placed the new regression tests from Michael 
into the wrong location. This moves them to joshua-core/src/ so that they 
actually get run.

Now they are failing. Can someone please investigate?

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/fhieber/incubator-joshua 7

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-joshua/pull/58.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #58


commit ee7398f7d0c921cf1aaf7937a370a41d31431033
Author: Hieber, Felix <fhie...@amazon.de>
Date:   2016-09-14T21:19:00Z

moved new unit tests (former regression tests) to correct location in 
joshua 7 (joshua-core/src)




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-joshua pull request #50: fix basedir path for java.library.path

2016-09-12 Thread fhieber
GitHub user fhieber opened a pull request:

https://github.com/apache/incubator-joshua/pull/50

fix basedir path for java.library.path

makes unit tests run with kenlm again

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/fhieber/incubator-joshua 7

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-joshua/pull/50.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #50


commit f0883156c10a521cc22ee0698b7c15456256b4bb
Author: Felix Hieber <fhie...@amazon.com>
Date:   2016-09-12T14:08:54Z

fix basedir path for java.library.path




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-joshua pull request #41: Major refactoring of features/rules/gramm...

2016-08-18 Thread fhieber
GitHub user fhieber opened a pull request:

https://github.com/apache/incubator-joshua/pull/41

Major refactoring of features/rules/grammars in Joshua

Major refactoring of core decoder components (Rule.java, FeatureVector.java 
and grammars). The core idea of this change is to simplify feature handling 
inside Joshua. Please note that this change is NOT backwards compatible. The 
following changes were made:

- No distinction between sparse and dense features inside the decoder 
anymore. Each feature stored at the rule is 'owned' by the grammar that 
contains the rule. An 'owned' feature simply means that its name is prepended 
with the owner string: 0=0.2 becomes _0=0.2. This applies to both dense 
features (features that occur at every rule), as well as sparse features. 
Please note that the old prefix 'tm_' is no longer used.
- Having only one type of feature, a revised version of FeatureVector.java 
was built that is greatly simplified. It is basically a HashMap of FeatureId 
(typed as ints) to feature values. FeatureIds are created/hashed by the new 
global mapping FeatureMap.java, which maintains a bidirectional mapping between 
feature ids and feature names. This also allowed getting rid of storing feature 
names in the vocabulary.
- The simplified FeatureVector cause removal of all 
'reportDenseFeatures'/'getNumDenseFeatures' method in the decoder and the 
grammar interface.
- The tradition but very obscure way of flipping the sign of dense features 
but not sparse features was removed. The feature value in the decoder is now 
just the value as you see it stored at the rule.
- The Rule class was changed to adhere to object-oriented principles. It 
now has only one constructor that requires all of its dependencies and these 
can not be changed later. This forces Rule creators to finalize the 
dependencies (deciding on an owner of the rule and the hashing of the feature 
vector).
- Also the unused concept of the precomputableCost in a rule was removed. 
Rules still 'cache' their estimated cost.
- The various Grammar and MemoryBasedBatchGrammar constructors were unified 
and a lot of old obscure code was removed.
- Due to the change above, the PhraseModel feature function that fires 
feature values for features stored at rules is greatly simplified.
- As featureVectors at Rules are final and have to have an owner, feature 
sharing across multiple grammars would need to be handled by a separate feature 
function implementation which is transparent.

This commit also updates all existing (and enabled) Unit tests which also 
pass. Existing regression tests do NOT work in this commit since many of the 
grammars are packed and would need to be re-packed.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/fhieber/incubator-joshua 7_features

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-joshua/pull/41.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #41


commit 20afddf85263f3def242f721728ac148ef143ad5
Author: Felix Hieber <fhie...@amazon.com>
Date:   2016-08-17T11:52:39Z

Major refactoring of core decoder components (Rule.java, FeatureVector.java 
and grammars). The core idea of this change is to simplify feature handling 
inside Joshua. Please note that this change is NOT backwards compatible. The 
following changes were made:
- No distinction between sparse and dense features inside the decoder 
anymore. Each feature stored at the rule is 'owned' by the grammar that 
contains the rule. An 'owned' feature simply means that its name is prepended 
with the owner string: 0=0.2 becomes _0=0.2. This applies to both dense 
features (features that occur at every rule), as well as sparse features. 
Please note that the old prefix 'tm_' is no longer used.
- Having only one type of feature, a revised version of FeatureVector.java 
was built that is greatly simplified. It is basically a HashMap of FeatureId 
(typed as ints) to feature values. FeatureIds are created/hashed by the new 
global mapping FeatureMap.java, which maintains a bidirectional mapping between 
feature ids and feature names. This also allowed getting rid of storing feature 
names in the vocabulary.
- The simplified FeatureVector cause removal of all 
'reportDenseFeatures'/'getNumDenseFeatures' method in the decoder and the 
grammar interface.
- The tradition but very obscure way of flipping the sign of dense features 
but not sparse features was removed. The feature value in the decoder is now 
just the value as you see it stored at the rule.
- The Rule class was changed to adhere to object-oriented principles. It 
now has only one constructor that requires all of its dependencies and these 
can not be changed later. This forces Rule cr

[GitHub] incubator-joshua pull request #:

2016-08-17 Thread fhieber
Github user fhieber commented on the pull request:


https://github.com/apache/incubator-joshua/commit/3e2e053db9cafcdc5b4885a0f6fa5e54a63bc468#commitcomment-18672173
  
In src/test/java/org/apache/joshua/system/LmOovFeatureTest.java:
In src/test/java/org/apache/joshua/system/LmOovFeatureTest.java on line 69:
is this main required?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-joshua pull request #29: Added option to fire a dense language mod...

2016-06-28 Thread fhieber
GitHub user fhieber opened a pull request:

https://github.com/apache/incubator-joshua/pull/29

Added option to fire a dense language model oov indicator feature 

from LanguageModelFF. By default it is turned off. To activate, specify 
'-oov_feature' in the lm line in joshua.config.
This OOV feature is different from the OOV rules in Joshua grammars/phrase 
tables. It indicates an OOV in the language model which typically has a much 
larger vocabulary.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/fhieber/incubator-joshua lm_feature

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-joshua/pull/29.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #29


commit b3063a4674222a3419eac17e310b1d2cb693fbc8
Author: Felix Hieber <fhie...@amazon.com>
Date:   2016-06-24T07:13:52Z

Added option to fire a dense language model oov indicator feature from 
LanguageModelFF. By default it is turned off. To activate, specify 
'-oov_feature' in the lm line in joshua.config




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-joshua pull request #:

2016-06-23 Thread fhieber
Github user fhieber commented on the pull request:


https://github.com/apache/incubator-joshua/commit/748cf32f956d3fd5772f62e023d0cc8d70a7dcad#commitcomment-17995315
  
No its totally fine. Thanks for the fix!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-joshua pull request #:

2016-06-23 Thread fhieber
Github user fhieber commented on the pull request:


https://github.com/apache/incubator-joshua/commit/3fb809c444c7855255b51080c368071bc508be1f#commitcomment-17995297
  
I am fine with testng. Its equally easy to run in eclipse with the plugin


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-joshua pull request #:

2016-06-23 Thread fhieber
Github user fhieber commented on the pull request:


https://github.com/apache/incubator-joshua/commit/748cf32f956d3fd5772f62e023d0cc8d70a7dcad#commitcomment-17994947
  
Ah, right! Thanks! Should have looked more closely.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-joshua pull request #:

2016-06-23 Thread fhieber
Github user fhieber commented on the pull request:


https://github.com/apache/incubator-joshua/commit/748cf32f956d3fd5772f62e023d0cc8d70a7dcad#commitcomment-17994753
  
Hi Matt, 
Can you comment how this was a bug and what this change changes? Using the 
Rule constructor without an owner will just assign the Unknown owner id to it 
in the constructor itself afaik. So what should the owner be here? 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-joshua issue #26: Owner Ids are maintained in separate mapping now

2016-06-22 Thread fhieber
Github user fhieber commented on the issue:

https://github.com/apache/incubator-joshua/pull/26
  
Alright, I can rebase this PR once the ClassLM PR is merged so that we have 
a clean separation of those two commits on Github


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-joshua issue #25: ClassLMs: fixed a bug with class-based lms not m...

2016-06-22 Thread fhieber
Github user fhieber commented on the issue:

https://github.com/apache/incubator-joshua/pull/25
  
Great, thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-joshua pull request #26: Owner Ids are maintained in separate mapp...

2016-06-21 Thread fhieber
GitHub user fhieber opened a pull request:

https://github.com/apache/incubator-joshua/pull/26

Owner Ids are maintained in separate mapping now

Removed owner ids from Vocabulary. These are now maintained in their own 
mapping. Fixes a bug with multiple packed grammars that would overwrite each 
others owner Vocab id. Also cleaned up grammar constructors a little bit.
The owner id is now strongly typed to prevent users to accidentally use 
ints that do not represent an actual OwnerId. Further OwnerIds can only be 
returned by the OwnerMap.register() method.

@mjpost I am obviously not a Github PR expert. This PR includes the other 
one about class LMs. Lets coordinate merging if you are ok with both.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/fhieber/incubator-joshua owner

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-joshua/pull/26.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #26


commit 8fc7544eaaf35f71367b48778eaa1f22772ca390
Author: Felix Hieber <fhie...@amazon.com>
Date:   2016-06-20T09:21:03Z

ClassLMs: fixed a bug with class-based lms not mapping to class ids when 
estimateCost(). Also refactored the code a little bit to have 
StateMinimizingLanguageModels support classes as well. Added some unit tests. 
The existing regression test output was changed to the new output.

commit 1011bbb03b29b57eb2903e4817a4d6a3d553354e
Author: Felix Hieber <fhie...@amazon.com>
Date:   2016-06-20T15:55:23Z

Removed owner ids from Vocabulary. These are now maintained in their own 
mapping. Fixes a bug with multiple packed grammars that would overwrite each 
others owner Vocab id. Also cleaned up grammar constructors a little bit.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-joshua pull request #25: ClassLMs: fixed a bug with class-based lm...

2016-06-21 Thread fhieber
GitHub user fhieber opened a pull request:

https://github.com/apache/incubator-joshua/pull/25

ClassLMs: fixed a bug with class-based lms not mapping to class ids for 
estimateCost()

Also refactored the code a little bit to have StateMinimizingLanguageModels 
support classes as well. Added some unit tests. The existing regression test 
output was changed to the new output.

@mjpost It is hard to see whether the new regression output for 
test-classlm.sh is 'more' correct than before. If you could test this change 
with some of your models that use class-based lms, that'd be great.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/fhieber/incubator-joshua master

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-joshua/pull/25.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #25


commit 8fc7544eaaf35f71367b48778eaa1f22772ca390
Author: Felix Hieber <fhie...@amazon.com>
Date:   2016-06-20T09:21:03Z

ClassLMs: fixed a bug with class-based lms not mapping to class ids when 
estimateCost(). Also refactored the code a little bit to have 
StateMinimizingLanguageModels support classes as well. Added some unit tests. 
The existing regression test output was changed to the new output.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-joshua pull request #:

2016-06-20 Thread fhieber
Github user fhieber commented on the pull request:


https://github.com/apache/incubator-joshua/commit/6d2213a20b74432fc7cb131c732f7507b74053e9#commitcomment-17933645
  
In src/main/java/org/apache/joshua/decoder/StructuredTranslation.java:
In src/main/java/org/apache/joshua/decoder/StructuredTranslation.java on 
line 57:
Hi Matt,
 
Thanks, StructuredTranslation is an object to be used in a downstream 
codebase and I think it would be good to have minimal dependencies on Joshua 
objects as possible there :)
 
Regarding your question:
I’d go for a container/value object for the reason stated above if 
possible.
 
Cheers,
Felix


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-joshua pull request #:

2016-06-19 Thread fhieber
Github user fhieber commented on the pull request:


https://github.com/apache/incubator-joshua/commit/6d2213a20b74432fc7cb131c732f7507b74053e9#commitcomment-17928740
  
In src/main/java/org/apache/joshua/decoder/StructuredTranslation.java:
In src/main/java/org/apache/joshua/decoder/StructuredTranslation.java on 
line 57:
Hi matt,

Do you need the features member of structuredTranslation to be a 
FeatureVector? The problem is that this leaks IDs of features into downstream 
applications. 

Cheers,
Felix


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-joshua pull request #:

2016-06-09 Thread fhieber
Github user fhieber commented on the pull request:


https://github.com/apache/incubator-joshua/commit/3fb809c444c7855255b51080c368071bc508be1f#commitcomment-17817018
  
I would vote for junit, since these tests were created much more recently 
and the older testng tests have not been used for quite some time right?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-joshua pull request: Some changes to structured translat...

2016-05-30 Thread fhieber
Github user fhieber commented on the pull request:

https://github.com/apache/incubator-joshua/pull/20#issuecomment-222445352
  
Unit tests should all pass now except the MultithreadedTranslation test 
because of the interface change to decodeAll()


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-joshua pull request: Some changes to structured translat...

2016-05-30 Thread fhieber
GitHub user fhieber opened a pull request:

https://github.com/apache/incubator-joshua/pull/20

Some changes to structured translation interface to provide kbest access.

Hi Matt,

I tried to fix all compile errors, but some of the tests still fail due to 
the changed decodeAll() interface (not returning Translations object, but 
writing to output stream.

Also, I noticed that the StructuredTranslationTest and word alignment unit 
tests fail (e.g. tst/joshua/system/StructuredTranslationTest.java). I can't 
investigate right now, but the expected output for word alignments in the tests 
should be correct. I am getting 
some error messages regarding the MemoryBasedBatchgrammars being loaded 
there:
'Couldn't create a GrammarReader for file null with format phrase'

With your recent changes to the Grammar interface/readers, these tests 
might use an outdated format or so?
It would be great if you could take a look / try run all unit tests under 
tst/.

On a side note: I also had trouble building the project using ant (unable 
to download doxygen and commons-cli dependencies).

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/fhieber/incubator-joshua master

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-joshua/pull/20.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #20


commit e3673e988d5d27f93e69cf270dd4056547a752b9
Author: Felix Hieber <fhie...@amazon.com>
Date:   2016-03-15T10:26:29Z

StructuredTranslation objects can now be generated from KBest Derivations. 
This gives way to expose k-best lists if Joshua is used as a library.
Also fixed some code issues and tests.

commit b6010671dba008669282a1e7b2b334217443c587
Author: Artem Sokolov <artem...@amazon.com>
Date:   2016-04-22T13:50:38Z

Handled an edge case where a null hypergraph with topn!=0 would not return 
an empty output translation




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-joshua pull request:

2016-05-25 Thread fhieber
Github user fhieber commented on the pull request:


https://github.com/apache/incubator-joshua/commit/f2f82c38af9aebd28f9d27f685a2e99767a2e575#commitcomment-17620411
  
In src/joshua/decoder/Decoder.java:
In src/joshua/decoder/Decoder.java on line 193:
where is the right side of this assignment coming from?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-joshua pull request:

2016-05-25 Thread fhieber
Github user fhieber commented on the pull request:


https://github.com/apache/incubator-joshua/commit/f2f82c38af9aebd28f9d27f685a2e99767a2e575#commitcomment-17620359
  
In src/joshua/decoder/Decoder.java:
In src/joshua/decoder/Decoder.java on line 39:
is this required?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-joshua pull request:

2016-05-25 Thread fhieber
Github user fhieber commented on the pull request:


https://github.com/apache/incubator-joshua/commit/f5adcdefd3e52c58b36713798c0830d1c42099e3#commitcomment-17618810
  
In src/joshua/tools/GrammarPacker.java:
In src/joshua/tools/GrammarPacker.java on line 295:
Completely agreed :)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-joshua pull request:

2016-05-25 Thread fhieber
Github user fhieber commented on the pull request:


https://github.com/apache/incubator-joshua/commit/868b340949f324b12d810d03c09c25fd5877d3dc#commitcomment-17618752
  
There is a Vocabulary unit test 
(https://github.com/apache/incubator-joshua/blob/master/tst/joshua/corpus/VocabularyTest.java)
 and a FormatUtil Unit test 
(https://github.com/apache/incubator-joshua/blob/master/tst/joshua/util/FormatUtilsTest.java)

These should be updated accordingly. I guess Unit tests are not run 
currently during build.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-joshua pull request:

2016-05-25 Thread fhieber
Github user fhieber commented on the pull request:


https://github.com/apache/incubator-joshua/commit/f5adcdefd3e52c58b36713798c0830d1c42099e3#commitcomment-17618502
  
In src/joshua/decoder/ff/tm/format/MosesFormatReader.java:
In src/joshua/decoder/ff/tm/format/MosesFormatReader.java on line 95:
this should probably go at some point or changed to logging


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-joshua pull request:

2016-05-25 Thread fhieber
Github user fhieber commented on the pull request:


https://github.com/apache/incubator-joshua/commit/f5adcdefd3e52c58b36713798c0830d1c42099e3#commitcomment-17618536
  
In src/joshua/tools/GrammarPacker.java:
In src/joshua/tools/GrammarPacker.java on line 261:
should probably be removed


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-joshua pull request:

2016-05-25 Thread fhieber
Github user fhieber commented on the pull request:


https://github.com/apache/incubator-joshua/commit/f5adcdefd3e52c58b36713798c0830d1c42099e3#commitcomment-17618628
  
I love this!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-joshua pull request:

2016-05-25 Thread fhieber
Github user fhieber commented on the pull request:


https://github.com/apache/incubator-joshua/commit/366f408672e2d29b69a78531b57056649629e978#commitcomment-17618694
  
In src/joshua/decoder/phrase/PhraseTable.java:
In src/joshua/decoder/phrase/PhraseTable.java on line 99:
"[X]" should be defined as a constant somewhere in the project


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-joshua pull request:

2016-05-25 Thread fhieber
Github user fhieber commented on the pull request:


https://github.com/apache/incubator-joshua/commit/f5adcdefd3e52c58b36713798c0830d1c42099e3#commitcomment-17618489
  
In src/joshua/decoder/ff/tm/format/MosesFormatReader.java:
In src/joshua/decoder/ff/tm/format/MosesFormatReader.java on line 82:
you can initialize the StringBuffer directly with the following line:
new StringBuffer("[X] ||| [X,1] " + fields[0] + " ||| [X,1] " + fields[1] + 
" |||")


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-joshua pull request:

2016-05-25 Thread fhieber
Github user fhieber commented on the pull request:


https://github.com/apache/incubator-joshua/commit/76eb9587a9c321d45a9560481429e2dffd918e4e#commitcomment-17618411
  
In src/joshua/decoder/ff/tm/hash_based/MemoryBasedBatchGrammar.java:
In src/joshua/decoder/ff/tm/hash_based/MemoryBasedBatchGrammar.java on line 
130:
We could think about making these type strings an enum at some point to 
clearly define what they are.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-joshua pull request:

2016-05-25 Thread fhieber
Github user fhieber commented on the pull request:


https://github.com/apache/incubator-joshua/commit/76eb9587a9c321d45a9560481429e2dffd918e4e#commitcomment-17618391
  
In src/joshua/decoder/ff/tm/format/MosesFormatReader.java:
In src/joshua/decoder/ff/tm/format/MosesFormatReader.java on line 45:
this could be made
private static final int lhs = Vocabulary.id("[X]").
However, ideally all 'dependencies' of a class should be instantiated 
before the class is instantiated. We should think at some point about what 
'contracts' we can define for grammars: i.e. when and how do they modify the 
vocabulary? At construction time? At loading time? Later? etc.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-joshua pull request:

2016-05-25 Thread fhieber
Github user fhieber commented on the pull request:


https://github.com/apache/incubator-joshua/commit/76eb9587a9c321d45a9560481429e2dffd918e4e#commitcomment-17618301
  
In src/joshua/decoder/ff/tm/format/HieroFormatReader.java:
In src/joshua/decoder/ff/tm/format/HieroFormatReader.java on line 83:
to be consistent with the above changes (English/French -> target/source) 
the comment could be adopted


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-joshua pull request:

2016-05-25 Thread fhieber
Github user fhieber commented on the pull request:


https://github.com/apache/incubator-joshua/commit/76eb9587a9c321d45a9560481429e2dffd918e4e#commitcomment-17618246
  
In src/joshua/decoder/ff/tm/Rule.java:
In src/joshua/decoder/ff/tm/Rule.java on line 201:
Fantasic commit! Would it make sense to refactor these methods to 
setTarget/setSource as well?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-joshua pull request:

2016-04-20 Thread fhieber
Github user fhieber commented on the pull request:


https://github.com/apache/incubator-joshua/commit/805b643187e07c1d4dcd5047c3ac2dfa0a84e256#commitcomment-17176979
  
I agree, this is kind of a big deal. Thanks for catching this! I have used 
this code before, but did not see clear improvements with class-based lms. Now 
I know why :)

Regarding StateMinimizingLanguageModels: that would be great.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---