Re: [Moses-support] Snt2cooc error - post re-formatted

2016-01-21 Thread Jonathan Chen
Ken Fasano  writes:

> ERROR: parameter '1' does not exist.
> ERROR: Unrecognized attribute :1
> ERROR: parameter '2' does not exist.
> ERROR: Unrecognized attribute :2
> ERROR: parameter '3' does not exist.
> ERROR: Unrecognized attribute :3
> ...
> ERROR: parameter '58527' does not exist.
> ERROR: Unrecognized attribute :58527
> ERROR: parameter '58528' does not exist.
> ERROR: Unrecognized attribute :58528
> ERROR: parameter 'homekfasanoworkingtraincorpusfrvcb' does not exist.
> WARNING: ignoring unrecognized option:  
> /home/kfasano/working/train/corpus/fr.vcb
> Reading vocabulary file from:
> 
> Cannot open vocabulary file fileExit code: 1
> ERROR at /home/kfasano/mosesdecoder/scripts/training/train-model.perl line
1186.
> 
> "parameter 'homekfasanoworkingtraincorpusfrvcb' does not exist." looks 


I got the same error and the cause turned out to be a corrupt snt2cooc.out.
 Recompiling and recopying it to ~/mosesdecoder/tools fixed everything.

More specifically, I corrupted snt2cooc.out by overwriting it with GIZA++. 
I was in a hurry and copied+pasted the following incomplete command

 cp ~/giza-pp/GIZA++-v2/GIZA++ ~/giza-pp/GIZA++-v2/snt2cooc.out

instead of 

 cp ~/giza-pp/GIZA++-v2/GIZA++ ~/giza-pp/GIZA++-v2/snt2cooc.out \
   ~/giza-pp/mkcls-v2/mkcls tools

-Jonathan

___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] Snt2cooc error - post re-formatted

2013-02-27 Thread Hieu Hoang
hi ken

first, try to use absolute paths for ALL files and directories. Even 
avoid ~...

so
   -root-dir train
should be
   -root-dir /whatever/train
And
   ~/mosesdecoder
should be
/whatever/mosesdecoder

second, where did you get your copy of giza/snt2cooc from? the mangled 
output
homekfasanoworkingtraincorpusfrvcb
looks like an old giza++ bug.

try and get the latest version from
   http://code.google.com/p/giza-pp/source/checkout
or even get the binary version from
http://www.statmt.org/moses/RELEASE-1.0/binaries/

On 26/02/2013 23:02, Ken Fasano wrote:
 I have built a Moses system, on Ubuntu Linux running on VMWare, according to 
 the
 Get Started and Baseline documents, with Boost, GIZA++, Irstlm, and 
 Mosesdecoder
 all built without error. All software is the latest. I am using the news-
 commentary-v7.fr-en as described in Baseline, and have run the Corpus
 Preparation and Language Model Training verbatim from the Baseline without
 problem. The directory structure is the same as Baseline, with GIZA, etc., in
 ~/mosesdecoder/tools. Running this command:

 nohup nice ~/mosesdecoder/scripts/training/train-model.perl  -root-dir train -
 corpus ~/corpus/news-commentary-v7.fr-en.clean -f fr -e en -alignment 
 grow-diag-
 final-and -reordering msd-bidirectional-fe -lm 0:3:$HOME/lm/news-commentary-
 v7.fr-en.blm.en:8 -external-bin-dir ~/mosesdecoder/tools 21 | tee -a
 training.out 

 with output redirected to both console and file, I get to sub
 run_single_snt2cooc in train-model.perl, the command at this point being:

 /home/kfasano/mosesdecoder/tools/snt2cooc.out
 /home/kfasano/working/train/corpus/en.vcb
 /home/kfasano/working/train/corpus/fr.vcb 
 /home/kfasano/working/train/corpus/fr-
 en-int-train.snt  /home/kfasano/working/train/giza.fr-en/fr-en.cooc

 The en.vcb file is as follows:
 1 UNK 0
 2 the 188122
 3 ,   174827
 4 .   130861
 5 of  93676
 6 to  85746
 7 and 80903
 8 in  65894
 9 a   58521
 10that41555
 ...
 58525 apos;Arche 1
 58526 apos;Amore 1
 58527 apos;Allemagne 1
 58528 apos;Administration1

 fr.vcb is:
 1 UNK 0
 2 de  188499
 3 ,   160804
 4 .   130136
 5 la  120085
 ...
 66123 --jargon1
 66124 --innovations   1
 66125 --expulsés  1
 66126 --des   1
 66127 --actuels   1

 and fr-en-int-train.snt is:
 1
 1660 12 4469 3586 74
 9 255 5710 2 5065 73
 1
 1660 12 4469 3586 74
 9 255 5710 2 5065 73
 1
 ...
 1
 128 12 831 1353 2 15560 2773 5 32 6874 43 4 2 1115 10 1168 27 596 840 352 973 
 6
 1896 397 3 276 635 3 7 472 877 4 20 46 28 136 9 58 4232 6 2042 2 4633 5 75 
 451 6
 47 4
 6 991 11 7 598 2 5 608 2903 2 19063 1845 19 2315 5 416 3 676 6 1182 8 524 10 
 425
 3655 4 32 53 7823 234 12 16 172 4842 19 613 16 650 4032 11 47 7648 4

 The error I get is:

 ERROR: parameter '1' does not exist.
 ERROR: Unrecognized attribute :1
 ERROR: parameter '2' does not exist.
 ERROR: Unrecognized attribute :2
 ERROR: parameter '3' does not exist.
 ERROR: Unrecognized attribute :3
 ...
 ERROR: parameter '58527' does not exist.
 ERROR: Unrecognized attribute :58527
 ERROR: parameter '58528' does not exist.
 ERROR: Unrecognized attribute :58528
 ERROR: parameter 'homekfasanoworkingtraincorpusfrvcb' does not exist.
 WARNING: ignoring unrecognized option:
 /home/kfasano/working/train/corpus/fr.vcb
 Reading vocabulary file from:

 Cannot open vocabulary file fileExit code: 1
 ERROR at /home/kfasano/mosesdecoder/scripts/training/train-model.perl line 
 1186.

 parameter 'homekfasanoworkingtraincorpusfrvcb' does not exist. looks
 suspicions: there IS a file ~/working/train/corpus/fr.vcb, but with slashes.

 This is my first attempt at building a LM. I would appreciate any help 
 figuring
 this out.


 ___
 Moses-support mailing list
 Moses-support@mit.edu
 http://mailman.mit.edu/mailman/listinfo/moses-support

___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] Snt2cooc error - post re-formatted

2013-02-27 Thread Tomas Hudik
Hi Ken,

You have a typo in your command:
...-root-dir train - corpus ~/corpus/news-commentary-v7.fr-en.clean...
There should be -corpus (not white space between - and corpus). The same is 
true with: -alignment grow-diag- final-and (before word final)

Also,  moses in general, prefers absolute paths then relatives (like 
/home/me/engine/train instead of just train, or ~/mosesdecoder/tools) - it can 
save you a lot of time

Not sure if improving the command would help, but I'd give a chance 


Cheers, Tomas

-Original Message-
From: Ken Fasano [mailto:kenfas...@hotmail.com] 
Sent: Tuesday, February 26, 2013 11:02 PM
To: moses-support@mit.edu
Subject: [Moses-support] Snt2cooc error - post re-formatted

I have built a Moses system, on Ubuntu Linux running on VMWare, according to 
the Get Started and Baseline documents, with Boost, GIZA++, Irstlm, and 
Mosesdecoder all built without error. All software is the latest. I am using 
the news- commentary-v7.fr-en as described in Baseline, and have run the Corpus 
Preparation and Language Model Training verbatim from the Baseline without 
problem. The directory structure is the same as Baseline, with GIZA, etc., in 
~/mosesdecoder/tools. Running this command:

nohup nice ~/mosesdecoder/scripts/training/train-model.perl  -root-dir train - 
corpus ~/corpus/news-commentary-v7.fr-en.clean -f fr -e en -alignment 
grow-diag- final-and -reordering msd-bidirectional-fe -lm 
0:3:$HOME/lm/news-commentary-
v7.fr-en.blm.en:8 -external-bin-dir ~/mosesdecoder/tools 21 | tee -a 
training.out  

with output redirected to both console and file, I get to sub 
run_single_snt2cooc in train-model.perl, the command at this point being:

/home/kfasano/mosesdecoder/tools/snt2cooc.out
/home/kfasano/working/train/corpus/en.vcb
/home/kfasano/working/train/corpus/fr.vcb /home/kfasano/working/train/corpus/fr-
en-int-train.snt  /home/kfasano/working/train/giza.fr-en/fr-en.cooc

The en.vcb file is as follows:
1   UNK 0
2   the 188122
3   ,   174827
4   .   130861
5   of  93676
6   to  85746
7   and 80903
8   in  65894
9   a   58521
10  that41555
...
58525   apos;Arche 1
58526   apos;Amore 1
58527   apos;Allemagne 1
58528   apos;Administration1

fr.vcb is:
1   UNK 0
2   de  188499
3   ,   160804
4   .   130136
5   la  120085
...
66123   --jargon1
66124   --innovations   1
66125   --expulsés  1
66126   --des   1
66127   --actuels   1

and fr-en-int-train.snt is:
1
1660 12 4469 3586 74
9 255 5710 2 5065 73
1
1660 12 4469 3586 74
9 255 5710 2 5065 73
1
...
1
128 12 831 1353 2 15560 2773 5 32 6874 43 4 2 1115 10 1168 27 596 840 352 973 6
1896 397 3 276 635 3 7 472 877 4 20 46 28 136 9 58 4232 6 2042 2 4633 5 75 451 6
47 4
6 991 11 7 598 2 5 608 2903 2 19063 1845 19 2315 5 416 3 676 6 1182 8 524 10 425
3655 4 32 53 7823 234 12 16 172 4842 19 613 16 650 4032 11 47 7648 4

The error I get is:

ERROR: parameter '1' does not exist.
ERROR: Unrecognized attribute :1
ERROR: parameter '2' does not exist.
ERROR: Unrecognized attribute :2
ERROR: parameter '3' does not exist.
ERROR: Unrecognized attribute :3
...
ERROR: parameter '58527' does not exist.
ERROR: Unrecognized attribute :58527
ERROR: parameter '58528' does not exist.
ERROR: Unrecognized attribute :58528
ERROR: parameter 'homekfasanoworkingtraincorpusfrvcb' does not exist.
WARNING: ignoring unrecognized option:  
/home/kfasano/working/train/corpus/fr.vcb
Reading vocabulary file from:

Cannot open vocabulary file fileExit code: 1 ERROR at 
/home/kfasano/mosesdecoder/scripts/training/train-model.perl line 1186.

parameter 'homekfasanoworkingtraincorpusfrvcb' does not exist. looks
suspicions: there IS a file ~/working/train/corpus/fr.vcb, but with slashes.

This is my first attempt at building a LM. I would appreciate any help figuring 
this out.




___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support