[Moses-support] How to create Two-way translator and accelerate.

2009-05-04 Thread Jan Helak
Hello everyone :)

I try to build two-way translator for polish and english languages as a
project on one of my subjects. By now, I created a one-way translator
(polish->english) as a beta version, but severals problems have came:

(1) A translator must work in two-ways. How to achieve this?

(2) Time of traslating for phrases is two long ( 4 min. for one
sentence). How to accelerate this  (decresing a quality of translation
is acceptable).

In addition  I  add a  list of commands used to create beta version:

cat data/training/beta.en | tools/scripts/tokenizer.perl -l en >
work/corpus/beta.tok.en

cat data/training/beta.pl | tools/scripts/tokenizer.perl -l en >
work/corpus/beta.tok.pl

tools/moses-scripts/scripts-20090321-2154/training/clean-corpus-n.perl
work/corpus/beta.tok pl en work/corpus/beta.clean 1 40

tools/scripts/lowercase.perl < work/corpus/beta.clean.en
>work/corpus/beta.lowercased.en
 
tools/scripts/lowercase.perl < work/corpus/beta.clean.pl
>work/corpus/beta.lowercased.pl

tools/scripts/lowercase.perl < work/corpus/beta.tok.en >
work/lm/beta.lowercased.en

tools/srilm/bin/i686/ngram-count -order 3 -interpolate -kndiscount -unk
-text work/lm/beta.lowercased.en -lm work/lm/beta.lm

 tools/moses-scripts/scripts-20090321-2154/ -root-dir work -corpus
work/corpus/beta.lowercased -f pl -e en -alignment grow-diag-final-and
-reordering msd-bidirectional-fe -lm 0:3:/home/janek/Lik/work/lm/beta.lm
>& work/training.out &


Thanks in advance!

Regards,

Jan Helak
Adam Mickiewicz Univerity in Poznan.
www.amu.edu.pl
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] How to create Two-way translator and accelerate.

2009-05-04 Thread Francis Tyers
El lun, 04-05-2009 a las 14:54 +0200, Jan Helak escribió:
> Hello everyone :)
> 
> I try to build two-way translator for polish and english languages as a
> project on one of my subjects. By now, I created a one-way translator
> (polish->english) as a beta version, but severals problems have came:
> 
> (1) A translator must work in two-ways. How to achieve this?

Make another directory and train two models.

> (2) Time of traslating for phrases is two long ( 4 min. for one
> sentence). How to accelerate this  (decresing a quality of translation
> is acceptable).

You can try filtering the phrase table before translating (see PART V -
Filtering Test Data), or using a binarised phrase table (see Memory-Map
LM and Phrase Table).

http://ufallab2.ms.mff.cuni.cz/~bojar/teaching/NPFL087/export/HEAD/lectures/02-phrase-based-Moses-installation-tutorial.html

Regards,

Fran

___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] How to create Two-way translator and accelerate.

2009-05-04 Thread Miles Osborne
actually, i think Jan wants a speedup, not a space saving.

your best bet is to reduce the size of the beam:

http://www.statmt.org/moses/?n=Moses.Tutorial#ntoc6

Miles
2009/5/4 Francis Tyers :
> El lun, 04-05-2009 a las 14:54 +0200, Jan Helak escribió:
>> Hello everyone :)
>>
>> I try to build two-way translator for polish and english languages as a
>> project on one of my subjects. By now, I created a one-way translator
>> (polish->english) as a beta version, but severals problems have came:
>>
>> (1) A translator must work in two-ways. How to achieve this?
>
> Make another directory and train two models.
>
>> (2) Time of traslating for phrases is two long ( 4 min. for one
>> sentence). How to accelerate this  (decresing a quality of translation
>> is acceptable).
>
> You can try filtering the phrase table before translating (see PART V -
> Filtering Test Data), or using a binarised phrase table (see Memory-Map
> LM and Phrase Table).
>
> http://ufallab2.ms.mff.cuni.cz/~bojar/teaching/NPFL087/export/HEAD/lectures/02-phrase-based-Moses-installation-tutorial.html
>
> Regards,
>
> Fran
>
> ___
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
>



-- 
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.

___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] How to create Two-way translator and accelerate.

2009-05-04 Thread Francis Tyers
El lun, 04-05-2009 a las 14:08 +0100, Miles Osborne escribió:
> actually, i think Jan wants a speedup, not a space saving.

Does filtering the phrase table before translation not decrease the
total time to make a translation (including the time taken to load the
phrase table etc.)?  That was my experience, and it appears to be
something that he hasn't done, but perhaps my set up is unusual...

Fran

> your best bet is to reduce the size of the beam:
> 
> http://www.statmt.org/moses/?n=Moses.Tutorial#ntoc6
> 
> Miles
> 2009/5/4 Francis Tyers :
> > El lun, 04-05-2009 a las 14:54 +0200, Jan Helak escribió:
> >> Hello everyone :)
> >>
> >> I try to build two-way translator for polish and english languages as a
> >> project on one of my subjects. By now, I created a one-way translator
> >> (polish->english) as a beta version, but severals problems have came:
> >>
> >> (1) A translator must work in two-ways. How to achieve this?
> >
> > Make another directory and train two models.
> >
> >> (2) Time of traslating for phrases is two long ( 4 min. for one
> >> sentence). How to accelerate this  (decresing a quality of translation
> >> is acceptable).
> >
> > You can try filtering the phrase table before translating (see PART V -
> > Filtering Test Data), or using a binarised phrase table (see Memory-Map
> > LM and Phrase Table).
> >
> > http://ufallab2.ms.mff.cuni.cz/~bojar/teaching/NPFL087/export/HEAD/lectures/02-phrase-based-Moses-installation-tutorial.html
> >
> > Regards,
> >
> > Fran
> >
> > ___
> > Moses-support mailing list
> > Moses-support@mit.edu
> > http://mailman.mit.edu/mailman/listinfo/moses-support
> >
> 
> 
> 

___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] How to create Two-way translator and accelerate.

2009-05-04 Thread Marcin Miłkowski
Miles Osborne pisze:
> filtering etc might give you a speed-up (eg  a constant one --less
> stuff to load) but if filtering is safe w.r.t to the source data, then
> you shouldn't see much here.
> 
> (pruning the table should make it faster since there will be fewer
> options to consider, but this is not safe)

Actually, this is contrary to what Johnson et al. say in their paper, 
and my subjective (not measured) experience was definitely in their 
favor. As long as you have really clean data, you don't want to lose any 
of it, but if alignments are lousy, translations ambiguous etc., you 
want to cut it off, and Jan wants to do that (see his post).

I was even filtering more and got better results by heuristically 
discarding unprobable phrases from the phrase table (based on Fran's 
idea he had about discarding unprobable alignments). Again, this is 
subjective, anecdotal, etc., but before that I was getting complete garbage.

Note: my pair was English-Polish and Polish English.

> i guess you might also see fewer page faults and the like with a
> smaller model and that will help matters.

btw, quantising and binarising language models helps as well

Marcin

> but in general, the beam size is the most direct way to make it faster.



> Miles
> 
> 2009/5/4 Francis Tyers :
>> El lun, 04-05-2009 a las 14:08 +0100, Miles Osborne escribió:
>>> actually, i think Jan wants a speedup, not a space saving.
>> Does filtering the phrase table before translation not decrease the
>> total time to make a translation (including the time taken to load the
>> phrase table etc.)?  That was my experience, and it appears to be
>> something that he hasn't done, but perhaps my set up is unusual...
>>
>> Fran
>>
>>> your best bet is to reduce the size of the beam:
>>>
>>> http://www.statmt.org/moses/?n=Moses.Tutorial#ntoc6
>>>
>>> Miles
>>> 2009/5/4 Francis Tyers :
 El lun, 04-05-2009 a las 14:54 +0200, Jan Helak escribió:
> Hello everyone :)
>
> I try to build two-way translator for polish and english languages as a
> project on one of my subjects. By now, I created a one-way translator
> (polish->english) as a beta version, but severals problems have came:
>
> (1) A translator must work in two-ways. How to achieve this?
 Make another directory and train two models.

> (2) Time of traslating for phrases is two long ( 4 min. for one
> sentence). How to accelerate this  (decresing a quality of translation
> is acceptable).
 You can try filtering the phrase table before translating (see PART V -
 Filtering Test Data), or using a binarised phrase table (see Memory-Map
 LM and Phrase Table).

 http://ufallab2.ms.mff.cuni.cz/~bojar/teaching/NPFL087/export/HEAD/lectures/02-phrase-based-Moses-installation-tutorial.html

 Regards,

 Fran

 ___
 Moses-support mailing list
 Moses-support@mit.edu
 http://mailman.mit.edu/mailman/listinfo/moses-support

>>>
>>>
>>
> 
> 
> 

___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] How to create Two-way translator and accelerate.

2009-05-04 Thread Miles Osborne
the original question was about speed of decoding, not potential
quality improvements due to filtering

clearly, if you can identify phrases to prune then you will get a
speed-boost.  but this is not true for the general case and my advice
was for the general case.

Miles

2009/5/4 Marcin Miłkowski :
> Miles Osborne pisze:
>>
>> filtering etc might give you a speed-up (eg  a constant one --less
>> stuff to load) but if filtering is safe w.r.t to the source data, then
>> you shouldn't see much here.
>>
>> (pruning the table should make it faster since there will be fewer
>> options to consider, but this is not safe)
>
> Actually, this is contrary to what Johnson et al. say in their paper, and my
> subjective (not measured) experience was definitely in their favor. As long
> as you have really clean data, you don't want to lose any of it, but if
> alignments are lousy, translations ambiguous etc., you want to cut it off,
> and Jan wants to do that (see his post).
>
> I was even filtering more and got better results by heuristically discarding
> unprobable phrases from the phrase table (based on Fran's idea he had about
> discarding unprobable alignments). Again, this is subjective, anecdotal,
> etc., but before that I was getting complete garbage.
>
> Note: my pair was English-Polish and Polish English.
>
>> i guess you might also see fewer page faults and the like with a
>> smaller model and that will help matters.
>
> btw, quantising and binarising language models helps as well
>
> Marcin
>
>> but in general, the beam size is the most direct way to make it faster.
>
>
>
>> Miles
>>
>> 2009/5/4 Francis Tyers :
>>>
>>> El lun, 04-05-2009 a las 14:08 +0100, Miles Osborne escribió:

 actually, i think Jan wants a speedup, not a space saving.
>>>
>>> Does filtering the phrase table before translation not decrease the
>>> total time to make a translation (including the time taken to load the
>>> phrase table etc.)?  That was my experience, and it appears to be
>>> something that he hasn't done, but perhaps my set up is unusual...
>>>
>>> Fran
>>>
 your best bet is to reduce the size of the beam:

 http://www.statmt.org/moses/?n=Moses.Tutorial#ntoc6

 Miles
 2009/5/4 Francis Tyers :
>
> El lun, 04-05-2009 a las 14:54 +0200, Jan Helak escribió:
>>
>> Hello everyone :)
>>
>> I try to build two-way translator for polish and english languages as
>> a
>> project on one of my subjects. By now, I created a one-way translator
>> (polish->english) as a beta version, but severals problems have came:
>>
>> (1) A translator must work in two-ways. How to achieve this?
>
> Make another directory and train two models.
>
>> (2) Time of traslating for phrases is two long ( 4 min. for one
>> sentence). How to accelerate this  (decresing a quality of translation
>> is acceptable).
>
> You can try filtering the phrase table before translating (see PART V -
> Filtering Test Data), or using a binarised phrase table (see Memory-Map
> LM and Phrase Table).
>
>
> http://ufallab2.ms.mff.cuni.cz/~bojar/teaching/NPFL087/export/HEAD/lectures/02-phrase-based-Moses-installation-tutorial.html
>
> Regards,
>
> Fran
>
> ___
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
>


>>>
>>
>>
>>
>
>



-- 
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.

___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] How to create Two-way translator and accelerate.

2009-05-04 Thread Miles Osborne
filtering etc might give you a speed-up (eg  a constant one --less
stuff to load) but if filtering is safe w.r.t to the source data, then
you shouldn't see much here.

(pruning the table should make it faster since there will be fewer
options to consider, but this is not safe)

i guess you might also see fewer page faults and the like with a
smaller model and that will help matters.

but in general, the beam size is the most direct way to make it faster.

Miles

2009/5/4 Francis Tyers :
> El lun, 04-05-2009 a las 14:08 +0100, Miles Osborne escribió:
>> actually, i think Jan wants a speedup, not a space saving.
>
> Does filtering the phrase table before translation not decrease the
> total time to make a translation (including the time taken to load the
> phrase table etc.)?  That was my experience, and it appears to be
> something that he hasn't done, but perhaps my set up is unusual...
>
> Fran
>
>> your best bet is to reduce the size of the beam:
>>
>> http://www.statmt.org/moses/?n=Moses.Tutorial#ntoc6
>>
>> Miles
>> 2009/5/4 Francis Tyers :
>> > El lun, 04-05-2009 a las 14:54 +0200, Jan Helak escribió:
>> >> Hello everyone :)
>> >>
>> >> I try to build two-way translator for polish and english languages as a
>> >> project on one of my subjects. By now, I created a one-way translator
>> >> (polish->english) as a beta version, but severals problems have came:
>> >>
>> >> (1) A translator must work in two-ways. How to achieve this?
>> >
>> > Make another directory and train two models.
>> >
>> >> (2) Time of traslating for phrases is two long ( 4 min. for one
>> >> sentence). How to accelerate this  (decresing a quality of translation
>> >> is acceptable).
>> >
>> > You can try filtering the phrase table before translating (see PART V -
>> > Filtering Test Data), or using a binarised phrase table (see Memory-Map
>> > LM and Phrase Table).
>> >
>> > http://ufallab2.ms.mff.cuni.cz/~bojar/teaching/NPFL087/export/HEAD/lectures/02-phrase-based-Moses-installation-tutorial.html
>> >
>> > Regards,
>> >
>> > Fran
>> >
>> > ___
>> > Moses-support mailing list
>> > Moses-support@mit.edu
>> > http://mailman.mit.edu/mailman/listinfo/moses-support
>> >
>>
>>
>>
>
>



-- 
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.

___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


[Moses-support] make binary model

2009-05-04 Thread mm23
Hello ,  I need help please
Not long ago I've made fr-en training based on the europarl
corpus , i found out that there is an option to convert the 
output of the model to binary version.
i used the next command to convert the phrase-table : 
cat phrase-table | sort | mosesdecoder/misc/processPhraseTable 
   -ttable 0 0 - -nscores 5 -out phrase-table

i would like to know what "nscores 5" does ? 
does "nscores 5" Refers to the number of ngrams used in the
langauge model ? 
thanks
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support