[Moses-support] tuning problem

2011-11-22 Thread somayeh bakhshaei
Hello all,

Salam,

I am using moses in this way:

train,
for i=1 to 3
tune
end for
decode
evaluate

in the above loop for something unexpected happens, in large execution
sometime the weights produced in moses.ini are wrong. For example once it
produce 3 in the other case produce 4, take a look hear:

# translation model weights
[weight-t]
0.0106455
0.036391
0.0453815
0.0716856
0.0271838

# translation model weights
[weight-t]
0.0705978
0.0652413
0.100475
0.00356951

in the case in the previous iteration nothing is wrong.
Did anyone can tell me what is happening here please?




-
Best Regards,
S.Bakhshaei

After All you will come 
And will spread light on the dark desolate world!
O' Kind Father! We will be waiting for your affectionate hands ...
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


[Moses-support] Randomisation by MGIZA and tuning result is worse than no tuning

2011-11-22 Thread Jelita Asian
I'm translating English to Indonesian and vice versa using Moses.
I discover that when I run in different machines and even in the same
machine, the result can be different especially with tuning.

So far I've discovered 3 places which cause the result to be different.
1. mert-modified.pl, I just need to activate predictable-seed.
2. mkcls, just set the seed for each run
3. mgiza: I find that even for the first iteration, the result is already
different:

In one run:

Model1: Iteration 1
Model1: (1) TRAIN CROSS-ENTROPY 15.8786 PERPLEXITY 60246.2
Model1: (1) VITERBI TRAIN CROSS-ENTROPY 20.5269 PERPLEXITY 1.51077e+06
Model 1 Iteration: 1 took: 1 seconds

 In second run:

Model1: Iteration 1
Model1: (1) TRAIN CROSS-ENTROPY 15.928 PERPLEXITY 62347.7
Model1: (1) VITERBI TRAIN CROSS-ENTROPY 20.5727 PERPLEXITY 1.55952e+06
Model 1 Iteration: 1 took: 1 seconds

I have no idea where the randomization occurs for MGIZA even after looking
at the codes which is hard to be understood.

So my questions are:
1. How do I set it so the cross-entropy result in MGIZA to be the same? I
think randomisation occurs somewhere but I can't find it.

2. I read in some threads that we need to run multiple time and average the
result for the run to report. However, how I can find the best combination
for training and  tuning parameters if the result for each run is
different? For example if I want to find the best combination for which
alignment and which reordering model.

3. Is that possible that tuning causes worst result? My corpus is around
500,000 words and I use 100 sentences for tuning. Can the sentences for
tuning be used for training or are they supposed to be separate?  I used
100 sentences which are different from the training set. My non-tuning NIST
and BLEU results are around 6.5 and 0.21, while the non tuning results are
around 6.1 and 0.19.
Is not the result a bit too low? I'm not sure how to increase it.

Sorry for the multiple questions in one post. I can separate them into
different posts but I don't want to spam the mailing list. Thanks. Any help
will be appreciated.

Best regards,

Jelita
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] tuning problem

2011-11-22 Thread Wang Pidong
I guess the problem is from the *reuse-weights.perl*
, which always ignores floats which have 'e' in the string representation
of the float, e.g., 1.05e-1

Hope this can help you.

Best wishes!
Pidong


On 22 November 2011 16:57, somayeh bakhshaei  wrote:

> Hello all,
>
> Salam,
>
> I am using moses in this way:
>
> train,
> for i=1 to 3
> tune
> end for
> decode
> evaluate
>
> in the above loop for something unexpected happens, in large execution
> sometime the weights produced in moses.ini are wrong. For example once it
> produce 3 in the other case produce 4, take a look hear:
>
> # translation model weights
> [weight-t]
> 0.0106455
> 0.036391
> 0.0453815
> 0.0716856
> 0.0271838
>
> # translation model weights
> [weight-t]
> 0.0705978
> 0.0652413
> 0.100475
> 0.00356951
>
> in the case in the previous iteration nothing is wrong.
> Did anyone can tell me what is happening here please?
>
>
>
>
> -
> Best Regards,
> S.Bakhshaei
>
> After All you will come 
> And will spread light on the dark desolate world!
> O' Kind Father! We will be waiting for your affectionate hands ...
>
>
> ___
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
>


-- 
Wang Pidong

Department of Computer Science
School of Computing
National University of Singapore
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] tuning problem

2011-11-22 Thread somayeh bakhshaei
Thank you a lot.
Do you have any solution for it?

On Tue, Nov 22, 2011 at 1:31 PM, Wang Pidong  wrote:

> I guess the problem is from the *reuse-weights.perl*
> , which always ignores floats which have 'e' in the string representation
> of the float, e.g., 1.05e-1
>
> Hope this can help you.
>
> Best wishes!
> Pidong
>
>
> On 22 November 2011 16:57, somayeh bakhshaei wrote:
>
>> Hello all,
>>
>> Salam,
>>
>> I am using moses in this way:
>>
>> train,
>> for i=1 to 3
>> tune
>> end for
>> decode
>> evaluate
>>
>> in the above loop for something unexpected happens, in large execution
>> sometime the weights produced in moses.ini are wrong. For example once it
>> produce 3 in the other case produce 4, take a look hear:
>>
>> # translation model weights
>> [weight-t]
>> 0.0106455
>> 0.036391
>> 0.0453815
>> 0.0716856
>> 0.0271838
>>
>> # translation model weights
>> [weight-t]
>> 0.0705978
>> 0.0652413
>> 0.100475
>> 0.00356951
>>
>> in the case in the previous iteration nothing is wrong.
>> Did anyone can tell me what is happening here please?
>>
>>
>>
>>
>> -
>> Best Regards,
>> S.Bakhshaei
>>
>> After All you will come 
>> And will spread light on the dark desolate world!
>> O' Kind Father! We will be waiting for your affectionate hands ...
>>
>>
>> ___
>> Moses-support mailing list
>> Moses-support@mit.edu
>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>
>>
>
>
> --
> Wang Pidong
>
> Department of Computer Science
> School of Computing
> National University of Singapore
>
>


-- 



-
Best Regards,
S.Bakhshaei

After All you will come 
And will spread light on the dark desolate world!
O' Kind Father! We will be waiting for your affectionate hands ...
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] tuning problem

2011-11-22 Thread Wang Pidong
I fixed this problem before, and I remember I just changed the regular
expression (appearing two times) for the weights from:
*/^([\-\d\.]+)\s*$/*
to
*/^([\-\d\.e]+)\s*$/*
in the perl stript.

You can try it to see whether it works.

Best wishes!
Pidong


On 22 November 2011 18:09, somayeh bakhshaei  wrote:

> Thank you a lot.
> Do you have any solution for it?
>
>
> On Tue, Nov 22, 2011 at 1:31 PM, Wang Pidong wrote:
>
>> I guess the problem is from the *reuse-weights.perl*
>> , which always ignores floats which have 'e' in the string representation
>> of the float, e.g., 1.05e-1
>>
>> Hope this can help you.
>>
>> Best wishes!
>> Pidong
>>
>>
>> On 22 November 2011 16:57, somayeh bakhshaei wrote:
>>
>>> Hello all,
>>>
>>> Salam,
>>>
>>> I am using moses in this way:
>>>
>>> train,
>>> for i=1 to 3
>>> tune
>>> end for
>>> decode
>>> evaluate
>>>
>>> in the above loop for something unexpected happens, in large execution
>>> sometime the weights produced in moses.ini are wrong. For example once it
>>> produce 3 in the other case produce 4, take a look hear:
>>>
>>> # translation model weights
>>> [weight-t]
>>> 0.0106455
>>> 0.036391
>>> 0.0453815
>>> 0.0716856
>>> 0.0271838
>>>
>>> # translation model weights
>>> [weight-t]
>>> 0.0705978
>>> 0.0652413
>>> 0.100475
>>> 0.00356951
>>>
>>> in the case in the previous iteration nothing is wrong.
>>> Did anyone can tell me what is happening here please?
>>>
>>>
>>>
>>>
>>> -
>>> Best Regards,
>>> S.Bakhshaei
>>>
>>> After All you will come 
>>> And will spread light on the dark desolate world!
>>> O' Kind Father! We will be waiting for your affectionate hands ...
>>>
>>>
>>> ___
>>> Moses-support mailing list
>>> Moses-support@mit.edu
>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>>
>>>
>>
>>
>> --
>> Wang Pidong
>>
>> Department of Computer Science
>> School of Computing
>> National University of Singapore
>>
>>
>
>
> --
>
>
>
> -
> Best Regards,
> S.Bakhshaei
>
> After All you will come 
> And will spread light on the dark desolate world!
> O' Kind Father! We will be waiting for your affectionate hands ...
>
>


-- 
Wang Pidong

Department of Computer Science
School of Computing
National University of Singapore
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] Randomisation by MGIZA and tuning result is worse than no tuning

2011-11-22 Thread Miles Osborne
we seem to have a number of posts all talking about non-determinism in
Moses.  here is a full answer.

--in general, Machine Translation training is non-convex.  this means
that there are multiple solutions and each time you run a full
training job, you will get different results.  in particular, you will
see different results when running Giza++ (any flavour) and MERT.

--the best way to deal with this (and most expensive) would be to run
the full pipe-line, from scratch and multiple times.  this will give
you a feel for variance --differences in results.  in general,
variance arising from Giza++ is less damaging than variance from MERT.

--to reduce variance it is best to use as much data as possible at
each stage.  (100 sentences for tuning is far too low;  you should be
using at least 1000 sentences).  it is possible to reduce this
variability by using better machine learning, but in general it will
always be there.

--another strategy I know about is to fix everything once you have a
set of good weights and never rerun MERT.  should you need to change
say the language model, you will then manually alter the associated
weight.  this will mean stability, but at the obvious cost of
generality.  it is also ugly.

Miles

On 22 November 2011 09:36, Jelita Asian  wrote:
> I'm translating English to Indonesian and vice versa using Moses.
> I discover that when I run in different machines and even in the same
> machine, the result can be different especially with tuning.
>
> So far I've discovered 3 places which cause the result to be different.
> 1. mert-modified.pl, I just need to activate predictable-seed.
> 2. mkcls, just set the seed for each run
> 3. mgiza: I find that even for the first iteration, the result is already
> different:
>
> In one run:
>
> Model1: Iteration 1
> Model1: (1) TRAIN CROSS-ENTROPY 15.8786 PERPLEXITY 60246.2
> Model1: (1) VITERBI TRAIN CROSS-ENTROPY 20.5269 PERPLEXITY 1.51077e+06
> Model 1 Iteration: 1 took: 1 seconds
>
>  In second run:
>
> Model1: Iteration 1
> Model1: (1) TRAIN CROSS-ENTROPY 15.928 PERPLEXITY 62347.7
> Model1: (1) VITERBI TRAIN CROSS-ENTROPY 20.5727 PERPLEXITY 1.55952e+06
> Model 1 Iteration: 1 took: 1 seconds
>
> I have no idea where the randomization occurs for MGIZA even after looking
> at the codes which is hard to be understood.
>
> So my questions are:
> 1. How do I set it so the cross-entropy result in MGIZA to be the same? I
> think randomisation occurs somewhere but I can't find it.
>
> 2. I read in some threads that we need to run multiple time and average the
> result for the run to report. However, how I can find the best combination
> for training and  tuning parameters if the result for each run is different?
> For example if I want to find the best combination for which alignment and
> which reordering model.
>
> 3. Is that possible that tuning causes worst result? My corpus is around
> 500,000 words and I use 100 sentences for tuning. Can the sentences for
> tuning be used for training or are they supposed to be separate?  I used 100
> sentences which are different from the training set. My non-tuning NIST and
> BLEU results are around 6.5 and 0.21, while the non tuning results are
> around 6.1 and 0.19.
> Is not the result a bit too low? I'm not sure how to increase it.
>
> Sorry for the multiple questions in one post. I can separate them into
> different posts but I don't want to spam the mailing list. Thanks. Any help
> will be appreciated.
>
> Best regards,
>
> Jelita
>
>
> ___
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
>



-- 
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.

___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] tuning problem

2011-11-22 Thread Lang Jun
Like pidong, I have fixed same bug as perl regular expression solution.

Best regards,
Lang Jun

在 2011年11月22日,下午06:17,Wang Pidong  写道:

I fixed this problem before, and I remember I just changed the regular
expression (appearing two times) for the weights from:
*/^([\-\d\.]+)\s*$/*
to
*/^([\-\d\.e]+)\s*$/*
in the perl stript.

You can try it to see whether it works.

Best wishes!
Pidong


On 22 November 2011 18:09, somayeh bakhshaei  wrote:

> Thank you a lot.
> Do you have any solution for it?
>
>
> On Tue, Nov 22, 2011 at 1:31 PM, Wang Pidong wrote:
>
>> I guess the problem is from the *reuse-weights.perl*
>> , which always ignores floats which have 'e' in the string representation
>> of the float, e.g., 1.05e-1
>>
>> Hope this can help you.
>>
>> Best wishes!
>> Pidong
>>
>>
>> On 22 November 2011 16:57, somayeh bakhshaei wrote:
>>
>>> Hello all,
>>>
>>> Salam,
>>>
>>> I am using moses in this way:
>>>
>>> train,
>>> for i=1 to 3
>>> tune
>>> end for
>>> decode
>>> evaluate
>>>
>>> in the above loop for something unexpected happens, in large execution
>>> sometime the weights produced in moses.ini are wrong. For example once it
>>> produce 3 in the other case produce 4, take a look hear:
>>>
>>> # translation model weights
>>> [weight-t]
>>> 0.0106455
>>> 0.036391
>>> 0.0453815
>>> 0.0716856
>>> 0.0271838
>>>
>>> # translation model weights
>>> [weight-t]
>>> 0.0705978
>>> 0.0652413
>>> 0.100475
>>> 0.00356951
>>>
>>> in the case in the previous iteration nothing is wrong.
>>> Did anyone can tell me what is happening here please?
>>>
>>>
>>>
>>>
>>> -
>>> Best Regards,
>>> S.Bakhshaei
>>>
>>> After All you will come 
>>> And will spread light on the dark desolate world!
>>> O' Kind Father! We will be waiting for your affectionate hands ...
>>>
>>>
>>> ___
>>> Moses-support mailing list
>>> Moses-support@mit.edu
>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>>
>>>
>>
>>
>> --
>> Wang Pidong
>>
>> Department of Computer Science
>> School of Computing
>> National University of Singapore
>>
>>
>
>
> --
>
>
>
> -
> Best Regards,
> S.Bakhshaei
>
> After All you will come 
> And will spread light on the dark desolate world!
> O' Kind Father! We will be waiting for your affectionate hands ...
>
>


-- 
Wang Pidong

Department of Computer Science
School of Computing
National University of Singapore

___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


[Moses-support] moses on Windows

2011-11-22 Thread Alessandro Momi
Hi,
i installed moses on my Windows Pc following your step by step guide for 
windows. After that I installed srilm. but now I'm in trouble. How I can do now 
to start translation with moses?
Please help me

Thank for your reply
Greeting


Alessandro Momi
IT Manager

[cid:image001.jpg@01CCA911.A75A6880]

Nexo Corporation Srl

Via Palmiro Togliatti, 73/A1 - 06073 Corciano [Pg]

Tel . 075 - 6979255 - Fax  075 - 9691073
P. Iva/C.F. : 03201370545 - Rea : Pg - 271348

This email and any file transmitted with it is intended only for the person or 
entity to which is addressed and may contain information that is privileged, 
confidential or otherwise protected from disclosure. Copying, dissemination or 
use of this email or the information herein by anyone other than the intended 
recipient is prohibited. If you have received this email by mistake, please 
notify us immediately by email, telephone or fax.
<>___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] tuning problem

2011-11-22 Thread Jehan Pages
Hi,

On Tue, Nov 22, 2011 at 5:57 PM, somayeh bakhshaei
 wrote:
> Hello all,
>
> Salam,
>
> I am using moses in this way:
>
> train,
> for i=1 to 3
>     tune
> end for

Sorry for not answering your problem (I don't have the solution though
I saw others did answer with a possible resolution). I just note that
you tune 3 times. Do you mean you re-tune using the exact same data
set these 3 times? Does it bring better results to tune several times
like this?
Thanks!

Jehan

> decode
> evaluate
>
> in the above loop for something unexpected happens, in large execution
> sometime the weights produced in moses.ini are wrong. For example once it
> produce 3 in the other case produce 4, take a look hear:
>
> # translation model weights
> [weight-t]
> 0.0106455
> 0.036391
> 0.0453815
> 0.0716856
> 0.0271838
>
> # translation model weights
> [weight-t]
> 0.0705978
> 0.0652413
> 0.100475
> 0.00356951
>
> in the case in the previous iteration nothing is wrong.
> Did anyone can tell me what is happening here please?
>
>
>
>
> -
> Best Regards,
> S.Bakhshaei
>
> After All you will come 
> And will spread light on the dark desolate world!
> O' Kind Father! We will be waiting for your affectionate hands ...
>
>
> ___
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
>

___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] Randomisation by MGIZA and tuning result is worse than no tuning

2011-11-22 Thread Jelita Asian
Hi Miles,

Thanks for your reply.

--in general, Machine Translation training is non-convex.  this means
> that there are multiple solutions and each time you run a full
> training job, you will get different results.  in particular, you will
> see different results when running Giza++ (any flavour) and MERT.
>
>
Is there no way to stop the variant in Giza++? I look at the code but has
no idea where it occurs.


> --the best way to deal with this (and most expensive) would be to run
> the full pipe-line, from scratch and multiple times.  this will give
> you a feel for variance --differences in results.  in general,
> variance arising from Giza++ is less damaging than variance from MERT.
>
> How many run is enough for this? As you say, it would be very expensive to
do so.


> --to reduce variance it is best to use as much data as possible at
> each stage.  (100 sentences for tuning is far too low;  you should be
> using at least 1000 sentences).  it is possible to reduce this
> variability by using better machine learning, but in general it will
> always be there.
>
> What do you mean by better machine learning? Isn't the 500,000 words
corpus enough? For the 1,000 sentences for tuning, can I use the same
sentences as used in the training or they shall be separate sets of
sentences?


> --another strategy I know about is to fix everything once you have a
> set of good weights and never rerun MERT.  should you need to change
> say the language model, you will then manually alter the associated
> weight.  this will mean stability, but at the obvious cost of
> generality.  it is also ugly.
>
> Could you elaborate a bit about the fixing everything and never rerun MERT
part? Do you mean after running n times, we find the best variation of
variables (there are so many of them) and don't run MERT which I understand
is for tuning?

Thanks and sorry to answer it with more questions.

Cheers,

Jelita
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] Randomisation by MGIZA and tuning result is worse than no tuning

2011-11-22 Thread Miles Osborne
>
>> --in general, Machine Translation training is non-convex.  this means
>> that there are multiple solutions and each time you run a full
>> training job, you will get different results.  in particular, you will
>> see different results when running Giza++ (any flavour) and MERT.
>>
>
> Is there no way to stop the variant in Giza++? I look at the code but has no
> idea where it occurs.

no, this is a property of the task, not the method.  put it another
way, there is nothing which tells the model how words are translated.
Giza++ makes a guess based upon how well it `explains's the training
data (log-likelihood / cross entropy).  there are many ways to achieve
the same log-likelihood and each guess amounts to a different
translation model.  on average these alternative models will all be
similar to each other (words are translated in similar ways), but in
general you will find differences.


>>
>> --the best way to deal with this (and most expensive) would be to run
>> the full pipe-line, from scratch and multiple times.  this will give
>> you a feel for variance --differences in results.  in general,
>> variance arising from Giza++ is less damaging than variance from MERT.
>>
> How many run is enough for this? As you say, it would be very expensive to
> do so.

how long is a piece of string?

>
>>
>> --to reduce variance it is best to use as much data as possible at
>> each stage.  (100 sentences for tuning is far too low;  you should be
>> using at least 1000 sentences).  it is possible to reduce this
>> variability by using better machine learning, but in general it will
>> always be there.
>>
> What do you mean by better machine learning? Isn't the 500,000 words corpus
> enough? For the 1,000 sentences for tuning, can I use the same sentences as
> used in the training or they shall be separate sets of sentences?

lattice MERT is an example, or the Berkeley Aligner.

you cannot use the same sentences for training and tuning, as has been
explained earlier on the list


>
>>
>> --another strategy I know about is to fix everything once you have a
>> set of good weights and never rerun MERT.  should you need to change
>> say the language model, you will then manually alter the associated
>> weight.  this will mean stability, but at the obvious cost of
>> generality.  it is also ugly.
>>
> Could you elaborate a bit about the fixing everything and never rerun MERT
> part? Do you mean after running n times, we find the best variation of
> variables (there are so many of them) and don't run MERT which I understand
> is for tuning?

if you have some problem that is fairly stable (uses the same training
set, language models etc) then after running MERT many times and
evaluating it on a disjoint test set, you pick the weights that
produce good results.  afterwards you do not re-run MERT even if you
have changed the model.

as i mentioned, this is ugly and something you do not want to do
unless you are forced to do it

Miles
>
> Thanks and sorry to answer it with more questions.
>
> Cheers,
>
> Jelita
>



-- 
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.

___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] tuning problem

2011-11-22 Thread somayeh bakhshaei
Hello,

Thanks for all answers.

Also thanks Jehan.
As you might follow moses emails there is an inconsistency problem about
tuning in mert (expressed by Neda)
For reducing this problem everyone offered to tune the system repeatedly
then choosing the best answer.
It is a way of getting rid of local maxima but not exactly catching the
global Maxima but instead trapping in another local one :)
So I think a better solution is needed!


On Tue, Nov 22, 2011 at 3:12 PM, Jehan Pages  wrote:

> Hi,
>
> On Tue, Nov 22, 2011 at 5:57 PM, somayeh bakhshaei
>  wrote:
> > Hello all,
> >
> > Salam,
> >
> > I am using moses in this way:
> >
> > train,
> > for i=1 to 3
> > tune
> > end for
>
> Sorry for not answering your problem (I don't have the solution though
> I saw others did answer with a possible resolution). I just note that
> you tune 3 times. Do you mean you re-tune using the exact same data
> set these 3 times? Does it bring better results to tune several times
> like this?
> Thanks!
>
> Jehan
>
> > decode
> > evaluate
> >
> > in the above loop for something unexpected happens, in large execution
> > sometime the weights produced in moses.ini are wrong. For example once it
> > produce 3 in the other case produce 4, take a look hear:
> >
> > # translation model weights
> > [weight-t]
> > 0.0106455
> > 0.036391
> > 0.0453815
> > 0.0716856
> > 0.0271838
> >
> > # translation model weights
> > [weight-t]
> > 0.0705978
> > 0.0652413
> > 0.100475
> > 0.00356951
> >
> > in the case in the previous iteration nothing is wrong.
> > Did anyone can tell me what is happening here please?
> >
> >
> >
> >
> > -
> > Best Regards,
> > S.Bakhshaei
> >
> > After All you will come 
> > And will spread light on the dark desolate world!
> > O' Kind Father! We will be waiting for your affectionate hands ...
> >
> >
> > ___
> > Moses-support mailing list
> > Moses-support@mit.edu
> > http://mailman.mit.edu/mailman/listinfo/moses-support
> >
> >
>



-- 



-
Best Regards,
S.Bakhshaei

After All you will come 
And will spread light on the dark desolate world!
O' Kind Father! We will be waiting for your affectionate hands ...
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] RandLM compile error on Ubuntu 11.10

2011-11-22 Thread Miles Osborne
this looks like a problem with Ubuntu rather than RandLM:

http://stackoverflow.com/questions/7755668/linking-against-boost-thread-fails-under-ubuntu-11-10

if you post to the RandLM Sourceforge site and raise an error, we may
get around to fixing it

(the Moses list is not really the best place)

Miles

On 19 November 2011 08:02, Tom Hoar
 wrote:
> I can't compile RandLM, 0.20 on Ubuntu 11.10. RandLM was configured with
> boost and multithreading support. The same configuration compiles under
> Ubuntu 10.04, 10.10 and 11.04.
>
> From the error log, it looks like RandLM can't find boost libraries on the
> new distro. Log attached. Any suggestions?
>
> Changes in 11.10 include: Linux kernel 3.0
> gcc (Ubuntu/Linaro 4.6.1-9ubuntu3) 4.6.1
> GNU Make 3.81
>
> Tom
>
> ___
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
>



-- 
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] tuning problem

2011-11-22 Thread Tom Hoar


Hi Somayeh, 

Moses is has a strong developer/researcher community,
but there's always room for more. Maybe you can lead in this area. 

Tom


On Tue, 22 Nov 2011 16:48:47 +0330, somayeh bakhshaei  wrote: 


Hello,

Thanks for all answers.

Also thanks Jehan.
As you might
follow moses emails there is an inconsistency problem about tuning in
mert (expressed by Neda)
For reducing this problem everyone offered to
tune the system repeatedly then choosing the best answer.
 It is a way
of getting rid of local maxima but not exactly catching the global
Maxima but instead trapping in another local one :) 
So I think a better
solution is needed! 

On Tue, Nov 22, 2011 at 3:12 PM, Jehan Pages 
wrote:
 Hi,

 On Tue, Nov 22, 2011 at 5:57 PM, somayeh bakhshaei
 
wrote:
 > Hello all,
 >
 > Salam,
 >
 > I am using moses in this way:

>
 > train,
 > for i=1 to 3
 > tune
 > end for

 Sorry for not answering
your problem (I don't have the solution though
 I saw others did answer
with a possible resolution). I just note that
 you tune 3 times. Do you
mean you re-tune using the exact same data
 set these 3 times? Does it
bring better results to tune several times
 like this?
 Thanks!


Jehan

 > decode
 > evaluate
 >
 > in the above loop for something
unexpected happens, in large execution
 > sometime the weights produced
in moses.ini are wrong. For example once it
 > produce 3 in the other
case produce 4, take a look hear:
 >
 > # translation model weights
 >
[weight-t]
 > 0.0106455
 > 0.036391
 > 0.0453815
 > 0.0716856
 >
0.0271838
 >
 > # translation model weights
 > [weight-t]
 > 0.0705978

> 0.0652413
 > 0.100475
 > 0.00356951
 >
 > in the case in the previous
iteration nothing is wrong.
 > Did anyone can tell me what is happening
here please?
 >
 >
 >
 >
 > -
 > Best Regards,
 >
S.Bakhshaei
 >
 > After All you will come 
 > And will spread light
on the dark desolate world!
 > O' Kind Father! We will be waiting for
your affectionate hands ...
 >
 >  

>
___
 > Moses-support mailing
list
 > Moses-support@mit.edu [3]
 >
http://mailman.mit.edu/mailman/listinfo/moses-support [4]
 >
 >

--


-
Best Regards,
S.Bakhshaei

After All you will
come 
And will spread light on the dark desolate world!

O' Kind
Father! We will be waiting for your affectionate hands
...



Links:
--
[1] mailto:je...@mygengo.com
[2]
mailto:s.bakhsh...@gmail.com
[3] mailto:Moses-support@mit.edu
[4]
http://mailman.mit.edu/mailman/listinfo/moses-support
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] RandLM compile error on Ubuntu 11.10

2011-11-22 Thread Tom Hoar
 Thanks Miles. I should have looked there myself. It's interesting, 
 however, that both MGIZA++ and moses decoder both rely on Boost and they 
 both compile nicely on 11.10.

 I have an inside contact at Canonical (Ubuntu's parent company) who has 
 helped update Moses dependencies with past changes in gcc. I'll ask him 
 to review the issues between randlm and 11.10 and revert any updates on 
 RandLM.

 Tom

 On Tue, 22 Nov 2011 13:24:59 +, Miles Osborne  
 wrote:
> this looks like a problem with Ubuntu rather than RandLM:
>
> 
> http://stackoverflow.com/questions/7755668/linking-against-boost-thread-fails-under-ubuntu-11-10
>
> if you post to the RandLM Sourceforge site and raise an error, we may
> get around to fixing it
>
> (the Moses list is not really the best place)
>
> Miles
>
> On 19 November 2011 08:02, Tom Hoar
>  wrote:
>> I can't compile RandLM, 0.20 on Ubuntu 11.10. RandLM was configured 
>> with
>> boost and multithreading support. The same configuration compiles 
>> under
>> Ubuntu 10.04, 10.10 and 11.04.
>>
>> From the error log, it looks like RandLM can't find boost libraries 
>> on the
>> new distro. Log attached. Any suggestions?
>>
>> Changes in 11.10 include: Linux kernel 3.0
>> gcc (Ubuntu/Linaro 4.6.1-9ubuntu3) 4.6.1
>> GNU Make 3.81
>>
>> Tom
>>
>> ___
>> Moses-support mailing list
>> Moses-support@mit.edu
>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>
>>

___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] RandLM compile error on Ubuntu 11.10

2011-11-22 Thread Miles Osborne
this seems to be a linker problem than anything else.  either that or
RandLM has taken errors to a whole new level

Miles

On 22 November 2011 13:33, Tom Hoar
 wrote:
> Thanks Miles. I should have looked there myself. It's interesting, however,
> that both MGIZA++ and moses decoder both rely on Boost and they both compile
> nicely on 11.10.
>
> I have an inside contact at Canonical (Ubuntu's parent company) who has
> helped update Moses dependencies with past changes in gcc. I'll ask him to
> review the issues between randlm and 11.10 and revert any updates on RandLM.
>
> Tom
>
> On Tue, 22 Nov 2011 13:24:59 +, Miles Osborne 
> wrote:
>>
>> this looks like a problem with Ubuntu rather than RandLM:
>>
>>
>>
>> http://stackoverflow.com/questions/7755668/linking-against-boost-thread-fails-under-ubuntu-11-10
>>
>> if you post to the RandLM Sourceforge site and raise an error, we may
>> get around to fixing it
>>
>> (the Moses list is not really the best place)
>>
>> Miles
>>
>> On 19 November 2011 08:02, Tom Hoar
>>  wrote:
>>>
>>> I can't compile RandLM, 0.20 on Ubuntu 11.10. RandLM was configured with
>>> boost and multithreading support. The same configuration compiles under
>>> Ubuntu 10.04, 10.10 and 11.04.
>>>
>>> From the error log, it looks like RandLM can't find boost libraries on
>>> the
>>> new distro. Log attached. Any suggestions?
>>>
>>> Changes in 11.10 include: Linux kernel 3.0
>>> gcc (Ubuntu/Linaro 4.6.1-9ubuntu3) 4.6.1
>>> GNU Make 3.81
>>>
>>> Tom
>>>
>>> ___
>>> Moses-support mailing list
>>> Moses-support@mit.edu
>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>>
>>>
>
>



-- 
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] moses on Windows

2011-11-22 Thread Tom Hoar


Hi Hamza, 

DoMY CE is an open source (LGPL3 licenses) specifically
designed to help novice Linux users install Moses and quickly become
productive. You will, however, need to install Ubuntu Linux 10.04,
10.10, 11.04 (not 11.10). Many of our users first install Ubuntu on a
guest virtual machine with a Windows host. This greatly facilitates the
transition to the new environment, but is not recommended for a
long-term solution. 

Tom 

http://www.precisiontransltiontools.com 

On
Tue, 22 Nov 2011 11:24:17 +, Alessandro Momi  wrote:  

Hi, 

i
installed moses on my Windows Pc following your step by step guide for
windows. After that I installed srilm. but now I'm in trouble. How I can
do now to start translation with moses? 

Please help me 

Thank for
your reply 

Greeting   

Alessandro Momi 
IT Manager

 [1]

NEXO
CORPORATION SRL 

Via Palmiro Togliatti, 73/A1 - 06073 Corciano [Pg]


Tel . 075 - 6979255 - Fax 075 - 9691073 

P. Iva/C.F. : 03201370545 -
Rea : Pg - 271348

  This email and any file transmitted with it is
intended only for the person or entity to which is addressed and may
contain information that is privileged, confidential or otherwise
protected from disclosure. Copying, dissemination or use of this email
or the information herein by anyone other than the intended recipient is
prohibited. If you have received this email by mistake, please notify us
immediately by email, telephone or fax. 

Links:
--
[1]
http://www.nexocorporation.com/
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] Randomisation by MGIZA and tuning result is worse than no tuning

2011-11-22 Thread Jelita Asian
Thanks for the answers, Miles.

You mentioned the variant by Giza++ is OK. In that case, is that OK if  I
set the seeds for mert and mkcls so the  only variant is Giza++? Otherwise,
the results will differ too much.

> >
> >>
> >> --to reduce variance it is best to use as much data as possible at
> >> each stage.  (100 sentences for tuning is far too low;  you should be
> >> using at least 1000 sentences).  it is possible to reduce this
> >> variability by using better machine learning, but in general it will
> >> always be there.
> >>
> > What do you mean by better machine learning? Isn't the 500,000 words
> corpus
> > enough? For the 1,000 sentences for tuning, can I use the same sentences
> as
> > used in the training or they shall be separate sets of sentences?
>
> lattice MERT is an example, or the Berkeley Aligner.
>
> Thanks for the pointers.


> you cannot use the same sentences for training and tuning, as has been
> explained earlier on the list
>
>
> What list? Oh, is that OK if the tuning is not from the same domain/source
as the training data?

if you have some problem that is fairly stable (uses the same training
> set, language models etc) then after running MERT many times and
> evaluating it on a disjoint test set, you pick the weights that
> produce good results.  afterwards you do not re-run MERT even if you
> have changed the model.
>

Oh, do you mean the same training data but different sets of test data?


>
> as i mentioned, this is ugly and something you do not want to do
> unless you are forced to do it
>
> Yes, I can imagine so. Sorry, I am quite new in this field and my
previous  specialisation was not this.

Cheers,

Jelita
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


[Moses-support] Training with n-class model

2011-11-22 Thread Raja Bensalem
Hello
I'm translating from frensh to english language.
To generate a translation model, i prepared bilingual corpus  based on
class.
So, to do that, i substitute each word in the bilingual corpus based on
words by the class to which it belongs.
the original corpus based on words is  trained well, but when i trained the
bilingual corpus based on class, i get the next errors:

[bensalemraja@localhost simple_demo]$
/home/bensalemraja/moses-scripts/scripts-20101214-2126/training/train-model.perl
-scripts-root-dir /home/bensalemraja/moses-scripts/scripts-20101214-2126/
-root-dir /media/win_d/simple_demo/travail_manel_classes -corpus
/media/win_d/simple_demo/travail_manel_classes/corpus/corpus_classes.lowercased
-f fr -e en -alignment grow-diag-final-and -reordering msd-bidirectional-fe
-lm 0:3:/media/win_d/simple_demo/travail_manel_classes/lm/corpus_classes.lm
>
/media/win_d/simple_demo/travail_manel_classes/training.out

Using SCRIPTS_ROOTDIR:
/home/bensalemraja/moses-scripts/scripts-20101214-2126/

Using single-thread
GIZA

(1) preparing corpus @ Tue Nov 22 15:49:25 CET 2011
(1.1)..
(1.2)..
(1.3) numberizing corpus
/media/win_d/simple_demo/travail_manel_classes/corpus/fr-en-int-train.snt @
Tue Nov 22 15:49:33 CET
2011

Unknown word 'cluster72
'

Use of uninitialized value in concatenation (.) or string at
/home/bensalemraja/moses-scripts/scripts-20101214-2126/training/train-model.perl
line 782,  line 1112.
()
Use of uninitialized value in concatenation (.) or string at
/home/bensalemraja/moses-scripts/scripts-20101214-2126/training/train-model.perl
line 782,  line
24373.

(2) running giza @ Tue Nov 22 15:49:37 CET
2011

(2.1a) running snt2cooc fr-en @ Tue Nov 22 15:49:37 CET 2011
..
(2.1b) running giza fr-en @ Tue Nov 22 15:49:38 CET 2011
/media/win_d/demo/tools/bin/GIZA++  -CoocurrenceFile
/media/win_d/simple_demo/travail_manel_classes/giza.fr-en/fr-en.cooc -c
/media/win_d/simple_demo/travail_manel_classes/corpus/fr-en-int-train.snt
-m1 5 -m2 0 -m3 3 -m4 3 -model1dumpfrequency 1 -model4smoothfactor 0.4
-nodumps 1 -nsmooth 4 -o
/media/win_d/simple_demo/travail_manel_classes/giza.fr-en/fr-en
-onlyaldumps 1 -p0 0.999 -s
/media/win_d/simple_demo/travail_manel_classes/corpus/en.vcb -t
/media/win_d/simple_demo/travail_manel_classes/corpus/fr.vcb
Executing: /media/win_d/demo/tools/bin/GIZA++  -CoocurrenceFile
/media/win_d/simple_demo/travail_manel_classes/giza.fr-en/fr-en.cooc -c
/media/win_d/simple_demo/travail_manel_classes/corpus/fr-en-int-train.snt
-m1 5 -m2 0 -m3 3 -m4 3 -model1dumpfrequency 1 -model4smoothfactor 0.4
-nodumps 1 -nsmooth 4 -o
/media/win_d/simple_demo/travail_manel_classes/giza.fr-en/fr-en
-onlyaldumps 1 -p0 0.999 -s
/media/win_d/simple_demo/travail_manel_classes/corpus/en.vcb -t
/media/win_d/simple_demo/travail_manel_classes/corpus/fr.vcb
Reading vocabulary file
from:/media/win_d/simple_demo/travail_manel_classes/corpus/en.vcb
Reading vocabulary file
from:/media/win_d/simple_demo/travail_manel_classes/corpus/fr.vcb
ERROR: Forbidden zero sentence length 0
ERROR: Forbidden zero sentence length 0
ERROR: Forbidden zero sentence length 0
ERROR: Forbidden zero sentence length 0
ERROR: Execution of: /media/win_d/demo/tools/bin/GIZA++  -CoocurrenceFile
/media/win_d/simple_demo/travail_manel_classes/giza.fr-en/fr-en.cooc -c
/media/win_d/simple_demo/travail_manel_classes/corpus/fr-en-int-train.snt
-m1 5 -m2 0 -m3 3 -m4 3 -model1dumpfrequency 1 -model4smoothfactor 0.4
-nodumps 1 -nsmooth 4 -o
/media/win_d/simple_demo/travail_manel_classes/giza.fr-en/fr-en
-onlyaldumps 1 -p0 0.999 -s
/media/win_d/simple_demo/travail_manel_classes/corpus/en.vcb -t
/media/win_d/simple_demo/travail_manel_classes/corpus/fr.vcb
  died with signal 11, without coredump



My class model is like this:

=english
access;cluster25
accidental;cluster53
accidentally;cluster53
accompanied;cluster32
accompanying;cluster32
accordance;cluster78
account;cluster64
accrued;cluster99
accumulated;cluster37
accuracy;cluster99
==

frensh=
absolues;cluster64
absolument;cluster45
absolus;cluster64
accent;cluster90
accents;cluster90
accentuation;cluster51
accentue;cluster51
accentué;cluster51
accentuées;cluster51
acceptables;cluster78
acceptant;cluster78
=
can you help me?
thanks in advance.
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


[Moses-support] multi-threaded moses and lattices; memscore problem

2011-11-22 Thread Sara Stymne
Hi!

I couldn't get multi-threaded Moses to work with lattice decoding. It
didn't crash, but was extremely slow (several hours per sentence). When
running single-threaded it is much faster (seconds-minutes).

I noticed that it says on the Moses webpage that multi-threaded Moses
for lattices is not tested. Does anyone know why there is this problem,
and have an idea of where to start looking to try and fix this?

Also, I trained my translation model using memscore, and it produced an
old phrasetable format:

! ! ! ) ||| !|SENT !|SENT !|SENT )|) ||| (0) (1) (2) (3) ||| (0) (1) (2)
(3) ||| 0.67 0.532303 0.67 0.000387809 2.718

With the alignments before the scores, which I couldn't get Moses to
accept. Everything worked again, once I removed the columns with
alignments.

/Sara
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


[Moses-support] question about phrase-table

2011-11-22 Thread bingyuanmy
Dear all,

I run Moses on my corpus and got the phrase-table containing information
as follows:

an atmosphere ||| an xxatmosphere ||| 1 0.9 1 0.9 2.718 ||| ||| 1 1

Could you kindly explain what each number means in it ?
Thanks a lot!

Best regards,
Yue
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] question about phrase-table

2011-11-22 Thread Barry Haddow
Hi Yue

This is covered in the faq
http://www.statmt.org/moses/?n=Moses.FAQ#ntoc7

For the last two numbers, see this recent mail on the list
http://www.mail-archive.com/moses-support@mit.edu/msg04812.html

cheers - Barry

On Tuesday 22 November 2011 15:54:14 bingyuanmy wrote:
> Dear all,
> 
> I run Moses on my corpus and got the phrase-table containing information
> as follows:
> 
> an atmosphere ||| an xxatmosphere ||| 1 0.9 1 0.9 2.718 ||| ||| 1 1
> 
> Could you kindly explain what each number means in it ?
> Thanks a lot!
> 
> Best regards,
> Yue
> ___
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
> 
 
--
Barry Haddow
University of Edinburgh
+44 (0) 131 651 3173

-- 
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.

___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] multi-threaded moses and lattices; memscore problem

2011-11-22 Thread Barry Haddow
Hi Sara

> I noticed that it says on the Moses webpage that multi-threaded Moses
> for lattices is not tested. Does anyone know why there is this problem,
> and have an idea of where to start looking to try and fix this?

I think I wrote this because there seemed to be some extra caching for 
lattices in the InitializeForInput methods. Actually it looks like it's for 
ConfusionNets.

If it's running really slow with threads, then maybe there's some heavy lock 
contention? In the translation option cache perhaps? You could try switching 
this off and see if it improves things. If you grep for mutex in the code then 
you should see all the places where locks are used.  Profiling may help (see 
valgrind) or just periodically attaching a debugger to see where the threads 
are waiting.

> 
> Also, I trained my translation model using memscore, and it produced an
> old phrasetable format:
> 
> ! ! ! ) ||| !|SENT !|SENT !|SENT )|) ||| (0) (1) (2) (3) ||| (0) (1) (2)
> (3) ||| 0.67 0.532303 0.67 0.000387809 2.718
> 

Feel free to fix this ... it looks as though no-one updated memscore when the 
table format changed,

cheers - Barry

-- 
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.

___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] moses on Windows

2011-11-22 Thread Philipp Koehn
Hi,

you can find more information about Moses on Windows here:
http://ssli.ee.washington.edu/people/amittai/Moses-on-Win7.pdf

-phi

On Tue, Nov 22, 2011 at 11:24 AM, Alessandro Momi <
alessandro.m...@nexocorp.com> wrote:

>  Hi,
>
> i installed moses on my Windows Pc following your step by step guide for
> windows. After that I installed srilm. but now I'm in trouble. How I can
> do now to start translation with moses?
>
> Please help me
>
>
>
> Thank for your reply
>
> Greeting
>
>
>
> *Alessandro Momi
> **IT Manager
> **
> * 
>
> *Nexo Corporation Srl*
>
> Via Palmiro Togliatti, 73/A1 *-* 06073 Corciano [Pg]
>
> Tel . 075 - 6979255 *-* Fax  075 - 9691073
>
> P. Iva/C.F. : 03201370545 *-* Rea : Pg – 271348
>
>  This email and any file transmitted with it is intended only for the
> person or entity to which is addressed and may contain information that is
> privileged, confidential or otherwise protected from disclosure. Copying,
> dissemination or use of this email or the information herein by anyone
> other than the intended recipient is prohibited. If you have received this
> email by mistake, please notify us immediately by email, telephone or fax.
>
> ___
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
>
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] tuning problem

2011-11-22 Thread Tom Hoar
 Jehan,

 There was a thread a few weeks ago about running mert-moses.pl several 
 times on the same translation model, with different tuning sets each 
 time, then averaging the final BLEU scores to have a more accurate 
 tuning.

 It looked to me like this was Somayeh's pseudo-code to do this.

 Tom


 On Tue, 22 Nov 2011 20:42:54 +0900, Jehan Pages  
 wrote:
> Hi,
>
> On Tue, Nov 22, 2011 at 5:57 PM, somayeh bakhshaei
>  wrote:
>> Hello all,
>>
>> Salam,
>>
>> I am using moses in this way:
>>
>> train,
>> for i=1 to 3
>>     tune
>> end for
>
> Sorry for not answering your problem (I don't have the solution 
> though
> I saw others did answer with a possible resolution). I just note that
> you tune 3 times. Do you mean you re-tune using the exact same data
> set these 3 times? Does it bring better results to tune several times
> like this?
> Thanks!
>
> Jehan
>
>> decode
>> evaluate
>>
>> in the above loop for something unexpected happens, in large 
>> execution
>> sometime the weights produced in moses.ini are wrong. For example 
>> once it
>> produce 3 in the other case produce 4, take a look hear:
>>
>> # translation model weights
>> [weight-t]
>> 0.0106455
>> 0.036391
>> 0.0453815
>> 0.0716856
>> 0.0271838
>>
>> # translation model weights
>> [weight-t]
>> 0.0705978
>> 0.0652413
>> 0.100475
>> 0.00356951
>>
>> in the case in the previous iteration nothing is wrong.
>> Did anyone can tell me what is happening here please?
>>
>>
>>
>>
>> -
>> Best Regards,
>> S.Bakhshaei
>>
>> After All you will come 
>> And will spread light on the dark desolate world!
>> O' Kind Father! We will be waiting for your affectionate hands ...
>>
>>
>> ___
>> Moses-support mailing list
>> Moses-support@mit.edu
>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>
>>
>
> ___
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support


___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] tuning problem

2011-11-22 Thread Jehan Pages
Hi,

On Tue, Nov 22, 2011 at 10:18 PM, somayeh bakhshaei
 wrote:
> Hello,
>
> Thanks for all answers.
>
> Also thanks Jehan.
> As you might follow moses emails there is an inconsistency problem about
> tuning in mert (expressed by Neda)
> For reducing this problem everyone offered to tune the system repeatedly
> then choosing the best answer.

Thanks for this explication. Reading Tom Hoar's email, yours and
researching and finding the original discussion, I am not sure to have
understood what is the proposed solution:

- should we average all the weights in the various moses.ini generated
during these tunings? Would weights really still make sense doing so?

- should we compare the BLEU values of the various tuning and take
as-is (without modifying it) the moses.ini whose BLEU was the closer
to the average of all the BLEUs?

> It is a way of getting rid of local maxima but not exactly catching the
> global Maxima but instead trapping in another local one :)
> So I think a better solution is needed!

So if I get it, the logics is that we may get very good BLEU (as from
what I read, the closer to 1, the better) on some tuning, but they are
actually local maxima (hence may be in fact terrible against real life
data). Hence in order to counter this, we prefer to use a tuning which
made an average BLEU on our data because it would be more robust on
the long term?

Also, my mathematics are far, but from what I recall, when we want to
get away from local maxima/minima, one would prefer to use median
rather than the average (even more on short samples like here), which
is also very influence by local maxima. Shouldn't it also be the case
here?

Regards,

Jehan

>
> On Tue, Nov 22, 2011 at 3:12 PM, Jehan Pages  wrote:
>>
>> Hi,
>>
>> On Tue, Nov 22, 2011 at 5:57 PM, somayeh bakhshaei
>>  wrote:
>> > Hello all,
>> >
>> > Salam,
>> >
>> > I am using moses in this way:
>> >
>> > train,
>> > for i=1 to 3
>> >     tune
>> > end for
>>
>> Sorry for not answering your problem (I don't have the solution though
>> I saw others did answer with a possible resolution). I just note that
>> you tune 3 times. Do you mean you re-tune using the exact same data
>> set these 3 times? Does it bring better results to tune several times
>> like this?
>> Thanks!
>>
>> Jehan
>>
>> > decode
>> > evaluate
>> >
>> > in the above loop for something unexpected happens, in large execution
>> > sometime the weights produced in moses.ini are wrong. For example once
>> > it
>> > produce 3 in the other case produce 4, take a look hear:
>> >
>> > # translation model weights
>> > [weight-t]
>> > 0.0106455
>> > 0.036391
>> > 0.0453815
>> > 0.0716856
>> > 0.0271838
>> >
>> > # translation model weights
>> > [weight-t]
>> > 0.0705978
>> > 0.0652413
>> > 0.100475
>> > 0.00356951
>> >
>> > in the case in the previous iteration nothing is wrong.
>> > Did anyone can tell me what is happening here please?
>> >
>> >
>> >
>> >
>> > -
>> > Best Regards,
>> > S.Bakhshaei
>> >
>> > After All you will come 
>> > And will spread light on the dark desolate world!
>> > O' Kind Father! We will be waiting for your affectionate hands ...
>> >
>> >
>> > ___
>> > Moses-support mailing list
>> > Moses-support@mit.edu
>> > http://mailman.mit.edu/mailman/listinfo/moses-support
>> >
>> >
>
>
>
> --
>
>
>
> -
> Best Regards,
> S.Bakhshaei
>
> After All you will come 
> And will spread light on the dark desolate world!
> O' Kind Father! We will be waiting for your affectionate hands ...
>
>

___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


[Moses-support] 1st CFP: EACL 2012 Joint Workshop on Information Retrieval and Hybrid Machine Translation

2011-11-22 Thread Marta Ruiz
Apologies for multiple postings
Please distribute to colleagues

=

Joint Workshop on
Exploiting Synergies between Information Retrieval and Machine
Translation (ESIRMT)
and
Hybrid Approaches to Machine Translation (HyTra)

Co-located with EACL 2012 (http://eacl2012.org)
Avignon, France
April 23 and 24, 2012

Deadline for paper submissions: January 27, 2012

http://www-lium.univ-lemans.fr/esirmt-hytra

=


This two-day workshop addresses two specific but related research
problems in computational linguistics.

The ESIRMT event (1st day) aims at reducing the gap, both theoretical
and practical, between information retrieval and machine translation
research and applications. Although both fields have been already
contributing to each other instrumentally, there is still too much
work to be done in relation to solidly framing these two fields into a
common ground of knowledge from both the procedural and paradigmatic
perspectives.

The HyTra event (2nd day) aims at sharing ideas among researchers
developing and applying statistical, example-based, or rule-based
machine translation systems and who wish to enhance their systems with
elements from the other approaches.

The joint workshop will provide participants with the opportunity of
discussing research related to technology integration and system
combination strategies at both the general level of cross-language
information access and the specific level of machine translation
technologies.

Contributions are to be organized into two tracks, corresponding to
ESIRMT and HyTra, respectively. Topics of interest include, but are
not limited to:

ESIRMT track:
* machine translation and information retrieval hybridization
* applications of MT and IR hybrid systems
* IR techniques integrated in MT systems
* MT techniques integrated in IR systems
* any kind of innovative approach exploiting synergies between MT and IR
* machine learning techniques for ranking

HyTra track:
* ways and techniques of MT hybridization
* architectures for the rapid development of hybrid MT systems
* hybrid systems dealing with underresourced languages and/or with
morphologically rich languages
* using linguistic information (morphology, syntax, semantics) to
enhance statistical MT
* bootstrapping rule-based systems from corpora
* hybrid methods in spoken language translation
* extraction of dictionaries from parallel and comparable corpora
* machine learning techniques for hybrid MT
* heuristics for limiting the search space in hybrid MT
* alternative methods for the fair evaluation of the output of
different types of MT systems
* system combination approaches
* open source tools and free language resources for hybrid MT

Submissions should follow EACL 2012 format, as specified in
http://eacl2012.org/information-for-authors/index.html (paper length:
up to 9 pages plus references). Reviewing of papers will be
double-blind, so the submissions should not reveal the authors’
identity.


Important Dates

January 27, 2012:  Paper submissions due
February 24, 2012:  Notification of acceptance
March 9, 2012:  Camera ready papers due
April 23-24, 2012:  Workshop in Avignon


Organizers

Contact person: Marta R. Costa-jussà (e-mail: marta.r...@barcelonamedia.org )

       ESIRMT: Marta R. Costa-jussà (Barcelona Media Innovation Center),
Patrik Lambert (University of Le Mans), Rafael E. Banchs (Institute
for Infocomm Research)

HyTra: Reinhard Rapp (Universities of Mainz and Leeds), Bogdan Babych
(University of Leeds), Kurt Eberle (Lingenio GmbH), Tony Hartley
(Toyohashi University of Technology and University of Leeds), Serge
Sharoff (University of Leeds), Martin Thomas (University of Leeds)


Invited Speakers

Philipp Koehn (University of Edinburgh)
TBA

Programme Committee

Jordi Atserias, Yahoo! Research
Bogdan Babych, University of Leeds, CTS
Sivaji Bandyopadhyay, Jadavpur University, Kolkata
Núria Bel, Universitat Pompeu Fabra, Barcelona
Pierrette Bouillon, ISSCO/TIM/ETI, University of Geneva
Chris Callison-Burch, Johns Hopkins University, Baltimore
Michael Carl, Copenhagen Business School
Oliver Culo, ICSI, University of California, Berkeley
Kurt Eberle, Lingenio GmbH, Heidelberg
Andreas Eisele, Directorate-General for Translation, European
Commission, Luxembourg
Marcello Federico, Fondazione Bruno Kessler.
Mikel Forcada, University of Alicante
Alexander Fraser, Institute for Natural Language Processing (IMS), Stuttgart
Johanna Geiß, University of Cambridge and Lingenio GmbH
Mireia Ginesti-Rosell, Lingenio GmbH, Heidelberg
Silvia Hansen-Schirra, FTSK, University of Mainz
Tony Hartley, Toyohashi University of Technology
Gareth Jones, Dublin City University
Min-Yen Kan, National University of Singapore
Udo Kruschwitz, University of Essez
Yanjun Ma, Baidu Inc.
Haizhou Li, Institute for Infocomm Research
Reinhard Rapp, University of Mainz, FTSK
Paul Schm