Re: [Moses-support] Factored instead of Phrase-based Model?

2016-01-06 Thread Shaimaa Marzouk
Dear Ondrej & Moses-Team,

@Ondrej: thanks a lot for your quick feedback.

The phrase "ich habe" does not appear in the phrase table. The word alignment 
file includes only the first 4 sentences of the training data. 

I have separated the sentence (ich habe das auto verkauf) in a separate "in" 
file, but got the same result. I also tried another sentence (ich verkaufe das 
auto), also here "ich verkaufe" can not be translated. I repeated the exact 
sentence (ich verkaufe das auto) many times in the training data and still get 
the same result.
I attach the word alignment, phrase table, training data and verbose result.. 
and would be very grateful to receive any tip.

I would also highly appreciate, if you could let me know, where can I find 
information about 
 1.   how to prepare the training data with additional factors, before training 
the Factored Model?
 2.   how to train the Language Model that considers the POS?

I think that sooner or later, the sentences will get complexer and I would need 
to work with a Factored Model.


Many Thanks
Shaimaa

 





Ondrej Bojar <bo...@ufal.mff.cuni.cz> schrieb am Mi, 6.1.2016:

 Betreff: Re: [Moses-support] Factored instead of Phrase-based Model?
 An: "Shaimaa Marzouk" <marzou...@yahoo.de>, "Shaimaa Marzouk" 
<marzou...@yahoo.de>, Moses-support@mit.edu
 Datum: Mittwoch, 6. Januar, 2016 08:42 Uhr
 
 Dear Shaimaa,
 
 Adding factors can only
 increase any out-of-vocabulary issues.
 
 Use -v (perhaps even a higher verbosity level)
 in moses to see what all translation options are considered
 for the problematic sentence. There could be some
 unfortunate weight settings  that for some reason prefer
 identity translation. (The identity translation must however
 appear in the data, or the source word must not appear in
 the data, otherwise Moses would not produce identity
 translation at all.)
 
 And
 then go back to the phrase table and manually search for the
 lines that are supposed to cover the missing words. Here you
 may find the identity entries.
 
 And then go back and check the word alignment
 this (test) sentence got in the training data. There are
 most likely some issues with the alignment that prevented
 proper translations to be extracted.
 
 Best, Ondrej.
  
 
 On January 6, 2016 4:48:26 AM CET, Shaimaa
 Marzouk <marzou...@yahoo.de>
 wrote:
 >Dear Moses-Team,
 >
 >I am trying to
 translate two short sentences included in the same file
 >from German into English using a
 “Phrase-based Model”. The first
 >sentence (das auto wurde verkauft) is
 translated correctly, while the
 >second
 is partly translated:
 >
 >I receive as a result for “ich habe das
 auto verkauft”
 >Ich|UNK|UNK|UNK
 habe|UNK|UNK|UNK the car sold  [1]  
 >[total=-203.330]   core=(-200.000,
 -5.000, 5.000, 0.000, 0.000, 0.000,
 >0.000, 0.000, -18.660)
 >
 >I tried to modify the
 training data in different ways, and at last
 >included the exact sentence (along with its
 translation) in the
 >training data (see
 attachment). But, I still get the same result.
 >
 >Do I need to use a
 “Factored Translation Model” instead of the
 >“Phrase-based Model” to be able to
 translate this sentence? If yes, I
 >find
 here http://www.statmt.org/moses/?n=Moses.FactoredTutorial
 >explanation of how to train Factored
 Models. Could you please tell me,
 >where
 can I find information about 
 >1.   
 how to prepare the training data with additional factors,
 before
 >training the Factored Model?
 >2.    how to train the Language Model
 that considers the POS?
 >
 >I currently use KenLM and Giza++. 
 >
 >Thanks a lot for your
 support.
 >
 >Kind
 regards,
 >Shaimaa
 >
 >
 >
 >___
 >Moses-support mailing list
 >Moses-support@mit.edu
 >http://mailman.mit.edu/mailman/listinfo/moses-support
 
 -- 
 Ondrej
 Bojar (mailto:o...@cuni.cz / bo...@ufal.mff.cuni.cz)
 http://www.cuni.cz/~obo

aligned.grow-diag-final
Description: Binary data


phrase-table.gz
Description: GNU Zip compressed data


verbose.docx
Description: MS-Word 2007 document


car-ready2016-2.de
Description: Binary data


car-ready2016-2.en
Description: Binary data
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


[Moses-support] Factored instead of Phrase-based Model?

2016-01-05 Thread Shaimaa Marzouk
Dear Moses-Team,

I am trying to translate two short sentences included in the same file from 
German into English using a “Phrase-based Model”. The first sentence (das auto 
wurde verkauft) is translated correctly, while the second is partly translated:

I receive as a result for “ich habe das auto verkauft”
Ich|UNK|UNK|UNK habe|UNK|UNK|UNK the car sold  [1]   [total=-203.330]   
core=(-200.000, -5.000, 5.000, 0.000, 0.000, 0.000, 0.000, 0.000, -18.660)

I tried to modify the training data in different ways, and at last included the 
exact sentence (along with its translation) in the training data (see 
attachment). But, I still get the same result.

Do I need to use a “Factored Translation Model” instead of the “Phrase-based 
Model” to be able to translate this sentence? If yes, I find here 
http://www.statmt.org/moses/?n=Moses.FactoredTutorial explanation of how to 
train Factored Models. Could you please tell me, where can I find information 
about 
1.  how to prepare the training data with additional factors, before 
training the Factored Model?
2.  how to train the Language Model that considers the POS?

I currently use KenLM and Giza++. 

Thanks a lot for your support.

Kind regards,
Shaimaa

Training data.docx
Description: MS-Word 2007 document
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


[Moses-support] input format of Giza++

2015-12-14 Thread Shaimaa Marzouk
Dear Support-Team,

I have prepared a small corpus (tokenisation, truecasing are done) and would 
like to convert it into the format of Giza++ described here:
http://www.statmt.org/moses/?n=FactoredTraining.PrepareData

Could you please tell me, how to convert the corpus files (two parallel files 
prepared by the Editor “gedit”) into the right input format of Giza++?

Thanks a lot
Shaimaa

___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


[Moses-support] Error: The build failed

2015-12-12 Thread Shaimaa Marzouk
Dear Support-Team,

I have got the error "The build failed", as I entered the command   ./bjam -j4  
 for setting up Moses.
Please find attached the "build.log.gz".

Could you please help me to fix this error?

Kind regards,
Shaimaa 

build.log.gz
Description: application/gzip
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


[Moses-support] easy steps for beginners

2015-12-10 Thread Shaimaa Marzouk

Dear support team, 

I would be extremely grateful, if you could help me with the following:
I have managed to install Moses (which is not an straightforward task for a 
windows user) and would like to understand, how the system determines the 
translation output. To simplify the scenario, my plan is to just have 5 
sentences as a training data and try to translate 1 sentence. Could you please 
send me easy instructions / steps for beginners?

BTW, I installed Moses on Ubuntu, use "Fast Align" for Word Alignment. and 
installed “IRSTLM” as a language model, but Moses still uses KENLM by default 
(according to the .ini file) 

Thanks a lot :)

Kind regards,
Shaimaa Marzouk

___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support