Dear Ondrej & Moses-Team,
@Ondrej: thanks a lot for your quick feedback.
The phrase "ich habe" does not appear in the phrase table. The word alignment
file includes only the first 4 sentences of the training data.
I have separated the sentence (ich habe das auto verkauf) in a separate "in"
file, but got the same result. I also tried another sentence (ich verkaufe das
auto), also here "ich verkaufe" can not be translated. I repeated the exact
sentence (ich verkaufe das auto) many times in the training data and still get
the same result.
I attach the word alignment, phrase table, training data and verbose result..
and would be very grateful to receive any tip.
I would also highly appreciate, if you could let me know, where can I find
information about
1. how to prepare the training data with additional factors, before training
the Factored Model?
2. how to train the Language Model that considers the POS?
I think that sooner or later, the sentences will get complexer and I would need
to work with a Factored Model.
Many Thanks
Shaimaa
Ondrej Bojar <bo...@ufal.mff.cuni.cz> schrieb am Mi, 6.1.2016:
Betreff: Re: [Moses-support] Factored instead of Phrase-based Model?
An: "Shaimaa Marzouk" <marzou...@yahoo.de>, "Shaimaa Marzouk"
<marzou...@yahoo.de>, Moses-support@mit.edu
Datum: Mittwoch, 6. Januar, 2016 08:42 Uhr
Dear Shaimaa,
Adding factors can only
increase any out-of-vocabulary issues.
Use -v (perhaps even a higher verbosity level)
in moses to see what all translation options are considered
for the problematic sentence. There could be some
unfortunate weight settings that for some reason prefer
identity translation. (The identity translation must however
appear in the data, or the source word must not appear in
the data, otherwise Moses would not produce identity
translation at all.)
And
then go back to the phrase table and manually search for the
lines that are supposed to cover the missing words. Here you
may find the identity entries.
And then go back and check the word alignment
this (test) sentence got in the training data. There are
most likely some issues with the alignment that prevented
proper translations to be extracted.
Best, Ondrej.
On January 6, 2016 4:48:26 AM CET, Shaimaa
Marzouk <marzou...@yahoo.de>
wrote:
>Dear Moses-Team,
>
>I am trying to
translate two short sentences included in the same file
>from German into English using a
“Phrase-based Model”. The first
>sentence (das auto wurde verkauft) is
translated correctly, while the
>second
is partly translated:
>
>I receive as a result for “ich habe das
auto verkauft”
>Ich|UNK|UNK|UNK
habe|UNK|UNK|UNK the car sold [1]
>[total=-203.330] core=(-200.000,
-5.000, 5.000, 0.000, 0.000, 0.000,
>0.000, 0.000, -18.660)
>
>I tried to modify the
training data in different ways, and at last
>included the exact sentence (along with its
translation) in the
>training data (see
attachment). But, I still get the same result.
>
>Do I need to use a
“Factored Translation Model” instead of the
>“Phrase-based Model” to be able to
translate this sentence? If yes, I
>find
here http://www.statmt.org/moses/?n=Moses.FactoredTutorial
>explanation of how to train Factored
Models. Could you please tell me,
>where
can I find information about
>1.
how to prepare the training data with additional factors,
before
>training the Factored Model?
>2. how to train the Language Model
that considers the POS?
>
>I currently use KenLM and Giza++.
>
>Thanks a lot for your
support.
>
>Kind
regards,
>Shaimaa
>
>
>
>___
>Moses-support mailing list
>Moses-support@mit.edu
>http://mailman.mit.edu/mailman/listinfo/moses-support
--
Ondrej
Bojar (mailto:o...@cuni.cz / bo...@ufal.mff.cuni.cz)
http://www.cuni.cz/~obo
aligned.grow-diag-final
Description: Binary data
phrase-table.gz
Description: GNU Zip compressed data
verbose.docx
Description: MS-Word 2007 document
car-ready2016-2.de
Description: Binary data
car-ready2016-2.en
Description: Binary data
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support