Hi All!
This is my first post here and AT first I want to apologize for my English
but I would like to ask you some questions. I finished a full phrase based
Moses training of EN-PL (English - Polish) corpus (few million sentences
from free sources + half million sentences from commercial tmx).
Hi Tomek,
4.5% definitely indicate that there was an error in your pipeline (or
test data?). However, there are so many places where things could go
wrong, that based on the little information you have us I could not even
start guessing. Check if your line numbers match, that you use tokenized
Hi experts,
I have a question about the phrase table theory.
If we take a corpus A to create a TM model TMA and a LM model LMA.
if we consider a corpus B.
Method 1 :
We add corpus B to A => corpus AB => TM-AB and LM-AB
Method 2:
We process corpus B => TMB and LMB
then we combine TMA + TMB and
Hi Tomek
Yes, that's quite a low score. Have a look at the translation output, do
the sentences have lots of English words in them, are they very long,
very short, or scrambled in some other way?
The commonest problem is that something went wrong in corpus
preparation, for example the
Hi,
it makes a difference if A and B differ in terms of closeness to the
tune/test data.
If you have a small in-domain corpus and a large out-of-domain corpus,
you should see significant improvements by using interpolation and
fill-up methods.
If they are basically the same kind of data, then
Hi,
you can find instructions for the different distributions here:
http://www.statmt.org/moses/?n=Development.GetStarted
I found installation on Ubuntu 14 quite straightforward.
-phi
On Mon, Sep 7, 2015 at 11:06 AM, eman ramzy wrote:
> Hello dear,
>
> I would like to