Hi,

GIZA++ can run more memory-efficient when it knows in advance
which words can possibly align to which other words (because they
occur in the same sentence pair). Hence, snt2cooc is collecting a list
of words that may co-occur prior to running GIZA++.

Since snt2cooc can also run into memory problems (2GB limit on
32-bit machines), experiment.perl includes the option to break
up the corpus on run it on parts. This setting is called
"run-giza-in-parts",
which is slighly misleading (it's snt2cooc that's run in parts, not
GIZA++).

-phi

On Sat, Apr 7, 2012 at 2:50 AM, Fong Po Po <fongpui...@yahoo.com.hk> wrote:

> Dear all:
>                   What is snt2cooc.out used to do in training of Moses?
>                  Thanks!
>  Best Regards,
>
> Fong Pui Chi
>
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
>
_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to