Re: [Moses-support] What is snt2cooc used to do in training?

Hieu Hoang Sat, 07 Apr 2012 19:32:29 -0700

can sntcooc be made more memory efficient?

for a start, it seems to be designed to do a calc of each<source,target) word pair but the count is not used.

it seems a bit silly that the simple initializer for giza++ takes up waymore memory than giza++ itself.


On 08/04/2012 04:21, Philipp Koehn wrote:

Hi,

GIZA++ can run more memory-efficient when it knows in advance
which words can possibly align to which other words (because they
occur in the same sentence pair). Hence, snt2cooc is collecting a list
of words that may co-occur prior to running GIZA++.

Since snt2cooc can also run into memory problems (2GB limit on
32-bit machines), experiment.perl includes the option to break

up the corpus on run it on parts. This setting is called"run-giza-in-parts",

which is slighly misleading (it's snt2cooc that's run in parts, not
GIZA++).

-phi

On Sat, Apr 7, 2012 at 2:50 AM, Fong Po Po <fongpui...@yahoo.com.hk<mailto:fongpui...@yahoo.com.hk>> wrote:


    Dear all:
                     What is snt2cooc.out used to do in training of Moses?
                     Thanks!
    Best Regards,

Fong Pui Chi



    _______________________________________________
    Moses-support mailing list
    Moses-support@mit.edu <mailto:Moses-support@mit.edu>
    http://mailman.mit.edu/mailman/listinfo/moses-support




_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support

_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support

Re: [Moses-support] What is snt2cooc used to do in training?

Reply via email to