Hi Ergun, processPhraseTable is no longer supported by Moses. But I see that Phil Williams has already fixed this problem in transliteration module, by changing
`$MOSES_SRC/scripts/training/filter-model-given-input.pl $TRANSLIT_MODEL/evaluation/$eval_file.filtered $TRANSLIT_MODEL/evaluation/$eval_file.moses.table.ini $TRANSLIT_MODEL/evaluation/$eval_file -Binarizer "$MOSES_SRC/bin/processPhraseTable"`; to `$MOSES_SRC/scripts/training/filter-model-given-input.pl $TRANSLIT_MODEL/evaluation/$eval_file.filtered $TRANSLIT_MODEL/evaluation/$eval_file.moses.table.ini $TRANSLIT_MODEL/evaluation/$eval_file -Binarizer "$MOSES_SRC/bin/CreateOnDiskPt 1 1 4 100 2"`; in path-to-moses/scripts/Transliteration/in-decoding-transliteration.pl Here's the commit https://github.com/moses-smt/mosesdecoder/commit/7e54e23fe234ac48f44beeee0e473d09a5b4d5f6 May be you pulled and in between version where the processPhraseTable was removed but transliteration scripts were not fixed. Cheers, Nadir On Mon, May 4, 2015 at 7:46 AM, <moses-support-requ...@mit.edu> wrote: > Send Moses-support mailing list submissions to > moses-support@mit.edu > > To subscribe or unsubscribe via the World Wide Web, visit > http://mailman.mit.edu/mailman/listinfo/moses-support > or, via email, send a message with subject or body 'help' to > moses-support-requ...@mit.edu > > You can reach the person managing the list at > moses-support-ow...@mit.edu > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of Moses-support digest..." > > > Today's Topics: > > 1. Re: 12-gram language model ARPA file for 16GB (liling tan) > 2. Transliteration model is using processPhraseTable, which is > not found in Moses version 3.0 (Ergun Bicici) > 3. Re: Transliteration model is using processPhraseTable, which > is not found in Moses version 3.0 (Hieu Hoang) > 4. Europarl monolingual corpus (Hieu Hoang) > > > ---------------------------------------------------------------------- > > Message: 1 > Date: Sun, 3 May 2015 19:44:12 +0200 > From: liling tan <alvati...@gmail.com> > Subject: Re: [Moses-support] 12-gram language model ARPA file for 16GB > To: moses-support <moses-support@mit.edu> > Message-ID: > <CAKzPaJJ7fY=9C89POact542vu32d+H3=0i_Dnaj=yfizbfa...@mail.gmail.com> > Content-Type: text/plain; charset="utf-8" > > Dear Moses devs/users, > > For now, I only know that it takes more than 250GB. I've 250GB of free > space and KenLM got "poisoned" by insufficient space... > > Does anyone have an idea how big would a 12-gram language model ARPA file > trained on 16GB of text become? > > STDERR: > > === 1/5 Counting and sorting n-grams === > Reading /media/2tb/wmt15/corpus.truecase/train-lm.en > ----5---10---15---20---25---30---35---40---45---50---55---60---65---70---75---80---85---90---95--100 > tcmalloc: large alloc 7846035456 bytes == 0x10f4000 @ > tcmalloc: large alloc 73229664256 bytes == 0x1d542e000 @ > **************************************************************************************************** > Unigram tokens 3038737446 types 5924314 > === 2/5 Calculating and sorting adjusted counts === > Chain sizes: 1:71091768 2:804524736 3:1508483968 4:2413574144 5:3519795968 > 6:4827148288 7:6335632384 8:8045247488 9:9955993600 10:12067871744 > 11:14380880896 12:16895020032 > tcmalloc: large alloc 16895025152 bytes == 0x1d542e000 @ > tcmalloc: large alloc 2413576192 bytes == 0x8f2a0000 @ > tcmalloc: large alloc 3519799296 bytes == 0x5c4488000 @ > tcmalloc: large alloc 4827152384 bytes == 0x696146000 @ > tcmalloc: large alloc 6335635456 bytes == 0x7b5cce000 @ > tcmalloc: large alloc 8045248512 bytes == 0x92f6f0000 @ > tcmalloc: large alloc 9955999744 bytes == 0xb0ef7c000 @ > tcmalloc: large alloc 12067872768 bytes == 0xd60644000 @ > tcmalloc: large alloc 14380883968 bytes == 0x12f616e000 @ > Last input should have been poison. > Last input should have been poison.util/file.cc:196 in void > util::WriteOrThrow(int, const void*, std::size_t) threw FDException because > `ret < 1'. > No space left on device in /tmp/PC2o3z (deleted) while writing 5301120368 > bytes > > Last input should have been poison.util/file.cc:196 in void > util::WriteOrThrow(int, const void*, std::size_t) threw FDException because > `ret < 1'. > No space left on device in /tmp/PftXeo (deleted) while writing 1941075872 > bytesLast input should have been poison. > > util/file.cc:196 in void util::WriteOrThrow(int, const void*, std::size_t) > threw FDException because `ret < 1'. > No space left on device in /tmp/CuZcPM (deleted) while writing 2984722272 > bytes > > util/file.cc:196 in void util::WriteOrThrow(int, const void*, std::size_t) > threw FDException because `ret < 1'. > No space left on device in /tmp/F2bE8A (deleted) while writing 389439488 > bytes > > Regards, > Liling > -------------- next part -------------- > An HTML attachment was scrubbed... > URL: > http://mailman.mit.edu/mailman/private/moses-support/attachments/20150503/b56dc8ba/attachment-0001.htm > > ------------------------------ > > Message: 2 > Date: Sun, 3 May 2015 22:42:22 +0100 > From: Ergun Bicici <ergun.bic...@computing.dcu.ie> > Subject: [Moses-support] Transliteration model is using > processPhraseTable, which is not found in Moses version 3.0 > To: moses-support <moses-support@mit.edu> > Message-ID: > <CAB2pGncpvc4roLXwLcFcXytZHKEqSZvzaX2L16Yfo=p-vq1...@mail.gmail.com> > Content-Type: text/plain; charset="utf-8" > > binarizing...gzip -cd > en-ru_path/model/Transliteration.8/tuning/filtered/phrase-table.0-0.1.1.gz > | LC_ALL=C sort -T en-ru_path/model/Transliteration.8/tuning/filtered | > moses_3.0/mosesdecoder/bin/processPhraseTable -ttable 0 0 - -nscores 4 -out > en-ru_path/model/Transliteration.8/tuning/filtered/phrase-table.0-0.1.1 > sh: moses_3.0/mosesdecoder/bin/processPhraseTable: No such file or directory > sort: write failed: standard output: Broken pipe > sort: write error > > How can I have processPhraseTable built? > > Best Regards, > Ergun > > Ergun Bi?ici, CNGL, School of Computing, DCU, www.cngl.ie > http://www.computing.dcu.ie/~ebicici/ > -------------- next part -------------- > An HTML attachment was scrubbed... > URL: > http://mailman.mit.edu/mailman/private/moses-support/attachments/20150503/dacaa1c9/attachment-0001.htm > > ------------------------------ > > Message: 3 > Date: Mon, 04 May 2015 08:31:18 +0400 > From: Hieu Hoang <hieuho...@gmail.com> > Subject: Re: [Moses-support] Transliteration model is using > processPhraseTable, which is not found in Moses version 3.0 > To: Ergun Bicici <ergun.bic...@computing.dcu.ie>, moses-support > <moses-support@mit.edu> > Message-ID: <5546f616.4000...@gmail.com> > Content-Type: text/plain; charset="windows-1252" > > do you know where the processPhraseTable exec is being called from? > > it would be helpful so we can make sure it uses something else. > > if you really want processPhraseTable back, uncomment 3 lines in > misc/Jamfile > > +++ b/misc/Jamfile > @@ -1,8 +1,8 @@ > -#exe processPhraseTable : GenerateTuples.cpp processPhraseTable.cpp > ..//boost_filesystem ../moses//moses ; > +exe processPhraseTable : GenerateTuples.cpp processPhraseTable.cpp > ..//boost_filesystem ../moses//moses ; > > exe processLexicalTable : processLexicalTable.cpp ..//boost_filesystem > ../moses//moses ; > > -#exe queryPhraseTable : queryPhraseTable.cpp ..//boost_filesystem > ../moses//moses ; > +exe queryPhraseTable : queryPhraseTable.cpp ..//boost_filesystem > ../moses//moses ; > > exe queryLexicalTable : queryLexicalTable.cpp ..//boost_filesystem > ../moses//moses ; > > @@ -46,6 +46,6 @@ $(TOP)//boost_iostreams > $(TOP)//boost_program_options > ; > > -alias programs : 1-1-Extraction TMining generateSequences > processLexicalTable queryLexicalTable programsMin programsProbing > merge-sorted prunePhraseTable ; > -#processPhraseTable queryPhraseTable > +alias programs : 1-1-Extraction TMining generateSequences > processLexicalTable queryLexicalTable programsMin programsProbing > merge-sorted prunePhraseTable processPhraseTable queryPhraseTable ; > > On 04/05/2015 01:42, Ergun Bicici wrote: >> >> binarizing...gzip -cd >> en-ru_path/model/Transliteration.8/tuning/filtered/phrase-table.0-0.1.1.gz >> | LC_ALL=C sort -T en-ru_path/model/Transliteration.8/tuning/filtered >> | moses_3.0/mosesdecoder/bin/processPhraseTable -ttable 0 0 - -nscores >> 4 -out >> en-ru_path/model/Transliteration.8/tuning/filtered/phrase-table.0-0.1.1 >> sh: moses_3.0/mosesdecoder/bin/processPhraseTable: No such file or >> directory >> sort: write failed: standard output: Broken pipe >> sort: write error >> >> How can I have processPhraseTable built? >> >> Best Regards, >> Ergun >> >> Ergun Bi?ici, CNGL, School of Computing, DCU, www.cngl.ie >> <http://www.cngl.ie> >> http://www.computing.dcu.ie/~ebicici/ >> <http://www.computing.dcu.ie/%7Eebicici/> >> >> >> >> _______________________________________________ >> Moses-support mailing list >> Moses-support@mit.edu >> http://mailman.mit.edu/mailman/listinfo/moses-support > > -- > Hieu Hoang > Researcher > New York University, Abu Dhabi > http://www.hoang.co.uk/hieu > > -------------- next part -------------- > An HTML attachment was scrubbed... > URL: > http://mailman.mit.edu/mailman/private/moses-support/attachments/20150504/303023d0/attachment-0001.htm > > ------------------------------ > > Message: 4 > Date: Mon, 4 May 2015 08:46:15 +0400 > From: Hieu Hoang <hieuho...@gmail.com> > Subject: [Moses-support] Europarl monolingual corpus > To: moses-support <moses-support@mit.edu> > Message-ID: > <caekmkbio64f_m20rwnxydoj60fhez_oo+by+hzkw3tbfukp...@mail.gmail.com> > Content-Type: text/plain; charset="utf-8" > > What's the easiest way get the single-language data from the Europarl > corpus as described in the 1st table in: > http://statmt.org/europarl/ > > I tried downloading the xml source > http://statmt.org/europarl/v7/europarl.tgz > stripping the xml and running split-sentence.perl, but this takes an > unfathomably long time > > Hieu Hoang > Researcher > New York University, Abu Dhabi > http://www.hoang.co.uk/hieu > -------------- next part -------------- > An HTML attachment was scrubbed... > URL: > http://mailman.mit.edu/mailman/private/moses-support/attachments/20150504/ba5b4087/attachment.htm > > ------------------------------ > > _______________________________________________ > Moses-support mailing list > Moses-support@mit.edu > http://mailman.mit.edu/mailman/listinfo/moses-support > > > End of Moses-support Digest, Vol 103, Issue 5 > ********************************************* _______________________________________________ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support