Hi Ergun,

processPhraseTable is no longer supported by Moses. But I see that
Phil Williams has already fixed this problem in transliteration
module, by changing

 `$MOSES_SRC/scripts/training/filter-model-given-input.pl
$TRANSLIT_MODEL/evaluation/$eval_file.filtered
$TRANSLIT_MODEL/evaluation/$eval_file.moses.table.ini
$TRANSLIT_MODEL/evaluation/$eval_file  -Binarizer
"$MOSES_SRC/bin/processPhraseTable"`;

to

`$MOSES_SRC/scripts/training/filter-model-given-input.pl
$TRANSLIT_MODEL/evaluation/$eval_file.filtered
$TRANSLIT_MODEL/evaluation/$eval_file.moses.table.ini
$TRANSLIT_MODEL/evaluation/$eval_file -Binarizer
"$MOSES_SRC/bin/CreateOnDiskPt 1 1 4 100 2"`;

in

path-to-moses/scripts/Transliteration/in-decoding-transliteration.pl

Here's the commit

https://github.com/moses-smt/mosesdecoder/commit/7e54e23fe234ac48f44beeee0e473d09a5b4d5f6

May be you pulled and in between version where the processPhraseTable
was removed but transliteration scripts were not fixed.

Cheers,
Nadir


On Mon, May 4, 2015 at 7:46 AM,  <moses-support-requ...@mit.edu> wrote:
> Send Moses-support mailing list submissions to
>         moses-support@mit.edu
>
> To subscribe or unsubscribe via the World Wide Web, visit
>         http://mailman.mit.edu/mailman/listinfo/moses-support
> or, via email, send a message with subject or body 'help' to
>         moses-support-requ...@mit.edu
>
> You can reach the person managing the list at
>         moses-support-ow...@mit.edu
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of Moses-support digest..."
>
>
> Today's Topics:
>
>    1. Re: 12-gram language model ARPA file for 16GB (liling tan)
>    2. Transliteration model is using processPhraseTable, which is
>       not found in Moses version 3.0 (Ergun Bicici)
>    3. Re: Transliteration model is using processPhraseTable, which
>       is not found in Moses version 3.0 (Hieu Hoang)
>    4. Europarl monolingual corpus (Hieu Hoang)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Sun, 3 May 2015 19:44:12 +0200
> From: liling tan <alvati...@gmail.com>
> Subject: Re: [Moses-support] 12-gram language model ARPA file for 16GB
> To: moses-support <moses-support@mit.edu>
> Message-ID:
>         <CAKzPaJJ7fY=9C89POact542vu32d+H3=0i_Dnaj=yfizbfa...@mail.gmail.com>
> Content-Type: text/plain; charset="utf-8"
>
> Dear Moses devs/users,
>
> For now, I only know that it takes more than 250GB. I've 250GB of free
> space and KenLM got "poisoned" by insufficient space...
>
> Does anyone have an idea how big would a 12-gram language model ARPA file
> trained on 16GB of text become?
>
> STDERR:
>
> === 1/5 Counting and sorting n-grams ===
> Reading /media/2tb/wmt15/corpus.truecase/train-lm.en
> ----5---10---15---20---25---30---35---40---45---50---55---60---65---70---75---80---85---90---95--100
> tcmalloc: large alloc 7846035456 bytes == 0x10f4000 @
> tcmalloc: large alloc 73229664256 bytes == 0x1d542e000 @
> ****************************************************************************************************
> Unigram tokens 3038737446 types 5924314
> === 2/5 Calculating and sorting adjusted counts ===
> Chain sizes: 1:71091768 2:804524736 3:1508483968 4:2413574144 5:3519795968
> 6:4827148288 7:6335632384 8:8045247488 9:9955993600 10:12067871744
> 11:14380880896 12:16895020032
> tcmalloc: large alloc 16895025152 bytes == 0x1d542e000 @
> tcmalloc: large alloc 2413576192 bytes == 0x8f2a0000 @
> tcmalloc: large alloc 3519799296 bytes == 0x5c4488000 @
> tcmalloc: large alloc 4827152384 bytes == 0x696146000 @
> tcmalloc: large alloc 6335635456 bytes == 0x7b5cce000 @
> tcmalloc: large alloc 8045248512 bytes == 0x92f6f0000 @
> tcmalloc: large alloc 9955999744 bytes == 0xb0ef7c000 @
> tcmalloc: large alloc 12067872768 bytes == 0xd60644000 @
> tcmalloc: large alloc 14380883968 bytes == 0x12f616e000 @
> Last input should have been poison.
> Last input should have been poison.util/file.cc:196 in void
> util::WriteOrThrow(int, const void*, std::size_t) threw FDException because
> `ret < 1'.
> No space left on device in /tmp/PC2o3z (deleted) while writing 5301120368
> bytes
>
> Last input should have been poison.util/file.cc:196 in void
> util::WriteOrThrow(int, const void*, std::size_t) threw FDException because
> `ret < 1'.
> No space left on device in /tmp/PftXeo (deleted) while writing 1941075872
> bytesLast input should have been poison.
>
> util/file.cc:196 in void util::WriteOrThrow(int, const void*, std::size_t)
> threw FDException because `ret < 1'.
> No space left on device in /tmp/CuZcPM (deleted) while writing 2984722272
> bytes
>
> util/file.cc:196 in void util::WriteOrThrow(int, const void*, std::size_t)
> threw FDException because `ret < 1'.
> No space left on device in /tmp/F2bE8A (deleted) while writing 389439488
> bytes
>
> Regards,
> Liling
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: 
> http://mailman.mit.edu/mailman/private/moses-support/attachments/20150503/b56dc8ba/attachment-0001.htm
>
> ------------------------------
>
> Message: 2
> Date: Sun, 3 May 2015 22:42:22 +0100
> From: Ergun Bicici <ergun.bic...@computing.dcu.ie>
> Subject: [Moses-support] Transliteration model is using
>         processPhraseTable, which is not found in Moses version 3.0
> To: moses-support <moses-support@mit.edu>
> Message-ID:
>         <CAB2pGncpvc4roLXwLcFcXytZHKEqSZvzaX2L16Yfo=p-vq1...@mail.gmail.com>
> Content-Type: text/plain; charset="utf-8"
>
> binarizing...gzip -cd
> en-ru_path/model/Transliteration.8/tuning/filtered/phrase-table.0-0.1.1.gz
> | LC_ALL=C sort -T en-ru_path/model/Transliteration.8/tuning/filtered |
> moses_3.0/mosesdecoder/bin/processPhraseTable -ttable 0 0 - -nscores 4 -out
> en-ru_path/model/Transliteration.8/tuning/filtered/phrase-table.0-0.1.1
> sh: moses_3.0/mosesdecoder/bin/processPhraseTable: No such file or directory
> sort: write failed: standard output: Broken pipe
> sort: write error
>
> How can I have processPhraseTable built?
>
> Best Regards,
> Ergun
>
> Ergun Bi?ici, CNGL, School of Computing, DCU, www.cngl.ie
> http://www.computing.dcu.ie/~ebicici/
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: 
> http://mailman.mit.edu/mailman/private/moses-support/attachments/20150503/dacaa1c9/attachment-0001.htm
>
> ------------------------------
>
> Message: 3
> Date: Mon, 04 May 2015 08:31:18 +0400
> From: Hieu Hoang <hieuho...@gmail.com>
> Subject: Re: [Moses-support] Transliteration model is using
>         processPhraseTable, which is not found in Moses version 3.0
> To: Ergun Bicici <ergun.bic...@computing.dcu.ie>,       moses-support
>         <moses-support@mit.edu>
> Message-ID: <5546f616.4000...@gmail.com>
> Content-Type: text/plain; charset="windows-1252"
>
> do you know where the processPhraseTable exec is being called from?
>
> it would be helpful so we can make sure it uses something else.
>
> if you really want processPhraseTable back, uncomment 3 lines in
>     misc/Jamfile
>
> +++ b/misc/Jamfile
> @@ -1,8 +1,8 @@
> -#exe processPhraseTable : GenerateTuples.cpp processPhraseTable.cpp
> ..//boost_filesystem ../moses//moses ;
> +exe processPhraseTable : GenerateTuples.cpp  processPhraseTable.cpp
> ..//boost_filesystem ../moses//moses ;
>
>   exe processLexicalTable : processLexicalTable.cpp ..//boost_filesystem
> ../moses//moses ;
>
> -#exe queryPhraseTable : queryPhraseTable.cpp ..//boost_filesystem
> ../moses//moses ;
> +exe queryPhraseTable : queryPhraseTable.cpp ..//boost_filesystem
> ../moses//moses ;
>
>   exe queryLexicalTable : queryLexicalTable.cpp ..//boost_filesystem
> ../moses//moses ;
>
> @@ -46,6 +46,6 @@ $(TOP)//boost_iostreams
>   $(TOP)//boost_program_options
>   ;
>
> -alias programs : 1-1-Extraction TMining generateSequences
> processLexicalTable queryLexicalTable programsMin programsProbing
> merge-sorted prunePhraseTable  ;
> -#processPhraseTable queryPhraseTable
> +alias programs : 1-1-Extraction TMining generateSequences
> processLexicalTable queryLexicalTable programsMin programsProbing
> merge-sorted prunePhraseTable  processPhraseTable queryPhraseTable ;
>
> On 04/05/2015 01:42, Ergun Bicici wrote:
>>
>> binarizing...gzip -cd
>> en-ru_path/model/Transliteration.8/tuning/filtered/phrase-table.0-0.1.1.gz
>> | LC_ALL=C sort -T en-ru_path/model/Transliteration.8/tuning/filtered
>> | moses_3.0/mosesdecoder/bin/processPhraseTable -ttable 0 0 - -nscores
>> 4 -out
>> en-ru_path/model/Transliteration.8/tuning/filtered/phrase-table.0-0.1.1
>> sh: moses_3.0/mosesdecoder/bin/processPhraseTable: No such file or
>> directory
>> sort: write failed: standard output: Broken pipe
>> sort: write error
>>
>> How can I have processPhraseTable built?
>>
>> Best Regards,
>> Ergun
>>
>> Ergun Bi?ici, CNGL, School of Computing, DCU, www.cngl.ie
>> <http://www.cngl.ie>
>> http://www.computing.dcu.ie/~ebicici/
>> <http://www.computing.dcu.ie/%7Eebicici/>
>>
>>
>>
>> _______________________________________________
>> Moses-support mailing list
>> Moses-support@mit.edu
>> http://mailman.mit.edu/mailman/listinfo/moses-support
>
> --
> Hieu Hoang
> Researcher
> New York University, Abu Dhabi
> http://www.hoang.co.uk/hieu
>
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: 
> http://mailman.mit.edu/mailman/private/moses-support/attachments/20150504/303023d0/attachment-0001.htm
>
> ------------------------------
>
> Message: 4
> Date: Mon, 4 May 2015 08:46:15 +0400
> From: Hieu Hoang <hieuho...@gmail.com>
> Subject: [Moses-support] Europarl monolingual corpus
> To: moses-support <moses-support@mit.edu>
> Message-ID:
>         <caekmkbio64f_m20rwnxydoj60fhez_oo+by+hzkw3tbfukp...@mail.gmail.com>
> Content-Type: text/plain; charset="utf-8"
>
> What's the easiest way get the single-language data from the Europarl
> corpus as described in the 1st table in:
>   http://statmt.org/europarl/
>
> I tried downloading the xml source
>    http://statmt.org/europarl/v7/europarl.tgz
> stripping the xml and running split-sentence.perl, but this takes an
> unfathomably long time
>
> Hieu Hoang
> Researcher
> New York University, Abu Dhabi
> http://www.hoang.co.uk/hieu
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: 
> http://mailman.mit.edu/mailman/private/moses-support/attachments/20150504/ba5b4087/attachment.htm
>
> ------------------------------
>
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
>
> End of Moses-support Digest, Vol 103, Issue 5
> *********************************************
_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to