Re: [Moses-support] In-memory loading of compact phrases
Thanks for the quick response, I will try as you suggest. Nevertheless, my main concern is the time spent collecting options. Is it normal the difference observed respect to the gzip'ed tables? being the tables cached, shouldn't they be closer? 2015-03-11 18:52 GMT+00:00 Marcin Junczys-Dowmunt junc...@amu.edu.pl: Hi, Try measuring the differences again after a full system reboot (fresh reboot before each mesurement) or after purging OS read/write caches. Your phrase tables are most likely cached, which means they are in fact in memory. Best, Marcin W dniu 11.03.2015 o 19:31, Jesús González Rubio pisze: Hi, I'm obtaining some unintuitive timing results when using compact phrase tables. The average translation time per sentence is much higher for them in comparison to using gzip'ed phrase tables. Particularly important is the difference in time required to collect the options. This table summarizes the timings (in seconds): CompactGzip'ed on-disk in-memory Init: 5.9 6.31882.8 Per-sentence: - Collect: 5.9 5.8 0.2 - Search: 1.6 1.6 3.3 Results in the table were computed using Moses v2.1 with one single thread (-th 1) but I've seen similar results using the pre-compiled binary for moses v3.0. The model comprises two phrase-tables (~2G and ~3M), two lexicalized reordering tables (~700M and ~1M) and two language models (~31G and ~38M). You can see the exact configuration in the attached moses.ini file. Interestingly, there is virtually no difference for the compact table between the the on-disk and in-memory options. Additionally, timings were higher for the initial sentences in both cases which I think should not be the case for the in-memory option. May be the case that the in-memory option of compact tables (-minpht-memory -minlexr-memory) is not working properly? Cheers. -- Jesús ___ Moses-support mailing listMoses-support@mit.eduhttp://mailman.mit.edu/mailman/listinfo/moses-support ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support -- Jesús ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support
Re: [Moses-support] In-memory loading of compact phrases
2015-03-11 19:21 GMT+00:00 Marcin Junczys-Dowmunt junc...@amu.edu.pl: Maybe someone will correct me, but if I am not wrong, the gziped version already calculates the future score while loading (i.e. the phrase is being scored by the language model). The compact phrase table cannot do this during loading and doing this on-line. This will be the reason for the slow speed. I suppose your phrase table has not been pruned? So, for instance function words like the can have hundreds of thousands of counterparts that need to be scored by the LM during collection. That makes sense. You can limit your phrase table using Barry's prunePhraseTable tool. With this you can limit it to, say, the 20 best phrases (corresponds to the ttable limit) and only score this 20 phrases during collection. That should be orders of magnitude faster. OK. Best, Marcin W dniu 11.03.2015 o 20:12, Jesús González Rubio pisze: Thanks for the quick response, I will try as you suggest. Nevertheless, my main concern is the time spent collecting options. Is it normal the difference observed respect to the gzip'ed tables? being the tables cached, shouldn't they be closer? 2015-03-11 18:52 GMT+00:00 Marcin Junczys-Dowmunt junc...@amu.edu.pl: Hi, Try measuring the differences again after a full system reboot (fresh reboot before each mesurement) or after purging OS read/write caches. Your phrase tables are most likely cached, which means they are in fact in memory. Best, Marcin W dniu 11.03.2015 o 19:31, Jesús González Rubio pisze: Hi, I'm obtaining some unintuitive timing results when using compact phrase tables. The average translation time per sentence is much higher for them in comparison to using gzip'ed phrase tables. Particularly important is the difference in time required to collect the options. This table summarizes the timings (in seconds): CompactGzip'ed on-disk in-memory Init: 5.9 6.31882.8 Per-sentence: - Collect: 5.9 5.8 0.2 - Search: 1.6 1.6 3.3 Results in the table were computed using Moses v2.1 with one single thread (-th 1) but I've seen similar results using the pre-compiled binary for moses v3.0. The model comprises two phrase-tables (~2G and ~3M), two lexicalized reordering tables (~700M and ~1M) and two language models (~31G and ~38M). You can see the exact configuration in the attached moses.ini file. Interestingly, there is virtually no difference for the compact table between the the on-disk and in-memory options. Additionally, timings were higher for the initial sentences in both cases which I think should not be the case for the in-memory option. May be the case that the in-memory option of compact tables (-minpht-memory -minlexr-memory) is not working properly? Cheers. -- Jesús ___ Moses-support mailing listMoses-support@mit.eduhttp://mailman.mit.edu/mailman/listinfo/moses-support ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support -- Jesús ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support
[Moses-support] In-memory loading of compact phrases
Hi, I'm obtaining some unintuitive timing results when using compact phrase tables. The average translation time per sentence is much higher for them in comparison to using gzip'ed phrase tables. Particularly important is the difference in time required to collect the options. This table summarizes the timings (in seconds): CompactGzip'ed on-disk in-memory Init: 5.9 6.31882.8 Per-sentence: - Collect: 5.9 5.8 0.2 - Search: 1.6 1.6 3.3 Results in the table were computed using Moses v2.1 with one single thread (-th 1) but I've seen similar results using the pre-compiled binary for moses v3.0. The model comprises two phrase-tables (~2G and ~3M), two lexicalized reordering tables (~700M and ~1M) and two language models (~31G and ~38M). You can see the exact configuration in the attached moses.ini file. Interestingly, there is virtually no difference for the compact table between the the on-disk and in-memory options. Additionally, timings were higher for the initial sentences in both cases which I think should not be the case for the in-memory option. May be the case that the in-memory option of compact tables (-minpht-memory -minlexr-memory) is not working properly? Cheers. -- Jesús moses.ini Description: Binary data ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support
[Moses-support] Constrained decoding
Hi all, I have some questions about the constrained decoding feature implemented in moses, (ConstrainedDecoding). Which is the meaning of the 'max-unknowns' parameter? I understand 'max-unknowns' as something like the maximum edit distance allowed between the final translation and the reference, i.e the maximum number of words in the final translation that are allowed to be different to the reference. Is this interpretation correct? Also, what is the interpretation of the 'negate' and 'soft' parameters? Thanks in advance. Cheers. -- Jesús ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support
[Moses-support] Failed check on moses_chart
Dear Moses supporters, I am experiencing some problems using the chart decoder implemented in Moses. Specifically, the decoder exits, without even loading the rule table, and outputs the following message: $ cat ../data/dev.utf8.es | ~/bin/moses/bin/moses_chart -f tm-chart/model/moses.ini . . . Start loading text SCFG phrase table. Moses format : [2.795] seconds Reading /home/jegonzalez/Escritorio/hierarchicalIMT/data/eu-tt2/es-en.utf8/tm-chart/model/rule-table.gz 5---10---15---20---25---30---35---40---45---50---55---60---65---70---75---80---85---90---95--100 Check !fit failed in moses/Word.cpp:109 Aborted A standard phrase-based model trained on the same corpora works perfectly fine. Any help would be greatly appreciated! Regards. -- Jesús ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support
[Moses-support] Question about the format of search graphs generated by moses-chart
Hi, I'm generating some translations using the -osg option of moses-chart and I have some difficulties to fully understand the format in which the search hypergraph is outputted. ¿Is there a description of the osg format available? Cheers. -- Jesús ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support
Re: [Moses-support] Question about the format of search graphs generated by moses-chart
Thanks Christian. I have read the code of OutputSearchNode and it seems to be designed to write a word-grap, not a hypergraph. ¿May be possible that OutputSearchNode is the function called when the -osg option is passed to moses, and a different function is called for the same option of moses-chart? These are two example lines of the search graph generated by moses-chat and do not seem to match the format followed by OutputSearchNode. 0 228 i go :: pC=-2.69474, c=-4.29439 [1..2] [total=-4.29439] -0.868589, 0, -4.93649, -0.538997, -4.52822, -4.26469, -5.14167, 0.999896, 0 0 245-228 X go :0-0 : pC=-3.04609, c=-4.20798 [1..2] 1 [total=-4.70939] -0.868589, 0, -4.93649, -4.14475, -4.52822, -3.73385, -5.14167, 1.99979, 0 Cheers. 2013/1/24 Christian Buck cb...@lantis.de Hi, I am not aware of updated documentation on this. Your best chance is probably to read through void OutputSearchNode in moses/src/Manager.cpp which is pretty readable. cheers, Christian On 24/01/13 17:24, Jesús González Rubio wrote: Hi, I'm generating some translations using the -osg option of moses-chart and I have some difficulties to fully understand the format in which the search hypergraph is outputted. ¿Is there a description of the osg format available? Cheers. -- Jesús ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support -- Jesús ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support
Re: [Moses-support] Incremental training for SMT
2011/10/6 HOANG Cong Duy Vu duyvu...@gmail.com Hi all, I am working on the problem that tries to develop a SMT system that can learn incrementally. The scenario is as follows: - A state-of-the-art SMT system tries to translate a source language sentence from users. - Users identify some translation errors in translated sentence and then give the correction. - SMT system gets the correction and learn from that immediately. What I mean is whether SMT system can learn the user corrections (without re-training) incrementally. Do you know any similar ideas or have any advice or suggestion? Thanks in advance! -- Cheers, Vu ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support Hi Vu, You can try searching for Interactive machine translation,for example this paper covers the details of the online retraining of an MT system: Online Learning for Interactive Statistical Machine Translation aclweb.org/anthology/N/N10/N10-1079.pdf Cheers -- Jesús ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support