Re: [Moses-support] In-memory loading of compact phrases

2015-03-12 Thread Jesús González Rubio
Thanks for the quick response, I will try as you suggest.

Nevertheless, my main concern is the time spent collecting options. Is it
normal the difference observed respect to the gzip'ed tables? being the
tables cached, shouldn't they be closer?

2015-03-11 18:52 GMT+00:00 Marcin Junczys-Dowmunt junc...@amu.edu.pl:

  Hi,
 Try measuring the differences again after a full system reboot (fresh
 reboot before each mesurement) or after purging OS read/write caches. Your
 phrase tables are most likely cached, which means they are in fact in
 memory.
 Best,
 Marcin

 W dniu 11.03.2015 o 19:31, Jesús González Rubio pisze:

 Hi,

  I'm obtaining some unintuitive timing results when using compact phrase
 tables. The average translation time per sentence is much higher for them
 in comparison to using gzip'ed phrase tables. Particularly important is the
 difference in time required to collect the options. This table summarizes
 the timings (in seconds):

   CompactGzip'ed
  on-disk in-memory
 Init:   5.9   6.31882.8
 Per-sentence:
  - Collect: 5.9   5.8   0.2
  - Search:  1.6   1.6   3.3

  Results in the table were computed using Moses v2.1 with one single
 thread (-th 1) but I've seen similar results using the pre-compiled binary
 for moses v3.0. The model comprises two phrase-tables (~2G and ~3M), two
 lexicalized reordering tables (~700M and ~1M) and two language models (~31G
 and ~38M). You can see the exact configuration in the attached moses.ini
 file.

  Interestingly, there is virtually no difference for the compact table
 between the the on-disk and in-memory options. Additionally, timings were
 higher for the initial sentences in both cases which I think should not be
 the case for the in-memory option.

  May be the case that the in-memory option of compact tables
 (-minpht-memory -minlexr-memory) is not working properly?

  Cheers.
 --
 Jesús


 ___
 Moses-support mailing 
 listMoses-support@mit.eduhttp://mailman.mit.edu/mailman/listinfo/moses-support



 ___
 Moses-support mailing list
 Moses-support@mit.edu
 http://mailman.mit.edu/mailman/listinfo/moses-support




-- 
Jesús
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] In-memory loading of compact phrases

2015-03-11 Thread Jesús González Rubio
2015-03-11 19:21 GMT+00:00 Marcin Junczys-Dowmunt junc...@amu.edu.pl:

  Maybe someone will correct me, but if I am not wrong, the gziped version
 already calculates the future score while loading (i.e. the phrase is being
 scored by the language model). The compact phrase table cannot do this
 during loading and doing this on-line. This will be the reason for the slow
 speed. I suppose your phrase table has not been pruned? So, for instance
 function words like the can have hundreds of thousands of counterparts
 that need to be scored by the LM during collection.


That makes sense.

You can limit your phrase table using Barry's prunePhraseTable tool. With
 this you can limit it to, say, the 20 best phrases (corresponds to the
 ttable limit) and only score this 20 phrases during collection. That should
 be orders of magnitude faster.


OK.


 Best,
 Marcin

 W dniu 11.03.2015 o 20:12, Jesús González Rubio pisze:

 Thanks for the quick response, I will try as you suggest.

  Nevertheless, my main concern is the time spent collecting options. Is
 it normal the difference observed respect to the gzip'ed tables? being the
 tables cached, shouldn't they be closer?

 2015-03-11 18:52 GMT+00:00 Marcin Junczys-Dowmunt junc...@amu.edu.pl:

  Hi,
 Try measuring the differences again after a full system reboot (fresh
 reboot before each mesurement) or after purging OS read/write caches. Your
 phrase tables are most likely cached, which means they are in fact in
 memory.
 Best,
 Marcin

 W dniu 11.03.2015 o 19:31, Jesús González Rubio pisze:

  Hi,

  I'm obtaining some unintuitive timing results when using compact phrase
 tables. The average translation time per sentence is much higher for them
 in comparison to using gzip'ed phrase tables. Particularly important is the
 difference in time required to collect the options. This table summarizes
 the timings (in seconds):

   CompactGzip'ed
  on-disk in-memory
 Init:   5.9   6.31882.8
 Per-sentence:
  - Collect: 5.9   5.8   0.2
  - Search:  1.6   1.6   3.3

  Results in the table were computed using Moses v2.1 with one single
 thread (-th 1) but I've seen similar results using the pre-compiled binary
 for moses v3.0. The model comprises two phrase-tables (~2G and ~3M), two
 lexicalized reordering tables (~700M and ~1M) and two language models (~31G
 and ~38M). You can see the exact configuration in the attached moses.ini
 file.

  Interestingly, there is virtually no difference for the compact table
 between the the on-disk and in-memory options. Additionally, timings were
 higher for the initial sentences in both cases which I think should not be
 the case for the in-memory option.

  May be the case that the in-memory option of compact tables
 (-minpht-memory -minlexr-memory) is not working properly?

  Cheers.
 --
 Jesús


  ___
 Moses-support mailing 
 listMoses-support@mit.eduhttp://mailman.mit.edu/mailman/listinfo/moses-support



 ___
 Moses-support mailing list
 Moses-support@mit.edu
 http://mailman.mit.edu/mailman/listinfo/moses-support






-- 
Jesús
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


[Moses-support] In-memory loading of compact phrases

2015-03-11 Thread Jesús González Rubio
Hi,

I'm obtaining some unintuitive timing results when using compact phrase
tables. The average translation time per sentence is much higher for them
in comparison to using gzip'ed phrase tables. Particularly important is the
difference in time required to collect the options. This table summarizes
the timings (in seconds):

 CompactGzip'ed
on-disk in-memory
Init:   5.9   6.31882.8
Per-sentence:
 - Collect: 5.9   5.8   0.2
 - Search:  1.6   1.6   3.3

Results in the table were computed using Moses v2.1 with one single thread
(-th 1) but I've seen similar results using the pre-compiled binary for
moses v3.0. The model comprises two phrase-tables (~2G and ~3M), two
lexicalized reordering tables (~700M and ~1M) and two language models (~31G
and ~38M). You can see the exact configuration in the attached moses.ini
file.

Interestingly, there is virtually no difference for the compact table
between the the on-disk and in-memory options. Additionally, timings were
higher for the initial sentences in both cases which I think should not be
the case for the in-memory option.

May be the case that the in-memory option of compact tables (-minpht-memory
-minlexr-memory) is not working properly?

Cheers.
-- 
Jesús


moses.ini
Description: Binary data
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


[Moses-support] Constrained decoding

2015-01-20 Thread Jesús González Rubio
Hi all,

I have some questions about the constrained decoding feature implemented in
moses, (ConstrainedDecoding).

Which is the meaning of the 'max-unknowns' parameter?

I understand 'max-unknowns' as something like the maximum edit distance
allowed between the final translation and the reference, i.e the maximum
number of words in the final translation that are allowed to be different
to the reference. Is this interpretation correct?

Also, what is the interpretation of the 'negate' and 'soft' parameters?

Thanks in advance.

Cheers.
-- 
Jesús
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


[Moses-support] Failed check on moses_chart

2013-03-28 Thread Jesús González Rubio
Dear Moses supporters,

I am experiencing some problems using the chart decoder implemented in
Moses. Specifically, the decoder exits, without even loading the rule
table, and outputs the following message:

$ cat ../data/dev.utf8.es | ~/bin/moses/bin/moses_chart -f
tm-chart/model/moses.ini
.
.
.
Start loading text SCFG phrase table. Moses  format : [2.795] seconds
Reading
/home/jegonzalez/Escritorio/hierarchicalIMT/data/eu-tt2/es-en.utf8/tm-chart/model/rule-table.gz
5---10---15---20---25---30---35---40---45---50---55---60---65---70---75---80---85---90---95--100
Check !fit failed in moses/Word.cpp:109
Aborted

A standard phrase-based model trained on the same corpora works perfectly
fine.

Any help would be greatly appreciated!

Regards.
-- 
Jesús
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


[Moses-support] Question about the format of search graphs generated by moses-chart

2013-01-24 Thread Jesús González Rubio
Hi,

I'm generating some translations using the -osg option of moses-chart and I
have some difficulties to fully understand the format in which the search
hypergraph is outputted. ¿Is there a description of the osg format
available?

Cheers.
-- 
Jesús
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] Question about the format of search graphs generated by moses-chart

2013-01-24 Thread Jesús González Rubio
Thanks Christian.

I have read the code of OutputSearchNode and it seems to be designed to
write a word-grap, not a hypergraph. ¿May be possible that OutputSearchNode
is the function called when the -osg option is passed to moses, and a
different function is called for the same option of moses-chart?

These are two example lines of the search graph generated by moses-chat and
do not seem to match the format followed by OutputSearchNode.
0 228 i go :: pC=-2.69474, c=-4.29439 [1..2] [total=-4.29439] -0.868589,
0, -4.93649, -0.538997, -4.52822, -4.26469, -5.14167, 0.999896, 0
0 245-228 X go :0-0 : pC=-3.04609, c=-4.20798 [1..2] 1 [total=-4.70939]
-0.868589, 0, -4.93649, -4.14475, -4.52822, -3.73385, -5.14167, 1.99979,
0

Cheers.


2013/1/24 Christian Buck cb...@lantis.de

 Hi,

 I am not aware of updated documentation on this. Your best chance is
 probably to read through

 void OutputSearchNode

 in moses/src/Manager.cpp which is pretty readable.

 cheers,
 Christian

 On 24/01/13 17:24, Jesús González Rubio wrote:
  Hi,
 
  I'm generating some translations using the -osg option of moses-chart
  and I have some difficulties to fully understand the format in which the
  search hypergraph is outputted. ¿Is there a description of the osg
  format available?
 
  Cheers.
  --
  Jesús
 
 
  ___
  Moses-support mailing list
  Moses-support@mit.edu
  http://mailman.mit.edu/mailman/listinfo/moses-support
 
 ___
 Moses-support mailing list
 Moses-support@mit.edu
 http://mailman.mit.edu/mailman/listinfo/moses-support




-- 
Jesús
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] Incremental training for SMT

2011-10-06 Thread Jesús González Rubio
2011/10/6 HOANG Cong Duy Vu duyvu...@gmail.com

 Hi all,

 I am working on the problem that tries to develop a SMT system that can
 learn incrementally. The scenario is as follows:

 - A state-of-the-art SMT system tries to translate a source language
 sentence from users.
 - Users identify some translation errors in translated sentence and then
 give the correction.
 - SMT system gets the correction and learn from that immediately.

 What I mean is whether SMT system can learn the user corrections (without
 re-training) incrementally.

 Do you know any similar ideas or have any advice or suggestion?

 Thanks in advance!

 --
 Cheers,
 Vu

 ___
 Moses-support mailing list
 Moses-support@mit.edu
 http://mailman.mit.edu/mailman/listinfo/moses-support


Hi Vu,

You can try searching for Interactive machine translation,for example this
paper covers the details of the online retraining of an MT system:

Online Learning for Interactive Statistical Machine Translation
aclweb.org/anthology/N/N10/N10-1079.pdf

Cheers
-- 
Jesús
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support