[Moses-support] KenLM scoring of long target phrases

2016-04-19 Thread Evgeny Matusov
Hi,


my colleagues and I noticed the following in the KenLM code when a Hypo is 
evaluated with the LM:


https://github.com/moses-smt/mosesdecoder/blob/master/moses/LM/Ken.cpp#L203


Do we understand it correctly that because of this line, for phrases longer 
than the LM order N only the first N words are scored with the LM, the 
subsequent words are not scored?  At least I don't see a call to add their 
scores anywhere, they are just passed on to update the LM state in lines 
222-225.


Please clarify. It seems like a phrase should be scored by the LM completely, 
otherwise longer phrases which start with frequent n-grams but have unlikely 
word sequences afterwards are wrongly preferred. Also, longer phrases are 
preferred in general with such scoring.


Thanks,


Evgeny.

___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] use placeholder with mosesserver

2015-11-17 Thread Evgeny Matusov
Hi active Moses developers,


As promised, I am working on adding the correct placeholder output to Moses in 
the server mode. Currently, passing the n-best size through xmlrpc is commented 
out. Any reason why this is the case? It might be convenient to run server 
queries with/without N-best output and with different N-best sizes without 
having to restart the server (currently, this is the only way to do it).


If there are no objections, I will put this parameter back into the code.


Best,

Evgeny.



From: Evgeny Matusov
Sent: Wednesday, November 11, 2015 10:30 AM
To: ugerm...@inf.ed.ac.uk
Cc: moses-support@mit.edu
Subject: Re: [Moses-support] use placeholder with mosesserver


Hi Uli,


thanks a lot! We will try to add some test cases for Mosesserver, including XML 
input with/without placeholders.


Best,

Evgeny.



From: Ulrich Germann 
Sent: Wednesday, November 11, 2015 2:58 AM
To: Evgeny Matusov
Cc: moses-support@mit.edu
Subject: Re: [Moses-support] use placeholder with mosesserver

Hi all,

I've just pushed what I believe might address a few of the issues in this 
thread:

- the more fine-grained configuration options for request handling and queuing, 
server timeouts etc. (added in August due to threading issue) have been 
transferred to the main moses executable.

- the server now pays attention to the xml-input option specified via json; the 
range of accepted values is the same as when specified on the command line. I 
have not written the xml-input handling and do not actively use it, so it may 
or may not work. I don't think there are any regression tests that test this 
right now. Reports from the trenches are welcome.

- mosesserver.cpp is deprecated. It is now merely a shell around the regular 
moses call with --server. I did not remove it from the code base entirely, as I 
assume that there's a plethora of setups out there that rely on the existence 
of mosesserver. What the wrapper does is add --server to the options and then 
call run regular moses.

- anyone adding stuff to mosesserver.cpp from now on owes me a lifetime supply 
of the finest Laphroaig. Just send me a quarter cask every year for Burns Nicht 
for the rest of my life if you do. If I haven't pushed anything for two years, 
you may assume I'm dead.


- Uli

On Tue, Nov 10, 2015 at 2:58 PM, Ulrich Germann 
mailto:ulrich.germ...@gmail.com>> wrote:
Hi all,

mosesserver is deprecated and should not be used any more. I'll transfer the 
threading-related changes to the server implementation in the regular moses 
executable and let you know once I'm done so that other things (like 
passthrough) can be added. By the looks of it, the changes are fairly 
straightforward, so it shouldn't take long. However, I can't guarantee that the 
new server will do everything the old server did, (or do it the same way).

It would be fantastic if a few people could design and contribute test cases so 
that we can do some regression testing for the server. Ideally a test case 
should provide:

- tiny models to work with (or we may be able to recycle some that already 
exist)
- sample input (json)
- expected output (json)

Cheers - Uli


___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] use placeholder with mosesserver

2015-11-11 Thread Evgeny Matusov
Hi Uli,


thanks a lot! We will try to add some test cases for Mosesserver, including XML 
input with/without placeholders.


Best,

Evgeny.



From: Ulrich Germann 
Sent: Wednesday, November 11, 2015 2:58 AM
To: Evgeny Matusov
Cc: moses-support@mit.edu
Subject: Re: [Moses-support] use placeholder with mosesserver

Hi all,

I've just pushed what I believe might address a few of the issues in this 
thread:

- the more fine-grained configuration options for request handling and queuing, 
server timeouts etc. (added in August due to threading issue) have been 
transferred to the main moses executable.

- the server now pays attention to the xml-input option specified via json; the 
range of accepted values is the same as when specified on the command line. I 
have not written the xml-input handling and do not actively use it, so it may 
or may not work. I don't think there are any regression tests that test this 
right now. Reports from the trenches are welcome.

- mosesserver.cpp is deprecated. It is now merely a shell around the regular 
moses call with --server. I did not remove it from the code base entirely, as I 
assume that there's a plethora of setups out there that rely on the existence 
of mosesserver. What the wrapper does is add --server to the options and then 
call run regular moses.

- anyone adding stuff to mosesserver.cpp from now on owes me a lifetime supply 
of the finest Laphroaig. Just send me a quarter cask every year for Burns Nicht 
for the rest of my life if you do. If I haven't pushed anything for two years, 
you may assume I'm dead.


- Uli

On Tue, Nov 10, 2015 at 2:58 PM, Ulrich Germann 
mailto:ulrich.germ...@gmail.com>> wrote:
Hi all,

mosesserver is deprecated and should not be used any more. I'll transfer the 
threading-related changes to the server implementation in the regular moses 
executable and let you know once I'm done so that other things (like 
passthrough) can be added. By the looks of it, the changes are fairly 
straightforward, so it shouldn't take long. However, I can't guarantee that the 
new server will do everything the old server did, (or do it the same way).

It would be fantastic if a few people could design and contribute test cases so 
that we can do some regression testing for the server. Ideally a test case 
should provide:

- tiny models to work with (or we may be able to recycle some that already 
exist)
- sample input (json)
- expected output (json)

Cheers - Uli

On Tue, Nov 10, 2015 at 11:37 AM, Evgeny Matusov 
mailto:ematu...@apptek.com>> wrote:

Hi,

can any of the more active recent developers advise what is the latest stable 
mosesserver implementation?

It seems to be the one in moses/server, but the  one in in 
contrib/server/mosesserver.cpp has been updated in August of this year with an 
important fix related to multiple threads:

https://github.com/moses-smt/mosesdecoder/commit/3c682fa8b05af6bff1a09f420141795875cf9685
https://www.mail-archive.com/moses-support%40mit.edu/msg12875.html

As Gregor mentioned, we would like to share our fix so that Mosesserver 
correctly supports placeholders. I want to make sure that this is a fix for 
something that many people use without problems.

Thanks,
Evgeny.



From: moses-support-boun...@mit.edu<mailto:moses-support-boun...@mit.edu> 
mailto:moses-support-boun...@mit.edu>> on behalf 
of moses-support-requ...@mit.edu<mailto:moses-support-requ...@mit.edu> 
mailto:moses-support-requ...@mit.edu>>
Sent: Monday, November 9, 2015 6:02 PM
To: moses-support@mit.edu<mailto:moses-support@mit.edu>
Subject: Moses-support Digest, Vol 109, Issue 16

Send Moses-support mailing list submissions to
moses-support@mit.edu<mailto:moses-support@mit.edu>

To subscribe or unsubscribe via the World Wide Web, visit
http://mailman.mit.edu/mailman/listinfo/moses-support
or, via email, send a message with subject or body 'help' to
moses-support-requ...@mit.edu<mailto:moses-support-requ...@mit.edu>

You can reach the person managing the list at
moses-support-ow...@mit.edu<mailto:moses-support-ow...@mit.edu>

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Moses-support digest..."


Today's Topics:

   1. Question about output alignment info (Marta Ruiz)
   2. Re: use placeholder with mosesserver (Leusch, Gregor)


--

Message: 1
Date: Mon, 9 Nov 2015 16:10:26 +0100
From: Marta Ruiz 
mailto:martaruizcostaju...@gmail.com>>
Subject: [Moses-support] Question about output alignment info
To: moses-support@mit.edu<mailto:moses-support@mit.edu>
Message-ID:

mailto:vrajg5btd3amfvmhpcynda3nj-0ynyqwf3xuzd...@mail.gmail.com>>
Content-Type: text/plain; charset="utf-8"

Re: [Moses-support] use placeholder with mosesserver

2015-11-10 Thread Evgeny Matusov

Hi,

can any of the more active recent developers advise what is the latest stable 
mosesserver implementation?

It seems to be the one in moses/server, but the  one in in 
contrib/server/mosesserver.cpp has been updated in August of this year with an 
important fix related to multiple threads:

https://github.com/moses-smt/mosesdecoder/commit/3c682fa8b05af6bff1a09f420141795875cf9685
https://www.mail-archive.com/moses-support%40mit.edu/msg12875.html

As Gregor mentioned, we would like to share our fix so that Mosesserver 
correctly supports placeholders. I want to make sure that this is a fix for 
something that many people use without problems.

Thanks,
Evgeny.



From: moses-support-boun...@mit.edu  on behalf 
of moses-support-requ...@mit.edu 
Sent: Monday, November 9, 2015 6:02 PM
To: moses-support@mit.edu
Subject: Moses-support Digest, Vol 109, Issue 16

Send Moses-support mailing list submissions to
moses-support@mit.edu

To subscribe or unsubscribe via the World Wide Web, visit
http://mailman.mit.edu/mailman/listinfo/moses-support
or, via email, send a message with subject or body 'help' to
moses-support-requ...@mit.edu

You can reach the person managing the list at
moses-support-ow...@mit.edu

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Moses-support digest..."


Today's Topics:

   1. Question about output alignment info (Marta Ruiz)
   2. Re: use placeholder with mosesserver (Leusch, Gregor)


--

Message: 1
Date: Mon, 9 Nov 2015 16:10:26 +0100
From: Marta Ruiz 
Subject: [Moses-support] Question about output alignment info
To: moses-support@mit.edu
Message-ID:

Content-Type: text/plain; charset="utf-8"

Hi all,

When I use the option "-alignment-output-file [file]", I get just a few
alignments. Most sentences are in blank, except some that have one
alignment...

best,
Marta




--
Marta Ruiz Costa-juss?
martaruizcostaju...@gmail.com
http://www.costa-jussa.com
-- next part --
An HTML attachment was scrubbed...
URL: 
http://mailman.mit.edu/mailman/private/moses-support/attachments/20151109/4da5ec6d/attachment-0001.html

--

Message: 2
Date: Mon, 9 Nov 2015 15:37:32 +
From: "Leusch, Gregor" 
Subject: Re: [Moses-support] use placeholder with mosesserver
To: Vito Mandorino , moses-support

Message-ID: 
Content-Type: text/plain; charset="utf-8"

Hi,

we saw the same issue a while ago in an older version of Moses. Mosesserver and 
moses use different routines to parse the input string; in particular the code 
in mosesserver did not parse placeholders input correctly. It seems to me that 
this is fixed in the most recent version of mosesserver (though I have not 
tested this; I just looked at the code); in addition, our team is currently 
discussing whether it makes sense to make available our patches to the 
mosesserver code either on the version we are using, or on a more recent 
version, available end of this week.

Best,

Gregor




From: mailto:moses-support-boun...@mit.edu>> on 
behalf of Vito Mandorino 
mailto:vito.mandor...@linguacustodia.com>>
Date: Friday 6 November 2015 16:22
To: moses-support mailto:moses-support@mit.edu>>
Subject: [Moses-support] use placeholder with mosesserver

Dear all,

I have been unsuccessful so far in using the placeholder approach with 
mosesserver. The translated segments contain the placeholder token @num@ 
instead of numbers.
Do you know how to get the numbers in the output?

Many thanks,

Vito Mandorino


--
M. Vito MANDORINO -- Chief Scientist


[Description?: Description?: lingua_custodia_final  full logo]

 The Translation Trustee

1, Place Charles de Gaulle, 78180 Montigny-le-Bretonneux

Tel : +33 1 30 44 04 23   Mobile : +33 6 84 65 68 89

Email :  
vito.mandor...@linguacustodia.com

Website :  www.linguacustodia.com - 
www.thetranslationtrustee.com 


-- next part --
An HTML attachment was scrubbed...
URL: 
http://mailman.mit.edu/mailman/private/moses-support/attachments/20151109/c046c74c/attachment-0001.html
-- next part --
A non-text attachment was scrubbed...
Name: image001.jpg
Type: image/jpeg
Size: 4421 bytes
Desc: image001.jpg
Url : 
http://mailman.mit.edu/mailman/private/moses-support/attachments/20151109/c046c74c/attachment-0001.jpg

--

___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


End of Moses-support Digest, Vol 109, Issue 16
**
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/lis

[Moses-support] Dependencies in EMS/Experiment.perl

2015-06-19 Thread Evgeny Matusov
Hi,


to those of you using Experiment.perl for experiments, maybe you can help me 
solve the following problem:


I added a step to filter full segment overlap of evaluation and tuning data 
with the training data. This steps removes all sentences from each CORPUS which 
are also found in EVALUATION and TUNING sentences. Thus, one of the CORPUS 
steps depend on EVALUATION and TUNING.


Now, I want to exchange the tuning corpus I am using, picking another one which 
was already declared in the EVALUATION section. Thus, the filter against which 
the overlap is checked does not change, and hence the training data does not 
need to be filtered again, and therefore neither the alignment training nor LM 
training or anything else should be repeated, just the tuning step should 
re-start. However, Experiment.perl is not smart enough to realize this. I tried 
to add "pass-if" or "ignore-if" step on the filter-overlap step that I declared 
and set a variable to pass it, but this did not help - all steps after it are 
still executed. Setting TRAINING:config to a valid moses.ini file helps to 
prevent the alignment training from running, but not the LM training, nor (more 
importantly), the several cleaning/lowercasing steps that follow the overlap 
step for each training corpus.


Is there an easy way to block everything below tuning from being repeated, even 
if the tuning data changes?


Thanks,

Evgeny.




___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] Memory leak when producing distinct N-best lists?

2009-11-16 Thread Evgeny Matusov
Hi,

the strange thing is that I don't see the leak when I run moses  in valgrind. 
Valgrind makes the process run much slower, which might explain the problem.

What I see in valgrind is a stable usage of 7.5G of memory up until sentence 
130 (where I stopped the process, and valgrind reported a memory leak of only 
256 MB). This took about 3 hours to run. 

When running the process without valgrind, I reach 11GB on sentence 70 after 
about 5-10 minutes.



Best,
Evgeny.

P.S. The generally high memory usage is dominated by 5.9 GB large IRST LM.

On Monday 16 November 2009 13:13:47 Hieu Hoang wrote:
> Hi Evgeny
>
> I've tried to reproduce the mem leak with distinct n-best, I don't see a
> problem.
>
> are you able to run valgrind and see where the leak is coming from?
>
> Evgeny Matusov wrote:
> > Hi,
> >
> > I tried to run Moses to produce N-best lists with distinct hypotheses
> > (setting -nbest-list  500 distinct). Although I also
> > set "-use-persistent-cache false", the memory usage continues to grow
> > from sentence to sentence during translation. The memory consumption
> > drops only 2-3 times in the first 100 sentences as opposed to after every
> > sentence. After 100 to 150 sentences, the program dies with bad_alloc.
> >
> > Do you know what causes this problem?
> >
> > I run this on a task with about 700K sentence pairs (GALE-type data),
> > setting stack size to 100, stack diversity to 10 and
> > max-trans-opt-per-coverage to 15, and had no problems running translation
> > with single-best output only.
> >
> >
> > Thanks much,
> > Evgeny.
> >
> > P.S. I just found that this issue had been discussed and presumably fixed
> > sometime ago; however, I still have this problem with the latest svn
> > commit of Moses.



-- 
===
Dipl.-Inform. Evgeny Matusov
Senior Speech and Language Engineer
Applications Technology, Inc.
E-mail: ematu...@apptek.com
Tel. +49-241-939-19815
Fax. +49-241-939-19816
===
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


[Moses-support] Memory leak when producing distinct N-best lists?

2009-11-13 Thread Evgeny Matusov
Hi,

I tried to run Moses to produce N-best lists with distinct hypotheses
(setting -nbest-list  500 distinct). Although I also set
"-use-persistent-cache false", the memory usage continues to grow from
sentence to sentence during translation. The memory consumption drops only
2-3 times in the first 100 sentences as opposed to after every sentence.
After 100 to 150 sentences, the program dies with bad_alloc.

Do you know what causes this problem?

I run this on a task with about 700K sentence pairs (GALE-type data),
setting stack size to 100, stack diversity to 10 and
max-trans-opt-per-coverage to 15, and had no problems running translation
with single-best output only.


Thanks much,
Evgeny.
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


[Moses-support] Memory leak when producing distinct N-best lists?

2009-11-13 Thread Evgeny Matusov
Hi,

I tried to run Moses to produce N-best lists with distinct hypotheses 
(setting -nbest-list  500 distinct). Although I also 
set "-use-persistent-cache false", the memory usage continues to grow from 
sentence to sentence during translation. The memory consumption drops only 
2-3 times in the first 100 sentences as opposed to after every sentence. 
After 100 to 150 sentences, the program dies with bad_alloc.

Do you know what causes this problem?

I run this on a task with about 700K sentence pairs (GALE-type data), setting 
stack size to 100, stack diversity to 10 and max-trans-opt-per-coverage to 
15, and had no problems running translation with single-best output only.


Thanks much,
Evgeny.

P.S. I just found that this issue had been discussed and presumably fixed 
sometime ago; however, I still have this problem with the latest svn commit 
of Moses.

-- 
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support