[Moses-support] zlib bug causing Moses training to fail

2014-01-10 Thread Barry Haddow
Hi

I've discovered that an apparent bug in zlib (specifically version 
1.2.3.4 installed in Ubuntu 12.04) can cause the Moses training pipeline 
to fail. Upgrading to the latest zlib (1.2.8) fixes the problem.

The problem appears when the extract files are being read in order to 
create the reordering tables or translation tables. For the reordering 
table, an error message like this

terminate called after throwing an instance of 'util::GZException'
   what():  util/read_compressed.cc:163 in virtual std::size_t 
util::{anonymous}::GZip::Read(void*, std::size_t, util::ReadCompressed) 
threw GZException'.
zlib encountered invalid distances set code -3

is printed, whilst for the translation table, the Moses scorer swallows 
the zlib error but eventually fails in 'consolidate' like this:

ERROR: source phrase does not match in line 289: 'clean .' != 
'definitely recommend to have'

So far I haven't found any reports of this zlib bug, and it seems to be 
quite intermittent.

cheers - Barry


-- 
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.

___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] Prereordering was: Re: Google Translate - technology paradigms behind it

2014-01-10 Thread Hieu Hoang

Not that I know of. If you create 1, please check it in.

On 09/01/2014 15:49, Per Tunedal wrote:

Hi,
are there any tools to do prereordering with Moses? I suppose it has 
to be done both before training and translation. And that it 
presupposes tagging.

How much does prereordering improve translation?
Yours,
Per Tunedal
On Wed, Jan 8, 2014, at 22:42, Philipp Koehn wrote:

Hi,
Google lists its research papers here:
http://research.google.com/pubs/MachineTranslation.html
--- snip---
maybe hierarchical, with pre-reordering.
---snip ---
-phi


On Wed, Jan 8, 2014 at 8:47 PM, Ivan Dun?er ivandun...@gmail.com 
mailto:ivandun...@gmail.com wrote:


Hi everybody,

is there any reading material (e.g. scientific articles etc.) on what
MT approaches Google Translate uses?
Sorry for off-topic.

Regards,
Ivan
___
Moses-support mailing list
Moses-support@mit.edu mailto:Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support

_
Moses-support mailing list
Moses-support@mit.edu mailto:Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support



___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] Moses-support Digest, Vol 87, Issue 20

2014-01-10 Thread Andreas Søeborg Kirkedal
I know Jakob Elming at UNICPH did prereordering before training MT models
with Moses. He might be able to help.

-Andreas


2014/1/10 moses-support-requ...@mit.edu

 Send Moses-support mailing list submissions to
 moses-support@mit.edu

 To subscribe or unsubscribe via the World Wide Web, visit
 http://mailman.mit.edu/mailman/listinfo/moses-support
 or, via email, send a message with subject or body 'help' to
 moses-support-requ...@mit.edu

 You can reach the person managing the list at
 moses-support-ow...@mit.edu

 When replying, please edit your Subject line so it is more specific
 than Re: Contents of Moses-support digest...


 Today's Topics:

1. static build of mosesdecoder (Tak Kuya)
2. zlib bug causing Moses training to fail (Barry Haddow)
3. Re: Prereordering was: Re: Google Translate - technology 
   paradigms behind it (Hieu Hoang)


 --

 Message: 1
 Date: Fri, 10 Jan 2014 18:56:33 +0900
 From: Tak Kuya kuya...@gmail.com
 Subject: [Moses-support] static build of mosesdecoder
 To: moses-support@mit.edu
 Message-ID:
 CANaijngK2Q2QpaQ4CPP7z5=
 wzrza48gmphukgcvppy2sr6g...@mail.gmail.com
 Content-Type: text/plain; charset=iso-8859-1

 Hi. How do you build mosesdecoder using static libraries like
 http://www.statmt.org/moses/RELEASE-1.0/binaries/ ? To make things
 simple, I use the master branch instead of Rel.1.0 and edit
 mosesdecoder/Jamroot a little and build the decoder only.

 I can build the normal shared version of moses without any problems
 and usually use it (on CentOS 6.5 with boost 1.41). But I failed to
 build moses with --static option. So I installed boost 1.41 to my home
 directory. But I still couldn't build moses. It said cannot find -lz
 and -lrt so I installed zlib-static and glibc-static with yum.

 And... It failed again. Here are build.log and Jamroot I used. Could
 anyone teach me how to solve the problem? Thank you.

 Best regards,
 Tak Kuya
 -- next part --
 A non-text attachment was scrubbed...
 Name: build.log
 Type: application/octet-stream
 Size: 2184 bytes
 Desc: not available
 Url :
 http://mailman.mit.edu/mailman/private/moses-support/attachments/20140110/03806883/attachment-0002.obj
 -- next part --
 A non-text attachment was scrubbed...
 Name: Jamroot
 Type: application/octet-stream
 Size: 6038 bytes
 Desc: not available
 Url :
 http://mailman.mit.edu/mailman/private/moses-support/attachments/20140110/03806883/attachment-0003.obj

 --

 Message: 2
 Date: Fri, 10 Jan 2014 14:13:57 +
 From: Barry Haddow bhad...@staffmail.ed.ac.uk
 Subject: [Moses-support] zlib bug causing Moses training to fail
 To: moses-support@mit.edu
 Message-ID: 52d00025.9090...@staffmail.ed.ac.uk
 Content-Type: text/plain; charset=ISO-8859-1; format=flowed

 Hi

 I've discovered that an apparent bug in zlib (specifically version
 1.2.3.4 installed in Ubuntu 12.04) can cause the Moses training pipeline
 to fail. Upgrading to the latest zlib (1.2.8) fixes the problem.

 The problem appears when the extract files are being read in order to
 create the reordering tables or translation tables. For the reordering
 table, an error message like this

 terminate called after throwing an instance of 'util::GZException'
what():  util/read_compressed.cc:163 in virtual std::size_t
 util::{anonymous}::GZip::Read(void*, std::size_t, util::ReadCompressed)
 threw GZException'.
 zlib encountered invalid distances set code -3

 is printed, whilst for the translation table, the Moses scorer swallows
 the zlib error but eventually fails in 'consolidate' like this:

 ERROR: source phrase does not match in line 289: 'clean .' !=
 'definitely recommend to have'

 So far I haven't found any reports of this zlib bug, and it seems to be
 quite intermittent.

 cheers - Barry


 --
 The University of Edinburgh is a charitable body, registered in
 Scotland, with registration number SC005336.



 --

 Message: 3
 Date: Fri, 10 Jan 2014 15:05:45 +
 From: Hieu Hoang hieuho...@gmail.com
 Subject: Re: [Moses-support] Prereordering was: Re: Google Translate -
 technology  paradigms behind it
 To: moses-support@mit.edu
 Message-ID: 52d00c49.7010...@gmail.com
 Content-Type: text/plain; charset=iso-8859-1

 Not that I know of. If you create 1, please check it in.

 On 09/01/2014 15:49, Per Tunedal wrote:
  Hi,
  are there any tools to do prereordering with Moses? I suppose it has
  to be done both before training and translation. And that it
  presupposes tagging.
  How much does prereordering improve translation?
  Yours,
  Per Tunedal
  On Wed, Jan 8, 2014, at 22:42, Philipp Koehn wrote:
  Hi,
  Google lists its research papers here:
  http://research.google.com/pubs/MachineTranslation.html
  --- snip---
  maybe hierarchical, with pre-reordering.
  ---snip ---
  -phi
 
 
  On Wed, Jan 8

Re: [Moses-support] static build of mosesdecoder

2014-01-10 Thread Hieu Hoang

double check that libz.a and librt.a exist in
   /usr/lib
or
   /usr/lib64
I statically compiled on Suse Linux and it was ok. My bjam command was:

./bjam --with-boost=/home/s0565741/workspace/boost/boost_1_54_0/ 
--with-irstlm=/home/s0565741/workspace/irstlm/ 
--with-cmph=/home/s0565741/workspace/cmph-2.0 
--with-dalm=/home/s0565741/workspace/github/DALM  -j16   -static -a





On 10/01/2014 09:56, Tak Kuya wrote:

Hi. How do you build mosesdecoder using static libraries like
http://www.statmt.org/moses/RELEASE-1.0/binaries/  ? To make things
simple, I use the master branch instead of Rel.1.0 and edit
mosesdecoder/Jamroot a little and build the decoder only.

I can build the normal shared version of moses without any problems
and usually use it (on CentOS 6.5 with boost 1.41). But I failed to
build moses with --static option. So I installed boost 1.41 to my home
directory. But I still couldn't build moses. It said cannot find -lz
and -lrt so I installed zlib-static and glibc-static with yum.

And... It failed again. Here are build.log and Jamroot I used. Could
anyone teach me how to solve the problem? Thank you.

Best regards,
Tak Kuya


___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


[Moses-support] Moses kbmira/pro with non-Moses decoder

2014-01-10 Thread Kenneth Heafield
Hi,

What's the best way to use Moses optimization without using Moses
the decoder?  I've got my own system combination decoder (MEMT).
Currently, the scripts use Z-MERT but 1) MERT supports a limited
number of features and 2) Users keep finding bugs in Z-MERT.

   I can run pro or kbmira directly but is there documentation of
their file formats?  Another option is to use mert-moses.pl and try to
look like Moses.  How much does my decoder have to look like Moses for
that to work?  Keep in mind the input isn't a flat text file anymore.

Kenneth
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


[Moses-support] does mosesserver use -threads argument?

2014-01-10 Thread Wang Pidong
Hi all,

Does anyone know whether mosesserver use -threads argument to control the
number of threads?
As we know that mosesserver uses XMLRPC to set up a http server, and I
found that XMLRPC only has two modes about threads: serial (using runOnce
method) or unlimited threads (using run method) from:
http://xmlrpc-c.sourceforge.net/doc/libxmlrpc_server_abyss++.html

I had a look at moseserver.cpp, and I think mosesserver does not use the
-threads to control the number of threads.
Is anyone familiar with this?
Thank you in advance!

Best wishes!

-- 
Wang Pidong

Department of Computer Science
School of Computing
National University of Singapore
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] does mosesserver use -threads argument?

2014-01-10 Thread Barry Haddow
Hi Wang

Yes, you're right. mosesserver uses the threading model from xmlrpc,  
which is either a thread per connection, or single threaded.

cheers - Barry

Quoting Wang Pidong wan...@comp.nus.edu.sg on Fri, 10 Jan 2014  
12:40:36 -0800:

 Hi all,

 Does anyone know whether mosesserver use -threads argument to control the
 number of threads?
 As we know that mosesserver uses XMLRPC to set up a http server, and I
 found that XMLRPC only has two modes about threads: serial (using runOnce
 method) or unlimited threads (using run method) from:
 http://xmlrpc-c.sourceforge.net/doc/libxmlrpc_server_abyss++.html

 I had a look at moseserver.cpp, and I think mosesserver does not use the
 -threads to control the number of threads.
 Is anyone familiar with this?
 Thank you in advance!

 Best wishes!

 --
 Wang Pidong

 Department of Computer Science
 School of Computing
 National University of Singapore




-- 
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.


___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] Moses kbmira/pro with non-Moses decoder

2014-01-10 Thread Barry Haddow
Hi Kenneth

If your decoder can output the n-best list in the same format as  
Moses, then you can run the Moses extractor (to get bleu sufficient  
stats and features from the n-best), then the optimiser. For PRO this  
means running the Moses pro binary, followed by MegaM. For k-best  
mira, I think you just have to run the kbmira binary. You should be  
able to see what arguments to use from looking at the output of a  
mert-moses.perl run, as it reports the commands as it runs them.

Each of the optimisers outputs the new weight set, which you then have  
to parse and send back to your decoder. I don't think these formats  
are documented, but they should be clear enough.

If you want to make your own decoder look like Moses, then the  
outputting the n-best list is probably straightforward - you have to  
do that anyway. The hassle is that mert-moses.pl will expect your  
decoder to understand all the Moses command line options for  
specifying new weights,

cheers - Barry

Quoting Kenneth Heafield mo...@kheafield.com on Fri, 10 Jan 2014  
12:39:09 -0800:

 Hi,

 What's the best way to use Moses optimization without using Moses
 the decoder?  I've got my own system combination decoder (MEMT).
 Currently, the scripts use Z-MERT but 1) MERT supports a limited
 number of features and 2) Users keep finding bugs in Z-MERT.

I can run pro or kbmira directly but is there documentation of
 their file formats?  Another option is to use mert-moses.pl and try to
 look like Moses.  How much does my decoder have to look like Moses for
 that to work?  Keep in mind the input isn't a flat text file anymore.

 Kenneth
 ___
 Moses-support mailing list
 Moses-support@mit.edu
 http://mailman.mit.edu/mailman/listinfo/moses-support





-- 
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.


___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


[Moses-support] How to Improve the translation

2014-01-10 Thread Asad A.Malik
Hi All,

I've developed Urdu to English SMT using MOSES, and it is currently giving me 
BLEU score of 8. Now I wanted to improve its translation so that it gives me 
higher BLEU score.
 
Regards 


Asad A.Malik___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support