Re: [Moses-support] Moses-support Digest, Vol 109, Issue 19

2015-11-12 Thread Barry Haddow
Hi Tomasz The moseserver is just the decoder, so it doesn't do any of the pre- and post-processing steps that you also need. In particular it does not do tokenisation. You need to send it tokenised text, and then de-tokenise the output, cheers - Barry On 12/11/15 13:40, Tomasz Gawryl wrote:

Re: [Moses-support] Moses-support Digest, Vol 109, Issue 19

2015-11-12 Thread Tomasz Gawryl
Hi Ulrich, I have a question about Moses server too. I'm testing it as a wrapper for Across server to check pre-translation possibilities. It generally works but there is one problem. Input segments are translated without tokenization, so every word close to special character (for example `this

Re: [Moses-support] Moses-support Digest, Vol 109, Issue 19

2015-11-12 Thread Panos Kanavos
Hi Barry, Have there ever been any thoughts about implementing tokenization/detokenization directly in Moses? I suppose this is some work as Moses should become language-aware, but I can only see advantanges from this. Besides, Moses is a language tool so these concepts shouldn't be so

Re: [Moses-support] Moses-support Digest, Vol 109, Issue 19

2015-11-12 Thread Hieu Hoang
there has been thoughts. There is a c++ tokenizer in contrib/c++tokenizer it compiles into a library file, ready for integration. The last time i checked, it gave a slightly worse BLEU. Not much, but consistent. If anyone wants to carry on with it, they're welcome to Hieu Hoang

Re: [Moses-support] Moses-support Digest, Vol 109, Issue 19

2015-11-12 Thread Panos Kanavos
Thanks for the info Hieu, didn't know that:) I'll try it sometime. Best, Panos On 12/11/2015 4:41 μμ, Hieu Hoang wrote: there has been thoughts. There is a c++ tokenizer in contrib/c++tokenizer it compiles into a library file, ready for integration. The last time i checked, it gave a

Re: [Moses-support] Moses-support Digest, Vol 109, Issue 19

2015-11-12 Thread Panos Kanavos
Hi Dingyuan, I was actually thinking about implementing the logic and rules from the perl scripts, which seem to do the job, into a separate library (that thankfully exists as Heiu informed:)). Asking an end-user to add himself a programming layer in a seemingly straightforward process is

Re: [Moses-support] Moses-support Digest, Vol 109, Issue 19

2015-11-12 Thread Philipp Koehn
Hi, there are a lot of different pre and post processing steps that you may want to apply for any given language pair, so it makes sense to keep them out of the decoder. If you are interested in a server implementation that integrates tokenization, truecasing, etc., check out Christian Buck's