[Moses-support] Slow downloads for CCaligned data sets on statmt.org

2021-01-28 Thread Mathias Müller
Dear all

Downloads from http://www.statmt.org/cc-aligned/ 
<http://www.statmt.org/cc-aligned/> are currently very slow (a file of 1GB 
takes hours). We checked from different European countries and on different 
machines.

Is there anyone I can contact regarding this problem? Is this slow download 
speed expected?

Thanks
Mathias

—

Mathias Müller
AND-2-20
Institute of Computational Linguistics
University of Zurich
Switzerland
+41 44 635 75 81
mmuel...@cl.uzh.ch

___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] Tool for aligning corpora

2018-04-20 Thread Mathias Müller
Hi Miguel

Two tools you can try are

Hunalign: https://github.com/danielvarga/hunalign
Bleualign: https://github.com/rsennrich/Bleualign 


I don’t know what exactly the effect of wildly different sentence lengths is 
though.

Regards
Mathias

> On 20 Apr 2018, at 09:24, Miguel Domingo  wrote:
> 
> Good morning,
> 
> I have two documents which have the same text (in different languages) but 
> different structure (one language was written using very short sentences 
> while the other was written using longer sentences). Does anybody know of a 
> tool with which to align the sentences to obtain a parallel corpus suitable 
> for MT? (So far I've tried Gargantua, but it's deleting most of the text.)
> 
> Thanks in advance,
> 
> Miguel
> 
> 
> ___
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support

___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] detecting if a translation is machine or human translation

2018-04-06 Thread Mathias Müller
Hi Ryan

My two cents:

First of all, a way of detecting Google-translated text might be to take the 
original text and run it through Google Translate (if they have access to an 
API, I mean). Then see if an alleged machine translated-text is similar to this.

Here is another pointer to literature for you:

http://www.aclweb.org/anthology/P14-2048 


Regards
Mathias

> On 6 Apr 2018, at 04:14, Ryan Coughlin  wrote:
> 
> Hi all,
> 
>   Hope you all had a happy Easter and are having a good spring (or autumn).
> 
>   One of the admins from Open Subtitles asked me if I knew of a way to detect 
> if a translation was machine or machine or human translation. The website 
> seems to have a lot of submissions that are simply Google translated, and 
> they're looking for a script to flag these submissions. Do you have any ideas 
> how they might go about this?
> 
>   I was thinking of running the SL against a few of the popular SMT programs, 
> and seeing if the BLEU scores were too similar, but the method seems too 
> random and computationally heavy. I'd appreciate any thoughts.
> 
> thank you,
> Ryan
> ___
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support

___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] Web Based Translation, Moses2 Production

2018-01-18 Thread Mathias Müller
Hi,

What exactly do you mean by deploying to a web app? Moses itself does not 
include a web app, but any web app can request translations from Moses.

For a web app you would certainly want to use Moses in server mode. I don’t 
know if the server works with Moses2 though. (Hieu?)

Did you see this page here? : http://www.statmt.org/moses/?n=Advanced.Moses 
.

If you have a web app, it should make XML-RPC requests to a Moses server 
running on a port that is accessible.

Regards,
Mathias

> On 18 Jan 2018, at 18:17, Alain Patience Mizero  
> wrote:
> 
> Hi
> 
> ​I have A small sample of Moses2 for a specific field (translating medical 
> communication). H​​ow do I deploy a complete Moses2 from server (UBUNTU) to a 
> web app to be user by visitors on the web page? Thank you for input or 
> recommendation .​
> 
> Alain Patience Mizero
> Professional Web Developer and IT Specialist
> patience.miz...@gmail.com .
> 1-425-200-5112 
> 
> ___
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support

___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] writing a wrapper for the Moses decoder

2017-12-22 Thread Mathias Müller
Hi Ryan

(Did you see Tom’s message?)

Here is what I meant by writing a wrapper.

The moses decoder can read from STDIN, so it will translate

echo “this is a test” | moses -f moses.ini

Now if you have a script that does proper segmentation of a Thai input sentence:

echo “notproperlysegmentedThaisentence” | segment.sh
properly segmented Thai sentence

and you plug this in in the middle:

echo “notproperlysegmentedThaisentence” | segment.sh | moses -f moses.ini

And for English-Thai, you would need a script that undoes segmentation (= 
restores to the proper uses for spacing, thanks Tom for explaining this), and 
can also read from STDIN, write to STDOUT:

echo “this is a test” | moses -f moses.ini | unsegment.sh

For instance, it the task were to just remove spaces, 

echo “this is a test” | moses -f moses.ini | sed 's/ //g'

Regards!
Mathias

> On 22 Dec 2017, at 02:26, Ryan Coughlin  wrote:
> 
> Hi all,
> 
>   Mathias recommended to me that I should write a lite wrapper for the Moses 
> decoder. Is anyone aware of any documentation for doing such a thing? I'm not 
> able to find it with any kind of search.
> 
> thank you,
> Ryan
> ___
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support

___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] handling no-space languages in the decoder

2017-12-18 Thread Mathias Müller
Hi Ryan

Conceptually, the easiest way is to regard segmentation as a preprocessing (and 
postprocessing) step that the core model has nothing to do with. You should not 
bother to modify the decoder itself in this case.

You will need a light wrapper for the Moses decoder. If you have a way to 
segment the training data, you can do the same right before translation for 
Thai-English and vice versa. For instance, a simple shell script.

I suspect that removing spaces is even less of a problem.

Regards
Mathias

> On 17 Dec 2017, at 13:08, Ryan Coughlin  wrote:
> 
> Hi all,
> 
>   I'm trying to use Moses to handle Thai-English translation. As far as I 
> know, this never has been done.
> 
>   Thai is a language without spacing between words. Running a 
> word-segmentation script to put spaces in between words is rather trivial. 
> When training, I've pre-segmented the sentences with spaces between the words 
> and the training seems to go OK.
> 
>   My problem is with the decoder. Is there a way to modify it so that a Thai 
> sentence without spaces will be segmented to a sentence with spaces and then 
> decoded to a proper English sentence. And the reverse would be an English 
> sentence would be input and the Thai no space sentence would be output. Does 
> that make sense? Sorry for the noob question.
> 
>   Thank you for any and all help that you may give me.
> 
> take care,
> Ryan
> ___
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support

___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] a tool for extracting specific terms from the corpora

2017-07-31 Thread Mathias Müller
Hi Mariusz

Sorry for the delay.

If your problem is so dynamic that it cannot be described with rules, then you 
cannot extract such a list of terms automatically.

A semi-automatic method would be: you define rules that have little precision 
and high recall, this gets you an overly long list of terms that will include 
false positives. Then, look through this list manually, e.g. by looking at the 
term and its sentence context. Inspecting the data in this way might even 
suggest patterns you did not see before.

Another option is to still extract terms only automatically, with rules that 
work most of the time (probably more precision-oriented rules) and live with 
the margin of error.

If the terms to be extracted are a finite set (i.e. one that can be enumerated) 
that changes infrequently, consider taking the time to simply list all of the 
terms, for highest precision.

(We still don’t know what you will use the exported list for. Intended use also 
dictates the approach to a certain extent.)

Regards
Mathias

> On 4 Jul 2017, at 10:41, Mariusz Hawryłkiewicz 
> <mariusz.hawrylkiew...@gmail.com> wrote:
> 
> Hi Mathias, thank you for getting back - let me give you an example from a 
> monolingual EN corpora:
> 
> Acoustic measurement precision and uncertainty.
> Each press of the Acoustic Output – key decreases the transmission power 
> setting (TX) displayed in the monitor display.
> 
> In the first sentence the word Acoustic should not be exported. In the second 
> sentence Acoustic Output should.
> Now I have written a program in Java that exports all the terms or group of 
> terms with first capital letter, but this obviously includes the words like 
> from the first example and it should not.
> 
> The purpose is that the proper names only should be exported to a separate 
> file.
> 
> Best regards
> Mariusz
> 
> 
> 
> 2017-07-04 10:02 GMT+02:00 Mathias Müller <mmuel...@ifi.uzh.ch 
> <mailto:mmuel...@ifi.uzh.ch>>:
> Hi Mariusz
> 
> What do you mean by “extracting” this content? What do you need the list of 
> proper names for? What are the languages involved?
> 
> Regards,
> Mathias
> 
> —
> 
> Mathias Müller
> AND-2-20
> Institute of Computational Linguistics
> University of Zurich
> Switzerland
> +41 44 635 75 81 <tel:+41%2044%20635%2075%2081>
> mmuel...@cl.uzh.ch <mailto:mmuel...@cl.uzh.ch>
>> On 4 Jul 2017, at 09:39, Mariusz Hawryłkiewicz 
>> <mariusz.hawrylkiew...@gmail.com <mailto:mariusz.hawrylkiew...@gmail.com>> 
>> wrote:
>> 
>> Dear all,
>> 
>> I have been searching for the most efficient way to extract untranslatable 
>> content from the corpora that always begin from the capital letter (product 
>> names etc.), the problem is that all the segments begin with the capital 
>> letter and what's obvious, the sentence may also begin with the 
>> untranslatable content (product name) :-). 
>> 
>> I want to avoid using common dictionaries to eliminate common words.
>> 
>> Would you have any other suggestions?
>> 
>> Thank you very much!
>> Mariusz
>> ___
>> Moses-support mailing list
>> Moses-support@mit.edu <mailto:Moses-support@mit.edu>
>> http://mailman.mit.edu/mailman/listinfo/moses-support 
>> <http://mailman.mit.edu/mailman/listinfo/moses-support>
> 
> 


—

Mathias Müller
AND-2-20
Institute of Computational Linguistics
University of Zurich
Switzerland
+41 44 635 75 81
mathias.muel...@uzh.ch

___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] Eliminating load times for MOSES phrase table, language model and reordering model

2017-04-06 Thread Mathias Müller

Hi Roko

Apart from compact phrase tables, which you should try:

You could start several Moses servers on different ports, with different 
decoding parameters. Would that not be feasible for you?


Regards

Mathias


On 05.04.17 13:33, RR wrote:

Hi,

I am currently working on a system to explore MOSES decoding parameter 
space - running MOSES in decoding mode with one parameter set, getting 
a BLEU score and rerunning with different parameters.


More than 50% of my time is being spent repeatedly "loading" the 
phrase table, language model and reordering model. For example,


$ ./example.fr  | .../moses -f ./run11.moses.ini 
-threads 31   > ./trans_baseline


gives: "Start loading text phrase table. Moses format : [37.296] 
seconds."


37 seconds just to load a 590MB phrase table, even if it is on 
RAMDISK. htop indicates that for those 37 seconds, a single core is 
100% utilized.


My understanding is that some kind of optimized data structure (a hash 
table of some kind?) is being created in those 37 seconds, which is 
then lost and recreated when I re-run MOSES with a different value of 
a decoding parameter.


If I want to eliminate this loading time, what is the best way 
forward? Is there a phrase table format that avoids a significant 
loading time (i.e. a 500MB phrase table loads in < 1 sec)? Should I 
try to find a way to run a single MOSES server but with different 
decoding parameters? I have a system with 130GB RAM and I am using a 
small phrase table that has been filtered for my decoding set.


Thanks in advance,

Roko


___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] memory problems running moses server

2017-03-23 Thread Mathias Müller
Hi Sarah

What version of the Moses decoder (including the code for Moses server) do
you use exactly? Do you get an error message from the Moses server, i.e.
does it crash? Is the process killed by the OS because of memory problems?

If you can also supply a toy model and ini file, other people can try and
reproduce your issue.

Regards!
Mathias

—

Mathias Müller
AND-2-20
Institute of Computational Linguistics
University of Zurich
Switzerland
+41 44 635 75 81
mathias.muel...@uzh.ch

On Thu, Mar 23, 2017 at 11:17 AM, Sarah Schulz <
sarah.sch...@ims.uni-stuttgart.de> wrote:

> Hi everyone,
>
> It seems like I run into memory problems when using moses server even
> though I have plenty of memory available.
>
> I isolated the problem in a little script (attached). I start the server
> and send strings for translation iteratively. At a certain point it will
> send the request and wait for a reply forever. The number of iterations
> decreases, when I increase the length of the string.
> It seems as if the already sent requests are "piling up" somewhere in
> the server's or client's memory and at a certain point it doesn't have
> the capacity to process the new request.
>
> Has anyone gone through anything similar?
>
> Cheers
> Sarah
> --
> Sarah Schulz
> University of Stuttgart
> Institute for Natural Language Processing (IMS)
> Pfaffenwaldring 5B, 70569 Stuttgart
> Germany
>
> http://www.ims.uni-stuttgart.de/institut/mitarbeiter/schulzsh/
>
> ___
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
>
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] (no subject)

2017-02-25 Thread Mathias Müller
Hi,

Did you perhaps forget the leading "/" in the file path to an existing
installation of XML-RPC?

Regards
Mathias



On Sat, Feb 25, 2017 at 10:46 AM, G/her G/libanos 
wrote:

> hello there I am doing on the Moses server but on installing the xmlrpc
> create an error
> what I can do
>
> thanks for every thing you doing all
>
> On Wed, Feb 8, 2017 at 3:07 AM, Marwa Refaie 
> wrote:
>
>> For web server dig little on those links I think could help
>>
>> http://www.statmt.org/moses/?n=Moses.WebTranslation
>>
>> https://github.com/moses-smt/mosesdecoder/tree/master/contri
>> b/iSenWeb/Introduction
>>
>> *Marwa N Refaie*
>> On 7 Feb 2017, at 21:05, G/her G/libanos  wrote:
>>>
>>> the Moses decoder works for our system on Ubuntu using terminal
>>> but we want to make user interactive whether in the form of web page or
>>> application based
>>> could you help me any information to done that
>>> 10qu...
>>>
>>> On Mon, Dec 12, 2016 at 8:43 AM, G/her G/libanos 
>>> wrote:
>>>
 we create the translation model and we need to uses the window
 application using
 http://www.statmt.org/moses/?n=Moses.Packages
 but when we import our model into the mainWindow application it create
 an error
 what we can do.
 since it works for EN into  Fr
 10qu for ...

 On Mon, Dec 12, 2016 at 7:17 PM, G/her G/libanos 
 wrote:

> hello there...
> First of all I would like to thanks for your response for the previous
> comments on the question I rise
>
> next my system trained by 1000 parallel sentence and I try to
> calculate the Bleu score of the translation system using the vedio on TAUS
> and I got this one
>
> On Fri, Dec 9, 2016 at 5:17 PM, G/her G/libanos 
> wrote:
>
>>
>> hello there
>> 1. we done our work using baseline and the system work for the train
>> data
>> but we need to localized the source code in the tokinization to leave
>> abrivated word ወ/ሮ as in english Adm. as Notbreaking_prefix.en that is 
>> use
>> the dot(.)
>>
>> 2. where we to change the code in c++ or in the perl part of the codes
>>
>>
>>
>> 3. we need to uses the translation system in window, since the end
>> users are not expert in ubuntu and we see the how to change the window  
>> see
>> the Moses GUI and which part of the our file will be load to the model 
>> only
>> or all the train including the train data and can we modify the GUI of 
>> the
>> Moses.
>>
>>
>>
>> 10qs for every thing
>>
>
>
>
> --
>
> ...education is door of one's life...
>



 --

 ...education is door of one's life...

>>>
>>>
>>>
>
>
> --
>
> ...education is door of one's life...
>
> ___
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
>
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] delete a segment from existing trained data

2017-02-06 Thread Mathias Müller
Hi Adel

Please always CC the Moses support list in your replies.

What do you mean by "wrong" segment? Anyway, it is unlikely that a single
pair of segments jeopardizes your trained system, since SMT systems are
fairly robust to noise. If you indeed have evidence that removing one
single wrong segment from the training set increases translation quality,
retraining would be far easier than trying to re-estimate counts in order
to remove the influence of this segment, in my opinion.

Regards
Mathias

On Mon, Feb 6, 2017 at 11:17 AM, Adel Khalifa <adelkhali...@gmail.com>
wrote:

> Hi Mathias,
>
> I found some wrong segment appeared after training so o want to remove it
> without retraining all data again
>
> Regards,
> Adel
>
> 2017-02-06 12:08 GMT+02:00 Mathias Müller <mathias.muel...@uzh.ch>:
>
>> Hi Adel
>>
>> Why do you need to remove a single pair of segments after training your
>> system?
>>
>> Regards
>> Mathias
>>
>
> On Mon, Feb 6, 2017 at 10:50 AM, Adel Khalifa <adelkhali...@gmail.com>
> wrote:
>
>> Hello All,
>>
>> Is there's any way to delete any row (Source segment and Target segment)
>> from data after train data to moses Engine without retrain all data again.
>>
>> Regards,
>> Adel Khalifa
>>
>> ___
>> Moses-support mailing list
>> Moses-support@mit.edu
>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>
>>
>
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] Need help for parallelisation in mosesserver

2016-12-28 Thread Mathias Müller
Hi Shubham

There is not necessarily a need to start several server processes running
on different ports. A Moses server can serve more than one client at the
same time, except when the "serial" option is used explicitly. For other
useful server options, see e.g. :

https://github.com/moses-smt/mosesdecoder/blob/master/moses/Parameter.cpp#L219

Parallelization has to kick in in another place, namely in your client code
that requests the translations. One solution is to use multithreading to
submit several requests at the same time and, of course, also do the pre-
and postprocessing in parallel.

You did not really explain to us how exactly you are requesting
translations from the server, though. Also, this part of your message:

*"... as splitting a single sentence to multiple group of sentence and then
translate them on different ports separately, can give different meaning
rather than translate the whole single sentence at single port."*

Is a bit confusing. Currently, decoding always happens sentence-by-sentence
and there is no document context. So, documents can safely be translated
sentence by sentence. If you mean splitting a single sentence into words
and send those to different Moses servers, then yes, that's a bad idea.

At any rate, your questions are very vague and general and it is difficult
to give specific advice.

Regards,
Mathias
—

Mathias Müller
AND-2-20
Institute of Computational Linguistics
University of Zurich
Switzerland
+41 44 635 75 81
mathias.muel...@uzh.ch

On Wed, Dec 28, 2016 at 8:23 PM, Shubham Khandelwal <skhlnm...@gmail.com>
wrote:

> Hello,
>
> As mosesserver accepts only one sentence at a time. So I am creating one
> another component in front of mosesserver to handle tokenisation, casing
> and splitting taking care of parallelisation.
>
> Following is my procedure to do it, let me know whether am I heading
> correctly or not to do this:
> *---*
> *So suppose, if I have 5 different sentences (as a paragraph) to translate
> at once (fr-en). So I will be creating mosesserver on 5 different ports
> firstly and pass those 5 different sentences after doing parallely
> tokenisaton, casing and splitting on those different ports and then
> concatenate the output after recasing and detokenisation parallely. *
> *--*
> Let me know whether this is correct or not ? If no, then please suggest me
> better solution to do this.
>
> Also, I have one more question in this that if a sentence is composed of
> around 10 words. Then when I pass this sentence to translate as follows:
> -> ~/mosesdecoder/bin/mosesserver -f moses.ini  -threads 16  -b 0.1
>
> then it takes around 10 seconds to translate. To make it fast, I can run
> this on different ports but that is not a good idea I think, as splitting a
> single sentence to multiple group of sentence and then translate them on
> different ports separately, can give different meaning rather than
> translate the whole single sentence at single port.
> So basically, my doubt is how to make better splitting in such cases which
> can take care of parallelisation aswell ?
>
> --
> Yours Sincerely,
>
> Shubham Khandelwal
>
> ___
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
>
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] Moses-support Digest, Vol 122, Issue 29

2016-12-16 Thread Mathias Müller
Hi Shubham

You could start Moses in server mode:

$ moses -f /path/to/moses.ini --server --server-port 12345 --server-log
/path/to/log

This will load the models, keep them in memory and the server will wait for
client requests and serve them until you terminate the process. Translating
is a bit different in this case, you have to send an XML-RPC request to the
server.

But first you'd have to make sure Moses is built with XML-RPC.

Regards and good luck
Mathias
—

Mathias Müller
AND-2-20
Institute of Computational Linguistics
University of Zurich
Switzerland
+41 44 635 75 81
mathias.muel...@uzh.ch

On Fri, Dec 16, 2016 at 10:32 AM, Shubham Khandelwal <skhlnm...@gmail.com>
wrote:

> Hey Thomas,
>
> Thanks for your reply.
> Using Cube Pruning, the speed is littile bit high, but not that much. I
> will try to play with these parameters.
>
> I have binary moses2 which supports it aswell but it is taking more time
> to than moses. Can you please send/share somewhere your binary moses2 file
> if possible ?
>
> Also, I do not wish to run this command ( ~/mosesdecoder/bin/moses
> -f moses.ini -threads all) every time for every input. Is there any way in
> Moses by which all models will load in memory for forever and I can just
> pass a input and get output in real time without using this command again
> and again.
>
> Looking forward for your response.
>
> Thanks again.
>
> On Fri, Dec 16, 2016 at 1:20 PM, Tomasz Gawryl <tomasz.gaw...@skrivanek.pl
> > wrote:
>
>> Hi,
>> If you want to speed up decoding time maybe you should consider changing
>> searching algorithm. I'm also using compact phrase tables and after some
>> test I realised that cube pruning gives almost exactly the same quality
>> but
>> is much faster. For example you can add something like this to your config
>> file:
>>
>> # Cube Pruning
>> [search-algorithm]
>> 1
>> [cube-pruning-pop-limit]
>> 1000
>> [stack]
>> 50
>>
>>  If your model allows you may also try moses2 binary which is faster than
>> original.
>>
>> Regards,
>> Thomas
>>
>> --
>>
>> Message: 1
>> Date: Thu, 15 Dec 2016 19:12:01 +0530
>> From: Shubham Khandelwal <skhlnm...@gmail.com>
>> Subject: Re: [Moses-support] Regarding Decoding Time
>> To: Hieu Hoang <hieuho...@gmail.com>
>> Cc: moses-support <moses-support@mit.edu>
>> Message-ID:
>> <cahwentvyealyrafjdgdih51t5_ahsprv0kwlcabc2td27yo...@mail.gm
>> ail.com>
>> Content-Type: text/plain; charset="utf-8"
>>
>> Hello,
>>
>> Currently, I am using phrase-table.minphr , reordering-table.minlexr and
>> language model (total size of these 3 are 6 GB). Now, I tried to decode on
>> two different machines (8 core-16GB RAM  *&* 4 core-40GB RAM) using them.
>> So, During decoding of around 500 words, it took 90 seconds and 100
>> seconds
>> respectively on those machines. I am already using compact phrase and
>> reordering table representations for faster decoding. Is there any other
>> way
>> to reduce this decoding time.
>>
>> Also, In Moses, Do we have distributed way of decoding on multiple
>> machines
>> ?
>>
>> Looking forward for your response.
>>
>> ___
>> Moses-support mailing list
>> Moses-support@mit.edu
>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>
>
>
>
> --
> Yours Sincerely,
>
> Shubham Khandelwal
> Masters in Informatics (M2-MoSIG),
> University Joseph Fourier-Grenoble INP,
> Grenoble, France
> Webpage: https://sites.google.com/site/skhandelwl21/
>
> ___
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
>
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] Moses as a service...

2016-09-29 Thread Mathias Müller
Hi Raymond

On Thu, Sep 29, 2016 at 12:07 AM, Raymond Monette <rmone...@ubiqus.com>
wrote:

> “This can be launched using the same command-line arguments as moses, with
> two additional arguments to specify the listening port and log-file (
> --server-port and --server-log). These default to 8080 and /dev/null
> respectively.”
>
>
>
> Would it be possible to get a full command line example please? Or
> something that illustrates/confirms that the “as a service” component
> runs/works. I tried this, but Im sure Im not getting something.
>
> mosesdecoder/bin/moses -f phrase-model/moses.ini < phrase-model/in > out 
> --server-port
> 8080  --server-log /dev/null
>

To start a Moses server you can either use the "mosesserver" binary which
by now is nothing more than an empty wrapper for "moses":

*$ mosesserver -f /path/to/moses.ini*

or start the "usual" Moses with the "--server" option:

*$ moses --server -f /path/to/moses.ini*

Port 8080 might already be in use, in which case you have to run the server
on another port.

And here is how to find out whether the server works: first of all, if it
outputs "Listening on port 12345" at the end of startup, then usually this
means the server is working. Testing translations is not done from the
command line, because the point of using Mosesserver is that clients
request translations with XML-RPC. You test it by writing a few lines of
code in your favourite programming language that, preferably,  has an
XML-RPC library.

In Python, for example, the code could look like


*# the Python implementation of XML-RPC*


*import xmlrpclib*

*# if the client is on the same machine and the --server-port is 12345*


*url = 'http://localhost:12345/RPC2 <http://localhost:12345/RPC2>'*

*# verbose prints the actual XML-RPC request and answer, have a look*

*server = xmlrpclib.ServerProxy(url, verbose=True)*

*# request the translation of a single sentence*
*server.translate({"text": "das ist ein kleines haus")*

Regards
Mathias

—

Mathias Müller
AND-2-20
Institute of Computational Linguistics
University of Zurich
+41 44 635 75 81
mathias.muel...@uzh.ch






> I haven’t tried the perl client script yet. Id eventually like to invoke
> maas via a web service, which I can build pretty easily on the Windows
> side.
>
>
>
> On a side note: Id like to contribute if possible… Im pretty decent and
> verbose when it comes to documenting stuff, but am not a Linux expert. If
> someone is willing to review my stuff I could send some docs. So far, Ive
> just documented stuff for Ubuntu 16.04… The challenge for me was
> understanding where things are supposed to go (ie root folders for
> installations, etc…)
>
>
>
> Any clarifications would be appreciated.
>
>
>
> Thanks
>
> R
>
> ___
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
>
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


[Moses-support] Parameters in server mode

2016-07-05 Thread Mathias Müller
Hi all

I have two questions about using Moses in server mode.

1) behaviour of certain parameters in server mode

It seems that parameters related to word alignment do not work in server
mode. For instance, "*--print-alignment-info*" does not have any effect if
I start the server as follows:

*$ mosesserver -f model/moses.ini --print-alignment-info*

If a translation is requested, the server does not output alignment
information after translation. In non-server mode, the word alignment is
displayed.

I then hypothesized that perhaps all options that relate to direct output
are disabled in server mode, because output is usually sent as a server
response. But other parameters that have to do with output apparently do
work in server mode, e.g. "*--n-best-list*".

So, my question is:

*Why do certain parameters not have any effect in server mode and is there
any documentation about this?*

2) Parameters that are accepted in a request

The examples on Github and elsewhere suggest that the only two parameters
to the translate() function are "text" and "align". But apparently,
"word-align" is also possible. Now I am wondering:

*Does anyone have a complete list of parameters of the translate() function
that can be sent to a Moses server in an XML-RPC request? Again, is there
documentation about this?*

XML-RPC introspection functions are not really helpful at the moment, since

*system.methodHelp("translate")*

returns:

*'Does translation'*

and

*system.methodSignature("translate")*

returns

*[['struct', 'struct']]*

Are there plans to make this more informative?

Thanks a lot for your help.
Mathias

—

Mathias Müller
AND-2-20
Institute of Computational Linguistics
University of Zurich
+41 44 635 75 81
mathias.muel...@uzh.ch
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] Language model interpolation without SRILM

2016-07-01 Thread Mathias Müller
Thanks Philipp and Kenneth!

So, does this mean that finding the weights and log-linear interpolation of
LMs is actually implemented in KenLM, but there is no ready-made,
higher-level script to use this functionality, as there is for SRILM
(interpolate-lm.perl)?

@Kenneth Since KenLM is already distributed with Moses, why do you
recommend that I compile the code separately again? Does
github.com/kpu/kenlm have different code than what comes with Moses?

Thanks again,
Mathias

On Tue, Jun 28, 2016 at 6:08 PM, Kenneth Heafield <mo...@kheafield.com>
wrote:

> Log-linear interpolation is in KenLM in the lm/interpolate directory.
> You'll want to get KenLM from github.com/kpu/kenlm and compile with Eigen.
>
> Tuning log-linear weights is super slow, but applying them is reasonably
> fast.  In total the tuning + applying weights time is comparable to SRILM.
>
> https://kheafield.com/professional/edinburgh/interpolate_paper.pdf
>
> Kenneth
>
> On 06/28/2016 03:27 PM, Philipp Koehn wrote:
> > Hi,
> >
> > unfortunately, the interpolation of language models requires two pieces
> > of code that only exist in SRILM: The EM training method to find weights
> > for the language models, and the linear interpolation of the language
> > models.
> >
> > Maybe Ken and Lane can weigh in, if/when a replacement in KENLM will be
> > available.
> >
> > -phi
> >
> > On Tue, Jun 28, 2016 at 10:10 AM, Mathias Müller <mathias.muel...@uzh.ch
> > <mailto:mathias.muel...@uzh.ch>> wrote:
> >
> > Hi all
> >
> > I have trained several language models and would like to combine
> > them with interpolate-lm.perl:
> >
> >
> https://github.com/moses-smt/mosesdecoder/blob/master/scripts/ems/support/interpolate-lm.perl
> >
> > As the language model tool, I always use KenLM, but looking at the
> > code of interpolate-lm.perl, it seems that the use of SRILM is
> > hard-coded in the script. I would like to avoid SRILM because, if I
> > understand correctly, its license does not permit use in commercial
> > products.
> >
> > My question is:
> >
> > Can I simply replace the call to SRILM with KenLM in my copy of
> > interpolate-lm.perl? Does KenLM have the functionality necessary for
> > language model combination, e.g. a substitute for SRILM's
> > "compute-best-mix"?
> >
> > Thanks for your help.
> > Mathias
> >
> > —
> >
> > Mathias Müller
> > AND-2-20
> > Institute of Computational Linguistics
> > University of Zurich
> > +41 44 635 75 81 <tel:%2B41%2044%20635%2075%2081>
> > mathias.muel...@uzh.ch <mailto:mathias.muel...@uzh.ch>
> >
> > ___
> > Moses-support mailing list
> > Moses-support@mit.edu <mailto:Moses-support@mit.edu>
> > http://mailman.mit.edu/mailman/listinfo/moses-support
> >
> >
> >
> >
> > ___
> > Moses-support mailing list
> > Moses-support@mit.edu
> > http://mailman.mit.edu/mailman/listinfo/moses-support
> >
> ___
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


[Moses-support] Language model interpolation without SRILM

2016-06-28 Thread Mathias Müller
Hi all

I have trained several language models and would like to combine them with
interpolate-lm.perl:

https://github.com/moses-smt/mosesdecoder/blob/master/scripts/ems/support/interpolate-lm.perl

As the language model tool, I always use KenLM, but looking at the code of
interpolate-lm.perl, it seems that the use of SRILM is hard-coded in the
script. I would like to avoid SRILM because, if I understand correctly, its
license does not permit use in commercial products.

My question is:

Can I simply replace the call to SRILM with KenLM in my copy of
interpolate-lm.perl? Does KenLM have the functionality necessary for
language model combination, e.g. a substitute for SRILM's "compute-best-mix
"?

Thanks for your help.
Mathias

—

Mathias Müller
AND-2-20
Institute of Computational Linguistics
University of Zurich
+41 44 635 75 81
mathias.muel...@uzh.ch
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] compile.sh with --static

2016-04-05 Thread Mathias Müller
Hi Liling

On Mon, Apr 4, 2016 at 11:28 PM, liling tan  wrote:

>
> Is that the normal/expected behavior of the .bjam in compile.sh for Ubuntu?
>

Yes, without --static in the bjam command, static linking is not
guaranteed. As this page:

http://www.statmt.org/moses/?n=Development.GetStarted

explains, the default linking method is static, but it will still fall back
to shared in some cases. For instance, your libboost library was not linked
statically the first time you've built moses. This is a problem especially
if you move moses to another system after you've built it.

@all In what circumstances does the build process fall back to "shared"
with the default settings? Why is it not "--static" that is the default?

Regards
Mathias


>
>
> Regards,
> Liling
>
> ___
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
>
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


[Moses-support] Mosesserver terminates with "girerr::error"

2016-03-22 Thread Mathias Müller
Dear list

Since I got recent mosesdecoder code from Github (mid-February 2016) and
built a new version on our servers, I cannot run mosesserver anymore. The
non-server version of moses works fine.

The specific error I get when I request a translation from a running server
is:


*terminate called after throwing an instance of 'girerr::error'  what():
Not string type.  See type() method*

This happens even with the simplest of models, e.g. with the sample models
taken from:

http://www.statmt.org/moses/?n=Development.GetStarted

What does this error mean? Can anyone help me solve my problem?

I have attached the following files:

   - commands.txt: the commands that lead to this error
   - minimal-client.py: a few lines of Python code that request a
   translation
   - server.err: the server STDERR output
   - python-output.txt: the output of minimal-client.py

If you need more information, I will gladly provide it.

Thanks for your help

Mathias


—

Mathias Müller
BIN 2.B.04
Institute of Computational Linguistics
University of Zurich
+41 44 635 75 81
mathias.muel...@uzh.ch
send: "POST /RPC2 HTTP/1.1\r\nHost: localhost:12345\r\nAccept-Encoding: 
gzip\r\nUser-Agent: xmlrpclib.py/1.0.1 (by www.pythonware.com)\r\nContent-Type: 
text/xml\r\nContent-Length: 327\r\n\r\n\n\ntranslate\n\n\n\n\ntext\ndas
 ist ein kleines 
haus\n\n\nword-align\n1\n\n\n\n\n\n"
reply: ''
Traceback (most recent call last):
  File "minimal-client.py", line 6, in 
server.translate({"text": "das ist ein kleines haus", "word-align": True})
  File "/usr/lib/python2.7/xmlrpclib.py", line 1233, in __call__
return self.__send(self.__name, args)
  File "/usr/lib/python2.7/xmlrpclib.py", line 1587, in __request
verbose=self.__verbose
  File "/usr/lib/python2.7/xmlrpclib.py", line 1273, in request
return self.single_request(host, handler, request_body, verbose)
  File "/usr/lib/python2.7/xmlrpclib.py", line 1301, in single_request
self.send_content(h, request_body)
  File "/usr/lib/python2.7/xmlrpclib.py", line 1448, in send_content
connection.endheaders(request_body)
  File "/usr/lib/python2.7/httplib.py", line 975, in endheaders
self._send_output(message_body)
  File "/usr/lib/python2.7/httplib.py", line 835, in _send_output
self.send(msg)
  File "/usr/lib/python2.7/httplib.py", line 797, in send
self.connect()
  File "/usr/lib/python2.7/httplib.py", line 778, in connect
self.timeout, self.source_address)
  File "/usr/lib/python2.7/socket.py", line 571, in create_connection
raise err
socket.error: [Errno 101] Network is unreachablewget http://www.statmt.org/moses/download/sample-models.tgz
tar xzf sample-models.tgz
cd sample-models

mosesserver -f phrase-model/moses.ini --server-port 12345 2> server.err

python minimal-client.py



server.err
Description: Binary data
import xmlrpclib

url = 'http://localhost:12345/RPC2'

server = xmlrpclib.ServerProxy(url, verbose=True)
server.translate({"text": "das ist ein kleines haus", "word-align": True})___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support