[Moses-support] Unable to install moses on windows 8

2013-12-10 Thread Asad A.Malik
I am using windows 8 and wanted to use Moses on it. I've installed Cygwin and 
the packages which are required for it. After that I am unable to install boost.
I've also tried the package in which you don't have to build Moses from source 
but that isn't working either___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] EMS-Decoder-Problem

2013-12-10 Thread Barry Haddow
Hi Nadeem

It looks like something went wrong earlier in the EVALUATION section, 
possibly in the input-from-sgm step. I would check all the steps in this 
section for errors.

It is also not clear to me that the truecaser will work with Hindi as it 
is designed for languages written in the latin script,

cheers - Barry

On 07/12/13 18:51, nadeem khan wrote:
>
>
>
> Hello Sir;
>
>I am using EMS now and getting into a problem with my data of hindi 
> language.
> I ran EMS on config.toy just fine there was not a single error but 
> when it comes to my own data and experiment I am getting stuck with 
> BLEU and BLEU-c Crashed.
> When I invistaged the problem there is only 1 single Input Segment in 
> test.input.tc.1. why and how the EMS taking only 1 segment from my 
> input test-src.sgm file? and when I investigated further there is a 
> fatal error under EVALUATION_test_nist-bleu-c.1.STDERR of no id in 
> srcset. why I am getting that as I am giving it the complete sgm 
> frame for wrapping out the output.
>
> I am sending you my those testdata sgm file as well as the input and 
> output generated by EMS for my dataset.
> Please have a look at it and Reply with your kind comments to resolve 
> these  issues
> Waiting for your kind response
>
> THANK YOU
> Regards
> nadeem
>
>
>
>
> ___
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support


-- 
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.

___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] Increasing context scope during training

2013-12-10 Thread Rūdolfs Mazurs
Thanks for the pointer, Dimitri!

Although I don't use EMS, I guess that script irstlm/bin/build-lm.sh is
responsible for LM part and the option is
   -n  Order of language model (default 3)

Thanks again!

On O , 2013-12-10 at 15:12 +0200, Dimitris Mavroeidis wrote:
> Dear Rūdolfs,
> 
> You must be referring to the language model's n-gram size. If you are 
> using EMS, then you can set "order" in the "LM" portion of the 
> configuration file.
> 
> Setting a higher n-gram order (not more than 5) usually helps, but that 
> depends on various factors, especially the target language, the size of 
> your corpus, etc. Just give it a try and see what order gives the best 
> results for your situation.
> 
> Best regards,
> Dimitris
> 
> On 09/12/2013 11:21 μμ, Rūdolfs Mazurs wrote:
> > Hi all,
> >
> > I am looking to improve quality of translation on my limited corpus.
> > During training process I noticed that ngrams only go up to 3. Is there
> > a way to increase the upper limit on ngram count? And is there a chance
> > it would improve results of translations?
> >
> 


___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] (no subject)

2013-12-10 Thread Philipp Koehn
Hi,

TAUS put together a basic slide presentation:
https://www.taus.net/press-releases/free-open-source-machine-translation-tutorial-is-made-available-by-taus

-phi

On Tue, Dec 10, 2013 at 11:27 AM, Kalyani Baruah  wrote:
> hii
> Can you provide me with a ppt(power point presntation ) regarding
> statistical translation using moses toolkit
>
>
>
>
>
>
> Regards,
>
>
> Kalyanee Kanchan Baruah
> Department of Information Technology,
> Institute of Science and Technology,
> Gauhati University,Guwahati,India
> Phone- +91-9706242124
>
> ___
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] Increasing context scope during training

2013-12-10 Thread Dimitris Mavroeidis
Dear Rūdolfs,

You must be referring to the language model's n-gram size. If you are 
using EMS, then you can set "order" in the "LM" portion of the 
configuration file.

Setting a higher n-gram order (not more than 5) usually helps, but that 
depends on various factors, especially the target language, the size of 
your corpus, etc. Just give it a try and see what order gives the best 
results for your situation.

Best regards,
Dimitris

On 09/12/2013 11:21 μμ, Rūdolfs Mazurs wrote:
> Hi all,
>
> I am looking to improve quality of translation on my limited corpus.
> During training process I noticed that ngrams only go up to 3. Is there
> a way to increase the upper limit on ngram count? And is there a chance
> it would improve results of translations?
>

___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


[Moses-support] single-input-sentence-problem-EMS

2013-12-10 Thread nadeem khan
I ran EMS on config.toy with toydata successfully all runs just fine there was 
not a single error but when it comes to my own data of hindi and experiment I 
am getting stuck with BLEU and BLEU-c Crashed and in input file produced by EMS 
have only 1 single sentence from my above 900 sentences.
When I invistaged the problem there is only 1 single Input Segment in 
test.input.tc.1. why and how the EMS taking only 1 segment from my input 
test-src.sgm file? and when I investigated further there is a fatal error under 
EVALUATION_test_nist-bleu-c.1.STDERR of no id in srcset. why I am getting 
that as I am giving it the complete sgm frame for wrapping out the output.
Help out please___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


[Moses-support] (no subject)

2013-12-10 Thread Kalyani Baruah
hii
Can you provide me with a ppt(power point presntation ) regarding
statistical translation using moses toolkit






Regards,


*Kalyanee Kanchan Baruah*
Department of Information Technology,
Institute of Science and Technology,
Gauhati University,Guwahati,India
Phone- +91-9706242124
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] word alignment viewer

2013-12-10 Thread Hieu Hoang
Thanks everyone for all your suggestions. I've found 2 programs which were
complimentary and perfect for my needs:
   1. Picaro by Jason Riesa. Displays the alignments as a matrix on the
command line. Now included in Moses
   https://github.com/moses-smt/mosesdecoder/tree/master/contrib/picaro
   2. Q. Java gui, display parallel sentences in 2 rows with links
between the words. Not yet officially downloadable.

Most of the others seem to be for doing manual alignment, I'm just looking
for a visualiser. Tried Cairo, it had compile problems (easy to fix) and
didn't seem to run properly even when fixed.



On 9 December 2013 18:28, Jason Riesa  wrote:

> Philipp, thanks. I sent Hieu the code you are referring to; ISI recently
> took my site offline, since I have moved to Google. I haven't had time to
> put something else up yet. Amin, if you're interested, I can also send to
> you.
>
> Best,
> Jason
>
>
> On Mon, Dec 9, 2013 at 9:24 AM, Philipp Koehn  wrote:
>
>> Hi,
>>
>> Jason Riesa has a nice command line word alignment visualization tool
>> http://nlg.isi.edu/demos/picaro/
>> but the download site is not available anymore.
>>
>> -phi
>>
>>
>> On Mon, Dec 9, 2013 at 5:10 PM, Amin Farajian wrote:
>>
>>>  Dear Hieu,
>>>
>>> For this task we recently modified the tool implemented by chris
>>> callison-burch, which you can find the original code here:
>>> http://cs.jhu.edu/~ccb/interface-word-alignment.html
>>>
>>> The modified version of the code reads the source, target and word
>>> alignment information from the input files and enables the user to modify
>>> the alignment points.
>>>
>>> I've tried different tools, but found this one easy to use and very
>>> helpful.
>>> If you are interested, let me know to share the code with you.
>>>
>>> Bests,
>>> Amin
>>>
>>> PS. Here is the screen-shot of the tool:
>>>
>>>
>>>
>>>
>>> On 12/09/2013 05:37 PM, Matthias Huck wrote:
>>>
>>> It's called "Cairo":
>>>
>>> Cairo: An Alignment Visualization Tool. Noah A. Smith and Michael E.
>>> Jahr. In Proceedings of the Language Resources and Evaluation Conference
>>> (LREC 2000), pages 549–552, Athens, Greece, May/June 
>>> 2000.http://www.cs.cmu.edu/~nasmith/papers/smith+jahr.lrec00.pdf
>>> http://old-site.clsp.jhu.edu/ws99/projects/mt/toolkit/cairo.tar.gz
>>>
>>> Never tried that one, though. The code seems to be kind of prehistoric.
>>>
>>>
>>> On Mon, 2013-12-09 at 11:15 -0500, Lane Schwartz wrote:
>>>
>>>  I don't have a copy, but I believe that there was a tool called Chiro
>>> or Cairo that does this, that I'm told helped provide the Egypt theme
>>> to the Egypt-themed JHU summer workshop on machine translation.
>>>
>>> On Mon, Dec 9, 2013 at 10:25 AM, Hieu Hoang  
>>>  wrote:
>>>
>>>  does anyone have a nice GUI word alignment viewer they can share? ie. given
>>> the source, target, alignment files, display each parallel sentence with a
>>> link between the aligned words.
>>>
>>> No webapp or complicated install procedure would be best
>>>
>>> --
>>> Hieu Hoang
>>> Research Associate
>>> University of Edinburghhttp://www.hoang.co.uk/hieu
>>>
>>>
>>> ___
>>> Moses-support mailing 
>>> listMoses-support@mit.eduhttp://mailman.mit.edu/mailman/listinfo/moses-support
>>>
>>>
>>>
>>> ___
>>> Moses-support mailing list
>>> Moses-support@mit.edu
>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>>
>>>
>>
>
> ___
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
>


-- 
Hieu Hoang
Research Associate
University of Edinburgh
http://www.hoang.co.uk/hieu
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] using Moses in Monolingual dialogue setting

2013-12-10 Thread Read, James C
Yep, you've hit the nail right on the head. This is why I said my main concern 
would be the inordinate amounts of training data you would need to get 
something useful up and running.

When translating sentences from one language to another there can be a lot of 
variance but there can also be a lot of consistency at some level and so it 
possible to identify a limited number of patterns. The domain you are trying to 
train seems to me to be so much more open to variance that I would expect you 
would need much larger training sets and/or much more intelligent learning 
algorithms to be able to extract useful generalisations.

Of course, I could be wrong. The only way to tell would be to suck it and see. 
We would need to set up some kind of empirical pipeline to train and test the 
system with varying amounts and types of data to see how it performs. I'm not 
sure how we would test such a system.

I guess a quick approximation of performance of your translation model would be 
to see how highly the output sentences score on a well trained language model. 
This would give you an idea of how fluent the utterances generated are but 
would give you no idea of how appropriate a user would rate the responses. I 
guess you could use one of the bag of metrics to measure the distance of output 
sentences from responses in a test corpus. Again, I'm not sure how good a 
predictor of user judgements this would be.

I suppose you could measure the average time a user is willing to chat with 
your bot to get an idea of how well it's performing. But if the output is 
particularly bad then some users may keep chatting with the bot just for the 
comical value.

Have you got a system running yet? Could you show us some sample output?

James


From: Andrew [rave...@hotmail.com]
Sent: 09 December 2013 21:46
To: Read, James C; moses-support@mit.edu
Subject: RE: [Moses-support] using Moses in Monolingual dialogue setting

Thanx for the insights.

I've already done approach 2, and the result didn't seem bad to me,
so I became curious if it would've made significant difference had I chosen the 
first approach.
I was worried that approach 2 might've resulted in over-training, but judging 
from your comments, I guess it's only a matter of having broader entries. (or 
could it have been over-trained?)

> I suppose my main concern would be the inordinate amounts of training data 
> you would need to get something useful up and running.

This leads me to my next question.
I trained my system with about 650k pairs of stimulus-response collected from 
Twitter.
Each pair is part of a conversation which consists of 3~10 utterances.
For example, suppose we have a conversation that has 4 utterances labeled 
A,B,C,D where A is the "root"of the conversation, and B is the response to A, C 
is the response to B, and D is the response to C.
Following my second approach, A and B, B and C, C and D are pairs, so source 
file will contain A,B,C and target file will contain B,C,D, making 3 pairs from 
1 conversation. In this way, I have 650k pairs from about 80k conversations.

I've seen that when you use Moses for actual translation task, say German to 
English, the amount of training data seems pretty low, somewhere around 50K. So 
my 650k is already much bigger than this. However, in the paper that I 
mentioned http://aritter.github.io/mt_chat.pdf the author used about 1.3M 
pairs, which is twice bigger than mine, and I've seen research in similar 
setting http://www.aclweb.org/anthology/P13-1095 which used 4M pairs.(!)

So, given the unpredictable nature of monolingual conversation setting, what 
would you think is the appropriate, or minimum amount of training data? And how 
much would the quality of the response-generation task depend on the amount of 
training data?

I know this is out-of-nowhere question which may be hard to answer, but even a 
rough guess would great me assist me. Thank you very much in advance.






> From: jcr...@essex.ac.uk
> To: kgim...@cs.cmu.edu
> Date: Mon, 9 Dec 2013 17:33:00 +
> CC: moses-support@mit.edu
> Subject: Re: [Moses-support] using Moses in Monolingual dialogue setting
>
> I guess if you were to change the subject and ask a question from a list of 
> well formed common questions if the probability of the response is below some 
> sensible threshold then you could make a system which fools a user some of 
> the time.
>
> James
>
> 
> From: moses-support-boun...@mit.edu [moses-support-boun...@mit.edu] on behalf 
> of Read, James C [jcr...@essex.ac.uk]
> Sent: 09 December 2013 17:14
> To: Kevin Gimpel
> Cc: moses-support@mit.edu
> Subject: Re: [Moses-support] using Moses in Monolingual dialogue setting
>
> I'm guessing he wants to make a conversational agent that produces a most 
> likely response based on the stimulus.
>
> In any case, the distinction between 1 and 2 is probably redundant if GIZA++ 
> is being used to train in both