[Moses-support] manipulating jamfile

2015-11-11 Thread koormoosh
Hi,

I am manipulating the jamfile to integrate my lm into moses. This is what I
have added to the jamfile:

local with-mylm = [ option.get "with-mylm" ] ;
if $(with-mylm) {
  lib mylm-lib : : $(with-mylm)/lib ;
  ...
}

and everything is fine and it picks up the includes, etc except that it
complains about the mylm-lib:

gcc.link moses-cmd/bin/gcc-4.9.2/release/link-static/threading-multi/moses
/usr/bin/ld: cannot find *-lmylm-lib*
collect2: error: ld returned 1 exit status

Am I doing something wrong? The path is correct, but for some reason it
doesn't take the mylm-lib.

On Sat, Nov 7, 2015 at 10:33 PM, Hieu Hoang  wrote:

> it's been a while since anyone looked at the SRILM code. It should still
> work but I can't remember exactly whats going on
>
> On 05/11/2015 14:04, koormoosh wrote:
>
> I am integrating my LM in mosesdecoder. I started by looking into the
> Skeleton files, and the SRI code. Things are clear except for these lines
> in SRI.cpp which I cannot wrap my head around them:
>
> ngram[count] = Vocab_None;
> 
>   if (finalState) {
> ngram[0] = lmId;
> unsigned int dummy;
> *finalState = m_srilmModel-
>
> assuming that lmId is the id of the last word of the sequence, I don't
> understand the functionality of finalState (probably because I lack the MT
> background). I wonder if you could kindly clarify these things if you are
> familiar with LM integration, or SRI integration in specific:
>
> *1) why are we adding ngram[count] = Vocab_None*
>
>
>
>
> *i guess this is initialising the array element. eg. if your LM is a
> trigram but you only want to calculate the score for a unigram, then set
> the 1st element in the array to the word, and the other elements to
> Vocab_None. This is how SRILM does it, you don't have to follow the same
> design in your LM *
>
>
> *2) what is being checked in the if-condition if(finalState), *
>
> *Don't know*
>
>
> *3) what is happenning in: *finalState =
> m_srilmModel->contextID(ngram, dummy);*
>
>
>
>
>
>
>
>
>
>
> *This is state information required by the decoder to decide whether to
> recombine the hypothesis with another hypothesis. For the language model,
> if the trigram isa b c The state information is a unique id for the
> BIGRAM 'b c'. This could be the hash of the bigram, the memory address of
> the node where is bigram is stored, or the string itself. As long as it is
> different from 'b d, 'd e' etc. This is the basic description - there are
> some optimization you can do, but its important you understand this 1st. I
> recommend looking at Philipp Koehn's book. This paper describe a similar
> thing, but for syntactic MT
> https://kheafield.com/professional/edinburgh/left_paper.pdf
>  *
>
>
> Thank you!
>
>
> ___
> Moses-support mailing 
> listMoses-support@mit.eduhttp://mailman.mit.edu/mailman/listinfo/moses-support
>
>
> --
> Hieu Hoanghttp://www.hoang.co.uk/hieu
>
>
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] Training with multiple text files

2015-11-11 Thread Philipp Koehn
Hi,

you have to convert your parallel text files yourself into the format that
Moses expects, i.e., two files, one for English, one for Hindi, where
line x in the English file corresponds to line x in the Hindi file.

If this is rather raw data, you may have to run sentence alignment on
your data, using tools such as Hunalign.
http://mokk.bme.hu/en/resources/hunalign/

-phi

On Wed, Nov 11, 2015 at 5:18 AM, Sunayana Gawde 
wrote:

> Hello all,
> I have developed a Baseline Machine Translation system as stated on moses
> website. Also i developed a MT system for a English-Hindi parallel corpus
> available online with which i am getting very low BLEU score i.e.5.31. Now
> i have a parallel text in English and Hindi in health n tourism corpus
> which contains many text files. How to train the system with multiple text
> files? I am only familiar to develop that baseline system. Is there
> something else which i need specifically for Hindi?
> Please help. Thanks.
>
> --
> *Thanks & Regards*
>
> Ms. Sunayana R. Gawde.
>
> DCST, Goa University.
> * P**leas**e don't print t**his e-mail unles**s you really need to.*
>
> ___
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
>
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


[Moses-support] Segmentation Fault on tuning phase

2015-11-11 Thread Alex Martinez

Hi,
I've just pulled the code and rebuild the MT system and I'm getting a 
segmentation fault during tuning step using EMS in a experiment that was 
working well with a version of the code pulled on October 2.

I have pulled and refreshed the code because I was facing some problems getting 
the word alignments with moses --server and I saw that the code that deals with 
the server params has been updated recently.

The model is a factored model and the error in the log is:

 Using SCRIPTS_ROOTDIR: /opt/moses/scripts
Asking moses for feature names and values from 
/mnt/a62/devel/en_es/process/model/moses.bin.ini.4
Executing: /opt/moses/bin/moses -threads all -v 0 -config 
/mnt/a62/devel/en_es/process/model/moses.bin.ini.4 -show-weights
exec: /opt/moses/bin/moses -threads all -v 0 -config 
/mnt/a62/devel/en_es/process/model/moses.bin.ini.4 -show-weights
Executing: /opt/moses/bin/moses -threads all -v 0 -config 
/mnt/a62/devel/en_es/process/model/moses.bin.ini.4 -show-weights > ./features.list 
2> /dev/null
MERT starting values and ranges for random generation:
LexicalReordering0 = 0.300 ( 0.00 .. 1.00)
LexicalReordering0 = 0.300 ( 0.00 .. 1.00)
LexicalReordering0 = 0.300 ( 0.00 .. 1.00)
LexicalReordering0 = 0.300 ( 0.00 .. 1.00)
LexicalReordering0 = 0.300 ( 0.00 .. 1.00)
LexicalReordering0 = 0.300 ( 0.00 .. 1.00)
Distortion0 = 0.300 ( 0.00 .. 1.00)
LM0 = 0.500 ( 0.00 .. 1.00)
LM1 = 0.500 ( 0.00 .. 1.00)
LM2 = 0.500 ( 0.00 .. 1.00)
WordPenalty0 = -1.000 ( 0.00 .. 1.00)
PhrasePenalty0 = 0.200 ( 0.00 .. 1.00)
TranslationModel0 = 0.200 ( 0.00 .. 1.00)
TranslationModel0 = 0.200 ( 0.00 .. 1.00)
TranslationModel0 = 0.200 ( 0.00 .. 1.00)
TranslationModel0 = 0.200 ( 0.00 .. 1.00)
TranslationModel1 = 0.200 ( 0.00 .. 1.00)
TranslationModel1 = 0.200 ( 0.00 .. 1.00)
TranslationModel1 = 0.200 ( 0.00 .. 1.00)
TranslationModel1 = 0.200 ( 0.00 .. 1.00)
TranslationModel2 = 0.200 ( 0.00 .. 1.00)
TranslationModel2 = 0.200 ( 0.00 .. 1.00)
TranslationModel2 = 0.200 ( 0.00 .. 1.00)
TranslationModel2 = 0.200 ( 0.00 .. 1.00)
GenerationModel0 = 0.300 ( 0.00 .. 1.00)
GenerationModel0 = 0.000 ( 0.00 .. 1.00)
GenerationModel1 = 0.300 ( 0.00 .. 1.00)
GenerationModel1 = 0.000 ( 0.00 .. 1.00)
featlist: LexicalReordering0=0.30 
featlist: LexicalReordering0=0.30 
featlist: LexicalReordering0=0.30 
featlist: LexicalReordering0=0.30 
featlist: LexicalReordering0=0.30 
featlist: LexicalReordering0=0.30 
featlist: Distortion0=0.30 
featlist: LM0=0.50 
featlist: LM1=0.50 
featlist: LM2=0.50 
featlist: WordPenalty0=-1.00 
featlist: PhrasePenalty0=0.20 
featlist: TranslationModel0=0.20 
featlist: TranslationModel0=0.20 
featlist: TranslationModel0=0.20 
featlist: TranslationModel0=0.20 
featlist: TranslationModel1=0.20 
featlist: TranslationModel1=0.20 
featlist: TranslationModel1=0.20 
featlist: TranslationModel1=0.20 
featlist: TranslationModel2=0.20 
featlist: TranslationModel2=0.20 
featlist: TranslationModel2=0.20 
featlist: TranslationModel2=0.20 
featlist: GenerationModel0=0.30 
featlist: GenerationModel0=0.00 
featlist: GenerationModel1=0.30 
featlist: GenerationModel1=0.00 
Saved: ./run1.moses.ini

Normalizing lambdas: 0.30 0.30 0.30 0.30 0.30 0.30 
0.30 0.50 0.50 0.50 -1.00 0.20 0.20 0.20 
0.20 0.20 0.20 0.20 0.20 0.20 0.20 0.20 
0.20 0.20 0.30 0.00 0.30 0.00
DECODER_CFG = -weight-overwrite 'WordPenalty0= -0.128205 PhrasePenalty0= 
0.025641 LexicalReordering0= 0.038462 0.038462 0.038462 0.038462 0.038462 
0.038462 Distortion0= 0.038462 GenerationModel0= 0.038462 0.00 
TranslationModel1= 0.025641 0.025641 0.025641 0.025641 LM1= 0.064103 
TranslationModel0= 0.025641 0.025641 0.025641 0.025641 LM0= 0.064103 LM2= 
0.064103 TranslationModel2= 0.025641 0.025641 0.025641 0.025641 
GenerationModel1= 0.038462 0.00'
Executing: /opt/moses/bin/moses -threads all -v 0 -config /mnt/a62/devel/en_es/process/model/moses.bin.ini.4 -weight-overwrite 'WordPenalty0= -0.128205 PhrasePenalty0= 0.025641 LexicalReordering0= 0.038462 0.038462 0.038462 0.038462 0.038462 0.038462 Distortion0= 0.038462 GenerationModel0= 0.038462 0.00 TranslationModel1= 0.025641 0.025641 0.025641 0.025641 LM1= 0.064103 TranslationModel0= 0.025641 0.025641 0.025641 0.025641 LM0= 0.064103 LM2= 0.064103 TranslationModel2= 0.025641 0.025641 0.025641 0.025641 GenerationModel1= 0.038462 0.00' -n-best-list run1.best100.out 100 distinct -input-file /mnt/a62/devel/en_es/data/corpora.tuning.en > run1.out 
Executing: /opt/moses/bin/moses -threads all -v 0 -config /mnt/a62/devel/en_es/process/model/moses.bin.ini.4 -weight-overwrite 'WordPenalty0= -0.128205 PhrasePenalty0= 0.025641 LexicalReordering0= 0.038462 0.038462 0.038462 0.038462 0.038462 0.038462 Distortion0= 0.038462 GenerationModel0= 0.038462 0.00 TranslationModel1= 0.025641 0.025641 0.025641 0.02

[Moses-support] Training with multiple text files

2015-11-11 Thread Sunayana Gawde
Hello all,
I have developed a Baseline Machine Translation system as stated on moses
website. Also i developed a MT system for a English-Hindi parallel corpus
available online with which i am getting very low BLEU score i.e.5.31. Now
i have a parallel text in English and Hindi in health n tourism corpus
which contains many text files. How to train the system with multiple text
files? I am only familiar to develop that baseline system. Is there
something else which i need specifically for Hindi?
Please help. Thanks.

-- 
*Thanks & Regards*

Ms. Sunayana R. Gawde.

DCST, Goa University.
* P**leas**e don't print t**his e-mail unles**s you really need to.*
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] use placeholder with mosesserver

2015-11-11 Thread Evgeny Matusov
Hi Uli,


thanks a lot! We will try to add some test cases for Mosesserver, including XML 
input with/without placeholders.


Best,

Evgeny.



From: Ulrich Germann 
Sent: Wednesday, November 11, 2015 2:58 AM
To: Evgeny Matusov
Cc: moses-support@mit.edu
Subject: Re: [Moses-support] use placeholder with mosesserver

Hi all,

I've just pushed what I believe might address a few of the issues in this 
thread:

- the more fine-grained configuration options for request handling and queuing, 
server timeouts etc. (added in August due to threading issue) have been 
transferred to the main moses executable.

- the server now pays attention to the xml-input option specified via json; the 
range of accepted values is the same as when specified on the command line. I 
have not written the xml-input handling and do not actively use it, so it may 
or may not work. I don't think there are any regression tests that test this 
right now. Reports from the trenches are welcome.

- mosesserver.cpp is deprecated. It is now merely a shell around the regular 
moses call with --server. I did not remove it from the code base entirely, as I 
assume that there's a plethora of setups out there that rely on the existence 
of mosesserver. What the wrapper does is add --server to the options and then 
call run regular moses.

- anyone adding stuff to mosesserver.cpp from now on owes me a lifetime supply 
of the finest Laphroaig. Just send me a quarter cask every year for Burns Nicht 
for the rest of my life if you do. If I haven't pushed anything for two years, 
you may assume I'm dead.


- Uli

On Tue, Nov 10, 2015 at 2:58 PM, Ulrich Germann 
mailto:ulrich.germ...@gmail.com>> wrote:
Hi all,

mosesserver is deprecated and should not be used any more. I'll transfer the 
threading-related changes to the server implementation in the regular moses 
executable and let you know once I'm done so that other things (like 
passthrough) can be added. By the looks of it, the changes are fairly 
straightforward, so it shouldn't take long. However, I can't guarantee that the 
new server will do everything the old server did, (or do it the same way).

It would be fantastic if a few people could design and contribute test cases so 
that we can do some regression testing for the server. Ideally a test case 
should provide:

- tiny models to work with (or we may be able to recycle some that already 
exist)
- sample input (json)
- expected output (json)

Cheers - Uli

On Tue, Nov 10, 2015 at 11:37 AM, Evgeny Matusov 
mailto:ematu...@apptek.com>> wrote:

Hi,

can any of the more active recent developers advise what is the latest stable 
mosesserver implementation?

It seems to be the one in moses/server, but the  one in in 
contrib/server/mosesserver.cpp has been updated in August of this year with an 
important fix related to multiple threads:

https://github.com/moses-smt/mosesdecoder/commit/3c682fa8b05af6bff1a09f420141795875cf9685
https://www.mail-archive.com/moses-support%40mit.edu/msg12875.html

As Gregor mentioned, we would like to share our fix so that Mosesserver 
correctly supports placeholders. I want to make sure that this is a fix for 
something that many people use without problems.

Thanks,
Evgeny.



From: moses-support-boun...@mit.edu 
mailto:moses-support-boun...@mit.edu>> on behalf 
of moses-support-requ...@mit.edu 
mailto:moses-support-requ...@mit.edu>>
Sent: Monday, November 9, 2015 6:02 PM
To: moses-support@mit.edu
Subject: Moses-support Digest, Vol 109, Issue 16

Send Moses-support mailing list submissions to
moses-support@mit.edu

To subscribe or unsubscribe via the World Wide Web, visit
http://mailman.mit.edu/mailman/listinfo/moses-support
or, via email, send a message with subject or body 'help' to
moses-support-requ...@mit.edu

You can reach the person managing the list at
moses-support-ow...@mit.edu

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Moses-support digest..."


Today's Topics:

   1. Question about output alignment info (Marta Ruiz)
   2. Re: use placeholder with mosesserver (Leusch, Gregor)


--

Message: 1
Date: Mon, 9 Nov 2015 16:10:26 +0100
From: Marta Ruiz 
mailto:martaruizcostaju...@gmail.com>>
Subject: [Moses-support] Question about output alignment info
To: moses-support@mit.edu
Message-ID:

mailto:vrajg5btd3amfvmhpcynda3nj-0ynyqwf3xuzd...@mail.gmail.com>>
Content-Type: text/plain; charset="utf-8"

Hi all,

When I use the option "-alignment-output-file [file]", I get just a few
alignments. Most sentences are in blank, except some that have one
alignment...