Re: [Moses-support] Adding sentence-level flag features

2010-03-28 Thread Suzy Howlett
Thanks, Chris, I'll give it a shot. I'll be back if I have trouble  
getting the lattice input to work as expected.

Suzy

On 27/03/2010, at 12:45 AM, Chris Dyer wrote:

> The first weight in the lattice format is called "transition
> probability", but it can be anything you want.  It just becomes a
> feature in the system's log-linear model.  The weight used to bias
> this feature is weight-i.
>
> Chris
>
> On Fri, Mar 26, 2010 at 1:17 AM, Suzy Howlett
>  wrote:
>> Thanks, that sounds like a good thing to try. But where would you  
>> specify
>> the feature value? The numbers in the lattice format (as far as I  
>> know) are
>> transition probability and distance to next node, so unless you can  
>> extend
>> the list of numbers, I'm still not clear on how you incorporate the  
>> feature.
>> Also, what weight is used, weight-i?
>>
>> Suzy
>>
>> On 26/03/2010, at 12:30 PM, Chris Dyer wrote:
>>
>>> That sounds reasonable.  And, I don't think you'll need to add an
>>> extra feature to moses to do this.  The lattice input format lets  
>>> you
>>> have a feature associated with a transition (in fact, I think you  
>>> can
>>> have an arbitrary number of features), so you can use that to  
>>> encoded
>>> whether the path your on corresponds to the reordered variant or  
>>> not.
>>> -Chris
>>>
>>> On Thu, Mar 25, 2010 at 8:51 PM, Suzy Howlett
>>>  wrote:

 Hi Chris,

 The preprocessing I referred to is a reordering of the words of the
 source
 sentence before translation. The overall idea would be to have a  
 single
 Moses model that can handle both reordered and non-reordered  
 sentences.
 The
 only way I've thought of to do this is to combine the sentence- 
 level
 feature
 I mentioned with two phrase translation tables and a lattice input
 combining
 the reordered and non-reordered versions of a single sentence.  
 Then we
 could
 have a number of other features that would influence the system's  
 choice
 of
 which version to use. There are obviously a number of points at  
 which
 this
 scheme could break down, and I have no idea if any of it will  
 work, but I
 figured the only way to find out would be to try. I appreciate any
 suggestions you have.

 Suzy

 On 26/03/2010, at 11:32 AM, Chris Dyer wrote:

> Moses uses features to discriminate between alternative  
> translations
> of individual sentences, so if the value is constant for all  
> possible
> translations (for example, because it is a function of the  
> input), the
> model won't be able to take advantage of it.  It sounds like you  
> might
> be proposing something like this.  What are you trying to do?
>
> -Chris
>
> On Thu, Mar 25, 2010 at 8:14 PM, Suzy Howlett
>  wrote:
>>
>> Hi,
>>
>> I am just starting my foray into the world of adding features  
>> to Moses
>> and haven't quite got my head around it yet. Could someone please
>> check I'm on the right track, or tell me if I've overlooked an  
>> easier
>> alternative?
>>
>> The feature that I want to add is essentially a sentence-level  
>> flag to
>> say whether a given input sentence has undergone a particular  
>> kind of
>> preprocessing before being passed to Moses. My best guess is  
>> that I
>> need to create a file containing a look-up table to indicate  
>> which
>> sentences have been preprocessed, e.g.
>>
>>  ||| 0
>>  ||| 0
>>  ||| 1
>>  ||| 0
>> ...
>>
>> where 1 and 0 indicate that the sentence has and has not been
>> preprocessed, respectively. Is this the best way to do it? Does  
>> anyone
>> know of anyone doing something similar before?
>>
>> I imagine I will need a StatelessFeatureFunction that will open  
>> the
>> file and read off the value for the input sentence, and two  
>> parameters
>> added with AddParam (one for the weight and one to specify the  
>> file
>> containing the table above). Does that sound right so far? If  
>> anyone
>> has any pointers for getting started implementing this feature,  
>> I'd
>> appreciate them.
>>
>> Thanks,
>> Suzy
>>
>> ___
>> Moses-support mailing list
>> Moses-support@mit.edu
>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>


>>
>>

___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] Moses-support Digest, Vol 41, Issue 36

2010-03-28 Thread Miles Osborne
a quick question.  will this break compatibility with existing training runs?

also, adding new features --even if they are not used-- can impact
upon MERT and may slow things down / make things worse.  have you
verified (using multiple runs) that this new feature doesnt' make
things worse than before?

Miles

On 28 March 2010 19:46, Lane Schwartz  wrote:
> On 28 Mar 2010, at 11:02 AM, moses-support-requ...@mit.edu wrote:
>
>> Hiya Mosers and Mosettes,
>>
>> It's been a year since the last release&  there's been lots of changes, by 
>> lots of people, that we thought you should know about.
>>
>> A new release tar ball and zip file are on sourceforge, or svn update as 
>> usual
>>    https://sourceforge.net/projects/mosesdecoder/
>>
>> Also, there is likely to be big changes in the next month as we merge the 
>> hierarchical/syntax branch into trunk. Please avoid svn up after today, and 
>> double check with someone else before committing large chunks of code to the 
>> trunk.
>
> Hieu,
>
> I've got a handful of changes from last week that I was planning to merge 
> from my new branch back into trunk tomorrow. The changes pretty much involve 
> adding one new feature, and should not affect anyone not using the new 
> feature.
>
> I'll wait for your go-ahead before I do this merge. If there are plans for 
> lots of updates to trunk tomorrow, I could probably do my merge later today 
> (Sunday) instead, if that would help.
>
> Lane
>
>
> ___
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
>



-- 
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.

___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] Moses-support Digest, Vol 41, Issue 36

2010-03-28 Thread Lane Schwartz
On 28 Mar 2010, at 11:02 AM, moses-support-requ...@mit.edu wrote:

> Hiya Mosers and Mosettes,
> 
> It's been a year since the last release&  there's been lots of changes, by 
> lots of people, that we thought you should know about.
> 
> A new release tar ball and zip file are on sourceforge, or svn update as usual
>https://sourceforge.net/projects/mosesdecoder/
> 
> Also, there is likely to be big changes in the next month as we merge the 
> hierarchical/syntax branch into trunk. Please avoid svn up after today, and 
> double check with someone else before committing large chunks of code to the 
> trunk.

Hieu,

I've got a handful of changes from last week that I was planning to merge from 
my new branch back into trunk tomorrow. The changes pretty much involve adding 
one new feature, and should not affect anyone not using the new feature.

I'll wait for your go-ahead before I do this merge. If there are plans for lots 
of updates to trunk tomorrow, I could probably do my merge later today (Sunday) 
instead, if that would help.

Lane


___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


[Moses-support] Moses release

2010-03-28 Thread Hieu Hoang
Hiya Mosers and Mosettes,

It's been a year since the last release&  there's been lots of changes, by lots 
of people, that we thought you should know about.

A new release tar ball and zip file are on sourceforge, or svn update as usual
https://sourceforge.net/projects/mosesdecoder/

Also, there is likely to be big changes in the next month as we merge the 
hierarchical/syntax branch into trunk. Please avoid svn up after today, and 
double check with someone else before committing large chunks of code to the 
trunk.

Changes since the last time:
1. minor bug fixes&  tweaks, especially to the decoder, MERT scripts
(thanks to too many people to mention)
2. fixes to make decoder compile with most versions of gcc, Visual
studio and other compilers (thanks to Tom Hoar, Jean-Bapist Fouet).
3. multi-threaded decoder (thanks to Barry Haddow)
4. update for IRSTLM (thanks to nicola bertoldi&  Marcello Federico)
5. run mert on a subset of features (thanks to nicola bertoldi)
6. Training using different alignment models (thanks to Mark Fishel)
7. "a handy script to get many translations from Google" (thanks to
Ondrej Bojar)
8. Lattice MBR (thanks to Abhishek Arun&  Barry Haddow)
9 . Option to compile moses as a dynamic library (thanks to
Jean-Bapist Fouet).
10. hierarchical re-ordering model (thanks to Christian Harmeier,
Sara Styme, Nadi, Marcello, Ankit Srivastava, Gabriele Antonio Musillo,
Philip Williams, Barry Haddow).
11. Global Lexical re-ordering model (thanks to Philipp Koehn)
12. Experiment.perl scripts for automating the whole MT pipeline (thanks to 
Philipp Koehn)



___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


[Moses-support] training fails on 1.4million fr-en sentence pairs

2010-03-28 Thread Niraj Aswani
Hi,

After following the step-by-step guide, I was able to train a model on 
44K sentences for the language pair fr-en.  I am trying to repeat 
training but this time on approx. 1.4 million sentences from europarl 
corpus.  The experiment runs fine till the step of building n-gram 
models. However, it seems to fail while training a phrase-based model. 

I am using cygwin on windows XP with RAM 3.25GB

Exception: STATUS_ACCESS_VIOLATION at eip=610D3C8E
eax= ebx= ecx=61150140 edx= esi= 
edi=7FDB2CD8
ebp=0022C258 esp=0022C240 
program=C:\cygwin\home\moses\tools\bin\snt2cooc.out, pid 4348, thread main
cs=001B ds=0023 es=0023 fs=003B gs= ss=0023
Stack trace:
Frame Function  Args
0022C258  610D3C8E  (, , , )
0022C368  610D75D0  (0044B000, , 0014, 0001)
0022C468  610DB0F3  (0044B000, 0001, , )
0022C4D8  610B5178  (0004, 0022C480, 0022C67C, 0041623F)
0022C538  004055BA  (0018, 5F9D58D0, 745B46A8, 7FDB2CDC)
0022C568  00443DA4  (7FDB2CD8, , 7816CCB0, 0022C658)
0022C598  00443B40  (7FDB2CD8, 7E5CE968, 0022C658, 00402000)
0022CD78  00402DB8  (0004, 61210304, 007100F8, 61004A1D)
0022CDA8  61006DDA  (, 0022CDE0, 610066E0, 7FFDF000)
End of stack trace
  1 [main] snt2cooc.out 4348 
C:\cygwin\home\moses\tools\bin\snt2cooc.out: *** fatal error - cmalloc 
would have returned NULL
=

Am I running short of RAM?  Could someone help?

Thanks,
Niraj
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support