Re: [Moses-support] Adding sentence-level flag features
Thanks, Chris, I'll give it a shot. I'll be back if I have trouble getting the lattice input to work as expected. Suzy On 27/03/2010, at 12:45 AM, Chris Dyer wrote: > The first weight in the lattice format is called "transition > probability", but it can be anything you want. It just becomes a > feature in the system's log-linear model. The weight used to bias > this feature is weight-i. > > Chris > > On Fri, Mar 26, 2010 at 1:17 AM, Suzy Howlett > wrote: >> Thanks, that sounds like a good thing to try. But where would you >> specify >> the feature value? The numbers in the lattice format (as far as I >> know) are >> transition probability and distance to next node, so unless you can >> extend >> the list of numbers, I'm still not clear on how you incorporate the >> feature. >> Also, what weight is used, weight-i? >> >> Suzy >> >> On 26/03/2010, at 12:30 PM, Chris Dyer wrote: >> >>> That sounds reasonable. And, I don't think you'll need to add an >>> extra feature to moses to do this. The lattice input format lets >>> you >>> have a feature associated with a transition (in fact, I think you >>> can >>> have an arbitrary number of features), so you can use that to >>> encoded >>> whether the path your on corresponds to the reordered variant or >>> not. >>> -Chris >>> >>> On Thu, Mar 25, 2010 at 8:51 PM, Suzy Howlett >>> wrote: Hi Chris, The preprocessing I referred to is a reordering of the words of the source sentence before translation. The overall idea would be to have a single Moses model that can handle both reordered and non-reordered sentences. The only way I've thought of to do this is to combine the sentence- level feature I mentioned with two phrase translation tables and a lattice input combining the reordered and non-reordered versions of a single sentence. Then we could have a number of other features that would influence the system's choice of which version to use. There are obviously a number of points at which this scheme could break down, and I have no idea if any of it will work, but I figured the only way to find out would be to try. I appreciate any suggestions you have. Suzy On 26/03/2010, at 11:32 AM, Chris Dyer wrote: > Moses uses features to discriminate between alternative > translations > of individual sentences, so if the value is constant for all > possible > translations (for example, because it is a function of the > input), the > model won't be able to take advantage of it. It sounds like you > might > be proposing something like this. What are you trying to do? > > -Chris > > On Thu, Mar 25, 2010 at 8:14 PM, Suzy Howlett > wrote: >> >> Hi, >> >> I am just starting my foray into the world of adding features >> to Moses >> and haven't quite got my head around it yet. Could someone please >> check I'm on the right track, or tell me if I've overlooked an >> easier >> alternative? >> >> The feature that I want to add is essentially a sentence-level >> flag to >> say whether a given input sentence has undergone a particular >> kind of >> preprocessing before being passed to Moses. My best guess is >> that I >> need to create a file containing a look-up table to indicate >> which >> sentences have been preprocessed, e.g. >> >> ||| 0 >> ||| 0 >> ||| 1 >> ||| 0 >> ... >> >> where 1 and 0 indicate that the sentence has and has not been >> preprocessed, respectively. Is this the best way to do it? Does >> anyone >> know of anyone doing something similar before? >> >> I imagine I will need a StatelessFeatureFunction that will open >> the >> file and read off the value for the input sentence, and two >> parameters >> added with AddParam (one for the weight and one to specify the >> file >> containing the table above). Does that sound right so far? If >> anyone >> has any pointers for getting started implementing this feature, >> I'd >> appreciate them. >> >> Thanks, >> Suzy >> >> ___ >> Moses-support mailing list >> Moses-support@mit.edu >> http://mailman.mit.edu/mailman/listinfo/moses-support >> >> >> ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support
Re: [Moses-support] Moses-support Digest, Vol 41, Issue 36
a quick question. will this break compatibility with existing training runs? also, adding new features --even if they are not used-- can impact upon MERT and may slow things down / make things worse. have you verified (using multiple runs) that this new feature doesnt' make things worse than before? Miles On 28 March 2010 19:46, Lane Schwartz wrote: > On 28 Mar 2010, at 11:02 AM, moses-support-requ...@mit.edu wrote: > >> Hiya Mosers and Mosettes, >> >> It's been a year since the last release& there's been lots of changes, by >> lots of people, that we thought you should know about. >> >> A new release tar ball and zip file are on sourceforge, or svn update as >> usual >> https://sourceforge.net/projects/mosesdecoder/ >> >> Also, there is likely to be big changes in the next month as we merge the >> hierarchical/syntax branch into trunk. Please avoid svn up after today, and >> double check with someone else before committing large chunks of code to the >> trunk. > > Hieu, > > I've got a handful of changes from last week that I was planning to merge > from my new branch back into trunk tomorrow. The changes pretty much involve > adding one new feature, and should not affect anyone not using the new > feature. > > I'll wait for your go-ahead before I do this merge. If there are plans for > lots of updates to trunk tomorrow, I could probably do my merge later today > (Sunday) instead, if that would help. > > Lane > > > ___ > Moses-support mailing list > Moses-support@mit.edu > http://mailman.mit.edu/mailman/listinfo/moses-support > -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support
Re: [Moses-support] Moses-support Digest, Vol 41, Issue 36
On 28 Mar 2010, at 11:02 AM, moses-support-requ...@mit.edu wrote: > Hiya Mosers and Mosettes, > > It's been a year since the last release& there's been lots of changes, by > lots of people, that we thought you should know about. > > A new release tar ball and zip file are on sourceforge, or svn update as usual >https://sourceforge.net/projects/mosesdecoder/ > > Also, there is likely to be big changes in the next month as we merge the > hierarchical/syntax branch into trunk. Please avoid svn up after today, and > double check with someone else before committing large chunks of code to the > trunk. Hieu, I've got a handful of changes from last week that I was planning to merge from my new branch back into trunk tomorrow. The changes pretty much involve adding one new feature, and should not affect anyone not using the new feature. I'll wait for your go-ahead before I do this merge. If there are plans for lots of updates to trunk tomorrow, I could probably do my merge later today (Sunday) instead, if that would help. Lane ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support
[Moses-support] Moses release
Hiya Mosers and Mosettes, It's been a year since the last release& there's been lots of changes, by lots of people, that we thought you should know about. A new release tar ball and zip file are on sourceforge, or svn update as usual https://sourceforge.net/projects/mosesdecoder/ Also, there is likely to be big changes in the next month as we merge the hierarchical/syntax branch into trunk. Please avoid svn up after today, and double check with someone else before committing large chunks of code to the trunk. Changes since the last time: 1. minor bug fixes& tweaks, especially to the decoder, MERT scripts (thanks to too many people to mention) 2. fixes to make decoder compile with most versions of gcc, Visual studio and other compilers (thanks to Tom Hoar, Jean-Bapist Fouet). 3. multi-threaded decoder (thanks to Barry Haddow) 4. update for IRSTLM (thanks to nicola bertoldi& Marcello Federico) 5. run mert on a subset of features (thanks to nicola bertoldi) 6. Training using different alignment models (thanks to Mark Fishel) 7. "a handy script to get many translations from Google" (thanks to Ondrej Bojar) 8. Lattice MBR (thanks to Abhishek Arun& Barry Haddow) 9 . Option to compile moses as a dynamic library (thanks to Jean-Bapist Fouet). 10. hierarchical re-ordering model (thanks to Christian Harmeier, Sara Styme, Nadi, Marcello, Ankit Srivastava, Gabriele Antonio Musillo, Philip Williams, Barry Haddow). 11. Global Lexical re-ordering model (thanks to Philipp Koehn) 12. Experiment.perl scripts for automating the whole MT pipeline (thanks to Philipp Koehn) ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support
[Moses-support] training fails on 1.4million fr-en sentence pairs
Hi, After following the step-by-step guide, I was able to train a model on 44K sentences for the language pair fr-en. I am trying to repeat training but this time on approx. 1.4 million sentences from europarl corpus. The experiment runs fine till the step of building n-gram models. However, it seems to fail while training a phrase-based model. I am using cygwin on windows XP with RAM 3.25GB Exception: STATUS_ACCESS_VIOLATION at eip=610D3C8E eax= ebx= ecx=61150140 edx= esi= edi=7FDB2CD8 ebp=0022C258 esp=0022C240 program=C:\cygwin\home\moses\tools\bin\snt2cooc.out, pid 4348, thread main cs=001B ds=0023 es=0023 fs=003B gs= ss=0023 Stack trace: Frame Function Args 0022C258 610D3C8E (, , , ) 0022C368 610D75D0 (0044B000, , 0014, 0001) 0022C468 610DB0F3 (0044B000, 0001, , ) 0022C4D8 610B5178 (0004, 0022C480, 0022C67C, 0041623F) 0022C538 004055BA (0018, 5F9D58D0, 745B46A8, 7FDB2CDC) 0022C568 00443DA4 (7FDB2CD8, , 7816CCB0, 0022C658) 0022C598 00443B40 (7FDB2CD8, 7E5CE968, 0022C658, 00402000) 0022CD78 00402DB8 (0004, 61210304, 007100F8, 61004A1D) 0022CDA8 61006DDA (, 0022CDE0, 610066E0, 7FFDF000) End of stack trace 1 [main] snt2cooc.out 4348 C:\cygwin\home\moses\tools\bin\snt2cooc.out: *** fatal error - cmalloc would have returned NULL = Am I running short of RAM? Could someone help? Thanks, Niraj ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support