There seem to be multiple issues here. As I said, I have null experience with EMS, so maybe someone else can help with that.
The message in extract.err seems to actually mean, that you were successful in calling the M2 scorer in EMS, the only problem is it dies 😊 The Levenshtein message is part of a failsafe that is meant to avoid exponentially long searches. It does not calculate the M2 metric for a sentence pair where there would be excessively many edits (these are usually wrong). Theses messages by themselves should not be a reason for worrying. The std::bad_alloc on the other hand is not good. It seems the scorer tries to allocate some huge piece of memory, probably some negative index somewhere and then dies. I have not seen this before. Is it possible that your system is creating a lot superfluous edits and the graph algorithm in M2 is going crazy due to that? From: Kelly Marchisio Sent: Saturday, January 13, 2018 7:46 PM To: Marcin Junczys-Dowmunt; moses-support Subject: Re: [Moses-support] M2 Scorer in EMS for Grammatical Error Correction looping back in mailing-list and copying message :) Thanks so much for the response, Marcin! I did see your original repo, thanks for sending along. I'd love to get this going with EMS because it looks like I can just pass in the M2 scorer with: tuning-settings = "-mertdir $moses-bin-dir -mertargs='--sctype M2SCORER' -threads $cores" However it fails with: ERROR: Failed to run '/Users/kellymarchisio/L101Final/experiments/tuning/tmp.1/extractor.sh'. at /Users/kellymarchisio/L101Final/programs/mosesdecoder/scripts/training/mert-moses.pl  line 1775. cp: /Users/kellymarchisio/L101Final/experiments/tuning/tmp.1/moses.ini: No such file or directory There may be an error with the mert-moses script itself used with M2, because moses.ini was never created within tmp.1 Additionally, in extract.err, I see: Binary write mode is NOT selected Scorer type: M2SCORER name: case value: true Data::m_score_type M2Scorer Data::Scorer type from Scorer: M2Scorer loading nbest from run1.best100.out.gz Levenshtein distance is greater than source size. Levenshtein distance is greater than source size. extractor(67381,0x7fffde7dd3c0) malloc: *** mach_vm_map(size=3368542481395712) failed (error code=3)*** error: can't allocate region *** set a breakpoint in malloc_error_break to debug Exception: std::bad_alloc I'm curious if you've come across these issues (I'm interested why I'm seeing "Levenshtein distance is greater than source size.") and if you have any pointers for how I can get mert-moses.pl to work for me with M2Scorer.  Best, Kelly On Sat, Jan 13, 2018 at 9:13 PM, Kelly Marchisio <kellymarchi...@gmail.com> wrote: Thanks so much for the response, Marcin! I did see your original repo, thanks for sending along. I'd love to get this going with EMS because it looks like I can just pass in the M2 scorer with: tuning-settings = "-mertdir $moses-bin-dir -mertargs='--sctype M2SCORER' -threads $cores" However it fails with: ERROR: Failed to run '/Users/kellymarchisio/L101Final/experiments/tuning/tmp.1/extractor.sh'. at /Users/kellymarchisio/L101Final/programs/mosesdecoder/scripts/training/mert-moses.pl line 1775. cp: /Users/kellymarchisio/L101Final/experiments/tuning/tmp.1/moses.ini: No such file or directory There may be an error with the mert-moses script itself used with M2, because moses.ini was never created within tmp.1 Additionally, in extract.err, I see: Binary write mode is NOT selected Scorer type: M2SCORER name: case value: true Data::m_score_type M2Scorer Data::Scorer type from Scorer: M2Scorer loading nbest from run1.best100.out.gz Levenshtein distance is greater than source size. Levenshtein distance is greater than source size. extractor(67381,0x7fffde7dd3c0) malloc: *** mach_vm_map(size=3368542481395712) failed (error code=3)*** error: can't allocate region *** set a breakpoint in malloc_error_break to debug Exception: std::bad_alloc I'm curious if you've come across these issues (I'm interested why I'm seeing "Levenshtein distance is greater than source size.") and if you have any pointers for how I can get mert-moses.pl to work for me with M2Scorer.  Best, Kelly On Fri, Jan 12, 2018 at 9:53 PM, Marcin Junczys-Dowmunt <junc...@amu.edu.pl> wrote: Hi, We never really used it with EMS, so I do not think anyone can help you here. Did you have a look at the original repo: https://github.com/grammatical/baselines-emnlp2016 ? Otherwise we can probably take this off-list and try to help you personally 😊  From: Kelly Marchisio Sent: Friday, January 12, 2018 6:20 PM To: moses-support Subject: [Moses-support] M2 Scorer in EMS for Grammatical Error Correction  Does anyone have experience using the M2 scorer for grammatical error correction with EMS for tuning and evaluation? Junczys-Dowmunt & Grundkiewicz (2016) use M2 (https://github.com/grammatical/baselines-emnlp2016/tree/c4fbcc09b45a46c7c46bdda2ba10484fa16e8f82), but I see no examples of using it with EMS.  Does anyone have experience or advice on how I can use the M2 scorer for GEC in my project? I'm having trouble figuring out how to incorporate it without an example. (for instance, how best to setup experiment.meta & the config file to incorporate it)   Â
_______________________________________________ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support