I haven't tested kenlm on Cygwin, but it could work. Can you run tests? 1) Install Boost. Cygwin's package manager should provide it.
2) Run kenlm tests. wget http://kheafield.com/code/kenlm.tar.gz tar xzf kenlm.tar.gz cd kenlm ./test.sh On 03/25/11 06:44, Sudip Datta wrote: > I've used gcc in cygwin to compile both Moses and IRSTLM. But as you and > Barry pointed out I'll try to use kenlm (can't use srilm due to > licensing restrictions) and if that doesn't work try srilm at my college. > > I think the segfault occurs in Hypothesis.cpp at: > > /m_ffStates[i] = ffs[i]->Evaluate( > *this, > m_prevHypo ? m_prevHypo->m_ffStates[i] : NULL, > &m_scoreBreakdown); > > /May be, it gives some clue in identifying the issue. > > Thanks again --Sudip. > > On Fri, Mar 25, 2011 at 3:51 PM, Hieu Hoang <hieuho...@gmail.com > <mailto:hieuho...@gmail.com>> wrote: > > If you've compiled with gcc in cygwin, you can use any lm. The > stipulation of using only the internal lm only applies Iif you use > visual studio. > > However, I would personally use srilm to start with as I'm not sure if > the other lm are fully tested on cygwin > > Hieu > Sent from my flying horse > > On 25 Mar 2011, at 10:06 AM, Barry Haddow <bhad...@inf.ed.ac.uk > <mailto:bhad...@inf.ed.ac.uk>> wrote: > > > Hi Sudip > > > > If you're using windows, then you should use the internal LM. See > here: > > http://www.statmt.org/moses/?n=Moses.FAQ#ntoc9 > > afaik this is still the case. > > > > Also, there are a couple of odd things in your setup. Firstly, > you've built a > > 3-gram LM, but you're telling moses that it's 2-gram: > >> [lmodel-file] > >> 1 0 2 /cygdrive/d/moses/fi-en/en.irstlm.gz > > This shouldn't matter, but just in case you're unaware. > > > > Also, both the words in your input sentence are unknown. Did the > phrase table > > build OK? Maybe you could use zless or zcat to extract and post > the first few > > lines of it, > > > > best regards - Barry > > > > On Friday 25 March 2011 08:13, Sudip Datta wrote: > >> Hi, > >> > >> I am a noob at using Moses and have been trying to build a model > and then > >> use the decoder to translate test sentences. I used the following > command > >> for training: > >> > >> * train-model.perl --root-dir > /cygdrive/d/moses/fi-en/**fienModel/ --corpus > >> /cygdrive/d/moses/fi-en/temp/**clean --f fi --e en --lm > >> 0:3:/cygdrive/d/moses/fi-en/**en.irstlm.gz:1* > >> > >> The process ended cleanly with the following moses.ini file: > >> > >> *# input factors > >> [input-factors] > >> 0 > >> > >> # mapping steps > >> [mapping] > >> 0 T 0 > >> > >> # translation tables: table type (hierarchical(0), textual (0), > binary > >> (1)), source-factors, target-factors, number of scores, file > >> # OLD FORMAT is still handled for back-compatibility > >> # OLD FORMAT translation tables: source-factors, target-factors, > number of > >> scores, file > >> # OLD FORMAT a binary table type (1) is assumed > >> [ttable-file] > >> 0 0 0 5 /cygdrive/d/moses/fi-en/fienModel//model/phrase-table.gz > >> > >> # no generation models, no generation-file section > >> > >> # language models: type(srilm/irstlm), factors, order, file > >> [lmodel-file] > >> 1 0 2 /cygdrive/d/moses/fi-en/en.irstlm.gz > >> > >> > >> # limit on how many phrase translations e for each phrase f are > loaded > >> # 0 = all elements loaded > >> [ttable-limit] > >> 20 > >> > >> # distortion (reordering) weight > >> [weight-d] > >> 0.6 > >> > >> # language model weights > >> [weight-l] > >> 0.5000 > >> > >> > >> # translation model weights > >> [weight-t] > >> 0.2 > >> 0.2 > >> 0.2 > >> 0.2 > >> 0.2 > >> > >> # no generation models, no weight-generation section > >> > >> # word penalty > >> [weight-w] > >> -1 > >> > >> [distortion-limit] > >> 6* > >> > >> But the decoding step ends with a segfault with following output > for -v 3: > >> > >> *Defined parameters (per moses.ini or switch): > >> config: /cygdrive/d/moses/fi-en/fienModel/model/moses.ini > >> distortion-limit: 6 > >> input-factors: 0 > >> lmodel-file: 1 0 2 /cygdrive/d/moses/fi-en/en.irstlm.gz > >> mapping: 0 T 0 > >> ttable-file: 0 0 0 5 > >> /cygdrive/d/moses/fi-en/fienModel//model/phrase-tab > >> le.gz > >> ttable-limit: 20 > >> verbose: 100 > >> weight-d: 0.6 > >> weight-l: 0.5000 > >> weight-t: 0.2 0.2 0.2 0.2 0.2 > >> weight-w: -1 > >> input type is: text input > >> Loading lexical distortion models...have 0 models > >> Start loading LanguageModel /cygdrive/d/moses/fi-en/en.irstlm.gz > : [0.000] > >> secon > >> ds > >> In LanguageModelIRST::Load: nGramOrder = 2 > >> Loading LM file (no MAP) > >> iARPA > >> loadtxt() > >> 1-grams: reading 3195 entries > >> 2-grams: reading 13313 entries > >> 3-grams: reading 20399 entries > >> done > >> OOV code is 3194 > >> OOV code is 3194 > >> IRST: m_unknownId=3194 > >> creating cache for storing prob, state and statesize of ngrams > >> Finished loading LanguageModels : [1.000] seconds > >> About to LoadPhraseTables > >> Start loading PhraseTable > >> /cygdrive/d/moses/fi-en/fienModel//model/phrase-table. > >> gz : [1.000] seconds > >> filePath: /cygdrive/d/moses/fi-en/fienModel//model/phrase-table.gz > >> using standard phrase tables > >> PhraseDictionaryMemory: input=FactorMask<0> output=FactorMask<0> > >> Finished loading phrase tables : [1.000] seconds > >> IO from STDOUT/STDIN > >> Created input-output object : [1.000] seconds > >> The score component vector looks like this: > >> Distortion > >> WordPenalty > >> !UnknownWordPenalty > >> LM_2gram > >> PhraseModel_1 > >> PhraseModel_2 > >> PhraseModel_3 > >> PhraseModel_4 > >> PhraseModel_5 > >> Stateless: 1 Stateful: 2 > >> The global weight vector looks like this: 0.600 -1.000 1.000 > 0.500 0.200 > >> 0.200 > >> 0 > >> .200 0.200 0.200 > >> Translating: istuntokauden uudelleenavaaminen > >> > >> DecodeStep(): > >> outputFactors=FactorMask<0> > >> conflictFactors=FactorMask<> > >> newOutputFactors=FactorMask<0> > >> Translation Option Collection > >> > >> Total translation options: 2 > >> Total translation options pruned: 0 > >> translation options spanning from 0 to 0 is 1 > >> translation options spanning from 0 to 1 is 0 > >> translation options spanning from 1 to 1 is 1 > >> translation options generated in total: 2 > >> future cost from 0 to 0 is -100.136 > >> future cost from 0 to 1 is -200.271 > >> future cost from 1 to 1 is -100.136 > >> Collecting options took 0.000 seconds > >> added hyp to stack, best on stack, now size 1 > >> processing hypothesis from next stack > >> > >> creating hypothesis 1 from 0 ( ... ) > >> base score 0.000 > >> covering 0-0: istuntokauden > >> translated as: istuntokauden|UNK|UNK|UNK > >> score -100.136 + future cost -100.136 = -200.271 > >> unweighted feature scores: <<0.000, -1.000, -100.000, -2.271, > >> 0.000, 0.0 > >> 00, 0.000, 0.000, 0.000>> > >> added hyp to stack, best on stack, now size 1 > >> Segmentation fault (core dumped)* > >> > >> The only suspicious thing I found in above is the message '*creating > >> hypothesis 1 from 0*', but neither I know if it is the actual > problem and > >> why is it happening. I believe that problem is with the training > step since > >> the samples models that I downloaded from > >> http://www.statmt.org/moses/download/sample-models.tgz work fine. > >> > >> Prior to this, I constructed an IRST LM an used > clean-corpus-n.perl for > >> cleaning the decoder input. Looking at the archives, the closest > message I > >> could find was > http://thread.gmane.org/gmane.comp.nlp.moses.user/1478 but I > >> don't think I'm committing the same mistake as the author of that > message. > >> > >> I'll be delighted if anybody could provide any insights in this > problem or > >> requires me to provide any further details. > >> > >> Thanks, > >> > >> --Sudip. > > > > -- > > The University of Edinburgh is a charitable body, registered in > > Scotland, with registration number SC005336. > > > > _______________________________________________ > > Moses-support mailing list > > Moses-support@mit.edu <mailto:Moses-support@mit.edu> > > http://mailman.mit.edu/mailman/listinfo/moses-support > > > _______________________________________________ > Moses-support mailing list > Moses-support@mit.edu <mailto:Moses-support@mit.edu> > http://mailman.mit.edu/mailman/listinfo/moses-support > > > > > _______________________________________________ > Moses-support mailing list > Moses-support@mit.edu > http://mailman.mit.edu/mailman/listinfo/moses-support _______________________________________________ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support