Hi, Two potential bugs I found trolling the source code.
Bug 1: FName::init in FeatureVector.cpp can renumber features in a race condition. void FName::init(const string& name) { #ifdef WITH_THREADS //reader lock boost::shared_lock<boost::shared_mutex> lock(m_idLock); #endif Name2Id::iterator i = name2id.find(name); if (i != name2id.end()) { m_id = i->second; } else { #ifdef WITH_THREADS //release the reader lock, and upgrade to writer lock lock.unlock(); boost::unique_lock<boost::shared_mutex> write_lock(m_idLock); #endif //Need to check again if the id is in the map, as someone may have added //it while we were waiting on the writer lock. if (i != name2id.end()) { m_id = i->second; } else { m_id = name2id.size(); name2id[name] = m_id; id2name.push_back(name); } } } The value of i does not change because some locking is done. It will still be name2id.end(). Therefore, the code m_id = i->second will never never be executed. Bug 2: I'm not sure what alreadyScored in moses/GlobalLexicalModelUnlimited.cpp does, but it looks like the intended use isn't being met. void GlobalLexicalModelUnlimited::AddFeature(ScoreComponentCollection* accumulator, StringHash alreadyScored, string sourceTrigger, string sourceWord, string targetTrigger, string targetWord) const { // snip bunch of code. alreadyScored[sourceWord] = 1; } The line alreadyScored[sourceWord] = 1 does nothing useful because the parameter alreadyScored was passed by value. Note: typedef std::map< std::string, short > StringHash; I think it was meant to be passed by reference. Finally, a performance rant. A lot of these features are using maps whose key is std::string. Instead, the features should use FactorCollection to intern their strings then do pointer comparison inside Evaluate. Kenneth _______________________________________________ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support