Hi Barrow

  Thanks for the tip, that sounds likely indeed. I'll try it again but
last time I ran the software through valgrind, I got so many errors in
external libs that I just gave up.

In the meantime, here is the complete fonction that handles the
decoding, in case someone sees something obviously wrong in here...

static void moses_translate_phonemes(manager_data_t * pool,
translation_pair_t * pair) {
    debug("starting");

    const TranslationSystem& system =
StaticData::Instance().GetTranslationSystem(TranslationSystem::DEFAULT);
/* there is only one translation system for now */
    const StaticData &staticData = StaticData::Instance();
    const vector<FactorType> &inputFactorOrder =
staticData.GetInputFactorOrder();

    MyConfusionNet * cn =
phonemes_to_cn(pool->mp_engine->phonemes_cm,pair->source->phonemes,pool->mp_config->cn_width,pool->mp_config->cn_thresh,inputFactorOrder);

    Manager * manager = new Manager(*cn,staticData.GetSearchAlgorithm(),
&system);
    manager->ProcessSentence();
    const Hypothesis* hypo = manager->GetBestHypothesis();

    string hyp = moses_get_hyp(hypo);
    char * hyp_ret = (char*)malloc((strlen(hyp.c_str())+1)*sizeof(char));
    strcpy(hyp_ret,hyp.c_str());

    pair->translation_score = UntransformScore(hypo->GetScore());
    translation_pair_set_target(pair, hyp_ret,NULL);

    delete manager;
    delete cn;

}

cheers,

Sylvain

On 26/04/12 13:49, Barry Haddow wrote:
> Hi Sylvain
> 
> I'm not familiar with this part of the code, but the strange score suggests 
> that there's some uninitialised memory. You could try running through 
> valgrind 
> and it might give some clues,
> 
> cheers - Barry
> 
> On Thursday 26 Apr 2012 12:24:11 Sylvain Raybaud wrote:
>> Hi all
>>
>>   I'm using Moses API for decoding a confusion network. The CN is
>> created from the output of an ASR engine and a confusion matrix. More
>> precisely (even though it's probably irrelevant to my problem), the ASR
>> engine provides a string of phonemes (1-best) and the confusion matrix
>> provides alternatives for each phonemes (the idea was described in Jiang
>> et al., _Phonetic representation based speech translation_, MT Summit
>> XIII, 2011).
>>
>> When the CN is dumped into a file and I use
>> moses -f moses.phonemes.cn.ini < CN
>> to decode it, everything is fine.
>>
>> But when I use Moses API (loading the same configuration file), I get
>> incomplete translations, like:
>>
>> ASR output (French): "nous font sont toujours chimistes plume
>> rassembleront ch je trouve que le office de ce tout de suite"
>> Phonetic representation: "n u f on s on t t u ge u r ch i m i s t z p l
>> y m r a s an b l swa r on ch ge swa t r u v k swa l swa oh f i s swa d
>> swa s swa t u d s h i t"
>> Translation: "of"
>> score: 903011968.000000
>>
>> Note that the transcription is poor (I haven't really tuned the ASR
>> engine), but still, the translation ought to be more than just "of".
>> Sometimes it's several words, I guess it's a phrase in the phrase table.
>> The word generally seems to be the translation of a word in the source
>> sentence.
>> When I use moses on command line to translate either the 1-best or the
>> the CN, I get a reasonable translation. When I use the API to translate
>> the 1-best phonetic representation, I also get a reasonable translation.
>> I think the CN object is created correctly because moses loads it and
>> prints it prior to decoding (this is normal verbose behavior). I also
>> tried to create a PCN object, and got exactly the same results. So I
>> guess the problem is either how I tell moses to decode it or how I
>> extract the result from the Hypothesis object. But I'm clueless about
>> what's the problem is here, since the code is working when I just
>> translate a string. The translation score seems ridiculously high too.
>> I'll give below the corresponding code.
>>
>> Decoding and hypothesis extraction:
>> ***********************************
>> [...]
>> Manager * manager = new Manager(*cn,staticData.GetSearchAlgorithm(),
>> &system);
>> manager->ProcessSentence();
>> const Hypothesis* hypo = manager->GetBestHypothesis();
>> string hyp = moses_get_hyp(hypo);
>> [...]
>> pair->translation_score = UntransformScore(hypo->GetScore());
>> [...]
>>
>> string moses_get_hyp(const Hypothesis* hypo) {
>>     return hypo->GetTargetPhraseStringRep();
>> }
>>
>>
>> Creation of the CN:
>> *******************
>>
>> /** new class derived from ConfusionNet, with a new method for directly
>> creating CN */
>> class MyConfusionNet : public ConfusionNet {
>>   public:
>>     void addCol(Column);
>> };
>>
>> void MyConfusionNet::addCol(Column col) {
>>     data.push_back(col);
>> }
>>
>> /** create a column of the CN */
>> static MyConfusionNet::Column create_phoneme_col(confusion_matrix_t *
>> cm, const char * ph, int width, double thresh, const vector<FactorType>
>> &factor_order) {
>>
>>     MyConfusionNet::Column col;
>>
>>     phoneme_conf_t * ph_conf =
>> (phoneme_conf_t*)g_hash_table_lookup(cm->matrix,ph);
>>     if(ph_conf==NULL) {
>>         return col;
>>     }
>>
>>     int i;
>>     for(i = 0; i<cm->n_phonemes; i++) {
>>         vector<float> scores;
>>         float score = float(ph_conf[i].p);
>>         if((width<=0 || i<width) && (thresh<=0 || score>=thresh)) {
>>             string wd(cm->phonemes[ph_conf[i].phoneme]);
>>             Word word;
>>             word.CreateFromString(Input,factor_order,wd,false);
>>             scores.push_back(score);
>>             pair<Word,vector<float> > linkdata(word,scores);
>>             col.push_back(linkdata);
>>         }
>>     }
>>
>>     return col;
>> }
>>
>> /** Creates a confusion network from a NULL terminated phonemes list and
>> a phonemes confusion matrix */
>> static MyConfusionNet * phonemes_to_cn(confusion_matrix_t * cm,const
>> char ** phonemes, int width, double thresh, const vector<FactorType>
>> &factor_order) {
>>     debug("start");
>>
>>     MyConfusionNet * cn = new MyConfusionNet();
>>
>>     int i = 0;
>>     while(phonemes[i]!=NULL) {
>>         debug("%s",phonemes[i]);
>>         MyConfusionNet::Column col =
>> create_phoneme_col(cm,phonemes[i],width,thresh,factor_order);
>>         cn->addCol(col);
>>         i += 1;
>>     }
>>
>>     return cn;
>> }
>>
>> So, if anyone has an idea about what's wrong here.... thanks!
>>
>> cheers,
>>


-- 
Sylvain Raybaud
_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to