wild guessing here: in TranslationTask::Run, I see there are many
alternatives for processing the sentence, like doLatticeMBR etc, not
just runing Manager::ProcessSentence()
Maybe one of these alternatives must be run for processing confusion
networks?

cheers

Sylvain


On 26/04/12 15:53, Sylvain Raybaud wrote:
> Hi Barrow
> 
>   Thanks for the tip, that sounds likely indeed. I'll try it again but
> last time I ran the software through valgrind, I got so many errors in
> external libs that I just gave up.
> 
> In the meantime, here is the complete fonction that handles the
> decoding, in case someone sees something obviously wrong in here...
> 
> static void moses_translate_phonemes(manager_data_t * pool,
> translation_pair_t * pair) {
>     debug("starting");
> 
>     const TranslationSystem& system =
> StaticData::Instance().GetTranslationSystem(TranslationSystem::DEFAULT);
> /* there is only one translation system for now */
>     const StaticData &staticData = StaticData::Instance();
>     const vector<FactorType> &inputFactorOrder =
> staticData.GetInputFactorOrder();
> 
>     MyConfusionNet * cn =
> phonemes_to_cn(pool->mp_engine->phonemes_cm,pair->source->phonemes,pool->mp_config->cn_width,pool->mp_config->cn_thresh,inputFactorOrder);
> 
>     Manager * manager = new Manager(*cn,staticData.GetSearchAlgorithm(),
> &system);
>     manager->ProcessSentence();
>     const Hypothesis* hypo = manager->GetBestHypothesis();
> 
>     string hyp = moses_get_hyp(hypo);
>     char * hyp_ret = (char*)malloc((strlen(hyp.c_str())+1)*sizeof(char));
>     strcpy(hyp_ret,hyp.c_str());
> 
>     pair->translation_score = UntransformScore(hypo->GetScore());
>     translation_pair_set_target(pair, hyp_ret,NULL);
> 
>     delete manager;
>     delete cn;
> 
> }
> 
> cheers,
> 
> Sylvain
> 
> On 26/04/12 13:49, Barry Haddow wrote:
>> Hi Sylvain
>>
>> I'm not familiar with this part of the code, but the strange score suggests 
>> that there's some uninitialised memory. You could try running through 
>> valgrind 
>> and it might give some clues,
>>
>> cheers - Barry
>>
>> On Thursday 26 Apr 2012 12:24:11 Sylvain Raybaud wrote:
>>> Hi all
>>>
>>>   I'm using Moses API for decoding a confusion network. The CN is
>>> created from the output of an ASR engine and a confusion matrix. More
>>> precisely (even though it's probably irrelevant to my problem), the ASR
>>> engine provides a string of phonemes (1-best) and the confusion matrix
>>> provides alternatives for each phonemes (the idea was described in Jiang
>>> et al., _Phonetic representation based speech translation_, MT Summit
>>> XIII, 2011).
>>>
>>> When the CN is dumped into a file and I use
>>> moses -f moses.phonemes.cn.ini < CN
>>> to decode it, everything is fine.
>>>
>>> But when I use Moses API (loading the same configuration file), I get
>>> incomplete translations, like:
>>>
>>> ASR output (French): "nous font sont toujours chimistes plume
>>> rassembleront ch je trouve que le office de ce tout de suite"
>>> Phonetic representation: "n u f on s on t t u ge u r ch i m i s t z p l
>>> y m r a s an b l swa r on ch ge swa t r u v k swa l swa oh f i s swa d
>>> swa s swa t u d s h i t"
>>> Translation: "of"
>>> score: 903011968.000000
>>>
>>> Note that the transcription is poor (I haven't really tuned the ASR
>>> engine), but still, the translation ought to be more than just "of".
>>> Sometimes it's several words, I guess it's a phrase in the phrase table.
>>> The word generally seems to be the translation of a word in the source
>>> sentence.
>>> When I use moses on command line to translate either the 1-best or the
>>> the CN, I get a reasonable translation. When I use the API to translate
>>> the 1-best phonetic representation, I also get a reasonable translation.
>>> I think the CN object is created correctly because moses loads it and
>>> prints it prior to decoding (this is normal verbose behavior). I also
>>> tried to create a PCN object, and got exactly the same results. So I
>>> guess the problem is either how I tell moses to decode it or how I
>>> extract the result from the Hypothesis object. But I'm clueless about
>>> what's the problem is here, since the code is working when I just
>>> translate a string. The translation score seems ridiculously high too.
>>> I'll give below the corresponding code.
>>>
>>> Decoding and hypothesis extraction:
>>> ***********************************
>>> [...]
>>> Manager * manager = new Manager(*cn,staticData.GetSearchAlgorithm(),
>>> &system);
>>> manager->ProcessSentence();
>>> const Hypothesis* hypo = manager->GetBestHypothesis();
>>> string hyp = moses_get_hyp(hypo);
>>> [...]
>>> pair->translation_score = UntransformScore(hypo->GetScore());
>>> [...]
>>>
>>> string moses_get_hyp(const Hypothesis* hypo) {
>>>     return hypo->GetTargetPhraseStringRep();
>>> }
>>>
>>>
>>> Creation of the CN:
>>> *******************
>>>
>>> /** new class derived from ConfusionNet, with a new method for directly
>>> creating CN */
>>> class MyConfusionNet : public ConfusionNet {
>>>   public:
>>>     void addCol(Column);
>>> };
>>>
>>> void MyConfusionNet::addCol(Column col) {
>>>     data.push_back(col);
>>> }
>>>
>>> /** create a column of the CN */
>>> static MyConfusionNet::Column create_phoneme_col(confusion_matrix_t *
>>> cm, const char * ph, int width, double thresh, const vector<FactorType>
>>> &factor_order) {
>>>
>>>     MyConfusionNet::Column col;
>>>
>>>     phoneme_conf_t * ph_conf =
>>> (phoneme_conf_t*)g_hash_table_lookup(cm->matrix,ph);
>>>     if(ph_conf==NULL) {
>>>         return col;
>>>     }
>>>
>>>     int i;
>>>     for(i = 0; i<cm->n_phonemes; i++) {
>>>         vector<float> scores;
>>>         float score = float(ph_conf[i].p);
>>>         if((width<=0 || i<width) && (thresh<=0 || score>=thresh)) {
>>>             string wd(cm->phonemes[ph_conf[i].phoneme]);
>>>             Word word;
>>>             word.CreateFromString(Input,factor_order,wd,false);
>>>             scores.push_back(score);
>>>             pair<Word,vector<float> > linkdata(word,scores);
>>>             col.push_back(linkdata);
>>>         }
>>>     }
>>>
>>>     return col;
>>> }
>>>
>>> /** Creates a confusion network from a NULL terminated phonemes list and
>>> a phonemes confusion matrix */
>>> static MyConfusionNet * phonemes_to_cn(confusion_matrix_t * cm,const
>>> char ** phonemes, int width, double thresh, const vector<FactorType>
>>> &factor_order) {
>>>     debug("start");
>>>
>>>     MyConfusionNet * cn = new MyConfusionNet();
>>>
>>>     int i = 0;
>>>     while(phonemes[i]!=NULL) {
>>>         debug("%s",phonemes[i]);
>>>         MyConfusionNet::Column col =
>>> create_phoneme_col(cm,phonemes[i],width,thresh,factor_order);
>>>         cn->addCol(col);
>>>         i += 1;
>>>     }
>>>
>>>     return cn;
>>> }
>>>
>>> So, if anyone has an idea about what's wrong here.... thanks!
>>>
>>> cheers,
>>>
> 
> 


-- 
Sylvain Raybaud
_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to