the original question was about speed of decoding, not potential
quality improvements due to filtering

clearly, if you can identify phrases to prune then you will get a
speed-boost.  but this is not true for the general case and my advice
was for the general case.

Miles

2009/5/4 Marcin Miłkowski <milek...@o2.pl>:
> Miles Osborne pisze:
>>
>> filtering etc might give you a speed-up (eg  a constant one --less
>> stuff to load) but if filtering is safe w.r.t to the source data, then
>> you shouldn't see much here.
>>
>> (pruning the table should make it faster since there will be fewer
>> options to consider, but this is not safe)
>
> Actually, this is contrary to what Johnson et al. say in their paper, and my
> subjective (not measured) experience was definitely in their favor. As long
> as you have really clean data, you don't want to lose any of it, but if
> alignments are lousy, translations ambiguous etc., you want to cut it off,
> and Jan wants to do that (see his post).
>
> I was even filtering more and got better results by heuristically discarding
> unprobable phrases from the phrase table (based on Fran's idea he had about
> discarding unprobable alignments). Again, this is subjective, anecdotal,
> etc., but before that I was getting complete garbage.
>
> Note: my pair was English-Polish and Polish English.
>
>> i guess you might also see fewer page faults and the like with a
>> smaller model and that will help matters.
>
> btw, quantising and binarising language models helps as well
>
> Marcin
>
>> but in general, the beam size is the most direct way to make it faster.
>
>
>
>> Miles
>>
>> 2009/5/4 Francis Tyers <fty...@prompsit.com>:
>>>
>>> El lun, 04-05-2009 a las 14:08 +0100, Miles Osborne escribió:
>>>>
>>>> actually, i think Jan wants a speedup, not a space saving.
>>>
>>> Does filtering the phrase table before translation not decrease the
>>> total time to make a translation (including the time taken to load the
>>> phrase table etc.)?  That was my experience, and it appears to be
>>> something that he hasn't done, but perhaps my set up is unusual...
>>>
>>> Fran
>>>
>>>> your best bet is to reduce the size of the beam:
>>>>
>>>> http://www.statmt.org/moses/?n=Moses.Tutorial#ntoc6
>>>>
>>>> Miles
>>>> 2009/5/4 Francis Tyers <fty...@prompsit.com>:
>>>>>
>>>>> El lun, 04-05-2009 a las 14:54 +0200, Jan Helak escribió:
>>>>>>
>>>>>> Hello everyone :)
>>>>>>
>>>>>> I try to build two-way translator for polish and english languages as
>>>>>> a
>>>>>> project on one of my subjects. By now, I created a one-way translator
>>>>>> (polish->english) as a beta version, but severals problems have came:
>>>>>>
>>>>>> (1) A translator must work in two-ways. How to achieve this?
>>>>>
>>>>> Make another directory and train two models.
>>>>>
>>>>>> (2) Time of traslating for phrases is two long ( 4 min. for one
>>>>>> sentence). How to accelerate this  (decresing a quality of translation
>>>>>> is acceptable).
>>>>>
>>>>> You can try filtering the phrase table before translating (see PART V -
>>>>> Filtering Test Data), or using a binarised phrase table (see Memory-Map
>>>>> LM and Phrase Table).
>>>>>
>>>>>
>>>>> http://ufallab2.ms.mff.cuni.cz/~bojar/teaching/NPFL087/export/HEAD/lectures/02-phrase-based-Moses-installation-tutorial.html
>>>>>
>>>>> Regards,
>>>>>
>>>>> Fran
>>>>>
>>>>> _______________________________________________
>>>>> Moses-support mailing list
>>>>> Moses-support@mit.edu
>>>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>>>>
>>>>
>>>>
>>>
>>
>>
>>
>
>



-- 
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.

_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to