On Thu, 24 Mar 2011, Lane Schwartz wrote:

> While re-running an experiment (using the exact same configuration, same
> models, translating the same data) I noticed that occasionally, I get a
> slightly different 1-best list.
> 
> Upon further examination, running Moses with the same config multiple times
> often (but not always) produces slightly different n-best lists. This is a
> bit worrisome from the perspective of being able to re-run an experiment to
> reproduce results.
> 
> Is this a known issue?

An observation that came up a while ago is that nondeterminism during
decoding can occur because comparison among tied hypotheses falls back
to pointer comparisons as a last resort. If the pruning limit of a stack
falls in the middle of a bunch of tied hypotheses, the memory layout
of the decoder determines which ones are pruned and which ones are kept.
This can change between architectures, and in the presence of multiple
threads also randomly on the same machine. Could this be what you're
observing?

I remember discussing this with Hieu and Barry, and I think it was
decided not to do anything about it because it was too difficult to fix.
I had tried to fix it by falling back on hypothesis ids as a last
resort, but this caused some other problems.

Cheers,
Christian
_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to