This is a common chestnut that gets trotted out commonly, but I doubt that the effects that the OP was worried about where on the same scale. Non-commutativity of FP arithmetic on doubles rarely has a very large effect.
On Mon, Jun 24, 2013 at 11:17 PM, Michael Kazekin <kazm...@hotmail.com>wrote: > Any algorithm is non-deterministic because of non-deterministic behavior > of underlying hardware, of course :) But that's an offtop. I'm talking > about specific implementation of specific algorithm, and in general I'd > like to know that at least some very general properties of the algorithm > implementation conserve (and why did authors added intentional > non-deterministic component to implementation). > > Date: Mon, 24 Jun 2013 14:43:59 -0700 > > Subject: Re: Consistent repeatable results for distributed ALS-WR > recommender > > From: dlie...@gmail.com > > To: user@mahout.apache.org > > > > The point of non-determinism of parallel processing is well known. It > was a > > joke to remind to be careful with absolute statements like "never > exists", > > as they are very hard to prove. Bringing more positive examples still > does > > not prove an absolute statement made, or make it any stronger from the > math > > logic point of view. Whereas there's enough just one counter-example to > > disprove them. :) > > > > > > On Mon, Jun 24, 2013 at 2:29 PM, Koobas <koo...@gmail.com> wrote: > > > > > On Mon, Jun 24, 2013 at 5:07 PM, Dmitriy Lyubimov <dlie...@gmail.com> > > > wrote: > > > > > > > On Mon, Jun 24, 2013 at 1:35 PM, Michael Kazekin < > kazm...@hotmail.com > > > > >wrote: > > > > > > > > > I agree with you, I should have mentioned earlier that it would be > good > > > > to > > > > > separate "noise from data" and deal with only what is separable. Of > > > > course > > > > > there is no truly deterministic implementation of any algorithm, > > > > > > > > > > > > I am pretty sure "2.0 + 2.0" is pretty deterministic :) > > > > > > > > > > > Few things are naturally deterministic in parallel computing. > > > Many parallel sorting algorithms are non-deterministic. > > > > > > In floating point commutativity is gone. > > > So, while 2.0 + 2.0 is deterministic, 1.0 + 10.0 + 100.0 is not 1.0 + > 100.0 > > > + 10.0. > > > Again, you don't know what happens if the reduction is done in > parallel. > > > > > > > > > > > > > > but I would expect to see "credible" results on a macro-level (in > our > > > > case > > > > > it would be nice to see the same order of recommendations given the > > > fixed > > > > > seed). It seems important for experiments (and for testing, as > > > > mentioned), > > > > > isn't it? > > > > > > > > > > > > > Yes for unit tests you usually would want to fix the seed if it means > > > that > > > > assertion may fail with a non-zero probability. There are > definitely a > > > lot > > > > of such cases in Mahout. > > > > > > > > Another question is that afaik ALS-WR is deterministic by its > inception, > > > so > > > > > I'm trying to understand the reasons (and I'm assuming there are > some) > > > > for > > > > > the specific implementation design. > > > > > > > > > > Thanks for a free lunch! ;) > > > > > Cheers,Mike. > > > > > > > > > > > Date: Mon, 24 Jun 2013 13:13:20 -0700 > > > > > > Subject: Re: Consistent repeatable results for distributed ALS-WR > > > > > recommender > > > > > > From: dlie...@gmail.com > > > > > > To: user@mahout.apache.org > > > > > > > > > > > > On Mon, Jun 24, 2013 at 1:07 PM, Michael Kazekin < > > > kazm...@hotmail.com > > > > > >wrote: > > > > > > > > > > > > > Thank you, Ted! > > > > > > > Any feedback on the usefulness of such functionality? Could it > > > > increase > > > > > > > the 'playability' of the recommender? > > > > > > > > > > > > > > > > > > > Almost all methods -- even deterministic ones -- will have a > > > "credible > > > > > > interval" of prediction simply because method assumptions do not > hold > > > > > 100% > > > > > > in real life, real data. So what you really want to know in such > > > cases > > > > is > > > > > > the credible interval rather than whether method is > deterministic or > > > > not. > > > > > > Non-deterministic methods might very well be more accurate than > > > > > > deterministic ones in this context, and, therefore, more > "useful". > > > Also > > > > > > see: "no free lunch theorem". > > > > > > > > > > > > > > > > > > > > From: ted.dunn...@gmail.com > > > > > > > > Date: Mon, 24 Jun 2013 20:46:43 +0100 > > > > > > > > Subject: Re: Consistent repeatable results for distributed > ALS-WR > > > > > > > recommender > > > > > > > > To: user@mahout.apache.org > > > > > > > > > > > > > > > > See org.apache.mahout.common.RandomUtils#useTestSeed > > > > > > > > > > > > > > > > It provides the ability to freeze the initial seed. Normally > > > this > > > > is > > > > > > > only > > > > > > > > used during testing, but you could use it. > > > > > > > > > > > > > > > > > > > > > > > > On Mon, Jun 24, 2013 at 8:44 PM, Michael Kazekin < > > > > > kazm...@hotmail.com > > > > > > > >wrote: > > > > > > > > > > > > > > > > > Thanks a lot! > > > > > > > > > Do you know by any chance what are the underlying reasons > for > > > > > including > > > > > > > > > such mandatory random seed initialization? > > > > > > > > > Do you see any sense in providing another option, such as > > > filling > > > > > them > > > > > > > > > with zeroes in order to ensure the consistency and > > > repeatability? > > > > > (for > > > > > > > > > example we might want to track and compare the generated > > > > > recommendation > > > > > > > > > lists for different parameters, such as the number of > features > > > or > > > > > > > number of > > > > > > > > > iterations etc.) > > > > > > > > > M. > > > > > > > > > > > > > > > > > > > > > > > > > > > > Date: Mon, 24 Jun 2013 19:51:44 +0200 > > > > > > > > > > Subject: Re: Consistent repeatable results for > distributed > > > > ALS-WR > > > > > > > > > recommender > > > > > > > > > > From: s...@apache.org > > > > > > > > > > To: user@mahout.apache.org > > > > > > > > > > > > > > > > > > > > The matrices of the factorization are initalized > randomly. If > > > > you > > > > > > > fix the > > > > > > > > > > random seed (would require modification of the code) you > > > should > > > > > get > > > > > > > > > exactly > > > > > > > > > > the same results. > > > > > > > > > > Am 24.06.2013 13:49 schrieb "Michael Kazekin" < > > > > > kazm...@hotmail.com>: > > > > > > > > > > > > > > > > > > > > > Hi! > > > > > > > > > > > Should I assume that under same dataset and same > parameters > > > > for > > > > > > > > > factorizer > > > > > > > > > > > and recommender I will get the same results for any > > > specific > > > > > user? > > > > > > > > > > > My current understanding that theoretically ALS-WR > > > algorithm > > > > > could > > > > > > > > > > > guarantee this, but I was wondering could be there any > > > > numeric > > > > > > > method > > > > > > > > > > > issues and/or implementation-specific concerns. > > > > > > > > > > > Would appreciate any highlight on this issue. > > > > > > > > > > > Mike. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >