On Mon, Jun 24, 2013 at 1:35 PM, Michael Kazekin <kazm...@hotmail.com>wrote:

> I agree with you, I should have mentioned earlier that it would be good to
> separate "noise from data" and deal with only what is separable. Of course
> there is no truly deterministic implementation of any algorithm,


I am pretty sure "2.0 + 2.0" is pretty deterministic  :)


> but I would expect to see "credible" results on a macro-level (in our case
> it would be nice to see the same order of recommendations given the fixed
> seed). It seems important for experiments (and for testing, as mentioned),
> isn't it?
>

Yes for unit tests you usually would want to fix the seed if it means that
assertion may fail  with a non-zero probability. There are definitely a lot
of such cases in Mahout.

Another question is that afaik ALS-WR is deterministic by its inception, so
> I'm trying to understand the reasons (and I'm assuming there are some) for
> the specific implementation design.
>
> Thanks for a free lunch! ;)
> Cheers,Mike.
>
> > Date: Mon, 24 Jun 2013 13:13:20 -0700
> > Subject: Re: Consistent repeatable results for distributed ALS-WR
> recommender
> > From: dlie...@gmail.com
> > To: user@mahout.apache.org
> >
> > On Mon, Jun 24, 2013 at 1:07 PM, Michael Kazekin <kazm...@hotmail.com
> >wrote:
> >
> > > Thank you, Ted!
> > > Any feedback on the usefulness of such functionality? Could it increase
> > > the 'playability' of the recommender?
> > >
> >
> > Almost all methods -- even deterministic ones -- will have a "credible
> > interval" of prediction simply because method assumptions do not hold
> 100%
> > in real life, real data. So what you really want to know in such cases is
> > the credible interval rather than whether method is deterministic or not.
> > Non-deterministic methods might very well be more accurate than
> > deterministic ones in this context, and, therefore, more "useful". Also
> > see: "no free lunch theorem".
> >
> >
> > > > From: ted.dunn...@gmail.com
> > > > Date: Mon, 24 Jun 2013 20:46:43 +0100
> > > > Subject: Re: Consistent repeatable results for distributed ALS-WR
> > > recommender
> > > > To: user@mahout.apache.org
> > > >
> > > > See org.apache.mahout.common.RandomUtils#useTestSeed
> > > >
> > > > It provides the ability to freeze the initial seed.  Normally this is
> > > only
> > > > used during testing, but you could use it.
> > > >
> > > >
> > > > On Mon, Jun 24, 2013 at 8:44 PM, Michael Kazekin <
> kazm...@hotmail.com
> > > >wrote:
> > > >
> > > > > Thanks a lot!
> > > > > Do you know by any chance what are the underlying reasons for
> including
> > > > > such mandatory random seed initialization?
> > > > > Do you see any sense in providing another option, such as filling
> them
> > > > > with zeroes in order to ensure the consistency and repeatability?
> (for
> > > > > example we might want to track and compare the generated
> recommendation
> > > > > lists for different parameters, such as the number of features or
> > > number of
> > > > > iterations etc.)
> > > > > M.
> > > > >
> > > > >
> > > > > > Date: Mon, 24 Jun 2013 19:51:44 +0200
> > > > > > Subject: Re: Consistent repeatable results for distributed ALS-WR
> > > > > recommender
> > > > > > From: s...@apache.org
> > > > > > To: user@mahout.apache.org
> > > > > >
> > > > > > The matrices of the factorization are initalized randomly. If you
> > > fix the
> > > > > > random seed (would require modification of the code) you should
> get
> > > > > exactly
> > > > > > the same results.
> > > > > > Am 24.06.2013 13:49 schrieb "Michael Kazekin" <
> kazm...@hotmail.com>:
> > > > > >
> > > > > > > Hi!
> > > > > > > Should I assume that under same dataset and same parameters for
> > > > > factorizer
> > > > > > > and recommender I will get the same results for any specific
> user?
> > > > > > > My current understanding that theoretically ALS-WR algorithm
> could
> > > > > > > guarantee this, but I was wondering could be there any numeric
> > > method
> > > > > > > issues and/or implementation-specific concerns.
> > > > > > > Would appreciate any highlight on this issue.
> > > > > > > Mike.
> > > > > > >
> > > > > > >
> > > > > > >
> > > > >
> > >
> > >
>
>

Reply via email to