Re: Consistent repeatable results for distributed ALS-WR recommender

2013-06-24 Thread Sebastian Schelter
The matrices of the factorization are initalized randomly. If you fix the
random seed (would require modification of the code) you should get exactly
the same results.
Am 24.06.2013 13:49 schrieb "Michael Kazekin" :

> Hi!
> Should I assume that under same dataset and same parameters for factorizer
> and recommender I will get the same results for any specific user?
> My current understanding that theoretically ALS-WR algorithm could
> guarantee this, but I was wondering could be there any numeric method
> issues and/or implementation-specific concerns.
> Would appreciate any highlight on this issue.
> Mike.
>
>
>


RE: Consistent repeatable results for distributed ALS-WR recommender

2013-06-24 Thread Michael Kazekin
Thanks a lot! 
Do you know by any chance what are the underlying reasons for including such 
mandatory random seed initialization?
Do you see any sense in providing another option, such as filling them with 
zeroes in order to ensure the consistency and repeatability? (for example we 
might want to track and compare the generated recommendation lists for 
different parameters, such as the number of features or number of iterations 
etc.)
M.


> Date: Mon, 24 Jun 2013 19:51:44 +0200
> Subject: Re: Consistent repeatable results for distributed ALS-WR recommender
> From: s...@apache.org
> To: user@mahout.apache.org
> 
> The matrices of the factorization are initalized randomly. If you fix the
> random seed (would require modification of the code) you should get exactly
> the same results.
> Am 24.06.2013 13:49 schrieb "Michael Kazekin" :
> 
> > Hi!
> > Should I assume that under same dataset and same parameters for factorizer
> > and recommender I will get the same results for any specific user?
> > My current understanding that theoretically ALS-WR algorithm could
> > guarantee this, but I was wondering could be there any numeric method
> > issues and/or implementation-specific concerns.
> > Would appreciate any highlight on this issue.
> > Mike.
> >
> >
> >
  

Re: Consistent repeatable results for distributed ALS-WR recommender

2013-06-24 Thread Ted Dunning
See org.apache.mahout.common.RandomUtils#useTestSeed

It provides the ability to freeze the initial seed.  Normally this is only
used during testing, but you could use it.


On Mon, Jun 24, 2013 at 8:44 PM, Michael Kazekin wrote:

> Thanks a lot!
> Do you know by any chance what are the underlying reasons for including
> such mandatory random seed initialization?
> Do you see any sense in providing another option, such as filling them
> with zeroes in order to ensure the consistency and repeatability? (for
> example we might want to track and compare the generated recommendation
> lists for different parameters, such as the number of features or number of
> iterations etc.)
> M.
>
>
> > Date: Mon, 24 Jun 2013 19:51:44 +0200
> > Subject: Re: Consistent repeatable results for distributed ALS-WR
> recommender
> > From: s...@apache.org
> > To: user@mahout.apache.org
> >
> > The matrices of the factorization are initalized randomly. If you fix the
> > random seed (would require modification of the code) you should get
> exactly
> > the same results.
> > Am 24.06.2013 13:49 schrieb "Michael Kazekin" :
> >
> > > Hi!
> > > Should I assume that under same dataset and same parameters for
> factorizer
> > > and recommender I will get the same results for any specific user?
> > > My current understanding that theoretically ALS-WR algorithm could
> > > guarantee this, but I was wondering could be there any numeric method
> > > issues and/or implementation-specific concerns.
> > > Would appreciate any highlight on this issue.
> > > Mike.
> > >
> > >
> > >
>


Re: Consistent repeatable results for distributed ALS-WR recommender

2013-06-24 Thread Koobas
I am guessing (comments welcome) that it is going to be difficult
to guarantee reproducibility under parallel execution conditions.
MapReduce has reduction in its name.
Reduction operations are the main cause of irreproducibility in parallel
codes,
because changing the order of summations changes the impact of roundoff
errors.


On Mon, Jun 24, 2013 at 3:46 PM, Ted Dunning  wrote:

> See org.apache.mahout.common.RandomUtils#useTestSeed
>
> It provides the ability to freeze the initial seed.  Normally this is only
> used during testing, but you could use it.
>
>
> On Mon, Jun 24, 2013 at 8:44 PM, Michael Kazekin  >wrote:
>
> > Thanks a lot!
> > Do you know by any chance what are the underlying reasons for including
> > such mandatory random seed initialization?
> > Do you see any sense in providing another option, such as filling them
> > with zeroes in order to ensure the consistency and repeatability? (for
> > example we might want to track and compare the generated recommendation
> > lists for different parameters, such as the number of features or number
> of
> > iterations etc.)
> > M.
> >
> >
> > > Date: Mon, 24 Jun 2013 19:51:44 +0200
> > > Subject: Re: Consistent repeatable results for distributed ALS-WR
> > recommender
> > > From: s...@apache.org
> > > To: user@mahout.apache.org
> > >
> > > The matrices of the factorization are initalized randomly. If you fix
> the
> > > random seed (would require modification of the code) you should get
> > exactly
> > > the same results.
> > > Am 24.06.2013 13:49 schrieb "Michael Kazekin" :
> > >
> > > > Hi!
> > > > Should I assume that under same dataset and same parameters for
> > factorizer
> > > > and recommender I will get the same results for any specific user?
> > > > My current understanding that theoretically ALS-WR algorithm could
> > > > guarantee this, but I was wondering could be there any numeric method
> > > > issues and/or implementation-specific concerns.
> > > > Would appreciate any highlight on this issue.
> > > > Mike.
> > > >
> > > >
> > > >
> >
>


RE: Consistent repeatable results for distributed ALS-WR recommender

2013-06-24 Thread Michael Kazekin
Thank you, Ted!
Any feedback on the usefulness of such functionality? Could it increase the 
'playability' of the recommender? 

> From: ted.dunn...@gmail.com
> Date: Mon, 24 Jun 2013 20:46:43 +0100
> Subject: Re: Consistent repeatable results for distributed ALS-WR recommender
> To: user@mahout.apache.org
> 
> See org.apache.mahout.common.RandomUtils#useTestSeed
> 
> It provides the ability to freeze the initial seed.  Normally this is only
> used during testing, but you could use it.
> 
> 
> On Mon, Jun 24, 2013 at 8:44 PM, Michael Kazekin wrote:
> 
> > Thanks a lot!
> > Do you know by any chance what are the underlying reasons for including
> > such mandatory random seed initialization?
> > Do you see any sense in providing another option, such as filling them
> > with zeroes in order to ensure the consistency and repeatability? (for
> > example we might want to track and compare the generated recommendation
> > lists for different parameters, such as the number of features or number of
> > iterations etc.)
> > M.
> >
> >
> > > Date: Mon, 24 Jun 2013 19:51:44 +0200
> > > Subject: Re: Consistent repeatable results for distributed ALS-WR
> > recommender
> > > From: s...@apache.org
> > > To: user@mahout.apache.org
> > >
> > > The matrices of the factorization are initalized randomly. If you fix the
> > > random seed (would require modification of the code) you should get
> > exactly
> > > the same results.
> > > Am 24.06.2013 13:49 schrieb "Michael Kazekin" :
> > >
> > > > Hi!
> > > > Should I assume that under same dataset and same parameters for
> > factorizer
> > > > and recommender I will get the same results for any specific user?
> > > > My current understanding that theoretically ALS-WR algorithm could
> > > > guarantee this, but I was wondering could be there any numeric method
> > > > issues and/or implementation-specific concerns.
> > > > Would appreciate any highlight on this issue.
> > > > Mike.
> > > >
> > > >
> > > >
> >
  

Re: Consistent repeatable results for distributed ALS-WR recommender

2013-06-24 Thread Dmitriy Lyubimov
On Mon, Jun 24, 2013 at 1:07 PM, Michael Kazekin wrote:

> Thank you, Ted!
> Any feedback on the usefulness of such functionality? Could it increase
> the 'playability' of the recommender?
>

Almost all methods -- even deterministic ones -- will have a "credible
interval" of prediction simply because method assumptions do not hold 100%
in real life, real data. So what you really want to know in such cases is
the credible interval rather than whether method is deterministic or not.
Non-deterministic methods might very well be more accurate than
deterministic ones in this context, and, therefore, more "useful". Also
see: "no free lunch theorem".


> > From: ted.dunn...@gmail.com
> > Date: Mon, 24 Jun 2013 20:46:43 +0100
> > Subject: Re: Consistent repeatable results for distributed ALS-WR
> recommender
> > To: user@mahout.apache.org
> >
> > See org.apache.mahout.common.RandomUtils#useTestSeed
> >
> > It provides the ability to freeze the initial seed.  Normally this is
> only
> > used during testing, but you could use it.
> >
> >
> > On Mon, Jun 24, 2013 at 8:44 PM, Michael Kazekin  >wrote:
> >
> > > Thanks a lot!
> > > Do you know by any chance what are the underlying reasons for including
> > > such mandatory random seed initialization?
> > > Do you see any sense in providing another option, such as filling them
> > > with zeroes in order to ensure the consistency and repeatability? (for
> > > example we might want to track and compare the generated recommendation
> > > lists for different parameters, such as the number of features or
> number of
> > > iterations etc.)
> > > M.
> > >
> > >
> > > > Date: Mon, 24 Jun 2013 19:51:44 +0200
> > > > Subject: Re: Consistent repeatable results for distributed ALS-WR
> > > recommender
> > > > From: s...@apache.org
> > > > To: user@mahout.apache.org
> > > >
> > > > The matrices of the factorization are initalized randomly. If you
> fix the
> > > > random seed (would require modification of the code) you should get
> > > exactly
> > > > the same results.
> > > > Am 24.06.2013 13:49 schrieb "Michael Kazekin" :
> > > >
> > > > > Hi!
> > > > > Should I assume that under same dataset and same parameters for
> > > factorizer
> > > > > and recommender I will get the same results for any specific user?
> > > > > My current understanding that theoretically ALS-WR algorithm could
> > > > > guarantee this, but I was wondering could be there any numeric
> method
> > > > > issues and/or implementation-specific concerns.
> > > > > Would appreciate any highlight on this issue.
> > > > > Mike.
> > > > >
> > > > >
> > > > >
> > >
>
>


RE: Consistent repeatable results for distributed ALS-WR recommender

2013-06-24 Thread Michael Kazekin
I agree with you, I should have mentioned earlier that it would be good to 
separate "noise from data" and deal with only what is separable. Of course 
there is no truly deterministic implementation of any algorithm, but I would 
expect to see "credible" results on a macro-level (in our case it would be nice 
to see the same order of recommendations given the fixed seed). It seems 
important for experiments (and for testing, as mentioned), isn't it? 
Another question is that afaik ALS-WR is deterministic by its inception, so I'm 
trying to understand the reasons (and I'm assuming there are some) for the 
specific implementation design.

Thanks for a free lunch! ;)
Cheers,Mike.

> Date: Mon, 24 Jun 2013 13:13:20 -0700
> Subject: Re: Consistent repeatable results for distributed ALS-WR recommender
> From: dlie...@gmail.com
> To: user@mahout.apache.org
> 
> On Mon, Jun 24, 2013 at 1:07 PM, Michael Kazekin wrote:
> 
> > Thank you, Ted!
> > Any feedback on the usefulness of such functionality? Could it increase
> > the 'playability' of the recommender?
> >
> 
> Almost all methods -- even deterministic ones -- will have a "credible
> interval" of prediction simply because method assumptions do not hold 100%
> in real life, real data. So what you really want to know in such cases is
> the credible interval rather than whether method is deterministic or not.
> Non-deterministic methods might very well be more accurate than
> deterministic ones in this context, and, therefore, more "useful". Also
> see: "no free lunch theorem".
> 
> 
> > > From: ted.dunn...@gmail.com
> > > Date: Mon, 24 Jun 2013 20:46:43 +0100
> > > Subject: Re: Consistent repeatable results for distributed ALS-WR
> > recommender
> > > To: user@mahout.apache.org
> > >
> > > See org.apache.mahout.common.RandomUtils#useTestSeed
> > >
> > > It provides the ability to freeze the initial seed.  Normally this is
> > only
> > > used during testing, but you could use it.
> > >
> > >
> > > On Mon, Jun 24, 2013 at 8:44 PM, Michael Kazekin  > >wrote:
> > >
> > > > Thanks a lot!
> > > > Do you know by any chance what are the underlying reasons for including
> > > > such mandatory random seed initialization?
> > > > Do you see any sense in providing another option, such as filling them
> > > > with zeroes in order to ensure the consistency and repeatability? (for
> > > > example we might want to track and compare the generated recommendation
> > > > lists for different parameters, such as the number of features or
> > number of
> > > > iterations etc.)
> > > > M.
> > > >
> > > >
> > > > > Date: Mon, 24 Jun 2013 19:51:44 +0200
> > > > > Subject: Re: Consistent repeatable results for distributed ALS-WR
> > > > recommender
> > > > > From: s...@apache.org
> > > > > To: user@mahout.apache.org
> > > > >
> > > > > The matrices of the factorization are initalized randomly. If you
> > fix the
> > > > > random seed (would require modification of the code) you should get
> > > > exactly
> > > > > the same results.
> > > > > Am 24.06.2013 13:49 schrieb "Michael Kazekin" :
> > > > >
> > > > > > Hi!
> > > > > > Should I assume that under same dataset and same parameters for
> > > > factorizer
> > > > > > and recommender I will get the same results for any specific user?
> > > > > > My current understanding that theoretically ALS-WR algorithm could
> > > > > > guarantee this, but I was wondering could be there any numeric
> > method
> > > > > > issues and/or implementation-specific concerns.
> > > > > > Would appreciate any highlight on this issue.
> > > > > > Mike.
> > > > > >
> > > > > >
> > > > > >
> > > >
> >
> >
  

Re: Consistent repeatable results for distributed ALS-WR recommender

2013-06-24 Thread Dmitriy Lyubimov
On Mon, Jun 24, 2013 at 1:35 PM, Michael Kazekin wrote:

> I agree with you, I should have mentioned earlier that it would be good to
> separate "noise from data" and deal with only what is separable. Of course
> there is no truly deterministic implementation of any algorithm, but I
> would expect to see "credible" results on a macro-level (in our case it
> would be nice to see the same order of recommendations given the fixed
> seed). It seems important for experiments (and for testing, as mentioned),
> isn't it?
> Another question is that afaik ALS-WR is deterministic by its inception,


I am not sure i know a deterministic version of any flavor of ALS, ALS-WR
included. You can make it such by fixing seed, but there's no benefit to it
w.r.t prediction credibility. The problem is guaranteed to converge but
there's always going to be an infinite-small delta between actuall loss and
best loss, at some point further improvements do not cover the
computational cost. I usually stop whenever i don't see more than 5%
training cost improvement w.r.t. previous iterations. In fact, model
parameters will often have much more effect on model credibility than
achieving ideal training cost.



> so I'm trying to understand the reasons (and I'm assuming there are some)
> for the specific implementation design.
>
> Thanks for a free lunch! ;)
> Cheers,Mike.
>
> > Date: Mon, 24 Jun 2013 13:13:20 -0700
> > Subject: Re: Consistent repeatable results for distributed ALS-WR
> recommender
> > From: dlie...@gmail.com
> > To: user@mahout.apache.org
> >
> > On Mon, Jun 24, 2013 at 1:07 PM, Michael Kazekin  >wrote:
> >
> > > Thank you, Ted!
> > > Any feedback on the usefulness of such functionality? Could it increase
> > > the 'playability' of the recommender?
> > >
> >
> > Almost all methods -- even deterministic ones -- will have a "credible
> > interval" of prediction simply because method assumptions do not hold
> 100%
> > in real life, real data. So what you really want to know in such cases is
> > the credible interval rather than whether method is deterministic or not.
> > Non-deterministic methods might very well be more accurate than
> > deterministic ones in this context, and, therefore, more "useful". Also
> > see: "no free lunch theorem".
> >
> >
> > > > From: ted.dunn...@gmail.com
> > > > Date: Mon, 24 Jun 2013 20:46:43 +0100
> > > > Subject: Re: Consistent repeatable results for distributed ALS-WR
> > > recommender
> > > > To: user@mahout.apache.org
> > > >
> > > > See org.apache.mahout.common.RandomUtils#useTestSeed
> > > >
> > > > It provides the ability to freeze the initial seed.  Normally this is
> > > only
> > > > used during testing, but you could use it.
> > > >
> > > >
> > > > On Mon, Jun 24, 2013 at 8:44 PM, Michael Kazekin <
> kazm...@hotmail.com
> > > >wrote:
> > > >
> > > > > Thanks a lot!
> > > > > Do you know by any chance what are the underlying reasons for
> including
> > > > > such mandatory random seed initialization?
> > > > > Do you see any sense in providing another option, such as filling
> them
> > > > > with zeroes in order to ensure the consistency and repeatability?
> (for
> > > > > example we might want to track and compare the generated
> recommendation
> > > > > lists for different parameters, such as the number of features or
> > > number of
> > > > > iterations etc.)
> > > > > M.
> > > > >
> > > > >
> > > > > > Date: Mon, 24 Jun 2013 19:51:44 +0200
> > > > > > Subject: Re: Consistent repeatable results for distributed ALS-WR
> > > > > recommender
> > > > > > From: s...@apache.org
> > > > > > To: user@mahout.apache.org
> > > > > >
> > > > > > The matrices of the factorization are initalized randomly. If you
> > > fix the
> > > > > > random seed (would require modification of the code) you should
> get
> > > > > exactly
> > > > > > the same results.
> > > > > > Am 24.06.2013 13:49 schrieb "Michael Kazekin" <
> kazm...@hotmail.com>:
> > > > > >
> > > > > > > Hi!
> > > > > > > Should I assume that under same dataset and same parameters for
> > > > > factorizer
> > > > > > > and recommender I will get the same results for any specific
> user?
> > > > > > > My current understanding that theoretically ALS-WR algorithm
> could
> > > > > > > guarantee this, but I was wondering could be there any numeric
> > > method
> > > > > > > issues and/or implementation-specific concerns.
> > > > > > > Would appreciate any highlight on this issue.
> > > > > > > Mike.
> > > > > > >
> > > > > > >
> > > > > > >
> > > > >
> > >
> > >
>
>


Re: Consistent repeatable results for distributed ALS-WR recommender

2013-06-24 Thread Dmitriy Lyubimov
On Mon, Jun 24, 2013 at 1:35 PM, Michael Kazekin wrote:

> I agree with you, I should have mentioned earlier that it would be good to
> separate "noise from data" and deal with only what is separable. Of course
> there is no truly deterministic implementation of any algorithm,


I am pretty sure "2.0 + 2.0" is pretty deterministic  :)


> but I would expect to see "credible" results on a macro-level (in our case
> it would be nice to see the same order of recommendations given the fixed
> seed). It seems important for experiments (and for testing, as mentioned),
> isn't it?
>

Yes for unit tests you usually would want to fix the seed if it means that
assertion may fail  with a non-zero probability. There are definitely a lot
of such cases in Mahout.

Another question is that afaik ALS-WR is deterministic by its inception, so
> I'm trying to understand the reasons (and I'm assuming there are some) for
> the specific implementation design.
>
> Thanks for a free lunch! ;)
> Cheers,Mike.
>
> > Date: Mon, 24 Jun 2013 13:13:20 -0700
> > Subject: Re: Consistent repeatable results for distributed ALS-WR
> recommender
> > From: dlie...@gmail.com
> > To: user@mahout.apache.org
> >
> > On Mon, Jun 24, 2013 at 1:07 PM, Michael Kazekin  >wrote:
> >
> > > Thank you, Ted!
> > > Any feedback on the usefulness of such functionality? Could it increase
> > > the 'playability' of the recommender?
> > >
> >
> > Almost all methods -- even deterministic ones -- will have a "credible
> > interval" of prediction simply because method assumptions do not hold
> 100%
> > in real life, real data. So what you really want to know in such cases is
> > the credible interval rather than whether method is deterministic or not.
> > Non-deterministic methods might very well be more accurate than
> > deterministic ones in this context, and, therefore, more "useful". Also
> > see: "no free lunch theorem".
> >
> >
> > > > From: ted.dunn...@gmail.com
> > > > Date: Mon, 24 Jun 2013 20:46:43 +0100
> > > > Subject: Re: Consistent repeatable results for distributed ALS-WR
> > > recommender
> > > > To: user@mahout.apache.org
> > > >
> > > > See org.apache.mahout.common.RandomUtils#useTestSeed
> > > >
> > > > It provides the ability to freeze the initial seed.  Normally this is
> > > only
> > > > used during testing, but you could use it.
> > > >
> > > >
> > > > On Mon, Jun 24, 2013 at 8:44 PM, Michael Kazekin <
> kazm...@hotmail.com
> > > >wrote:
> > > >
> > > > > Thanks a lot!
> > > > > Do you know by any chance what are the underlying reasons for
> including
> > > > > such mandatory random seed initialization?
> > > > > Do you see any sense in providing another option, such as filling
> them
> > > > > with zeroes in order to ensure the consistency and repeatability?
> (for
> > > > > example we might want to track and compare the generated
> recommendation
> > > > > lists for different parameters, such as the number of features or
> > > number of
> > > > > iterations etc.)
> > > > > M.
> > > > >
> > > > >
> > > > > > Date: Mon, 24 Jun 2013 19:51:44 +0200
> > > > > > Subject: Re: Consistent repeatable results for distributed ALS-WR
> > > > > recommender
> > > > > > From: s...@apache.org
> > > > > > To: user@mahout.apache.org
> > > > > >
> > > > > > The matrices of the factorization are initalized randomly. If you
> > > fix the
> > > > > > random seed (would require modification of the code) you should
> get
> > > > > exactly
> > > > > > the same results.
> > > > > > Am 24.06.2013 13:49 schrieb "Michael Kazekin" <
> kazm...@hotmail.com>:
> > > > > >
> > > > > > > Hi!
> > > > > > > Should I assume that under same dataset and same parameters for
> > > > > factorizer
> > > > > > > and recommender I will get the same results for any specific
> user?
> > > > > > > My current understanding that theoretically ALS-WR algorithm
> could
> > > > > > > guarantee this, but I was wondering could be there any numeric
> > > method
> > > > > > > issues and/or implementation-specific concerns.
> > > > > > > Would appreciate any highlight on this issue.
> > > > > > > Mike.
> > > > > > >
> > > > > > >
> > > > > > >
> > > > >
> > >
> > >
>
>


Re: Consistent repeatable results for distributed ALS-WR recommender

2013-06-24 Thread Koobas
On Mon, Jun 24, 2013 at 5:07 PM, Dmitriy Lyubimov  wrote:

> On Mon, Jun 24, 2013 at 1:35 PM, Michael Kazekin  >wrote:
>
> > I agree with you, I should have mentioned earlier that it would be good
> to
> > separate "noise from data" and deal with only what is separable. Of
> course
> > there is no truly deterministic implementation of any algorithm,
>
>
> I am pretty sure "2.0 + 2.0" is pretty deterministic  :)
>
>
Few things are naturally deterministic in parallel computing.
Many parallel sorting algorithms are non-deterministic.

In floating point commutativity is gone.
So, while 2.0 + 2.0 is deterministic, 1.0 + 10.0 + 100.0 is not 1.0 + 100.0
+ 10.0.
Again, you don't know what happens if the reduction is done in parallel.



> > but I would expect to see "credible" results on a macro-level (in our
> case
> > it would be nice to see the same order of recommendations given the fixed
> > seed). It seems important for experiments (and for testing, as
> mentioned),
> > isn't it?
> >
>
> Yes for unit tests you usually would want to fix the seed if it means that
> assertion may fail  with a non-zero probability. There are definitely a lot
> of such cases in Mahout.
>
> Another question is that afaik ALS-WR is deterministic by its inception, so
> > I'm trying to understand the reasons (and I'm assuming there are some)
> for
> > the specific implementation design.
> >
> > Thanks for a free lunch! ;)
> > Cheers,Mike.
> >
> > > Date: Mon, 24 Jun 2013 13:13:20 -0700
> > > Subject: Re: Consistent repeatable results for distributed ALS-WR
> > recommender
> > > From: dlie...@gmail.com
> > > To: user@mahout.apache.org
> > >
> > > On Mon, Jun 24, 2013 at 1:07 PM, Michael Kazekin  > >wrote:
> > >
> > > > Thank you, Ted!
> > > > Any feedback on the usefulness of such functionality? Could it
> increase
> > > > the 'playability' of the recommender?
> > > >
> > >
> > > Almost all methods -- even deterministic ones -- will have a "credible
> > > interval" of prediction simply because method assumptions do not hold
> > 100%
> > > in real life, real data. So what you really want to know in such cases
> is
> > > the credible interval rather than whether method is deterministic or
> not.
> > > Non-deterministic methods might very well be more accurate than
> > > deterministic ones in this context, and, therefore, more "useful". Also
> > > see: "no free lunch theorem".
> > >
> > >
> > > > > From: ted.dunn...@gmail.com
> > > > > Date: Mon, 24 Jun 2013 20:46:43 +0100
> > > > > Subject: Re: Consistent repeatable results for distributed ALS-WR
> > > > recommender
> > > > > To: user@mahout.apache.org
> > > > >
> > > > > See org.apache.mahout.common.RandomUtils#useTestSeed
> > > > >
> > > > > It provides the ability to freeze the initial seed.  Normally this
> is
> > > > only
> > > > > used during testing, but you could use it.
> > > > >
> > > > >
> > > > > On Mon, Jun 24, 2013 at 8:44 PM, Michael Kazekin <
> > kazm...@hotmail.com
> > > > >wrote:
> > > > >
> > > > > > Thanks a lot!
> > > > > > Do you know by any chance what are the underlying reasons for
> > including
> > > > > > such mandatory random seed initialization?
> > > > > > Do you see any sense in providing another option, such as filling
> > them
> > > > > > with zeroes in order to ensure the consistency and repeatability?
> > (for
> > > > > > example we might want to track and compare the generated
> > recommendation
> > > > > > lists for different parameters, such as the number of features or
> > > > number of
> > > > > > iterations etc.)
> > > > > > M.
> > > > > >
> > > > > >
> > > > > > > Date: Mon, 24 Jun 2013 19:51:44 +0200
> > > > > > > Subject: Re: Consistent repeatable results for distributed
> ALS-WR
> > > > > > recommender
> > > > > > > From: s...@apache.org
> > > > > > > To: user@mahout.apache.org
> > > > > > >
> > > > > > > The matrices of the factorization are initalized randomly. If
> you
> > > > fix the
> > > > > > > random seed (would require modification of the code) you should
> > get
> > > > > > exactly
> > > > > > > the same results.
> > > > > > > Am 24.06.2013 13:49 schrieb "Michael Kazekin" <
> > kazm...@hotmail.com>:
> > > > > > >
> > > > > > > > Hi!
> > > > > > > > Should I assume that under same dataset and same parameters
> for
> > > > > > factorizer
> > > > > > > > and recommender I will get the same results for any specific
> > user?
> > > > > > > > My current understanding that theoretically ALS-WR algorithm
> > could
> > > > > > > > guarantee this, but I was wondering could be there any
> numeric
> > > > method
> > > > > > > > issues and/or implementation-specific concerns.
> > > > > > > > Would appreciate any highlight on this issue.
> > > > > > > > Mike.
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > >
> > > >
> > > >
> >
> >
>


Re: Consistent repeatable results for distributed ALS-WR recommender

2013-06-24 Thread Dmitriy Lyubimov
The point of non-determinism of parallel processing is well known. It was a
joke to remind to be careful with absolute statements like "never exists",
as they are very hard to prove. Bringing more positive examples still does
not prove an absolute statement made, or make it any stronger from the math
logic point of view. Whereas there's enough just one counter-example to
disprove them. :)


On Mon, Jun 24, 2013 at 2:29 PM, Koobas  wrote:

> On Mon, Jun 24, 2013 at 5:07 PM, Dmitriy Lyubimov 
> wrote:
>
> > On Mon, Jun 24, 2013 at 1:35 PM, Michael Kazekin  > >wrote:
> >
> > > I agree with you, I should have mentioned earlier that it would be good
> > to
> > > separate "noise from data" and deal with only what is separable. Of
> > course
> > > there is no truly deterministic implementation of any algorithm,
> >
> >
> > I am pretty sure "2.0 + 2.0" is pretty deterministic  :)
> >
> >
> Few things are naturally deterministic in parallel computing.
> Many parallel sorting algorithms are non-deterministic.
>
> In floating point commutativity is gone.
> So, while 2.0 + 2.0 is deterministic, 1.0 + 10.0 + 100.0 is not 1.0 + 100.0
> + 10.0.
> Again, you don't know what happens if the reduction is done in parallel.
>
>
>
> > > but I would expect to see "credible" results on a macro-level (in our
> > case
> > > it would be nice to see the same order of recommendations given the
> fixed
> > > seed). It seems important for experiments (and for testing, as
> > mentioned),
> > > isn't it?
> > >
> >
> > Yes for unit tests you usually would want to fix the seed if it means
> that
> > assertion may fail  with a non-zero probability. There are definitely a
> lot
> > of such cases in Mahout.
> >
> > Another question is that afaik ALS-WR is deterministic by its inception,
> so
> > > I'm trying to understand the reasons (and I'm assuming there are some)
> > for
> > > the specific implementation design.
> > >
> > > Thanks for a free lunch! ;)
> > > Cheers,Mike.
> > >
> > > > Date: Mon, 24 Jun 2013 13:13:20 -0700
> > > > Subject: Re: Consistent repeatable results for distributed ALS-WR
> > > recommender
> > > > From: dlie...@gmail.com
> > > > To: user@mahout.apache.org
> > > >
> > > > On Mon, Jun 24, 2013 at 1:07 PM, Michael Kazekin <
> kazm...@hotmail.com
> > > >wrote:
> > > >
> > > > > Thank you, Ted!
> > > > > Any feedback on the usefulness of such functionality? Could it
> > increase
> > > > > the 'playability' of the recommender?
> > > > >
> > > >
> > > > Almost all methods -- even deterministic ones -- will have a
> "credible
> > > > interval" of prediction simply because method assumptions do not hold
> > > 100%
> > > > in real life, real data. So what you really want to know in such
> cases
> > is
> > > > the credible interval rather than whether method is deterministic or
> > not.
> > > > Non-deterministic methods might very well be more accurate than
> > > > deterministic ones in this context, and, therefore, more "useful".
> Also
> > > > see: "no free lunch theorem".
> > > >
> > > >
> > > > > > From: ted.dunn...@gmail.com
> > > > > > Date: Mon, 24 Jun 2013 20:46:43 +0100
> > > > > > Subject: Re: Consistent repeatable results for distributed ALS-WR
> > > > > recommender
> > > > > > To: user@mahout.apache.org
> > > > > >
> > > > > > See org.apache.mahout.common.RandomUtils#useTestSeed
> > > > > >
> > > > > > It provides the ability to freeze the initial seed.  Normally
> this
> > is
> > > > > only
> > > > > > used during testing, but you could use it.
> > > > > >
> > > > > >
> > > > > > On Mon, Jun 24, 2013 at 8:44 PM, Michael Kazekin <
> > > kazm...@hotmail.com
> > > > > >wrote:
> > > > > >
> > > > > > > Thanks a lot!
> > > > > > > Do you know by any chance what are the underlying reasons for
> > > including
> > > > > > > such mandatory random seed initialization?
> > > > > > > Do you see any sense in providing another opt

Re: Consistent repeatable results for distributed ALS-WR recommender

2013-06-24 Thread Koobas
On Mon, Jun 24, 2013 at 5:43 PM, Dmitriy Lyubimov  wrote:

> The point of non-determinism of parallel processing is well known. It was a
> joke to remind to be careful with absolute statements like "never exists",
> as they are very hard to prove. Bringing more positive examples still does
> not prove an absolute statement made, or make it any stronger from the math
> logic point of view. Whereas there's enough just one counter-example to
> disprove them. :)
>
>
Ooops sorry, I did not get that.
It just annoys me that reproducibility in machine learning is taken so
lightly.
I think this will change as soon as CaaS machine learning goes mainstream.


> On Mon, Jun 24, 2013 at 2:29 PM, Koobas  wrote:
>
> > On Mon, Jun 24, 2013 at 5:07 PM, Dmitriy Lyubimov 
> > wrote:
> >
> > > On Mon, Jun 24, 2013 at 1:35 PM, Michael Kazekin  > > >wrote:
> > >
> > > > I agree with you, I should have mentioned earlier that it would be
> good
> > > to
> > > > separate "noise from data" and deal with only what is separable. Of
> > > course
> > > > there is no truly deterministic implementation of any algorithm,
> > >
> > >
> > > I am pretty sure "2.0 + 2.0" is pretty deterministic  :)
> > >
> > >
> > Few things are naturally deterministic in parallel computing.
> > Many parallel sorting algorithms are non-deterministic.
> >
> > In floating point commutativity is gone.
> > So, while 2.0 + 2.0 is deterministic, 1.0 + 10.0 + 100.0 is not 1.0 +
> 100.0
> > + 10.0.
> > Again, you don't know what happens if the reduction is done in parallel.
> >
> >
> >
> > > > but I would expect to see "credible" results on a macro-level (in our
> > > case
> > > > it would be nice to see the same order of recommendations given the
> > fixed
> > > > seed). It seems important for experiments (and for testing, as
> > > mentioned),
> > > > isn't it?
> > > >
> > >
> > > Yes for unit tests you usually would want to fix the seed if it means
> > that
> > > assertion may fail  with a non-zero probability. There are definitely a
> > lot
> > > of such cases in Mahout.
> > >
> > > Another question is that afaik ALS-WR is deterministic by its
> inception,
> > so
> > > > I'm trying to understand the reasons (and I'm assuming there are
> some)
> > > for
> > > > the specific implementation design.
> > > >
> > > > Thanks for a free lunch! ;)
> > > > Cheers,Mike.
> > > >
> > > > > Date: Mon, 24 Jun 2013 13:13:20 -0700
> > > > > Subject: Re: Consistent repeatable results for distributed ALS-WR
> > > > recommender
> > > > > From: dlie...@gmail.com
> > > > > To: user@mahout.apache.org
> > > > >
> > > > > On Mon, Jun 24, 2013 at 1:07 PM, Michael Kazekin <
> > kazm...@hotmail.com
> > > > >wrote:
> > > > >
> > > > > > Thank you, Ted!
> > > > > > Any feedback on the usefulness of such functionality? Could it
> > > increase
> > > > > > the 'playability' of the recommender?
> > > > > >
> > > > >
> > > > > Almost all methods -- even deterministic ones -- will have a
> > "credible
> > > > > interval" of prediction simply because method assumptions do not
> hold
> > > > 100%
> > > > > in real life, real data. So what you really want to know in such
> > cases
> > > is
> > > > > the credible interval rather than whether method is deterministic
> or
> > > not.
> > > > > Non-deterministic methods might very well be more accurate than
> > > > > deterministic ones in this context, and, therefore, more "useful".
> > Also
> > > > > see: "no free lunch theorem".
> > > > >
> > > > >
> > > > > > > From: ted.dunn...@gmail.com
> > > > > > > Date: Mon, 24 Jun 2013 20:46:43 +0100
> > > > > > > Subject: Re: Consistent repeatable results for distributed
> ALS-WR
> > > > > > recommender
> > > > > > > To: user@mahout.apache.org
> > > > > > >
> > > > > > > See org.apache.mahout.common.RandomUtils#useTestSeed
> > > > > > >
> > >

RE: Consistent repeatable results for distributed ALS-WR recommender

2013-06-24 Thread Michael Kazekin
Any algorithm is non-deterministic because of non-deterministic behavior of 
underlying hardware, of course :) But that's an offtop. I'm talking about 
specific implementation of specific algorithm, and in general I'd like to know 
that at least some very general properties of the algorithm implementation 
conserve (and why did authors added intentional non-deterministic component to 
implementation).
> Date: Mon, 24 Jun 2013 14:43:59 -0700
> Subject: Re: Consistent repeatable results for distributed ALS-WR recommender
> From: dlie...@gmail.com
> To: user@mahout.apache.org
> 
> The point of non-determinism of parallel processing is well known. It was a
> joke to remind to be careful with absolute statements like "never exists",
> as they are very hard to prove. Bringing more positive examples still does
> not prove an absolute statement made, or make it any stronger from the math
> logic point of view. Whereas there's enough just one counter-example to
> disprove them. :)
> 
> 
> On Mon, Jun 24, 2013 at 2:29 PM, Koobas  wrote:
> 
> > On Mon, Jun 24, 2013 at 5:07 PM, Dmitriy Lyubimov 
> > wrote:
> >
> > > On Mon, Jun 24, 2013 at 1:35 PM, Michael Kazekin  > > >wrote:
> > >
> > > > I agree with you, I should have mentioned earlier that it would be good
> > > to
> > > > separate "noise from data" and deal with only what is separable. Of
> > > course
> > > > there is no truly deterministic implementation of any algorithm,
> > >
> > >
> > > I am pretty sure "2.0 + 2.0" is pretty deterministic  :)
> > >
> > >
> > Few things are naturally deterministic in parallel computing.
> > Many parallel sorting algorithms are non-deterministic.
> >
> > In floating point commutativity is gone.
> > So, while 2.0 + 2.0 is deterministic, 1.0 + 10.0 + 100.0 is not 1.0 + 100.0
> > + 10.0.
> > Again, you don't know what happens if the reduction is done in parallel.
> >
> >
> >
> > > > but I would expect to see "credible" results on a macro-level (in our
> > > case
> > > > it would be nice to see the same order of recommendations given the
> > fixed
> > > > seed). It seems important for experiments (and for testing, as
> > > mentioned),
> > > > isn't it?
> > > >
> > >
> > > Yes for unit tests you usually would want to fix the seed if it means
> > that
> > > assertion may fail  with a non-zero probability. There are definitely a
> > lot
> > > of such cases in Mahout.
> > >
> > > Another question is that afaik ALS-WR is deterministic by its inception,
> > so
> > > > I'm trying to understand the reasons (and I'm assuming there are some)
> > > for
> > > > the specific implementation design.
> > > >
> > > > Thanks for a free lunch! ;)
> > > > Cheers,Mike.
> > > >
> > > > > Date: Mon, 24 Jun 2013 13:13:20 -0700
> > > > > Subject: Re: Consistent repeatable results for distributed ALS-WR
> > > > recommender
> > > > > From: dlie...@gmail.com
> > > > > To: user@mahout.apache.org
> > > > >
> > > > > On Mon, Jun 24, 2013 at 1:07 PM, Michael Kazekin <
> > kazm...@hotmail.com
> > > > >wrote:
> > > > >
> > > > > > Thank you, Ted!
> > > > > > Any feedback on the usefulness of such functionality? Could it
> > > increase
> > > > > > the 'playability' of the recommender?
> > > > > >
> > > > >
> > > > > Almost all methods -- even deterministic ones -- will have a
> > "credible
> > > > > interval" of prediction simply because method assumptions do not hold
> > > > 100%
> > > > > in real life, real data. So what you really want to know in such
> > cases
> > > is
> > > > > the credible interval rather than whether method is deterministic or
> > > not.
> > > > > Non-deterministic methods might very well be more accurate than
> > > > > deterministic ones in this context, and, therefore, more "useful".
> > Also
> > > > > see: "no free lunch theorem".
> > > > >
> > > > >
> > > > > > > From: ted.dunn...@gmail.com
> > > > > > > Date: Mon, 24 Jun 2013 20:46:43 +0100
> > > > > > > Subject:

Re: Consistent repeatable results for distributed ALS-WR recommender

2013-06-24 Thread Ted Dunning
This is a common chestnut that gets trotted out commonly, but I doubt that
the effects that the OP was worried about where on the same scale.
 Non-commutativity of FP arithmetic on doubles rarely has a very large
effect.


On Mon, Jun 24, 2013 at 11:17 PM, Michael Kazekin wrote:

> Any algorithm is non-deterministic because of non-deterministic behavior
> of underlying hardware, of course :) But that's an offtop. I'm talking
> about specific implementation of specific algorithm, and in general I'd
> like to know that at least some very general properties of the algorithm
> implementation conserve (and why did authors added intentional
> non-deterministic component to implementation).
> > Date: Mon, 24 Jun 2013 14:43:59 -0700
> > Subject: Re: Consistent repeatable results for distributed ALS-WR
> recommender
> > From: dlie...@gmail.com
> > To: user@mahout.apache.org
> >
> > The point of non-determinism of parallel processing is well known. It
> was a
> > joke to remind to be careful with absolute statements like "never
> exists",
> > as they are very hard to prove. Bringing more positive examples still
> does
> > not prove an absolute statement made, or make it any stronger from the
> math
> > logic point of view. Whereas there's enough just one counter-example to
> > disprove them. :)
> >
> >
> > On Mon, Jun 24, 2013 at 2:29 PM, Koobas  wrote:
> >
> > > On Mon, Jun 24, 2013 at 5:07 PM, Dmitriy Lyubimov 
> > > wrote:
> > >
> > > > On Mon, Jun 24, 2013 at 1:35 PM, Michael Kazekin <
> kazm...@hotmail.com
> > > > >wrote:
> > > >
> > > > > I agree with you, I should have mentioned earlier that it would be
> good
> > > > to
> > > > > separate "noise from data" and deal with only what is separable. Of
> > > > course
> > > > > there is no truly deterministic implementation of any algorithm,
> > > >
> > > >
> > > > I am pretty sure "2.0 + 2.0" is pretty deterministic  :)
> > > >
> > > >
> > > Few things are naturally deterministic in parallel computing.
> > > Many parallel sorting algorithms are non-deterministic.
> > >
> > > In floating point commutativity is gone.
> > > So, while 2.0 + 2.0 is deterministic, 1.0 + 10.0 + 100.0 is not 1.0 +
> 100.0
> > > + 10.0.
> > > Again, you don't know what happens if the reduction is done in
> parallel.
> > >
> > >
> > >
> > > > > but I would expect to see "credible" results on a macro-level (in
> our
> > > > case
> > > > > it would be nice to see the same order of recommendations given the
> > > fixed
> > > > > seed). It seems important for experiments (and for testing, as
> > > > mentioned),
> > > > > isn't it?
> > > > >
> > > >
> > > > Yes for unit tests you usually would want to fix the seed if it means
> > > that
> > > > assertion may fail  with a non-zero probability. There are
> definitely a
> > > lot
> > > > of such cases in Mahout.
> > > >
> > > > Another question is that afaik ALS-WR is deterministic by its
> inception,
> > > so
> > > > > I'm trying to understand the reasons (and I'm assuming there are
> some)
> > > > for
> > > > > the specific implementation design.
> > > > >
> > > > > Thanks for a free lunch! ;)
> > > > > Cheers,Mike.
> > > > >
> > > > > > Date: Mon, 24 Jun 2013 13:13:20 -0700
> > > > > > Subject: Re: Consistent repeatable results for distributed ALS-WR
> > > > > recommender
> > > > > > From: dlie...@gmail.com
> > > > > > To: user@mahout.apache.org
> > > > > >
> > > > > > On Mon, Jun 24, 2013 at 1:07 PM, Michael Kazekin <
> > > kazm...@hotmail.com
> > > > > >wrote:
> > > > > >
> > > > > > > Thank you, Ted!
> > > > > > > Any feedback on the usefulness of such functionality? Could it
> > > > increase
> > > > > > > the 'playability' of the recommender?
> > > > > > >
> > > > > >
> > > > > > Almost all methods -- even deterministic ones -- will have a
> > > "credible
> > > > > > interval" of predictio

Re: Consistent repeatable results for distributed ALS-WR recommender

2013-06-24 Thread Sean Owen
Yeah this has gone well off-road.

ALS is not non-deterministic because of hardware errors or cosmic
rays. It's also nothing to do with floating-point round-off, or
certainly, that is not the primary source of non-determinism to
several orders of magnitude.

ALS starts from a random solution and this will result in a different
solution. The overall problem is non-convex and the process will not
necessarily converge to the same solution.

Randomness is a common feature of machine learning: centroid selection
in k-means, the 'stochastic' in SGD, random forests, etc. I don't
think the question is why randomness is useful right?

For ALS... I don't quite understand the question, what's the
alternative? certainly I have always seen it formulated in terms of a
random initial solution. You don't want to always start from the same
point because of local minima. Ideally you start from many points and
take the best solution.

On Mon, Jun 24, 2013 at 11:22 PM, Ted Dunning  wrote:
> This is a common chestnut that gets trotted out commonly, but I doubt that
> the effects that the OP was worried about where on the same scale.
>  Non-commutativity of FP arithmetic on doubles rarely has a very large
> effect.
>
>
> On Mon, Jun 24, 2013 at 11:17 PM, Michael Kazekin wrote:
>
>> Any algorithm is non-deterministic because of non-deterministic behavior
>> of underlying hardware, of course :) But that's an offtop. I'm talking
>> about specific implementation of specific algorithm, and in general I'd
>> like to know that at least some very general properties of the algorithm
>> implementation conserve (and why did authors added intentional
>> non-deterministic component to implementation).


Re: Consistent repeatable results for distributed ALS-WR recommender

2013-06-24 Thread Koobas
Well, you know, the issue is there, whether we like it or not.
Maybe replication is enough, maybe not.
If there is a workshop on that issue, it's on the radar.
http://beamtenherrschaft.blogspot.com/2013/06/acm-recsys-2013-workshop-on.html


On Mon, Jun 24, 2013 at 6:36 PM, Sean Owen  wrote:

> Yeah this has gone well off-road.
>
> ALS is not non-deterministic because of hardware errors or cosmic
> rays. It's also nothing to do with floating-point round-off, or
> certainly, that is not the primary source of non-determinism to
> several orders of magnitude.
>
> ALS starts from a random solution and this will result in a different
> solution. The overall problem is non-convex and the process will not
> necessarily converge to the same solution.
>
> Randomness is a common feature of machine learning: centroid selection
> in k-means, the 'stochastic' in SGD, random forests, etc. I don't
> think the question is why randomness is useful right?
>
> For ALS... I don't quite understand the question, what's the
> alternative? certainly I have always seen it formulated in terms of a
> random initial solution. You don't want to always start from the same
> point because of local minima. Ideally you start from many points and
> take the best solution.
>
> On Mon, Jun 24, 2013 at 11:22 PM, Ted Dunning 
> wrote:
> > This is a common chestnut that gets trotted out commonly, but I doubt
> that
> > the effects that the OP was worried about where on the same scale.
> >  Non-commutativity of FP arithmetic on doubles rarely has a very large
> > effect.
> >
> >
> > On Mon, Jun 24, 2013 at 11:17 PM, Michael Kazekin  >wrote:
> >
> >> Any algorithm is non-deterministic because of non-deterministic behavior
> >> of underlying hardware, of course :) But that's an offtop. I'm talking
> >> about specific implementation of specific algorithm, and in general I'd
> >> like to know that at least some very general properties of the algorithm
> >> implementation conserve (and why did authors added intentional
> >> non-deterministic component to implementation).
>


RE: Consistent repeatable results for distributed ALS-WR recommender

2013-06-24 Thread Michael Kazekin
Thanks for clarification, Owen!
> ALS starts from a random solution and this will result in a different
> solution. The overall problem is non-convex and the process will not
> necessarily converge to the same solution.
But doesn't alternation guarantee convexity?
> Randomness is a common feature of machine learning: centroid selection
> in k-means, the 'stochastic' in SGD, random forests, etc. I don't
> think the question is why randomness is useful right?
It isn't! :)

> For ALS... I don't quite understand the question, what's the
> alternative? certainly I have always seen it formulated in terms of a
> random initial solution. You don't want to always start from the same
> point because of local minima. Ideally you start from many points and
> take the best solution.
Yeah, but then you start dealing with another problem, how to blend all results 
together and how doing this affects overall quality of results (in our case 
recommendations), right? 
But that's another story, I agree. In general I understood the reasons behind 
seeding the matrix values.

> 
> On Mon, Jun 24, 2013 at 11:22 PM, Ted Dunning  wrote:
> > This is a common chestnut that gets trotted out commonly, but I doubt that
> > the effects that the OP was worried about where on the same scale.
> >  Non-commutativity of FP arithmetic on doubles rarely has a very large
> > effect.
> >
> >
> > On Mon, Jun 24, 2013 at 11:17 PM, Michael Kazekin 
> > wrote:
> >
> >> Any algorithm is non-deterministic because of non-deterministic behavior
> >> of underlying hardware, of course :) But that's an offtop. I'm talking
> >> about specific implementation of specific algorithm, and in general I'd
> >> like to know that at least some very general properties of the algorithm
> >> implementation conserve (and why did authors added intentional
> >> non-deterministic component to implementation).
  

Re: Consistent repeatable results for distributed ALS-WR recommender

2013-06-24 Thread Sean Owen
On Tue, Jun 25, 2013 at 12:44 AM, Michael Kazekin  wrote:
> But doesn't alternation guarantee convexity?

No, the problem remains non-convex. At each step, where half the
parameters are fixed, yes that constrained problem is convex. But each
of these is not the same as the overall global problem being solved.

> Yeah, but then you start dealing with another problem, how to blend all 
> results together and how doing this affects overall quality of results (in 
> our case recommendations), right?

No you would usually just take the best solution and use it alone. Or
at least, that's a fine thing to do.


Re: Consistent repeatable results for distributed ALS-WR recommender

2013-06-25 Thread Robin East
I think Sean has put it well but just to emphasise the point: 

- optimisation problems are either convex or non-convex
- if you have a convex problem then the fundamental result that everyone should 
know is that if you find a local minimum it is also the global minimum
- if you don't have a convex problem (or you can't prove that it is convex) 
then you cannot be sure the minimum you have found is a global minimum

So if you know something is non-convex (e.g. ALS, k-means, convnets) you 
usually have to trade off something. You could try and find a global minimum 
but that may take an impossibly long-time. The other trade-off is to that you 
accept a local minimum and hope that it is close to the global. Which local 
minimum you get will depend on what your initialisation values are - if you try 
different initialisation values you may get a different local minimum. So a 
common practice is to run several sets of iterations using different initial 
values - often randomly generated - and take the best minimum. And of course 
you know the best one by evaluating the objective function and choosing the one 
that gives the lowest value.

Robin
On 24 Jun 2013, at 20:44, Michael Kazekin  wrote:

> Thanks a lot! 
> Do you know by any chance what are the underlying reasons for including such 
> mandatory random seed initialization?
> Do you see any sense in providing another option, such as filling them with 
> zeroes in order to ensure the consistency and repeatability? (for example we 
> might want to track and compare the generated recommendation lists for 
> different parameters, such as the number of features or number of iterations 
> etc.)
> M.
> 
> 
>> Date: Mon, 24 Jun 2013 19:51:44 +0200
>> Subject: Re: Consistent repeatable results for distributed ALS-WR recommender
>> From: s...@apache.org
>> To: user@mahout.apache.org
>> 
>> The matrices of the factorization are initalized randomly. If you fix the
>> random seed (would require modification of the code) you should get exactly
>> the same results.
>> Am 24.06.2013 13:49 schrieb "Michael Kazekin" :
>> 
>>> Hi!
>>> Should I assume that under same dataset and same parameters for factorizer
>>> and recommender I will get the same results for any specific user?
>>> My current understanding that theoretically ALS-WR algorithm could
>>> guarantee this, but I was wondering could be there any numeric method
>>> issues and/or implementation-specific concerns.
>>> Would appreciate any highlight on this issue.
>>> Mike.
>>> 
>>> 
>>> 
> 



RE: Consistent repeatable results for distributed ALS-WR recommender

2013-06-25 Thread Michael Kazekin
Thank you for clarification, Robin! My math was a very long time ago, so for me 
it was not very obvious that specific optimization problem doesn't hold this 
property (and thus cannot be globally optimized). But if anyone interested in 
further reading I found some decent material on the subject:
http://www.stanford.edu/~boyd/cvxbook/bv_cvxbook.pdf 

Again thank you all for fruitful (at least for me :-P) discussion!


> Subject: Re: Consistent repeatable results for distributed ALS-WR recommender
> From: robin.e...@xense.co.uk
> Date: Tue, 25 Jun 2013 09:10:23 +0100
> To: user@mahout.apache.org
> 
> I think Sean has put it well but just to emphasise the point: 
> 
> - optimisation problems are either convex or non-convex
> - if you have a convex problem then the fundamental result that everyone 
> should know is that if you find a local minimum it is also the global minimum
> - if you don't have a convex problem (or you can't prove that it is convex) 
> then you cannot be sure the minimum you have found is a global minimum
> 
> So if you know something is non-convex (e.g. ALS, k-means, convnets) you 
> usually have to trade off something. You could try and find a global minimum 
> but that may take an impossibly long-time. The other trade-off is to that you 
> accept a local minimum and hope that it is close to the global. Which local 
> minimum you get will depend on what your initialisation values are - if you 
> try different initialisation values you may get a different local minimum. So 
> a common practice is to run several sets of iterations using different 
> initial values - often randomly generated - and take the best minimum. And of 
> course you know the best one by evaluating the objective function and 
> choosing the one that gives the lowest value.
> 
> Robin
> On 24 Jun 2013, at 20:44, Michael Kazekin  wrote:
> 
> > Thanks a lot! 
> > Do you know by any chance what are the underlying reasons for including 
> > such mandatory random seed initialization?
> > Do you see any sense in providing another option, such as filling them with 
> > zeroes in order to ensure the consistency and repeatability? (for example 
> > we might want to track and compare the generated recommendation lists for 
> > different parameters, such as the number of features or number of 
> > iterations etc.)
> > M.
> > 
> > 
> >> Date: Mon, 24 Jun 2013 19:51:44 +0200
> >> Subject: Re: Consistent repeatable results for distributed ALS-WR 
> >> recommender
> >> From: s...@apache.org
> >> To: user@mahout.apache.org
> >> 
> >> The matrices of the factorization are initalized randomly. If you fix the
> >> random seed (would require modification of the code) you should get exactly
> >> the same results.
> >> Am 24.06.2013 13:49 schrieb "Michael Kazekin" :
> >> 
> >>> Hi!
> >>> Should I assume that under same dataset and same parameters for factorizer
> >>> and recommender I will get the same results for any specific user?
> >>> My current understanding that theoretically ALS-WR algorithm could
> >>> guarantee this, but I was wondering could be there any numeric method
> >>> issues and/or implementation-specific concerns.
> >>> Would appreciate any highlight on this issue.
> >>> Mike.
> >>> 
> >>> 
> >>> 
> >   
>