Re: Number of features for ALS

Ted Dunning Sun, 30 Mar 2014 12:48:48 -0700

Niklas,

http://en.wikipedia.org/wiki/Cross-validation_(statistics)


http://statweb.stanford.edu/~tibs/sta306b/cvwrong.pdf



On Sun, Mar 30, 2014 at 12:41 PM, Niklas Ekvall <niklas.ekv...@gmail.com>wrote:

> Hello Sebastian, could you do a deeper explanation or refer to any article
> that handle the subject?
>
> Best regards, Niklas
>
>
> 2014-03-30 20:50 GMT+02:00 Sebastian Schelter <s...@apache.org>:
>
> > Use k-fold cross-validation or hold-out tests for estimating the quality
> > of different parameter combinations.
> >
> > --sebastian
> >
> >
> > On 03/30/2014 11:53 AM, Niklas Ekvall wrote:
> >
> >> Hi,
> >>
> >> My name is Niklas Ekvall and I have a implementation of the recommender
> >> algorithm "Large-scale Parallel Collaborative Filtering for the Netflix
> >> Prize" and now I'm wondering how to choose the number of features and
> >> lambda. Could any of guys help me to explain a stepwise strategy to
> choose
> >> or optimize these two parameters?
> >>
> >> Best regards, Niklas
> >>
> >>
> >> 2014-03-27 19:07 GMT+01:00 j.barrett Strausser <
> >> j.barrett.straus...@gmail.com>:
> >>
> >>  Thanks Ted,
> >>>
> >>> Yes for the time problem. We tend to use aggregations of session data.
> So
> >>> instead of asking for user recommendations we do things like
> >>> user+sessions
> >>> recommendations.
> >>>
> >>> Of course, deciding when sessions start and stop isn't trivial. I
> ideally
> >>> what I would want to is time-weight views using a kernel or
> convolution.
> >>> That's a bit heavy so we typically have a global model, that is is
> >>> basically all preferences over times. Then these user+session type
> >>> models.
> >>> We can then combine these at another level to give recommendations
> based
> >>> on
> >>> what you like throughout time versus what you have been doing recently.
> >>>
> >>>
> >>>
> >>> -b
> >>>
> >>>
> >>> On Thu, Mar 27, 2014 at 1:59 PM, Ted Dunning <ted.dunn...@gmail.com>
> >>> wrote:
> >>>
> >>>  For the poly-syllable challenged,
> >>>>
> >>>> hetereoscedasticity - degree of variation changes.  This is common
> with
> >>>> counts because you expect the standard deviation of count data to be
> >>>> proportional to sqrt(n).
> >>>>
> >>>> time imhogeneity - changes in behavior over time.  One way to handle
> >>>> this
> >>>> (roughly) is to first remove variation in personal and item means over
> >>>>
> >>> time
> >>>
> >>>> (if using ratings) and then to segment user histories into episodes.
>  By
> >>>> including both short and long episodes you get some repair for changes
> >>>> in
> >>>> personal preference.  A great example of how this works/breaks is
> >>>>
> >>> Christmas
> >>>
> >>>> music.  On December 26th, you want to *stop* recommending this music
> so
> >>>>
> >>> it
> >>>
> >>>> really pays to limit histories at this point.  By having an episodic
> >>>> user
> >>>> session that starts around November and runs to Christmas, you can get
> >>>>
> >>> good
> >>>
> >>>> recommendations for seasonal songs and not pollute the rest of the
> >>>> universe.
> >>>>
> >>>>
> >>>>
> >>>> On Thu, Mar 27, 2014 at 8:30 AM, j.barrett Strausser <
> >>>> j.barrett.straus...@gmail.com> wrote:
> >>>>
> >>>>  For my team it has usually been hetereoscedasticity and time
> >>>>>
> >>>> inhomogeneity.
> >>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>> On Thu, Mar 27, 2014 at 10:18 AM, Tevfik Aytekin
> >>>>> <tevfik.ayte...@gmail.com>wrote:
> >>>>>
> >>>>>  Interesting topic,
> >>>>>> Ted, can you give examples of those mathematical assumptions
> >>>>>> under-pinning ALS which are violated by the real world?
> >>>>>>
> >>>>>> On Thu, Mar 27, 2014 at 3:43 PM, Ted Dunning <ted.dunn...@gmail.com
> >
> >>>>>> wrote:
> >>>>>>
> >>>>>>> How can there be any other practical method?  Essentially all of
> >>>>>>>
> >>>>>> the
> >>>
> >>>> mathematical assumptions under-pinning ALS are violated by the real
> >>>>>>>
> >>>>>> world.
> >>>>>>
> >>>>>>>   Why would any mathematical consideration of the number of
> features
> >>>>>>>
> >>>>>> be
> >>>>
> >>>>> much
> >>>>>>
> >>>>>>> more than heuristic?
> >>>>>>>
> >>>>>>> That said, you can make an information content argument.  You can
> >>>>>>>
> >>>>>> also
> >>>>
> >>>>> make
> >>>>>>
> >>>>>>> the argument that if you take too many features, it doesn't much
> >>>>>>>
> >>>>>> hurt
> >>>
> >>>> so
> >>>>>
> >>>>>> you should always take as many as you can compute.
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> On Thu, Mar 27, 2014 at 6:33 AM, Sebastian Schelter <
> >>>>>>>
> >>>>>> s...@apache.org>
> >>>
> >>>> wrote:
> >>>>>>
> >>>>>>>
> >>>>>>>  Hi,
> >>>>>>>>
> >>>>>>>> does anyone know of a principled approach of choosing the number
> >>>>>>>>
> >>>>>>> of
> >>>
> >>>> features for ALS (other than cross-validation?)
> >>>>>>>>
> >>>>>>>> --sebastian
> >>>>>>>>
> >>>>>>>>
> >>>>>>
> >>>>>
> >>>>>
> >>>>> --
> >>>>>
> >>>>>
> >>>>> https://github.com/bearrito
> >>>>> @deepbearrito
> >>>>>
> >>>>>
> >>>>
> >>>
> >>>
> >>> --
> >>>
> >>>
> >>> https://github.com/bearrito
> >>> @deepbearrito
> >>>
> >>>
> >>
> >
>

Re: Number of features for ALS

Reply via email to