Niklas, http://en.wikipedia.org/wiki/Cross-validation_(statistics)
http://statweb.stanford.edu/~tibs/sta306b/cvwrong.pdf On Sun, Mar 30, 2014 at 12:41 PM, Niklas Ekvall <niklas.ekv...@gmail.com>wrote: > Hello Sebastian, could you do a deeper explanation or refer to any article > that handle the subject? > > Best regards, Niklas > > > 2014-03-30 20:50 GMT+02:00 Sebastian Schelter <s...@apache.org>: > > > Use k-fold cross-validation or hold-out tests for estimating the quality > > of different parameter combinations. > > > > --sebastian > > > > > > On 03/30/2014 11:53 AM, Niklas Ekvall wrote: > > > >> Hi, > >> > >> My name is Niklas Ekvall and I have a implementation of the recommender > >> algorithm "Large-scale Parallel Collaborative Filtering for the Netflix > >> Prize" and now I'm wondering how to choose the number of features and > >> lambda. Could any of guys help me to explain a stepwise strategy to > choose > >> or optimize these two parameters? > >> > >> Best regards, Niklas > >> > >> > >> 2014-03-27 19:07 GMT+01:00 j.barrett Strausser < > >> j.barrett.straus...@gmail.com>: > >> > >> Thanks Ted, > >>> > >>> Yes for the time problem. We tend to use aggregations of session data. > So > >>> instead of asking for user recommendations we do things like > >>> user+sessions > >>> recommendations. > >>> > >>> Of course, deciding when sessions start and stop isn't trivial. I > ideally > >>> what I would want to is time-weight views using a kernel or > convolution. > >>> That's a bit heavy so we typically have a global model, that is is > >>> basically all preferences over times. Then these user+session type > >>> models. > >>> We can then combine these at another level to give recommendations > based > >>> on > >>> what you like throughout time versus what you have been doing recently. > >>> > >>> > >>> > >>> -b > >>> > >>> > >>> On Thu, Mar 27, 2014 at 1:59 PM, Ted Dunning <ted.dunn...@gmail.com> > >>> wrote: > >>> > >>> For the poly-syllable challenged, > >>>> > >>>> hetereoscedasticity - degree of variation changes. This is common > with > >>>> counts because you expect the standard deviation of count data to be > >>>> proportional to sqrt(n). > >>>> > >>>> time imhogeneity - changes in behavior over time. One way to handle > >>>> this > >>>> (roughly) is to first remove variation in personal and item means over > >>>> > >>> time > >>> > >>>> (if using ratings) and then to segment user histories into episodes. > By > >>>> including both short and long episodes you get some repair for changes > >>>> in > >>>> personal preference. A great example of how this works/breaks is > >>>> > >>> Christmas > >>> > >>>> music. On December 26th, you want to *stop* recommending this music > so > >>>> > >>> it > >>> > >>>> really pays to limit histories at this point. By having an episodic > >>>> user > >>>> session that starts around November and runs to Christmas, you can get > >>>> > >>> good > >>> > >>>> recommendations for seasonal songs and not pollute the rest of the > >>>> universe. > >>>> > >>>> > >>>> > >>>> On Thu, Mar 27, 2014 at 8:30 AM, j.barrett Strausser < > >>>> j.barrett.straus...@gmail.com> wrote: > >>>> > >>>> For my team it has usually been hetereoscedasticity and time > >>>>> > >>>> inhomogeneity. > >>>> > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> On Thu, Mar 27, 2014 at 10:18 AM, Tevfik Aytekin > >>>>> <tevfik.ayte...@gmail.com>wrote: > >>>>> > >>>>> Interesting topic, > >>>>>> Ted, can you give examples of those mathematical assumptions > >>>>>> under-pinning ALS which are violated by the real world? > >>>>>> > >>>>>> On Thu, Mar 27, 2014 at 3:43 PM, Ted Dunning <ted.dunn...@gmail.com > > > >>>>>> wrote: > >>>>>> > >>>>>>> How can there be any other practical method? Essentially all of > >>>>>>> > >>>>>> the > >>> > >>>> mathematical assumptions under-pinning ALS are violated by the real > >>>>>>> > >>>>>> world. > >>>>>> > >>>>>>> Why would any mathematical consideration of the number of > features > >>>>>>> > >>>>>> be > >>>> > >>>>> much > >>>>>> > >>>>>>> more than heuristic? > >>>>>>> > >>>>>>> That said, you can make an information content argument. You can > >>>>>>> > >>>>>> also > >>>> > >>>>> make > >>>>>> > >>>>>>> the argument that if you take too many features, it doesn't much > >>>>>>> > >>>>>> hurt > >>> > >>>> so > >>>>> > >>>>>> you should always take as many as you can compute. > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> On Thu, Mar 27, 2014 at 6:33 AM, Sebastian Schelter < > >>>>>>> > >>>>>> s...@apache.org> > >>> > >>>> wrote: > >>>>>> > >>>>>>> > >>>>>>> Hi, > >>>>>>>> > >>>>>>>> does anyone know of a principled approach of choosing the number > >>>>>>>> > >>>>>>> of > >>> > >>>> features for ALS (other than cross-validation?) > >>>>>>>> > >>>>>>>> --sebastian > >>>>>>>> > >>>>>>>> > >>>>>> > >>>>> > >>>>> > >>>>> -- > >>>>> > >>>>> > >>>>> https://github.com/bearrito > >>>>> @deepbearrito > >>>>> > >>>>> > >>>> > >>> > >>> > >>> -- > >>> > >>> > >>> https://github.com/bearrito > >>> @deepbearrito > >>> > >>> > >> > > >