Re: Why is Lanczos deprecated?

2013-08-04 Thread Dmitriy Lyubimov
On Aug 4, 2013 4:32 AM, "Fernando Fernández" < fernando.fernandez.gonza...@gmail.com> wrote: > > > I thought both methods accept exactly the same drm format so u could just > feed the same thing to them? > > That's exactly what I'm doing, (or I think I'm doing at least... I will > look deeply into

Re: Why is Lanczos deprecated?

2013-08-04 Thread Ted Dunning
It is very easy but I expect to see no difference. Sent from my iPhone On Aug 4, 2013, at 4:31, Fernando Fernández wrote: > > Also, maybe it's easy to adapt SSVD prototype in R to use uniform vectors, > right?

Re: Why is Lanczos deprecated?

2013-08-04 Thread Fernando Fernández
> I thought both methods accept exactly the same drm format so u could just feed the same thing to them? That's exactly what I'm doing, (or I think I'm doing at least... I will look deeply into this in three or four weeks when I get back to work). As for the movielens example I will try to replic

Re: Why is Lanczos deprecated?

2013-08-03 Thread Ted Dunning
On Sat, Aug 3, 2013 at 3:05 AM, Fernando Fernández < fernando.fernandez.gonza...@gmail.com> wrote: > > svd.r$d[1:10] [1] 640.63362 244.83635 217.84622 159.15360 158.21191 > 145.87261 126.57977 121.90770 106.82918 99.74794[1] "three runs with q=0" > [1] 640.63362 244.83613 217.84493 159.14512 158.

Re: Why is Lanczos deprecated?

2013-08-03 Thread Dmitriy Lyubimov
On Sat, Aug 3, 2013 at 3:05 AM, Fernando Fernández < fernando.fernandez.gonza...@gmail.com> wrote: > Definition of "so" is Mahout Lanczos an R yielding eigenvalues like (I'm > inventing the numbers here cause I don't remeber exact figures): 1834.58, > 756.34, 325,67,125,67 and providing very good

Re: Why is Lanczos deprecated?

2013-08-03 Thread Dmitriy Lyubimov
On Aug 3, 2013 3:06 AM, "Fernando Fernández" < fernando.fernandez.gonza...@gmail.com> wrote: > > Definition of "so" is Mahout Lanczos an R yielding eigenvalues like (I'm > inventing the numbers here cause I don't remeber exact figures): 1834.58, > 756.34, 325,67,125,67 and providing very good recom

Re: Why is Lanczos deprecated?

2013-08-03 Thread Fernando Fernández
Definition of "so" is Mahout Lanczos an R yielding eigenvalues like (I'm inventing the numbers here cause I don't remeber exact figures): 1834.58, 756.34, 325,67,125,67 and providing very good recommendations in the recommender system, and SSVD giving eigenvalues (invented numbers again) 723,56, 35

Re: Why is Lanczos deprecated?

2013-08-02 Thread Dmitriy Lyubimov
On Fri, Aug 2, 2013 at 3:08 PM, Dmitriy Lyubimov wrote: > > > > On Fri, Aug 2, 2013 at 2:52 PM, Fernando Fernández < > fernando.fernandez.gonza...@gmail.com> wrote: > >> I don't agree with k>10 being unlikely meaningful. I've used SVD in text >> mining problems where k~150 yielded best results (n

Re: Why is Lanczos deprecated?

2013-08-02 Thread Dmitriy Lyubimov
On Fri, Aug 2, 2013 at 2:52 PM, Fernando Fernández < fernando.fernandez.gonza...@gmail.com> wrote: > I don't agree with k>10 being unlikely meaningful. I've used SVD in text > mining problems where k~150 yielded best results (not only a good choice > based on plotting eigenvalues and seeing elbow

Re: Why is Lanczos deprecated?

2013-08-02 Thread Fernando Fernández
I don't agree with k>10 being unlikely meaningful. I've used SVD in text mining problems where k~150 yielded best results (not only a good choice based on plotting eigenvalues and seeing elbow in decay was near 150 but checking results with different k's and seeing around 150 made much more sense).

Re: Why is Lanczos deprecated?

2013-08-02 Thread Dmitriy Lyubimov
the only time you would not get good results is if spectrum does not have a good decay. Which is equivalent to mostly same variance in most of original basis directions. This problem is similar to problem that arises with PCA when you try to do dimensionality reduction with retaining certain %-tage

Re: Why is Lanczos deprecated?

2013-08-02 Thread Dmitriy Lyubimov
if you use k > 40 you are already beating Lanczos for larger datasets. k>10 is unlikely meaninful. p need not be more than 15% of k (default is 15). use q=1, q>1 does not yield tangible improvements in real world. Again, see Nathan Halko's dissertation on accuracy comparison. On Fri, Aug 2, 201

Re: Why is Lanczos deprecated?

2013-08-02 Thread Fernando Fernández
Keeping Lanczos would be nice, Like I said, it's currently being used in some projects with good results and I think it's easier to tune so it would be my first choice for future developments. I still need to further test SSVD, specially because in the current example I'm working it yields very dif

Re: Why is Lanczos deprecated?

2013-08-01 Thread Sebastian Schelter
I would also be fine with keeping if there is demand. I just proposed to deprecate it and nobody voted against that at that point in time. --sebastian On 02.08.2013 03:12, Dmitriy Lyubimov wrote: > There's a part of Nathan Halko's dissertation referenced on algorithm page > running comparison.

Re: Why is Lanczos deprecated?

2013-08-01 Thread Dmitriy Lyubimov
There's a part of Nathan Halko's dissertation referenced on algorithm page running comparison. In particular, he was not able to compute more than 40 eigenvectors with Lanczos on wikipedia dataset. You may refer to that study. On the accuracy part, it was not observed that it was a problem, assum

Re: Why is Lanczos deprecated?

2013-08-01 Thread Jake Mannix
On Thu, Aug 1, 2013 at 7:08 AM, Sebastian Schelter wrote: > IIRC the main reasons for deprecating Lanczos was that in contrast to > SSVD, it does not use a constant number of MapReduce jobs and that our > implementation has the constraint that all the resulting vectors have to > fit into the memo

Re: Why is Lanczos deprecated?

2013-08-01 Thread Sebastian Schelter
IIRC the main reasons for deprecating Lanczos was that in contrast to SSVD, it does not use a constant number of MapReduce jobs and that our implementation has the constraint that all the resulting vectors have to fit into the memory of the driver machine. Best, Sebastian On 01.08.2013 12:15, Fer

Why is Lanczos deprecated?

2013-08-01 Thread Fernando Fernández
Hi everyone, Sorry if I duplicate the question but I've been looking for an answer and I haven't found an explanation other than it's not being used (together with some other algorithms). If it's been discussed in depth before maybe you can point me to some link with the discussion. I have succes