On Aug 4, 2013 4:32 AM, "Fernando Fernández" <
fernando.fernandez.gonza...@gmail.com> wrote:
>
> > I thought both methods accept exactly the same drm format so u could
just
> feed the same thing to them?
>
> That's exactly what I'm doing, (or I think I'm doing at least... I will
> look deeply into
It is very easy but I expect to see no difference.
Sent from my iPhone
On Aug 4, 2013, at 4:31, Fernando Fernández
wrote:
>
> Also, maybe it's easy to adapt SSVD prototype in R to use uniform vectors,
> right?
> I thought both methods accept exactly the same drm format so u could just
feed the same thing to them?
That's exactly what I'm doing, (or I think I'm doing at least... I will
look deeply into this in three or four weeks when I get back to work).
As for the movielens example I will try to replic
On Sat, Aug 3, 2013 at 3:05 AM, Fernando Fernández <
fernando.fernandez.gonza...@gmail.com> wrote:
> > svd.r$d[1:10] [1] 640.63362 244.83635 217.84622 159.15360 158.21191
> 145.87261 126.57977 121.90770 106.82918 99.74794[1] "three runs with q=0"
> [1] 640.63362 244.83613 217.84493 159.14512 158.
On Sat, Aug 3, 2013 at 3:05 AM, Fernando Fernández <
fernando.fernandez.gonza...@gmail.com> wrote:
> Definition of "so" is Mahout Lanczos an R yielding eigenvalues like (I'm
> inventing the numbers here cause I don't remeber exact figures): 1834.58,
> 756.34, 325,67,125,67 and providing very good
On Aug 3, 2013 3:06 AM, "Fernando Fernández" <
fernando.fernandez.gonza...@gmail.com> wrote:
>
> Definition of "so" is Mahout Lanczos an R yielding eigenvalues like (I'm
> inventing the numbers here cause I don't remeber exact figures): 1834.58,
> 756.34, 325,67,125,67 and providing very good recom
Definition of "so" is Mahout Lanczos an R yielding eigenvalues like (I'm
inventing the numbers here cause I don't remeber exact figures): 1834.58,
756.34, 325,67,125,67 and providing very good recommendations in the
recommender system, and SSVD giving eigenvalues (invented numbers again)
723,56, 35
On Fri, Aug 2, 2013 at 3:08 PM, Dmitriy Lyubimov wrote:
>
>
>
> On Fri, Aug 2, 2013 at 2:52 PM, Fernando Fernández <
> fernando.fernandez.gonza...@gmail.com> wrote:
>
>> I don't agree with k>10 being unlikely meaningful. I've used SVD in text
>> mining problems where k~150 yielded best results (n
On Fri, Aug 2, 2013 at 2:52 PM, Fernando Fernández <
fernando.fernandez.gonza...@gmail.com> wrote:
> I don't agree with k>10 being unlikely meaningful. I've used SVD in text
> mining problems where k~150 yielded best results (not only a good choice
> based on plotting eigenvalues and seeing elbow
I don't agree with k>10 being unlikely meaningful. I've used SVD in text
mining problems where k~150 yielded best results (not only a good choice
based on plotting eigenvalues and seeing elbow in decay was near 150 but
checking results with different k's and seeing around 150 made much more
sense).
the only time you would not get good results is if spectrum does not have a
good decay. Which is equivalent to mostly same variance in most of original
basis directions. This problem is similar to problem that arises with PCA
when you try to do dimensionality reduction with retaining certain %-tage
if you use k > 40 you are already beating Lanczos for larger datasets. k>10
is unlikely meaninful. p need not be more than 15% of k (default is 15).
use q=1, q>1 does not yield tangible improvements in real world. Again,
see Nathan Halko's dissertation on accuracy comparison.
On Fri, Aug 2, 201
Keeping Lanczos would be nice, Like I said, it's currently being used in
some projects with good results and I think it's easier to tune so it would
be my first choice for future developments. I still need to further test
SSVD, specially because in the current example I'm working it yields very
dif
I would also be fine with keeping if there is demand. I just proposed to
deprecate it and nobody voted against that at that point in time.
--sebastian
On 02.08.2013 03:12, Dmitriy Lyubimov wrote:
> There's a part of Nathan Halko's dissertation referenced on algorithm page
> running comparison.
There's a part of Nathan Halko's dissertation referenced on algorithm page
running comparison. In particular, he was not able to compute more than 40
eigenvectors with Lanczos on wikipedia dataset. You may refer to that
study.
On the accuracy part, it was not observed that it was a problem, assum
On Thu, Aug 1, 2013 at 7:08 AM, Sebastian Schelter wrote:
> IIRC the main reasons for deprecating Lanczos was that in contrast to
> SSVD, it does not use a constant number of MapReduce jobs and that our
> implementation has the constraint that all the resulting vectors have to
> fit into the memo
IIRC the main reasons for deprecating Lanczos was that in contrast to
SSVD, it does not use a constant number of MapReduce jobs and that our
implementation has the constraint that all the resulting vectors have to
fit into the memory of the driver machine.
Best,
Sebastian
On 01.08.2013 12:15, Fer
Hi everyone,
Sorry if I duplicate the question but I've been looking for an answer and I
haven't found an explanation other than it's not being used (together with
some other algorithms). If it's been discussed in depth before maybe you
can point me to some link with the discussion.
I have succes
18 matches
Mail list logo