Like Jake said.

On Sun, Aug 29, 2010 at 4:48 PM, Ted Dunning <[email protected]> wrote:

>
> In particular, since our sparse representation requires an int (4 bytes)
> and a double (8 bytes) to store one non-zero entry while a dense row
> requires only 8 bytes per entry then your original data would require less
> storage if it has less than 200 * 8 / 12 = 133 non-zero
> entries per row on average.  Depending on the data-set, this could be very
> likely or totally implausible.
>
> SVD is still useful in these cases because it can provide useful smoothing.
>
>
> On Sun, Aug 29, 2010 at 3:29 PM, Akshay Bhat <[email protected]>wrote:
>
>> Even though the SVD is supposed to reduce dimensionality it does not means
>> that your results will have smaller size [in terms of memory], since U , S
>> and V are dense matrices. except if you are using too few eigenvectors.
>> Your
>> input matrix is a sparse, had it been represented as a dense matrix it
>> would
>> have far large size.
>>
>>
>> On Sun, Aug 29, 2010 at 5:13 PM, Grant Ingersoll <[email protected]
>> >wrote:
>>
>> > Should be noted, that cranking the rank down to 20 produces a
>> significantly
>> > smaller result.
>> >
>> >
>> > On Aug 29, 2010, at 4:38 PM, Grant Ingersoll wrote:
>> >
>> > > I'm running SVD as:
>> > > ./mahout svd --input /tmp/solr-clust-n2/part-out.vec --tempDir
>> > /tmp/solr-clust-n2/svdTemp --output /tmp/solr-clust-n2/svdOut --rank 200
>> > --numCols 65458 --numRows  130103
>> > >  ./mahout cleansvd --eigenInput /tmp/solr-clust-n2/svdOut
>> --corpusInput
>> > /tmp/solr-clust-n2/part-out.vec --output /tmp/solr-clust-n2/svdFinal
>> > --maxError 0.1 --minEigenvalue 10.0
>> > >
>> > > part-out.vec is 52 MB.  The output from SVD  (svdOut) is 104 MB and
>> > largestCleanEigens is 88 MB.  For some reason, this really doesn't feel
>> > right.
>> > >
>> > > Is there a guide on interpreting the output of SVD anywhere?
>> >  Intuitively, I believe the output should be a lot smaller?   I mean
>> that's
>> > the point, right?
>> > >
>> > > I can share the vector if you want.
>> > >
>> > > -Grant
>> > >
>> > > --------------------------
>> > > Grant Ingersoll
>> > > http://lucenerevolution.org Lucene/Solr Conference, Boston Oct 7-8
>> > >
>> >
>> > --------------------------
>> > Grant Ingersoll
>> > http://lucenerevolution.org Apache Lucene/Solr Conference, Boston Oct
>> 7-8
>> >
>> >
>>
>>
>> --
>> Akshay Uday Bhat.
>> Graduate Student, Computer Science, Cornell University
>> Website: http://www.akshaybhat.com
>>
>
>

Reply via email to