Hey Lars,
So, my model built successfully last night with your new code, thanks! I'll
see if I can build larger models as well this week and put memory usage to
the test a bit, but this is a *massive* improvement. Our Scientist is
making sure the output is sane, but I'm assuming you already did ba
Thanks so much for spending some time on this. I'll give it a try first
thing tomorrow and report back. Thanks Lars!
-Will
On Fri, Aug 30, 2013 at 1:32 AM, Lars Buitinck wrote:
> 2013/8/30 Will Buckner :
> > Damn, hmm. This just seems so so heavy to calculate reconstruction_err,
> > which isn'
2013/8/30 Will Buckner :
> Damn, hmm. This just seems so so heavy to calculate reconstruction_err,
> which isn't even used inside the algorithm. I don't even use it in the
> pipeline. My current best idea is just to subclass ProjectedGradientNMF()
> and overload fit_transform(), not computing recon
Damn, hmm. This just seems so so heavy to calculate reconstruction_err,
which isn't even used inside the algorithm. I don't even use it in the
pipeline. My current best idea is just to subclass ProjectedGradientNMF()
and overload fit_transform(), not computing reconstruction_err at all
On Thu
This looks great, and I'll talk to my lead scientist about incorporating it
and evaluating, thanks! I must warn you all, I'm not an algorithms guy; I'm
on the software/performance/"make this shit work" side of things. For the
task at hand, we're using NMF for a reason, and I've gotta make this
work
2013/8/29 Lars Buitinck :
> W, H = csr_matrix(W), csr_matrix(H)
> reconstruction_err = euclidean_distances(X, W * H).sum()
Never mind. Even after fixing the formula, W * H is actually too dense
for this to be any good, regardless of the initialization and
sparseness parameter.
--
Lars Buitinck
S
2013/8/29 Will Buckner :
> Er, it looks like safe_sparse_dot() returns sparse unless dense_output=True.
No, it returns dense output when one of its two arguments is dense.
dense_output only exists to force dense output in the sparse-sparse
case.
safe_sparse_norm(X - safe_sparse_dot(W, H)) would n
> Er, it looks like safe_sparse_dot() returns sparse unless dense_output=True.
> And, I'm confused as to how this would result in more memory. Aren't we
> allocating more in the lines above for the issparse(X) case? I'm stick right
> now because my 40k x 220k CSR matrix can't make it past computing
2013/8/29 Will Buckner :
>> the motivation for these lines is that even if X is sparse
>> safe_sparse_dot(W, H)
> will not be. So you will allocate a matrix of size X but dense which is
> unacceptable in many cases.
>
> Er, it looks like safe_sparse_dot() returns sparse unless dense_output=True.
>
> the motivation for these lines is that even if X is sparse
safe_sparse_dot(W, H)
will not be. So you will allocate a matrix of size X but dense which is
unacceptable in many cases.
Er, it looks like safe_sparse_dot() returns sparse unless dense_output=True.
And, I'm confused as to how this would
hi Will,
> if not sp.issparse(X):
>
> self.reconstruction_err_ = norm(X - np.dot(W, H))
>
> else:
>
> norm2X = np.sum(X.data ** 2) # Ok because X is CSR
>
> normWHT = np.trace(np.dot(np.dot(H.T, np.dot(W.T, W)), H))
>
>
11 matches
Mail list logo