2014-11-27 17:26 GMT+01:00 Ian Ozsvald <[email protected]>:
> If safe_sparse_dot is called with dense_output=False then I get a sparse
> result and everything looks sensible with low RAM usage.
>
> I'm using 0.15, the current github shows the line:
> https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/metrics/pairwise.py#L692
>
> Was there a design decision to force dense matrices at this point? Maybe
> some call paths assume a dense result?
Practically all the consumers of kernels/distances expect to get dense
outputs. If you just want pairwise cosine similarities and you expect
lots of zeros, try X_normalized * Y_normalized.T.
But note that pairwise_distances('cosine') computes the "cosine
distance", which is 1 - cosine similarity, so if the result of the
matrix multiplication is sparse, the "distance" result is guaranteed
not to be sparse.
------------------------------------------------------------------------------
Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
from Actuate! Instantly Supercharge Your Business Reports and Dashboards
with Interactivity, Sharing, Native Excel Exports, App Integration & more
Get technology previously reserved for billion-dollar corporations, FREE
http://pubads.g.doubleclick.net/gampad/clk?id=157005751&iu=/4140/ostg.clktrk
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general