Would there be use in propagating the dense_output=True flag further back
through the interfaces, such that sparsity could be requested?
On 27 Nov 2014 17:40, "Ian Ozsvald" <[email protected]> wrote:
> Hi Andy, Lars. Here I was picking up a colleague's code, I think he used
> pairwise_kernels just because it was handy. I agree that if I just compute
> X*X.T I get the sparse result that I'm after, it was more that I was
> confused that various methods in that call stack made a point of preserving
> sparsity until the final step which deliberately took it away.
>
> Anyhow, I've now sparsified this particular routine so I'm back in the
> happy state that 16GB is not a bottleneck (at all).
>
> Cheers, i.
>
> On 27 November 2014 at 16:37, Lars Buitinck <[email protected]> wrote:
>
>> 2014-11-27 17:26 GMT+01:00 Ian Ozsvald <[email protected]>:
>> > If safe_sparse_dot is called with dense_output=False then I get a sparse
>> > result and everything looks sensible with low RAM usage.
>> >
>> > I'm using 0.15, the current github shows the line:
>> >
>> https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/metrics/pairwise.py#L692
>> >
>> > Was there a design decision to force dense matrices at this point? Maybe
>> > some call paths assume a dense result?
>>
>> Practically all the consumers of kernels/distances expect to get dense
>> outputs. If you just want pairwise cosine similarities and you expect
>> lots of zeros, try X_normalized * Y_normalized.T.
>>
>> But note that pairwise_distances('cosine') computes the "cosine
>> distance", which is 1 - cosine similarity, so if the result of the
>> matrix multiplication is sparse, the "distance" result is guaranteed
>> not to be sparse.
>>
>>
>> ------------------------------------------------------------------------------
>> Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
>> from Actuate! Instantly Supercharge Your Business Reports and Dashboards
>> with Interactivity, Sharing, Native Excel Exports, App Integration & more
>> Get technology previously reserved for billion-dollar corporations, FREE
>>
>> http://pubads.g.doubleclick.net/gampad/clk?id=157005751&iu=/4140/ostg.clktrk
>> _______________________________________________
>> Scikit-learn-general mailing list
>> [email protected]
>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>>
>
>
>
> --
> Ian Ozsvald (A.I. researcher)
> [email protected]
>
> http://IanOzsvald.com
> http://ModelInsight.io
> http://MorConsulting.com
> http://Annotate.IO
> http://SocialTiesApp.com
> http://TheScreencastingHandbook.com
> http://FivePoundApp.com
> http://twitter.com/IanOzsvald
> http://ShowMeDo.com
>
------------------------------------------------------------------------------
Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
from Actuate! Instantly Supercharge Your Business Reports and Dashboards
with Interactivity, Sharing, Native Excel Exports, App Integration & more
Get technology previously reserved for billion-dollar corporations, FREE
http://pubads.g.doubleclick.net/gampad/clk?id=157005751&iu=/4140/ostg.clktrk
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general