2013/8/3 Gael Varoquaux <[email protected]>:
> On Tue, Jul 30, 2013 at 11:15:45AM -0700, Jacob Vanderplas wrote:
>> Additionally, the results of the benchmarks will be highly dependent on the
>> structure of the data.  Data with a low intrinsic dimensionality is usually
>> handled better by Ball Tree, while dense data (even if it is in blobs) will
>> not be handled especially well by any method.
>
> Indeed. However, I tried two real-world datasets, labelled faces in the
> wild, and MNIST, that I would have expected to have structure, and even
> in high dimension, your KDTree was still faster than your BallTree.
>
> I am attaching the benchmark scripts, as well as the results. They are
> somewhat interesting. I'd be interested by any feedback on these.
>
> The good news is that we are most often faster than scipy's KDTree.
> However, these benchmark seem to suggest that the 'auto' method, for
> euclidean distance, should not switch to BallTree.

Indeed, I think this auto method was implemented at the time of the
previous implementation and the heuristic has not been changed to take
into account the perf behavior of the new implementation.


-- 
Olivier
http://twitter.com/ogrisel - http://github.com/ogrisel

------------------------------------------------------------------------------
Get your SQL database under version control now!
Version control is standard for application code, but databases havent 
caught up. So what steps can you take to put your SQL databases under 
version control? Why should you start doing it? Read more to find out.
http://pubads.g.doubleclick.net/gampad/clk?id=49501711&iu=/4140/ostg.clktrk
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to