orhankislal opened a new pull request #505: URL: https://github.com/apache/madlib/pull/505
JIRA: MADLIB-1017 The brute force DBSCAN runs on N^2 time. To improve this, we added two layers of indexing. 1. The data is broken into chunks with kd-tree 2. Each leaf of the kd-tree creates an rtree index for efficient range queries. In addition we added a separate process to reduce the number of edges to consider during wcc operation. <!-- Thanks for sending a pull request! Here are some tips for you: 1. Refer to this link for contribution guidelines https://cwiki.apache.org/confluence/display/MADLIB/Contribution+Guidelines 2. Please Provide the Module Name, a JIRA Number and a short description about your changes. --> - [ ] Add the module name, JIRA# to PR/commit and description. - [ ] Add tests for the change. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org