What if you had Arrays of Matrices, or Arrays of Arrays of Matrices? (e.g. 3d and 4d tensors)?
I implemented these for the MLPs (still WIP) https://github.com/apache/mahout/pull/323/files#diff-cd8a7c5e2cf42b91b5aa47c96daf19c0R25 But those functions were specifically to overcome the challenges you describe wrt neural network weight sets. If you could use those as is- that would be awesome, if not maybe you'll find inspiration there. On Thu, Jun 22, 2017 at 6:43 PM, Aditya <adityasarma...@gmail.com> wrote: > Hello everyone, > > I've been working for the past few weeks on coding an in-core DBSCAN > algorithm. > > A more efficient version with an O(n*log(n)) complexity does exist but it > uses the R-Tree data structure to index the data. I have a few > concerns/questions and I'm hoping you would be able to help me out. > > 1. Based on my knowledge, using an indexing data structure like an R-Tree > or a Kd-Tree is the only way to reduce the complexity of the dbscan > algorithm. If there's any other method that you are familiar with, please > let me know. > > 2. From what I've read in the book Apache Mahout: Beyond MapReduce written > by Andrew and Dmitry, I don't see how I can directly exploit the existing > data structures and operations to get the functionality of an R-Tree. > > 3. On the off chance that an R-Tree module has to built in Mahout, because > it is not possible to easily use existing operations I need some insights > as to how to go about it. I learned that everything in Mahout at the end > should be serializable to a vector. I can't fathom how to do that > intuitively in the case of an R-Tree > > There are a couple of other concerns that need to be discussed but these > are vital at the moment. > > PS: The research paper that proposed the DBSCAN algorithm can be found here > <https://www.aaai.org/Papers/KDD/1996/KDD96-037.pdf>. > > Thanks! > > -Aditya >