An attempt to implement dbscan algorithm on top of Spark

2014-06-12 Thread Aliaksei Litouka
Hi. I'm not sure if messages like this are appropriate in this list; I just want to share with you an application I am working on. This is my personal project which I started to learn more about Spark and Scala, and, if it succeeds, to contribute it to the Spark community. Maybe someone will find

Re: An attempt to implement dbscan algorithm on top of Spark

2014-06-12 Thread Vipul Pandey
Great! I was going to implement one of my own - but I may not need to do that any more :) I haven't had a chance to look deep into your code but I would recommend accepting an RDD[Double,Double] as well, instead of just a file. val data = IOHelper.readDataset(sc, /path/to/my/data.csv) And other

Re: An attempt to implement dbscan algorithm on top of Spark

2014-06-12 Thread Aliaksei Litouka
Vipul, Thanks for your feedback. As far as I understand, mean RDD[(Double, Double)] (note the parenthesis), and each of these Double values is supposed to contain one coordinate of a point. It limits us to 2-dimensional space, which is not suitable for many tasks. I want the algorithm to be able