You need to build the spark assembly with your modification and deploy into cluster.
Sincerely, DB Tsai ---------------------------------------------------------- Blog: https://www.dbtsai.com PGP Key ID: 0xAF08DF8D On Wed, Jun 17, 2015 at 5:11 PM, Raghav Shankar <raghav0110...@gmail.com> wrote: > I’ve implemented this in the suggested manner. When I build Spark and attach > the new spark-core jar to my eclipse project, I am able to use the new > method. In order to conduct the experiments I need to launch my app on a > cluster. I am using EC2. When I setup my master and slaves using the EC2 > setup scripts, it sets up spark, but I think my custom built spark-core jar > is not being used. How do it up on EC2 so that my custom version of > Spark-core is used? > > Thanks, > Raghav > > On Jun 9, 2015, at 7:41 PM, DB Tsai <dbt...@dbtsai.com> wrote: > > Having the following code in RDD.scala works for me. PS, in the following > code, I merge the smaller queue into larger one. I wonder if this will help > performance. Let me know when you do the benchmark. > > def treeTakeOrdered(num: Int)(implicit ord: Ordering[T]): Array[T] = > withScope { > if (num == 0) { > Array.empty > } else { > val mapRDDs = mapPartitions { items => > // Priority keeps the largest elements, so let's reverse the ordering. > val queue = new BoundedPriorityQueue[T](num)(ord.reverse) > queue ++= util.collection.Utils.takeOrdered(items, num)(ord) > Iterator.single(queue) > } > if (mapRDDs.partitions.length == 0) { > Array.empty > } else { > mapRDDs.treeReduce { (queue1, queue2) => > if (queue1.size > queue2.size) { > queue1 ++= queue2 > queue1 > } else { > queue2 ++= queue1 > queue2 > } > }.toArray.sorted(ord) > } > } > } > > def treeTop(num: Int)(implicit ord: Ordering[T]): Array[T] = withScope { > treeTakeOrdered(num)(ord.reverse) > } > > > > Sincerely, > > DB Tsai > ---------------------------------------------------------- > Blog: https://www.dbtsai.com > PGP Key ID: 0xAF08DF8D > > On Tue, Jun 9, 2015 at 10:09 AM, raggy <raghav0110...@gmail.com> wrote: >> >> I am trying to implement top-k in scala within apache spark. I am aware >> that >> spark has a top action. But, top() uses reduce(). Instead, I would like to >> use treeReduce(). I am trying to compare the performance of reduce() and >> treeReduce(). >> >> The main issue I have is that I cannot use these 2 lines of code which are >> used in the top() action within my Spark application. >> >> val queue = new BoundedPriorityQueue[T](num)(ord.reverse) >> queue ++= util.collection.Utils.takeOrdered(items, num)(ord) >> >> How can I go about implementing top() using treeReduce()? >> >> >> >> -- >> View this message in context: >> http://apache-spark-user-list.1001560.n3.nabble.com/Implementing-top-using-treeReduce-tp23227.html >> Sent from the Apache Spark User List mailing list archive at Nabble.com. >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org >> For additional commands, e-mail: user-h...@spark.apache.org >> > > --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org