So, I would add the assembly jar to the just the master or would I have to add it to all the slaves/workers too?
Thanks, Raghav > On Jun 17, 2015, at 5:13 PM, DB Tsai <dbt...@dbtsai.com> wrote: > > You need to build the spark assembly with your modification and deploy > into cluster. > > Sincerely, > > DB Tsai > ---------------------------------------------------------- > Blog: https://www.dbtsai.com > PGP Key ID: 0xAF08DF8D > > > On Wed, Jun 17, 2015 at 5:11 PM, Raghav Shankar <raghav0110...@gmail.com> > wrote: >> I’ve implemented this in the suggested manner. When I build Spark and attach >> the new spark-core jar to my eclipse project, I am able to use the new >> method. In order to conduct the experiments I need to launch my app on a >> cluster. I am using EC2. When I setup my master and slaves using the EC2 >> setup scripts, it sets up spark, but I think my custom built spark-core jar >> is not being used. How do it up on EC2 so that my custom version of >> Spark-core is used? >> >> Thanks, >> Raghav >> >> On Jun 9, 2015, at 7:41 PM, DB Tsai <dbt...@dbtsai.com> wrote: >> >> Having the following code in RDD.scala works for me. PS, in the following >> code, I merge the smaller queue into larger one. I wonder if this will help >> performance. Let me know when you do the benchmark. >> >> def treeTakeOrdered(num: Int)(implicit ord: Ordering[T]): Array[T] = >> withScope { >> if (num == 0) { >> Array.empty >> } else { >> val mapRDDs = mapPartitions { items => >> // Priority keeps the largest elements, so let's reverse the ordering. >> val queue = new BoundedPriorityQueue[T](num)(ord.reverse) >> queue ++= util.collection.Utils.takeOrdered(items, num)(ord) >> Iterator.single(queue) >> } >> if (mapRDDs.partitions.length == 0) { >> Array.empty >> } else { >> mapRDDs.treeReduce { (queue1, queue2) => >> if (queue1.size > queue2.size) { >> queue1 ++= queue2 >> queue1 >> } else { >> queue2 ++= queue1 >> queue2 >> } >> }.toArray.sorted(ord) >> } >> } >> } >> >> def treeTop(num: Int)(implicit ord: Ordering[T]): Array[T] = withScope { >> treeTakeOrdered(num)(ord.reverse) >> } >> >> >> >> Sincerely, >> >> DB Tsai >> ---------------------------------------------------------- >> Blog: https://www.dbtsai.com >> PGP Key ID: 0xAF08DF8D >> >> On Tue, Jun 9, 2015 at 10:09 AM, raggy <raghav0110...@gmail.com> wrote: >>> >>> I am trying to implement top-k in scala within apache spark. I am aware >>> that >>> spark has a top action. But, top() uses reduce(). Instead, I would like to >>> use treeReduce(). I am trying to compare the performance of reduce() and >>> treeReduce(). >>> >>> The main issue I have is that I cannot use these 2 lines of code which are >>> used in the top() action within my Spark application. >>> >>> val queue = new BoundedPriorityQueue[T](num)(ord.reverse) >>> queue ++= util.collection.Utils.takeOrdered(items, num)(ord) >>> >>> How can I go about implementing top() using treeReduce()? >>> >>> >>> >>> -- >>> View this message in context: >>> http://apache-spark-user-list.1001560.n3.nabble.com/Implementing-top-using-treeReduce-tp23227.html >>> Sent from the Apache Spark User List mailing list archive at Nabble.com. >>> >>> --------------------------------------------------------------------- >>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org >>> For additional commands, e-mail: user-h...@spark.apache.org >>> >> >> --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org