Re: Submitting Spark Applications using Spark Submit

2015-06-20 Thread Raghav Shankar
the assembly jar to the cluster instead of building it there. The EC2 machines often take much longer to build for some reason. Also it's cumbersome to set up proper IDE there. -Andrew 2015-06-19 19:11 GMT-07:00 Raghav Shankar raghav0110...@gmail.com: Thanks Andrew! Is this all I have to do when

Re: Submitting Spark Applications using Spark Submit

2015-06-19 Thread Raghav Shankar
Thanks Andrew! Is this all I have to do when using the spark ec2 script to setup a spark cluster? It seems to be getting an assembly jar that is not from my project(perhaps from a maven repo). Is there a way to make the ec2 script use the assembly jar that I created? Thanks, Raghav On Friday,

Re: Implementing top() using treeReduce()

2015-06-17 Thread Raghav Shankar
I’ve implemented this in the suggested manner. When I build Spark and attach the new spark-core jar to my eclipse project, I am able to use the new method. In order to conduct the experiments I need to launch my app on a cluster. I am using EC2. When I setup my master and slaves using the EC2

Re: Implementing top() using treeReduce()

2015-06-17 Thread Raghav Shankar
, DB Tsai -- Blog: https://www.dbtsai.com PGP Key ID: 0xAF08DF8D On Wed, Jun 17, 2015 at 5:11 PM, Raghav Shankar raghav0110...@gmail.com wrote: I’ve implemented this in the suggested manner. When I build Spark and attach the new

Re: Submitting Spark Applications using Spark Submit

2015-06-16 Thread Raghav Shankar
script will upload this jar to YARN cluster automatically and then you can run your application as usual. It does not care about which version of Spark in your YARN cluster. 2015-06-17 10:42 GMT+08:00 Raghav Shankar raghav0110...@gmail.com javascript:_e(%7B%7D,'cvml','raghav0110...@gmail.com

Re: Submitting Spark Applications using Spark Submit

2015-06-16 Thread Raghav Shankar
I made the change so that I could implement top() using treeReduce(). A member on here suggested I make the change in RDD.scala to accomplish that. Also, this is for a research project, and not for commercial use. So, any advice on how I can get the spark submit to use my custom built jars

Re: Submitting Spark Applications using Spark Submit

2015-06-16 Thread Raghav Shankar
/configuration.html On June 16, 2015, at 10:12 PM, Raghav Shankar raghav0110...@gmail.com wrote: I made the change so that I could implement top() using treeReduce(). A member on here suggested I make the change in RDD.scala to accomplish that. Also, this is for a research project, and not for commercial

Re: Different Sorting RDD methods in Apache Spark

2015-06-09 Thread Raghav Shankar
Thank you for you responses! You mention that it only works as long as the data fits on a single machine. What I am tying to do is receive the sorted contents of my dataset. For this to be possible, the entire dataset should be able to fit on a single machine. Are you saying that sorting the

Re: TreeReduce Functionality in Spark

2015-06-04 Thread Raghav Shankar
Hey Reza, Thanks for your response! Your response clarifies some of my initial thoughts. However, what I don't understand is how the depth of the tree is used to identify how many intermediate reducers there will be, and how many partitions are sent to the intermediate reducers. Could you

Re: TreeReduce Functionality in Spark

2015-06-04 Thread Raghav Shankar
--- Blog: https://www.dbtsai.com On Thu, Jun 4, 2015 at 10:46 AM, Raghav Shankar raghav0110...@gmail.com javascript:; wrote: Hey Reza, Thanks for your response! Your response clarifies some of my initial thoughts. However, what I don't

Re: Task result in Spark Worker Node

2015-04-17 Thread Raghav Shankar
Hey Imran, Thanks for the great explanation! This cleared up a lot of things for me. I am actually trying to utilize some of the features within Spark for a system I am developing. I am currently working on developing a subsystem that can be integrated within Spark and other Big Data

Re: Task result in Spark Worker Node

2015-04-17 Thread Raghav Shankar
) at org.apache.spark.rdd.RDD.partitions(RDD.scala:217) On Apr 17, 2015, at 2:30 AM, Raghav Shankar raghav0110...@gmail.com wrote: Hey Imran, Thanks for the great explanation! This cleared up a lot of things for me. I am actually trying to utilize some of the features within Spark

Re: Sending RDD object over the network

2015-04-06 Thread Raghav Shankar
Hey Akhil, Thanks for your response! No, I am not expecting to receive the values themselves. I am just trying to receive the RDD object on my second Spark application. However, I get a NPE when I try to use the object within my second program. Would you know how I can properly send the RDD