spark job server

2016-01-16 Thread Madabhattula Rajesh Kumar
Hi, I am not able to start spark job sever. I am facing below error. Please let me know, how to resolve this issue. I have configured one master and two workers in cluster mode. ./server_start.sh *./server_start.sh: line 52: kill: (19621) - No such process./server_start.sh: line 78:

答复: 答复: 答复: 答复: spark streaming context trigger invoke stop why?

2016-01-16 Thread Triones,Deng(vip.com)
Thanks for your response. As a notice that , when my spark version is 1.4.1 when that kind of error won’t cause driver stop. Another wise spark 1.5.2 will cause driver stop, I think there must be some change. As I notice the code @spark 1.5.2 JobScheduler.scala :

ClassNotFoundException interpreting a Spark job

2016-01-16 Thread milad bourhani
Hi everyone, I’m trying to use the Scala interpreter, IMain, to interpret some Scala code that executes a job with Spark: @Test public void countToFive() throws ScriptException { SparkConf conf = new SparkConf().setAppName("Spark interpreter").setMaster("local[2]"); SparkContext sc =

Re: spark job server

2016-01-16 Thread Madabhattula Rajesh Kumar
Hi, I am using "ooyala/spark-jobserver". Regards, Rajesh On Sat, Jan 16, 2016 at 8:36 PM, Ted Yu wrote: > Which distro are you using ? > > From the error message, compute-classpath.sh was not found. > I searched Spark 1.6 built for hadoop 2.6 but didn't find > either

Re: Consuming commands from a queue

2016-01-16 Thread Afshartous, Nick
Thanks Cody. One reason I was thinking of using Akka is that some of the copies take much longer than others (or get stuck). We've seen this with our current streaming job. This can cause the entire streaming micro-batch to take longer. If we had a set of Akka actors than each copy would

How to apply mapPartitionsWithIndex to an emptyRDD?

2016-01-16 Thread LINChen
Hi all,I have some data on the driver side. Then I will broadcast the data to all workers side to ensure each worker has same data. Due to there is no RDD in the memory, I don't know how to make workers to start tasks to do some transformation based on the data. I have try to write code like

Re: Sending large objects to specific RDDs

2016-01-16 Thread Daniel Imberman
Hi Koert, So I actually just mentioned something somewhat similar in the thread (your email actually came through as I was sending it :) ). One question I have is if I do a groupByKey and I have been smart about my partitioning up to this point would I have that benefit of not needing to shuffle

Re: Sending large objects to specific RDDs

2016-01-16 Thread Ted Yu
Both groupByKey and join() accept Partitioner as parameter. Maybe you can specify a custom Partitioner so that the amount of shuffle is reduced. On Sat, Jan 16, 2016 at 9:39 AM, Daniel Imberman wrote: > Hi Ted, > > I think I might have figured something out!(Though I

Re: spark job server

2016-01-16 Thread Ted Yu
Which distro are you using ? >From the error message, compute-classpath.sh was not found. I searched Spark 1.6 built for hadoop 2.6 but didn't find either compute-classpath.sh or server_start.sh Cheers On Sat, Jan 16, 2016 at 5:33 AM, Madabhattula Rajesh Kumar < mrajaf...@gmail.com> wrote: >