date:20160116

Re: Sending large objects to specific RDDs

2016-01-16 Thread Ted Yu

Both groupByKey and join() accept Partitioner as parameter. Maybe you can specify a custom Partitioner so that the amount of shuffle is reduced. On Sat, Jan 16, 2016 at 9:39 AM, Daniel Imberman wrote: > Hi Ted, > > I think I might have figured something out!(Though I haven't tested it at > scal

Re: Consuming commands from a queue

2016-01-16 Thread Afshartous, Nick

Thanks Cody. One reason I was thinking of using Akka is that some of the copies take much longer than others (or get stuck). We've seen this with our current streaming job. This can cause the entire streaming micro-batch to take longer. If we had a set of Akka actors than each copy would b

Re: Sending large objects to specific RDDs

2016-01-16 Thread Daniel Imberman

Hi Koert, So I actually just mentioned something somewhat similar in the thread (your email actually came through as I was sending it :) ). One question I have is if I do a groupByKey and I have been smart about my partitioning up to this point would I have that benefit of not needing to shuffle

Re: Sending large objects to specific RDDs

2016-01-16 Thread Daniel Imberman

Hi Ted, I think I might have figured something out!(Though I haven't tested it at scale yet) My current thought is that I can do a groupByKey on the RDD of vectors and then do a join with the invertedIndex. It would look something like this: val InvIndexes:RDD[(Int,InvertedIndex)] val partitione

Re: Sending large objects to specific RDDs

2016-01-16 Thread Koert Kuipers

Just doing a join is not an option? If you carefully manage your partitioning then this can be pretty efficient (meaning no extra shuffle, basically map-side join) On Jan 13, 2016 2:30 PM, "Daniel Imberman" wrote: > I'm looking for a way to send structures to pre-determined partitions so > that >

Re: spark job server

2016-01-16 Thread Madabhattula Rajesh Kumar

Hi, I am using "ooyala/spark-jobserver". Regards, Rajesh On Sat, Jan 16, 2016 at 8:36 PM, Ted Yu wrote: > Which distro are you using ? > > From the error message, compute-classpath.sh was not found. > I searched Spark 1.6 built for hadoop 2.6 but didn't find > either compute-classpath.sh or se

How to apply mapPartitionsWithIndex to an emptyRDD?

2016-01-16 Thread LINChen

Hi all,I have some data on the driver side. Then I will broadcast the data to all workers side to ensure each worker has same data. Due to there is no RDD in the memory, I don't know how to make workers to start tasks to do some transformation based on the data. I have try to write code like thi

Re: spark job server

2016-01-16 Thread Ted Yu

Which distro are you using ? >From the error message, compute-classpath.sh was not found. I searched Spark 1.6 built for hadoop 2.6 but didn't find either compute-classpath.sh or server_start.sh Cheers On Sat, Jan 16, 2016 at 5:33 AM, Madabhattula Rajesh Kumar < mrajaf...@gmail.com> wrote: > Hi

ClassNotFoundException interpreting a Spark job

2016-01-16 Thread milad bourhani

Hi everyone, I’m trying to use the Scala interpreter, IMain, to interpret some Scala code that executes a job with Spark: @Test public void countToFive() throws ScriptException { SparkConf conf = new SparkConf().setAppName("Spark interpreter").setMaster("local[2]"); SparkContext sc = ne

spark job server

2016-01-16 Thread Madabhattula Rajesh Kumar

Hi, I am not able to start spark job sever. I am facing below error. Please let me know, how to resolve this issue. I have configured one master and two workers in cluster mode. ./server_start.sh *./server_start.sh: line 52: kill: (19621) - No such process./server_start.sh: line 78: /home/spar

答复: 答复: 答复: 答复: spark streaming context trigger invoke stop why?

2016-01-16 Thread Triones,Deng(vip.com)

Thanks for your response. As a notice that , when my spark version is 1.4.1 when that kind of error won’t cause driver stop. Another wise spark 1.5.2 will cause driver stop, I think there must be some change. As I notice the code @spark 1.5.2 JobScheduler.scala : jobScheduler.reportError("Err

Re: Sending large objects to specific RDDs

Re: Consuming commands from a queue

Re: Sending large objects to specific RDDs

Re: Sending large objects to specific RDDs

Re: Sending large objects to specific RDDs

Re: spark job server

How to apply mapPartitionsWithIndex to an emptyRDD?

Re: spark job server

ClassNotFoundException interpreting a Spark job

spark job server

答复: 答复: 答复: 答复: spark streaming context trigger invoke stop why?

11 matches

Site Navigation

Mail list logo

Footer information