Re: truly bizarre behavior with local[n] on Spark 1.0.1
This is (obviously) spark streaming, by the way. On Mon, Jul 14, 2014 at 8:27 PM, Walrus theCat walrusthe...@gmail.com wrote: Hi, I've got a socketTextStream through which I'm reading input. I have three Dstreams, all of which are the same window operation over that socketTextStream. I have a four core machine. As we've been covering lately, I have to give a cores parameter to my StreamingSparkContext: ssc = new StreamingContext(local[4] /**TODO change once a cluster is up **/, AppName, Seconds(1)) Now, I have three dstreams, and all I ask them to do is print or count. I should preface this with the statement that they all work on their own. dstream1 // 1 second window dstream2 // 2 second window dstream3 // 5 minute window If I construct the ssc with local[8], and put these statements in this order, I get prints on the first one, and zero counts on the second one: ssc(local[8]) // hyperthread dat sheezy dstream1.print // works dstream2.count.print // always prints 0 If I do this, this happens: ssc(local[4]) dstream1.print // doesn't work, just gives me the Time: ms message dstream2.count.print // doesn't work, prints 0 ssc(local[6]) dstream1.print // doesn't work, just gives me the Time: ms message dstream2.count.print // works, prints 1 Sometimes these results switch up, seemingly at random. How can I get things to the point where I can develop and test my application locally? Thanks
Re: truly bizarre behavior with local[n] on Spark 1.0.1
This sounds really really weird. Can you give me a piece of code that I can run to reproduce this issue myself? TD On Tue, Jul 15, 2014 at 12:02 AM, Walrus theCat walrusthe...@gmail.com wrote: This is (obviously) spark streaming, by the way. On Mon, Jul 14, 2014 at 8:27 PM, Walrus theCat walrusthe...@gmail.com wrote: Hi, I've got a socketTextStream through which I'm reading input. I have three Dstreams, all of which are the same window operation over that socketTextStream. I have a four core machine. As we've been covering lately, I have to give a cores parameter to my StreamingSparkContext: ssc = new StreamingContext(local[4] /**TODO change once a cluster is up **/, AppName, Seconds(1)) Now, I have three dstreams, and all I ask them to do is print or count. I should preface this with the statement that they all work on their own. dstream1 // 1 second window dstream2 // 2 second window dstream3 // 5 minute window If I construct the ssc with local[8], and put these statements in this order, I get prints on the first one, and zero counts on the second one: ssc(local[8]) // hyperthread dat sheezy dstream1.print // works dstream2.count.print // always prints 0 If I do this, this happens: ssc(local[4]) dstream1.print // doesn't work, just gives me the Time: ms message dstream2.count.print // doesn't work, prints 0 ssc(local[6]) dstream1.print // doesn't work, just gives me the Time: ms message dstream2.count.print // works, prints 1 Sometimes these results switch up, seemingly at random. How can I get things to the point where I can develop and test my application locally? Thanks
Re: truly bizarre behavior with local[n] on Spark 1.0.1
Will do. On Tue, Jul 15, 2014 at 12:56 PM, Tathagata Das tathagata.das1...@gmail.com wrote: This sounds really really weird. Can you give me a piece of code that I can run to reproduce this issue myself? TD On Tue, Jul 15, 2014 at 12:02 AM, Walrus theCat walrusthe...@gmail.com wrote: This is (obviously) spark streaming, by the way. On Mon, Jul 14, 2014 at 8:27 PM, Walrus theCat walrusthe...@gmail.com wrote: Hi, I've got a socketTextStream through which I'm reading input. I have three Dstreams, all of which are the same window operation over that socketTextStream. I have a four core machine. As we've been covering lately, I have to give a cores parameter to my StreamingSparkContext: ssc = new StreamingContext(local[4] /**TODO change once a cluster is up **/, AppName, Seconds(1)) Now, I have three dstreams, and all I ask them to do is print or count. I should preface this with the statement that they all work on their own. dstream1 // 1 second window dstream2 // 2 second window dstream3 // 5 minute window If I construct the ssc with local[8], and put these statements in this order, I get prints on the first one, and zero counts on the second one: ssc(local[8]) // hyperthread dat sheezy dstream1.print // works dstream2.count.print // always prints 0 If I do this, this happens: ssc(local[4]) dstream1.print // doesn't work, just gives me the Time: ms message dstream2.count.print // doesn't work, prints 0 ssc(local[6]) dstream1.print // doesn't work, just gives me the Time: ms message dstream2.count.print // works, prints 1 Sometimes these results switch up, seemingly at random. How can I get things to the point where I can develop and test my application locally? Thanks
truly bizarre behavior with local[n] on Spark 1.0.1
Hi, I've got a socketTextStream through which I'm reading input. I have three Dstreams, all of which are the same window operation over that socketTextStream. I have a four core machine. As we've been covering lately, I have to give a cores parameter to my StreamingSparkContext: ssc = new StreamingContext(local[4] /**TODO change once a cluster is up **/, AppName, Seconds(1)) Now, I have three dstreams, and all I ask them to do is print or count. I should preface this with the statement that they all work on their own. dstream1 // 1 second window dstream2 // 2 second window dstream3 // 5 minute window If I construct the ssc with local[8], and put these statements in this order, I get prints on the first one, and zero counts on the second one: ssc(local[8]) // hyperthread dat sheezy dstream1.print // works dstream2.count.print // always prints 0 If I do this, this happens: ssc(local[4]) dstream1.print // doesn't work, just gives me the Time: ms message dstream2.count.print // doesn't work, prints 0 ssc(local[6]) dstream1.print // doesn't work, just gives me the Time: ms message dstream2.count.print // works, prints 1 Sometimes these results switch up, seemingly at random. How can I get things to the point where I can develop and test my application locally? Thanks