Yeah. I have been wondering how to check this in the general case, across all deployment modes, but thats a hard problem. Last week I realized that even if we can do it just for local, we can get the biggest bang of the buck.
TD On Tue, Jul 15, 2014 at 9:31 PM, Tobias Pfeiffer <t...@preferred.jp> wrote: > Hi, > > thanks for creating the issue. It feels like in the last week, more or > less half of the questions about Spark Streaming rooted in setting the > master to "local" ;-) > > Tobias > > > On Wed, Jul 16, 2014 at 11:03 AM, Tathagata Das < > tathagata.das1...@gmail.com> wrote: > >> Aah, right, copied from the wrong browser tab i guess. Thanks! >> >> TD >> >> >> On Tue, Jul 15, 2014 at 5:57 PM, Michael Campbell < >> michael.campb...@gmail.com> wrote: >> >>> I think you typo'd the jira id; it should be >>> https://issues.apache.org/jira/browse/SPARK-2475 "Check whether #cores >>> > #receivers in local mode" >>> >>> >>> On Mon, Jul 14, 2014 at 3:57 PM, Tathagata Das < >>> tathagata.das1...@gmail.com> wrote: >>> >>>> The problem is not really for local[1] or local. The problem arises >>>> when there are more input streams than there are cores. >>>> But I agree, for people who are just beginning to use it by running it >>>> locally, there should be a check addressing this. >>>> >>>> I made a JIRA for this. >>>> https://issues.apache.org/jira/browse/SPARK-2464 >>>> >>>> TD >>>> >>>> >>>> On Sun, Jul 13, 2014 at 4:26 PM, Sean Owen <so...@cloudera.com> wrote: >>>> >>>>> How about a PR that rejects a context configured for local or >>>>> local[1]? As I understand it is not intended to work and has bitten >>>>> several >>>>> people. >>>>> On Jul 14, 2014 12:24 AM, "Michael Campbell" < >>>>> michael.campb...@gmail.com> wrote: >>>>> >>>>>> This almost had me not using Spark; I couldn't get any output. It is >>>>>> not at all obvious what's going on here to the layman (and to the best of >>>>>> my knowledge, not documented anywhere), but now you know you'll be able >>>>>> to >>>>>> answer this question for the numerous people that will also have it. >>>>>> >>>>>> >>>>>> On Sun, Jul 13, 2014 at 5:24 PM, Walrus theCat < >>>>>> walrusthe...@gmail.com> wrote: >>>>>> >>>>>>> Great success! >>>>>>> >>>>>>> I was able to get output to the driver console by changing the >>>>>>> construction of the Streaming Spark Context from: >>>>>>> >>>>>>> val ssc = new StreamingContext("local" /**TODO change once a >>>>>>> cluster is up **/, >>>>>>> "AppName", Seconds(1)) >>>>>>> >>>>>>> >>>>>>> to: >>>>>>> >>>>>>> val ssc = new StreamingContext("local[2]" /**TODO change once a >>>>>>> cluster is up **/, >>>>>>> "AppName", Seconds(1)) >>>>>>> >>>>>>> >>>>>>> I found something that tipped me off that this might work by digging >>>>>>> through this mailing list. >>>>>>> >>>>>>> >>>>>>> On Sun, Jul 13, 2014 at 2:00 PM, Walrus theCat < >>>>>>> walrusthe...@gmail.com> wrote: >>>>>>> >>>>>>>> More strange behavior: >>>>>>>> >>>>>>>> lines.foreachRDD(x => println(x.first)) // works >>>>>>>> lines.foreachRDD(x => println((x.count,x.first))) // no output is >>>>>>>> printed to driver console >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On Sun, Jul 13, 2014 at 11:47 AM, Walrus theCat < >>>>>>>> walrusthe...@gmail.com> wrote: >>>>>>>> >>>>>>>>> >>>>>>>>> Thanks for your interest. >>>>>>>>> >>>>>>>>> lines.foreachRDD(x => println(x.count)) >>>>>>>>> >>>>>>>>> And I got 0 every once in a while (which I think is strange, >>>>>>>>> because lines.print prints the input I'm giving it over the socket.) >>>>>>>>> >>>>>>>>> >>>>>>>>> When I tried: >>>>>>>>> >>>>>>>>> lines.map(_->1).reduceByKey(_+_).foreachRDD(x => println(x.count)) >>>>>>>>> >>>>>>>>> I got no count. >>>>>>>>> >>>>>>>>> Thanks >>>>>>>>> >>>>>>>>> >>>>>>>>> On Sun, Jul 13, 2014 at 11:34 AM, Tathagata Das < >>>>>>>>> tathagata.das1...@gmail.com> wrote: >>>>>>>>> >>>>>>>>>> Try doing DStream.foreachRDD and then printing the RDD count and >>>>>>>>>> further inspecting the RDD. >>>>>>>>>> On Jul 13, 2014 1:03 AM, "Walrus theCat" <walrusthe...@gmail.com> >>>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>>> Hi, >>>>>>>>>>> >>>>>>>>>>> I have a DStream that works just fine when I say: >>>>>>>>>>> >>>>>>>>>>> dstream.print >>>>>>>>>>> >>>>>>>>>>> If I say: >>>>>>>>>>> >>>>>>>>>>> dstream.map(_,1).print >>>>>>>>>>> >>>>>>>>>>> that works, too. However, if I do the following: >>>>>>>>>>> >>>>>>>>>>> dstream.reduce{case(x,y) => x}.print >>>>>>>>>>> >>>>>>>>>>> I don't get anything on my console. What's going on? >>>>>>>>>>> >>>>>>>>>>> Thanks >>>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>> >>> >> >