Yeah. I have been wondering how to check this in the general case, across
all deployment modes, but thats a hard problem. Last week I realized that
even if we can do it just for local, we can get the biggest bang of the
buck.
TD
On Tue, Jul 15, 2014 at 9:31 PM, Tobias Pfeiffer t...@preferred.jp
I think you typo'd the jira id; it should be
https://issues.apache.org/jira/browse/SPARK-2475 Check whether #cores
#receivers in local mode
On Mon, Jul 14, 2014 at 3:57 PM, Tathagata Das tathagata.das1...@gmail.com
wrote:
The problem is not really for local[1] or local. The problem arises
Aah, right, copied from the wrong browser tab i guess. Thanks!
TD
On Tue, Jul 15, 2014 at 5:57 PM, Michael Campbell
michael.campb...@gmail.com wrote:
I think you typo'd the jira id; it should be
https://issues.apache.org/jira/browse/SPARK-2475 Check whether #cores
#receivers in local
Hi,
thanks for creating the issue. It feels like in the last week, more or less
half of the questions about Spark Streaming rooted in setting the master to
local ;-)
Tobias
On Wed, Jul 16, 2014 at 11:03 AM, Tathagata Das tathagata.das1...@gmail.com
wrote:
Aah, right, copied from the wrong
The problem is not really for local[1] or local. The problem arises when
there are more input streams than there are cores.
But I agree, for people who are just beginning to use it by running it
locally, there should be a check addressing this.
I made a JIRA for this.
Try doing DStream.foreachRDD and then printing the RDD count and further
inspecting the RDD.
On Jul 13, 2014 1:03 AM, Walrus theCat walrusthe...@gmail.com wrote:
Hi,
I have a DStream that works just fine when I say:
dstream.print
If I say:
dstream.map(_,1).print
that works, too.
Update on this:
val lines = ssc.socketTextStream(localhost, )
lines.print // works
lines.map(_-1).print // works
lines.map(_-1).reduceByKey(_+_).print // nothing printed to driver console
Just lots of:
14/07/13 11:37:40 INFO receiver.BlockGenerator: Pushed block
input-0-1405276660400
Thanks for your interest.
lines.foreachRDD(x = println(x.count))
And I got 0 every once in a while (which I think is strange, because
lines.print prints the input I'm giving it over the socket.)
When I tried:
lines.map(_-1).reduceByKey(_+_).foreachRDD(x = println(x.count))
I got no count.
More strange behavior:
lines.foreachRDD(x = println(x.first)) // works
lines.foreachRDD(x = println((x.count,x.first))) // no output is printed
to driver console
On Sun, Jul 13, 2014 at 11:47 AM, Walrus theCat walrusthe...@gmail.com
wrote:
Thanks for your interest.
lines.foreachRDD(x =
Great success!
I was able to get output to the driver console by changing the construction
of the Streaming Spark Context from:
val ssc = new StreamingContext(local /**TODO change once a cluster is up
**/,
AppName, Seconds(1))
to:
val ssc = new StreamingContext(local[2] /**TODO
This almost had me not using Spark; I couldn't get any output. It is not
at all obvious what's going on here to the layman (and to the best of my
knowledge, not documented anywhere), but now you know you'll be able to
answer this question for the numerous people that will also have it.
On Sun,
How about a PR that rejects a context configured for local or local[1]? As
I understand it is not intended to work and has bitten several people.
On Jul 14, 2014 12:24 AM, Michael Campbell michael.campb...@gmail.com
wrote:
This almost had me not using Spark; I couldn't get any output. It is not
12 matches
Mail list logo