I’m trying to run the stateful network word count at
https://github.com/apache/spark/blob/master/examples/src/main/python/streaming/stateful_network_wordcount.py
using the command:

./bin/spark-submit
examples/src/main/python/streaming/stateful_network_wordcount.py
localhost 9999

I am also running netcat at the same time (prior to running the above
command):

nc -lk 9999

However, no wordcount is printed (even though pprint() is being called).

   1. How do I print the results?
   2. How do I otherwise access the data at real time? Suppose I want to
   have a dashboard showing the data in running_counts?

Note that
https://github.com/apache/spark/blob/master/examples/src/main/python/streaming/network_wordcount.py
works perfectly fine.

Running Spark 1.2.0, hadoop 2.4.x prebuilt

Thanks,
Samarth
​

Reply via email to