Oh, I mean the average data rate/node. But in case I want to know the input activities to each node (I use a custom receiver instead of Kafka), I usually search these records in logs to get a sense: "BlockManagerInfo: Added input ... on [hostname:port] (size: xxx KB)"
I also see some spikes in latency as I posted earlier: http://apache-spark-user-list.1001560.n3.nabble.com/How-to-achieve-reasonable-performance-on-Spark-Streaming-tp7262.html It's even worse as the spikes cause the latency to increase infinitely when the data rate is a little high, although the machines are underutilized. I can't explain it either. I'm not sure if the cause is the same as yours. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Problem-in-Spark-Streaming-tp7310p7327.html Sent from the Apache Spark User List mailing list archive at Nabble.com.