I guess because this example is stateless, so it outputs counts only for given RDD. Take a look at stateful word counter StatefulNetworkWordCount.scala
On Wed, Sep 24, 2014 at 4:29 AM, SK <skrishna...@gmail.com> wrote: > > I execute it as follows: > > $SPARK_HOME/bin/spark-submit --master <master url> --class > org.apache.spark.examples.streaming.HdfsWordCount > target/scala-2.10/spark_stream_examples-assembly-1.0.jar <hdfsdir> > > After I start the job, I add a new test file in hdfsdir. It is a large text > file which I will not be able to copy here. But it probably has at least > 100 distinct words. But the streaming output has only about 5-6 words along > with their counts as follows. I then stop the job after some time. > > Time ... > > (word1, cnt1) > (word2, cnt2) > (word3, cnt3) > (word4, cnt4) > (word5, cnt5) > > Time ... > > Time ... > > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/HdfsWordCount-only-counts-some-of-the-words-tp14929p14967.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > --------------------------------------------------------------------- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > >