Hi Jane Try this example
https://github.com/apache/spark/blob/master/examples/src/main/scala/org/apache/spark/examples/streaming/HdfsWordCount.scala Som On Tue, 31 Mar 2020, 21:34 jane thorpe, <janethor...@aol.com.invalid> wrote: > hi, > > Are there setup instructions on the website for > spark-3.0.0-preview2-bin-hadoop2.7 > I can run same program for hdfs format > > val textFile = sc.textFile("hdfs://...")val counts = textFile.flatMap(line => > line.split(" ")) > .map(word => (word, 1)) > .reduceByKey(_ + _)counts.saveAsTextFile("hdfs://...") > > > > val textFile = sc.textFile("/data/README.md") > val counts = textFile.flatMap(line => line.split(" ")) > .map(word => (word, 1)) > .reduceByKey(_ + _) > counts.saveAsTextFile("/data/wordcount") > > textFile: org.apache.spark.rdd.RDD[String] = /data/README.md > MapPartitionsRDD[23] at textFile at <console>:28 > > counts: org.apache.spark.rdd.RDD[(String, Int)] = ShuffledRDD[26] at > reduceByKey at <console>:31 > > br > Jane >