Hi Jane

Try this example

https://github.com/apache/spark/blob/master/examples/src/main/scala/org/apache/spark/examples/streaming/HdfsWordCount.scala


Som

On Tue, 31 Mar 2020, 21:34 jane thorpe, <janethor...@aol.com.invalid> wrote:

> hi,
>
> Are there setup instructions on the website for
> spark-3.0.0-preview2-bin-hadoop2.7
> I can run same program for hdfs format
>
> val textFile = sc.textFile("hdfs://...")val counts = textFile.flatMap(line => 
> line.split(" "))
>                  .map(word => (word, 1))
>                  .reduceByKey(_ + _)counts.saveAsTextFile("hdfs://...")
>
>
>
> val textFile = sc.textFile("/data/README.md")
> val counts = textFile.flatMap(line => line.split(" "))
>                  .map(word => (word, 1))
>                  .reduceByKey(_ + _)
> counts.saveAsTextFile("/data/wordcount")
>
> textFile: org.apache.spark.rdd.RDD[String] = /data/README.md
> MapPartitionsRDD[23] at textFile at <console>:28
>
> counts: org.apache.spark.rdd.RDD[(String, Int)] = ShuffledRDD[26] at 
> reduceByKey at <console>:31
>
> br
> Jane
>

Reply via email to