Hi Som ,
Did you know that simple demo program of reading characters from file didn't 
work ?
Who wrote that simple hello world type little program ?
 
jane thorpe
janethor...@aol.com
 
 
-----Original Message-----
From: jane thorpe <janethor...@aol.com>
To: somplasticllc <somplastic...@gmail.com>; user <user@spark.apache.org>
Sent: Fri, 3 Apr 2020 2:44
Subject: Re: HDFS file hdfs://127.0.0.1:9000/hdfs/spark/examples/README.txt

 
Thanks darling
I tried this and worked 

hdfs getconf -confKey fs.defaultFS
hdfs://localhost:9000


scala> :paste
// Entering paste mode (ctrl-D to finish)

val textFile = 
sc.textFile("hdfs://127.0.0.1:9000/hdfs/spark/examples/README.txt")
val counts = textFile.flatMap(line => line.split(" "))
                 .map(word => (word, 1))
                 .reduceByKey(_ + _)
counts.saveAsTextFile("hdfs://127.0.0.1:9000/hdfs/spark/examples/README7.out")

// Exiting paste mode, now interpreting.

textFile: org.apache.spark.rdd.RDD[String] = 
hdfs://127.0.0.1:9000/hdfs/spark/examples/README.txt MapPartitionsRDD[91] at 
textFile at <pastie>:27
counts: org.apache.spark.rdd.RDD[(String, Int)] = ShuffledRDD[94] at 
reduceByKey at <pastie>:30

scala> :quit

 
jane thorpe
janethor...@aol.com
 
 
-----Original Message-----
From: Som Lima <somplastic...@gmail.com>
CC: user <user@spark.apache.org>
Sent: Tue, 31 Mar 2020 23:06
Subject: Re: HDFS file

Hi Jane
Try this example 
https://github.com/apache/spark/blob/master/examples/src/main/scala/org/apache/spark/examples/streaming/HdfsWordCount.scala

Som
On Tue, 31 Mar 2020, 21:34 jane thorpe, <janethor...@aol.com.invalid> wrote:

 hi,
Are there setup instructions on the website for 
spark-3.0.0-preview2-bin-hadoop2.7I can run same program for hdfs format
val textFile = sc.textFile("hdfs://...")
val counts = textFile.flatMap(line => line.split(" "))
                 .map(word => (word, 1))
                 .reduceByKey(_ + _)
counts.saveAsTextFile("hdfs://...")

val textFile = sc.textFile("/data/README.md")val counts = textFile.flatMap(line 
=> line.split(" "))                 .map(word => (word, 1))                 
.reduceByKey(_ + _)counts.saveAsTextFile("/data/wordcount")
textFile: org.apache.spark.rdd.RDD[String] = /data/README.md 
MapPartitionsRDD[23] at textFile at <console>:28counts: 
org.apache.spark.rdd.RDD[(String, Int)] = ShuffledRDD[26] at reduceByKey at 
<console>:31


br
Jane 

Reply via email to