Re: HDFS file hdfs://127.0.0.1:9000/hdfs/spark/examples/README.txt

2020-04-06 Thread jane thorpe

Hi Som,

HdfsWordCount program  counts words 
>From files you place in a  directory with the name of argv [args.length -1]  
>while the program is running in a for (;;)  loop until user press CTRL C. 

Why  does program name  have prefix of  HDFS   ? 
HADOOP distributed  FileSystem.

Is it a program which demonstrates 
HDFS  or  streaming.

I am really really  confused  with this  program ExceptionHandlingTest.

What exception handling is being tested,  JVM's throw new exception  syntax , 
if value greater than  0.75, 
 or is it some thing meant to be testing SPARK API exception handling.


spark.sparkContext.parallelize(0 until 
spark.sparkContext.defaultParallelism).foreach 
    {
 i => if (math.random > 0.75)
    { 
  throw new Exception("Testing exception handling") 
   }
 }


package org.apache.spark.examples

 import org.apache.spark.sql.SparkSession

 object ExceptionHandlingTest
 { 
def main(args: Array[String]): Unit =
 { 
val spark = SparkSession .builder .appName("ExceptionHandlingTest") 
.getOrCreate()

 spark.sparkContext.parallelize(0 until 
spark.sparkContext.defaultParallelism).foreach {
 i => if (math.random > 0.75) 
{ 
throw new Exception("Testing exception handling") 
} 
}

 spark.stop() }}


On Monday, 6 April 2020 Som Lima  wrote:
Ok Try this one instead. (link below) 
It has both  an EXIT which we know is  rude and abusive  instead of graceful 
structured programming and also includes half hearted  user input validation.
Do you think millions of spark users download and test these programmes and 
repeat this rude programming behaviour.
I don't think they have any coding rules like the safety critical software 
industry But they do have strict emailing rules.
Do you think email rules are far more important than programming rules and 
guidelines  ?

https://github.com/apache/spark/blob/master/examples/src/main/scala/org/apache/spark/examples/streaming/clickstream/PageViewStream.scala



On Mon, 6 Apr 2020, 07:04 jane thorpe,  wrote:

Hi Som ,
Did you know that simple demo program of reading characters from file didn't 
work ?
Who wrote that simple hello world type little program ?
 
jane thorpe
janethor...@aol.com
 
 
-Original Message-
From: jane thorpe 
To: somplasticllc ; user 
Sent: Fri, 3 Apr 2020 2:44
Subject: Re: HDFS file hdfs://127.0.0.1:9000/hdfs/spark/examples/README.txt

 
Thanks darling
I tried this and worked 

hdfs getconf -confKey fs.defaultFS
hdfs://localhost:9000


scala> :paste
// Entering paste mode (ctrl-D to finish)

val textFile = 
sc.textFile("hdfs://127.0.0.1:9000/hdfs/spark/examples/README.txt")
val counts = textFile.flatMap(line => line.split(" "))
                 .map(word => (word, 1))
                 .reduceByKey(_ + _)
counts.saveAsTextFile("hdfs://127.0.0.1:9000/hdfs/spark/examples/README7.out")

// Exiting paste mode, now interpreting.

textFile: org.apache.spark.rdd.RDD[String] = 
hdfs://127.0.0.1:9000/hdfs/spark/examples/README.txt MapPartitionsRDD[91] at 
textFile at :27
counts: org.apache.spark.rdd.RDD[(String, Int)] = ShuffledRDD[94] at 
reduceByKey at :30

scala> :quit

 
jane thorpe
janethor...@aol.com
 
 
-Original Message-
From: Som Lima 
CC: user 
Sent: Tue, 31 Mar 2020 23:06
Subject: Re: HDFS file

Hi Jane
Try this example 
https://github.com/apache/spark/blob/master/examples/src/main/scala/org/apache/spark/examples/streaming/HdfsWordCount.scala

Som
On Tue, 31 Mar 2020, 21:34 jane thorpe,  wrote:

 hi,
Are there setup instructions on the website for 
spark-3.0.0-preview2-bin-hadoop2.7I can run same program for hdfs format
val textFile = sc.textFile("hdfs://...")
val counts = textFile.flatMap(line => line.split(" "))
 .map(word => (word, 1))
 .reduceByKey(_ + _)
counts.saveAsTextFile("hdfs://...")

val textFile = sc.textFile("/data/README.md")val counts = textFile.flatMap(line 
=> line.split(" ")) .map(word => (word, 1)) 
.reduceByKey(_ + _)counts.saveAsTextFile("/data/wordcount")
textFile: org.apache.spark.rdd.RDD[String] = /data/README.md 
MapPartitionsRDD[23] at textFile at :28counts: 
org.apache.spark.rdd.RDD[(String, Int)] = ShuffledRDD[26] at reduceByKey at 
:31


br
Jane 




Re: HDFS file hdfs://127.0.0.1:9000/hdfs/spark/examples/README.txt

2020-04-06 Thread Som Lima
Ok Try this one instead. (link below)

It has both  an EXIT which we know is  rude and abusive  instead of
graceful structured programming and also includes half hearted  user input
validation.

Do you think millions of spark users download and test these programmes and
repeat this rude programming behaviour.

I don't think they have any coding rules like the safety critical software
industry
But they do have strict emailing rules.

Do you think email rules are far more important than programming rules and
guidelines  ?


https://github.com/apache/spark/blob/master/examples/src/main/scala/org/apache/spark/examples/streaming/clickstream/PageViewStream.scala



On Mon, 6 Apr 2020, 07:04 jane thorpe,  wrote:

> Hi Som ,
>
> Did you know that simple demo program of reading characters from file
> didn't work ?
> Who wrote that simple hello world type little program ?
>
> jane thorpe
> janethor...@aol.com
>
>
> -Original Message-
> From: jane thorpe 
> To: somplasticllc ; user 
> Sent: Fri, 3 Apr 2020 2:44
> Subject: Re: HDFS file hdfs://
> 127.0.0.1:9000/hdfs/spark/examples/README.txt
>
>
> Thanks darling
>
> I tried this and worked
>
> hdfs getconf -confKey fs.defaultFS
> hdfs://localhost:9000
>
>
> scala> :paste
> // Entering paste mode (ctrl-D to finish)
>
> val textFile = sc.textFile("hdfs://
> 127.0.0.1:9000/hdfs/spark/examples/README.txt")
> val counts = textFile.flatMap(line => line.split(" "))
>  .map(word => (word, 1))
>  .reduceByKey(_ + _)
> counts.saveAsTextFile("hdfs://
> 127.0.0.1:9000/hdfs/spark/examples/README7.out")
>
> // Exiting paste mode, now interpreting.
>
> textFile: org.apache.spark.rdd.RDD[String] = hdfs://
> 127.0.0.1:9000/hdfs/spark/examples/README.txt MapPartitionsRDD[91] at
> textFile at :27
> counts: org.apache.spark.rdd.RDD[(String, Int)] = ShuffledRDD[94] at
> reduceByKey at :30
>
> scala> :quit
>
>
> jane thorpe
> janethor...@aol.com
>
>
> -Original Message-
> From: Som Lima 
> CC: user 
> Sent: Tue, 31 Mar 2020 23:06
> Subject: Re: HDFS file
>
> Hi Jane
>
> Try this example
>
>
> https://github.com/apache/spark/blob/master/examples/src/main/scala/org/apache/spark/examples/streaming/HdfsWordCount.scala
>
>
> Som
>
> On Tue, 31 Mar 2020, 21:34 jane thorpe, 
> wrote:
>
> hi,
>
> Are there setup instructions on the website for
> spark-3.0.0-preview2-bin-hadoop2.7
> I can run same program for hdfs format
>
> val textFile = sc.textFile("hdfs://...")val counts = textFile.flatMap(line => 
> line.split(" "))
>  .map(word => (word, 1))
>  .reduceByKey(_ + _)counts.saveAsTextFile("hdfs://...")
>
>
>
> val textFile = sc.textFile("/data/README.md")
> val counts = textFile.flatMap(line => line.split(" "))
>  .map(word => (word, 1))
>  .reduceByKey(_ + _)
> counts.saveAsTextFile("/data/wordcount")
>
> textFile: org.apache.spark.rdd.RDD[String] = /data/README.md
> MapPartitionsRDD[23] at textFile at :28
>
> counts: org.apache.spark.rdd.RDD[(String, Int)] = ShuffledRDD[26] at 
> reduceByKey at :31
>
> br
> Jane
>
>


Re: HDFS file hdfs://127.0.0.1:9000/hdfs/spark/examples/README.txt

2020-04-02 Thread jane thorpe
 
Thanks darling
I tried this and worked 

hdfs getconf -confKey fs.defaultFS
hdfs://localhost:9000


scala> :paste
// Entering paste mode (ctrl-D to finish)

val textFile = 
sc.textFile("hdfs://127.0.0.1:9000/hdfs/spark/examples/README.txt")
val counts = textFile.flatMap(line => line.split(" "))
                 .map(word => (word, 1))
                 .reduceByKey(_ + _)
counts.saveAsTextFile("hdfs://127.0.0.1:9000/hdfs/spark/examples/README7.out")

// Exiting paste mode, now interpreting.

textFile: org.apache.spark.rdd.RDD[String] = 
hdfs://127.0.0.1:9000/hdfs/spark/examples/README.txt MapPartitionsRDD[91] at 
textFile at :27
counts: org.apache.spark.rdd.RDD[(String, Int)] = ShuffledRDD[94] at 
reduceByKey at :30

scala> :quit

 
jane thorpe
janethor...@aol.com
 
 
-Original Message-
From: Som Lima 
CC: user 
Sent: Tue, 31 Mar 2020 23:06
Subject: Re: HDFS file

Hi Jane
Try this example 
https://github.com/apache/spark/blob/master/examples/src/main/scala/org/apache/spark/examples/streaming/HdfsWordCount.scala

Som
On Tue, 31 Mar 2020, 21:34 jane thorpe,  wrote:

 hi,
Are there setup instructions on the website for 
spark-3.0.0-preview2-bin-hadoop2.7I can run same program for hdfs format
val textFile = sc.textFile("hdfs://...")
val counts = textFile.flatMap(line => line.split(" "))
 .map(word => (word, 1))
 .reduceByKey(_ + _)
counts.saveAsTextFile("hdfs://...")

val textFile = sc.textFile("/data/README.md")val counts = textFile.flatMap(line 
=> line.split(" ")) .map(word => (word, 1)) 
.reduceByKey(_ + _)counts.saveAsTextFile("/data/wordcount")
textFile: org.apache.spark.rdd.RDD[String] = /data/README.md 
MapPartitionsRDD[23] at textFile at :28counts: 
org.apache.spark.rdd.RDD[(String, Int)] = ShuffledRDD[26] at reduceByKey at 
:31


br
Jane