Re: HDFS file hdfs://127.0.0.1:9000/hdfs/spark/examples/README.txt

2020-04-06 Thread jane thorpe
Hi Som, HdfsWordCount program  counts words >From files you place in a  directory with the name of argv [args.length -1]  >while the program is running in a for (;;)  loop until user press CTRL C. Why  does program name  have prefix of  HDFS   ? HADOOP distributed  FileSystem. Is it a

Re: Scala version compatibility

2020-04-06 Thread Andrew Melo
Hello, On Mon, Apr 6, 2020 at 3:31 PM Koert Kuipers wrote: > actually i might be wrong about this. did you declare scala to be a > provided dependency? so scala is not in your fat/uber jar? if so then maybe > it will work. > I declare spark to be a provided dependency, so Scala's not included

Re: Scala version compatibility

2020-04-06 Thread Koert Kuipers
actually i might be wrong about this. did you declare scala to be a provided dependency? so scala is not in your fat/uber jar? if so then maybe it will work. On Mon, Apr 6, 2020 at 4:16 PM Andrew Melo wrote: > > > On Mon, Apr 6, 2020 at 3:08 PM Koert Kuipers wrote: > >> yes it will >> >> >

Re: Scala version compatibility

2020-04-06 Thread Andrew Melo
On Mon, Apr 6, 2020 at 3:08 PM Koert Kuipers wrote: > yes it will > > Ooof, I was hoping that wasn't the case. I guess I need to figure out how to get Maven to compile/publish jars with different dependencies/artifactIDs like how sbt does? (or re-implement the functionality in java) Thanks for

Re: Scala version compatibility

2020-04-06 Thread Som Lima
Those who followed best practices in software development would start with a clean environment I.e. installation of operating system. Then install development tools keeping a record of version numbers. So that at the time of deployment unforeseen errors are avoided by duplicating development

Re: Scala version compatibility

2020-04-06 Thread Koert Kuipers
yes it will On Mon, Apr 6, 2020 at 3:50 PM Andrew Melo wrote: > Hello all, > > I'm aware that Scala is not binary compatible between revisions. I have > some Java code whose only Scala dependency is the transitive dependency > through Spark. This code calls a Spark API which returns a Seq,

Scala version compatibility

2020-04-06 Thread Andrew Melo
Hello all, I'm aware that Scala is not binary compatible between revisions. I have some Java code whose only Scala dependency is the transitive dependency through Spark. This code calls a Spark API which returns a Seq, which I then convert into a List with JavaConverters.seqAsJavaListConverter.

RE: spark-submit exit status on k8s

2020-04-06 Thread Marshall Markham
Thank you, that looks promising as well. * Marshall From: Yinan Li Sent: Sunday, April 5, 2020 3:49 PM To: Marshall Markham Cc: user Subject: Re: spark-submit exit status on k8s Not sure if you are aware of this new feature in Airflow

RE: spark-submit exit status on k8s

2020-04-06 Thread Marshall Markham
This is a great idea Masood. We are actually managing our spark jobs with a kubernetes pod operator, we may stick something in at that layer to determine success/failure so that we are in the same node of the DAG. Thanks again. * Marshall From: Masood Krohy Sent: Sunday, April 5, 2020

Re: HDFS file hdfs://127.0.0.1:9000/hdfs/spark/examples/README.txt

2020-04-06 Thread Som Lima
Ok Try this one instead. (link below) It has both an EXIT which we know is rude and abusive instead of graceful structured programming and also includes half hearted user input validation. Do you think millions of spark users download and test these programmes and repeat this rude

Re: pandas_udf is very slow

2020-04-06 Thread Gourav Sengupta
Hi Leon, please refer to this link: https://docs.databricks.com/spark/latest/spark-sql/udf-python-pandas.html I have found using GROUP MAP to be a bit tricky, please refer to the statement: "All data for a group is loaded into memory before the function is applied. This can lead to out of memory

Security vulnerabilities due to Jackson Databind

2020-04-06 Thread simonhampe
My question concerns the dependency of Spark on somewhat older Versions of Jackson databind (2.6.7 in Spark 2.4.5) and the potential security vulnerabilities that come with that. In my current project, company/project guidelines require that we scan all our dependencies - including transitive

How does spark sql evaluate case statements?

2020-04-06 Thread kant kodali
Hi All, I have the following query and I was wondering if spark sql evaluates the same condition twice in the case statement below? I did .explain(true) and all I get is a table scan so not sure if spark sql evaluates the same condition twice? if it does, is there a way to return multiple values

Fwd: HDFS file hdfs://127.0.0.1:9000/hdfs/spark/examples/README.txt

2020-04-06 Thread jane thorpe
Hi Som , Did you know that simple demo program of reading characters from file didn't work ? Who wrote that simple hello world type little program ? jane thorpe janethor...@aol.com -Original Message- From: jane thorpe To: somplasticllc ; user Sent: Fri, 3 Apr 2020 2:44 Subject: