Re: AnalysisException - Infer schema for the Parquet path

2020-05-09 Thread Mich Talebzadeh
Have you tried catching error when you are creating a dataframe? import scala.util.{Try, Success, Failure} val df = Try(spark.read. format("com.databricks.spark.xml"). option("rootTag", "hierarchy"). option("rowTag", "sms_request").

AnalysisException - Infer schema for the Parquet path

2020-05-09 Thread Chetan Khatri
Hi Spark Users, I've a spark job where I am reading the parquet path, and that parquet path data is generated by other systems, some of the parquet paths doesn't contains any data which is possible. is there a any way to read the parquet if no data found I can create a dummy dataframe and go

Re: How to deal Schema Evolution with Dataset API

2020-05-09 Thread Edgardo Szrajber
If you want to keep the dataset, maybe you can try to add a constructor to the case class (through the companion objcet) that receives only the age.Bentzi Sent from Yahoo Mail on Android On Sat, May 9, 2020 at 17:50, Jorge Machado wrote: Ok, I found a way to solve it. Just pass the

Re: How to populate all possible combination values in columns using Spark SQL

2020-05-09 Thread Edgardo Szrajber
Once you pivot you can use CASE WHEN to calculate the True/false, then you can agregate.Bentzi Sent from Yahoo Mail on Android On Sat, May 9, 2020 at 21:18, Aakash Basu wrote: I know how to pivot, but how to aggregate and pivot and build those True and False combination is the doubt. On

Re: How to populate all possible combination values in columns using Spark SQL

2020-05-09 Thread Aakash Basu
I know how to pivot, but how to aggregate and pivot and build those True and False combination is the doubt. On Fri 8 May, 2020, 1:31 PM Edgardo Szrajber, wrote: > Have you checked the pivot function? > Bentzi > > Sent from Yahoo Mail on Android >

dynamic executor scalling spark on kubernetes client mode

2020-05-09 Thread Pradeepta Choudhury
Hiii , The dynamic executor scalling is working fine for spark on kubernetes (latest from spark master repository ) in cluster mode . is the dynamic executor scalling available for client mode ? if yes where can i find the usage doc for same . If no is there any PR open for this ? Thanks ,

Re: How to deal Schema Evolution with Dataset API

2020-05-09 Thread Jorge Machado
Ok, I found a way to solve it. Just pass the schema like this: val schema = Encoders.product[Person].schema spark.read.schema(schema).parquet(“input”)…. > On 9. May 2020, at 13:28, Jorge Machado wrote: > > Hello everyone, > > One question to the community. > > Imagine I have this > >

How to deal Schema Evolution with Dataset API

2020-05-09 Thread Jorge Machado
Hello everyone, One question to the community. Imagine I have this Case class Person(age: int) spark.read.parquet(“inputPath”).as[Person] After a few weeks of coding I change the class to: Case class Person(age: int, name: Option[String] = None) Then when I run