Alternative for spark-redshift on scala 2.12

2020-05-05 Thread Jun Zhu
Hello Users, Is there any alternative for https://github.com/databricks/spark-redshift on scala 2.12.x? Thanks -- [image: vshapesaqua11553186012.gif] *Jun Zhu* Sr. Engineer I, Data +86 18565739171 [image: in1552694272.png]

Exception handling in Spark

2020-05-05 Thread Mich Talebzadeh
Hi, As I understand exception handling in Spark only makes sense if one attempts an action as opposed to lazy transformations? Let us assume that I am reading an XML file from the HDFS directory and create a dataframe DF on it val broadcastValue = "123456789" // I assume this will be sent as

Re: Exception handling in Spark

2020-05-05 Thread Mich Talebzadeh
Thanks Todd. This is what I did before creating DF on top of that file var exists = true exists = xmlDirExists(broadcastStagingConfig.xmlFilePath) if(!exists) { println(s"\n Error: The xml file ${ broadcastStagingConfig.xmlFilePath} does not exist, aborting!\n") sys.exit(1) } . . def

Re: Exception handling in Spark

2020-05-05 Thread Todd Nist
Could you do something like this prior to calling the action. // Create FileSystem object from Hadoop Configuration val fs = FileSystem.get(spark.sparkContext.hadoopConfiguration) // This methods returns Boolean (true - if file exists, false - if file doesn't exist val fileExists = fs.exists(new

Pyspark and snowflake Column Mapping

2020-05-05 Thread anbutech
Hi Team, While working on the json data and we flattened the unstrucured data into structured format.so here we are having spark data types like Array> fields and Array data type columns in the databricks delta table. while loading the data from databricks spark connector to snowflake we

Re: Exception handling in Spark

2020-05-05 Thread Brandon Geise
Read is an action, so you could wrap it in a Try (or whatever you want) scala> val df = Try(spark.read.csv("test")) df: scala.util.Try[org.apache.spark.sql.DataFrame] = Failure(org.apache.spark.sql.AnalysisException: Path does not exist: file:/test;) From: Mich Talebzadeh Date: Tuesday,

Re: Exception handling in Spark

2020-05-05 Thread Mich Talebzadeh
Thanks Brandon! i should have remembered that. basically the code gets out with sys.exit(1) if it cannot find the file I guess there is no easy way of validating DF except actioning it by show(1,0) etc and checking if it works? Regards, Dr Mich Talebzadeh LinkedIn *

Re: Exception handling in Spark

2020-05-05 Thread Brandon Geise
You could use the Hadoop API and check if the file exists. From: Mich Talebzadeh Date: Tuesday, May 5, 2020 at 11:25 AM To: "user @spark" Subject: Exception handling in Spark Hi, As I understand exception handling in Spark only makes sense if one attempts an action as opposed to

Re: Exception handling in Spark

2020-05-05 Thread Brandon Geise
This is what I had in mind.  Can you give this approach a try? val df = Try(spark.read.csv("")) match {   case Success(df) => df   case Failure(e) => throw new Exception("foo")   } From: Mich Talebzadeh Date: Tuesday, May 5, 2020 at 5:17 PM To: Todd Nist Cc: Brandon Geise

Re: Exception handling in Spark

2020-05-05 Thread Mich Talebzadeh
This is what I get scala> val df = Try(spark.read.format("com.databricks.spark.xml").option("rootTag", "hierarchy").option("rowTag", "sms_request").load("/tmp/broadcast.xml")) Match {case Success(df) => df case Failure(e) => throw new Exception("foo")} :47: error: not found: value Try val

Re: Exception handling in Spark

2020-05-05 Thread Mich Talebzadeh
Hi Brandon. In dealing with df case Failure(e) => throw new Exception("foo") Can one print the Exception message? Thanks Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw

Re: Exception handling in Spark

2020-05-05 Thread Brandon Geise
Sure, just do case Failure(e) => throw e From: Mich Talebzadeh Date: Tuesday, May 5, 2020 at 6:36 PM To: Brandon Geise Cc: Todd Nist , "user @spark" Subject: Re: Exception handling in Spark Hi Brandon. In dealing with df case Failure(e) => throw new Exception("foo") Can

PyArrow Exception in Pandas UDF GROUPEDAGG()

2020-05-05 Thread Gautham Acharya
Hi everyone, I'm running a job that runs a Pandas UDF to GROUP BY a large matrix. The GROUP BY function runs on a wide dataset. The first column of the dataset contains string labels that are GROUPed on. The remaining columns are numeric values that are aggregated in the Pandas UDF. The

Unsubscribe

2020-05-05 Thread Zeming Yu
Unsubscribe Get Outlook for Android

Re: Exception handling in Spark

2020-05-05 Thread Mich Talebzadeh
scala> import scala.util.{Try, Success, Failure} import scala.util.{Try, Success, Failure} scala> val df = Try(spark.read.format("com.databricks.spark.xml").option("rootTag", "hierarchy").option("rowTag", "sms_request").load("/tmp/broadcast.xml")) Match {case Success(df) => df case Failure(e) =>

Re: Exception handling in Spark

2020-05-05 Thread Brandon Geise
Import scala.util.Try Import scala.util.Success Import scala.util.Failure From: Mich Talebzadeh Date: Tuesday, May 5, 2020 at 6:11 PM To: Brandon Geise Cc: Todd Nist , "user @spark" Subject: Re: Exception handling in Spark This is what I get scala> val df =

Re: Exception handling in Spark

2020-05-05 Thread Mich Talebzadeh
I am trying this approach val broadcastValue = "123456789" // I assume this will be sent as a constant for the batch // Create a DF on top of XML try { val df = spark.read. format("com.databricks.spark.xml"). option("rootTag", "hierarchy").

Re: Exception handling in Spark

2020-05-05 Thread Brandon Geise
Match needs to be lower case “match” From: Mich Talebzadeh Date: Tuesday, May 5, 2020 at 6:13 PM To: Brandon Geise Cc: Todd Nist , "user @spark" Subject: Re: Exception handling in Spark scala> import scala.util.{Try, Success, Failure} import scala.util.{Try, Success, Failure} scala>

Re: Exception handling in Spark

2020-05-05 Thread Mich Talebzadeh
OK looking promising thanks scala> import scala.util.{Try, Success, Failure} import scala.util.{Try, Success, Failure} scala> val df = Try(spark.read.format("com.databricks.spark.xml").option("rootTag", "hierarchy").option("rowTag", "sms_request").load("/tmp/broadcast.xml")) match {case

URL what is ? SecureByDesign & Use of LOGIN form not pop box.

2020-05-05 Thread Secure Bydesign
@Sean Owen As you do not know being the result of an average American Education. A URL has three main parts. HEAD. contains Addresses origin & Destination BODY. contains Index.html TAIL contains checksum total count of bytes in BODY. Just in case got lost en route.

Unsubscribe

2020-05-05 Thread Bibudh Lahiri
Unsubscribe

Re: Path style access fs.s3a.path.style.access property is not working in spark code

2020-05-05 Thread Samik Raychaudhuri
Recommend to use v2.9.x, there are lot of optimizations that makes life much easier while accessing from Spark. Thanks. -Samik On 05-05-2020 01:55 am, Aniruddha P Tekade wrote: Hello User, I got the solution to this. If you are writing to a custom s3 url, then use the hadoop-aws-2.8.0.jar as