Spark 2.4.7

2023-08-25 Thread Harry Jamison
I am using python 3.7 and Spark 2.4.7 I am not sure what the best way to do this is. I have a dataframe with a url in one of the columns, and I want to download the contents of that url and put it in a new column. Can someone point me in the right direction on how to do this?I looked at the UDFs

Re: mysterious spark.sql.utils.AnalysisException Union in spark 3.3.2, but not seen in 3.4.0+

2023-08-25 Thread Mich Talebzadeh
Hi Srivastan, Ground investigation 1. Does this union explicitly exist in your code? If not, where are the 7 and 6 column counting coming from? 2. On 3.3.1 have you looked at spark UI and the relevant dag diagram 3. Check query execution plan using explain() functionality 4. Can

mysterious spark.sql.utils.AnalysisException Union in spark 3.3.2, but not seen in 3.4.0+

2023-08-25 Thread Srivatsan vn
Hello Users, I have been seeing some weird issues when I upgraded my EMR setup to 6.11 (which uses spark 3.3.2) , the call stack seems to point to a code location where there is no explicit union, also I have unionByName everywhere in the codebase with allowMissingColumns set

Unsubscribe

2023-08-25 Thread Dipayan Dev

Spark Connect: API mismatch in SparkSesession#execute

2023-08-25 Thread Stefan Hagedorn
Hi everyone, I’m trying to use the “extension” feature of the Spark Connect CommandPlugin (Spark 3.4.1). I created a simple protobuf message `MyMessage` that I want to send from the connect client-side to the connect server (where I registered my plugin). The SparkSession class in