Re: Kafka Topic to Parquet HDFS with Structured Streaming

2020-11-19 Thread AlbertoMarq
Hi Chetan I'm having the exact same issue with spark structured streaming and kafka trying to write to HDFS. Can you please tell me how did you fixed it? I'm ussing spark 3.0.1 and hadoop 3.3.0 Thanks! -- Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/

Re: Spark 3.0.1 new Proleptic Gregorian calendar

2020-11-19 Thread Maxim Gekk
Hello Saurabh, > What config options should we set, > - if we are always going to read old data written from Spark2.4 using Spark 3.0 You should set *spark.sql.legacy.parquet.datetimeRebaseModeInRead* to *LEGACY *when you read old data*.* You see this exception because Spark 3.0 cannot

Spark 3.0.1 new Proleptic Gregorian calendar

2020-11-19 Thread Saurabh Gulati
Hello, First of all, Thanks to you guys for maintaining and improving Spark. We just updated to Spark 3.0.1 and are facing some issues with the new Proleptic Gregorian calendar. We have data from different sources in our platform and we saw there were some date/timestamp columns that go back

Re: Cannot perform operation after producer has been closed

2020-11-19 Thread Eric Beabes
THANK YOU SO MUCH! Will try it out & revert. On Thu, Nov 19, 2020 at 8:18 AM Gabor Somogyi wrote: > "spark.kafka.producer.cache.timeout" is available since 2.2.1 which can be > increased as a temporary workaround. > This is not super elegant but works which gives enough time to migrate to >

Re: Cannot perform operation after producer has been closed

2020-11-19 Thread Gabor Somogyi
"spark.kafka.producer.cache.timeout" is available since 2.2.1 which can be increased as a temporary workaround. This is not super elegant but works which gives enough time to migrate to Spark 3. On Wed, Nov 18, 2020 at 11:12 PM Eric Beabes wrote: > I must say.. *Spark has let me down in this

Re: Need Unit test complete reference for Pyspark

2020-11-19 Thread Sofia’s World
Hey they are good libraries..to get you started. Have used both of them.. unfortunately -as far as i saw when i started to use them - only few people maintains them. But you can get pointers out of them for writing tests. the code below can get you started What you'll need is - a method to