Re: [ANNOUNCE] Announcing Apache Spark 3.1.1

2021-03-02 Thread Kent Yao
Congrats, all! Bests, Kent Yao @ Data Science Center, Hangzhou Research Institute, NetEase Corp.a spark 

AW: [ANNOUNCE] Announcing Apache Spark 3.1.1

2021-03-02 Thread Bode, Meikel, NMA-CFD
Congrats! Von: Hyukjin Kwon Gesendet: Mittwoch, 3. März 2021 02:41 An: user @spark ; dev Betreff: [ANNOUNCE] Announcing Apache Spark 3.1.1 We are excited to announce Spark 3.1.1 today. Apache Spark 3.1.1 is the second release of the 3.x line. This release adds Python type annotations and

Re: [ANNOUNCE] Announcing Apache Spark 3.1.1

2021-03-02 Thread Mich Talebzadeh
Great, let us take it for a test drive. LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw * *Disclaimer:* Use it at your own risk. Any and all responsibility for

Re: [ANNOUNCE] Announcing Apache Spark 3.1.1

2021-03-02 Thread Takeshi Yamamuro
Congrats, all! Bests, Takeshi On Wed, Mar 3, 2021 at 2:18 PM Mridul Muralidharan wrote: > > Thanks Hyukjin and congratulations everyone on the release ! > > Regards, > Mridul > > On Tue, Mar 2, 2021 at 8:54 PM Yuming Wang wrote: > >> Great work, Hyukjin! >> >> On Wed, Mar 3, 2021 at 9:50 AM

Re: [ANNOUNCE] Announcing Apache Spark 3.1.1

2021-03-02 Thread Mridul Muralidharan
Thanks Hyukjin and congratulations everyone on the release ! Regards, Mridul On Tue, Mar 2, 2021 at 8:54 PM Yuming Wang wrote: > Great work, Hyukjin! > > On Wed, Mar 3, 2021 at 9:50 AM Hyukjin Kwon wrote: > >> We are excited to announce Spark 3.1.1 today. >> >> Apache Spark 3.1.1 is the

Re: [ANNOUNCE] Announcing Apache Spark 3.1.1

2021-03-02 Thread Yuming Wang
Great work, Hyukjin! On Wed, Mar 3, 2021 at 9:50 AM Hyukjin Kwon wrote: > We are excited to announce Spark 3.1.1 today. > > Apache Spark 3.1.1 is the second release of the 3.x line. This release adds > Python type annotations and Python dependency management support as part > of Project Zen. >

[ANNOUNCE] Announcing Apache Spark 3.1.1

2021-03-02 Thread Hyukjin Kwon
We are excited to announce Spark 3.1.1 today. Apache Spark 3.1.1 is the second release of the 3.x line. This release adds Python type annotations and Python dependency management support as part of Project Zen. Other major updates include improved ANSI SQL compliance support, history server

Re: Structured Streaming With Kafka - processing each event

2021-03-02 Thread Gourav Sengupta
Hi, Are you using structured streaming, which is the spark version and Kafka version, and where are you fetching the data from? Semantically speaking if your data in Kafka represents an action to be performed then it should be actually a queue like rabbitmq or SQS. If it is simply data then it

Re: Please update this notification on Spark download Site

2021-03-02 Thread Sean Owen
That statement is still accurate - it is saying the release will be 3.1.1, not 3.1.0. In any event, 3.1.1 is rolling out as we speak - already in Maven and binaries are up and the website changes are being merged. On Tue, Mar 2, 2021 at 9:10 AM Mich Talebzadeh wrote: > > Can someone please

Please update this notification on Spark download Site

2021-03-02 Thread Mich Talebzadeh
Can someone please update the release date of 3.1.1 from Downloads | Apache Spark - Next official release: Spark 3.1.1

Re: Structured Streaming With Kafka - processing each event

2021-03-02 Thread Sachit Murarka
Hi Mich, Thanks for reply. Will checkout this. Kind Regards, Sachit Murarka On Fri, Feb 26, 2021 at 2:14 AM Mich Talebzadeh wrote: > Hi Sachit, > > I managed to make mine work using the *foreachBatch function *in > writeStream. > > "foreach" performs custom write logic on each row and

[Spark SQL, intermediate+] possible bug or weird behavior of insertInto

2021-03-02 Thread Oldrich Vlasic
Hi, I have encountered a weird and potentially dangerous behaviour of Spark concerning partial overwrites of partitioned data. Not sure if this is a bug or just abstraction leak. I have checked Spark section of Stack Overflow and haven't found any relevant questions or answers. Full minimal

Re: s3a staging committer(directory committer )not writing data to s3 bucket (final output directory) in spark3

2021-03-02 Thread Mich Talebzadeh
Hi Shiva, *This works on 3.0.1 on prem* but not on Google dataproc with spark 3.1.1-RC2 These are the jar files used for structured streaming All added under $SPARK_HOME/jars on all nodes spark-sql-kafka-0-10_2.12-3.0.1.jar kafka-clients-2.7.0.jar spark-token-provider-kafka-0-10_2.12-3.0.1.jar

Re: s3a staging committer(directory committer )not writing data to s3 bucket (final output directory) in spark3

2021-03-02 Thread shiva
Hi Mich Talebzadeh, Could you please share the spark configuration used to run the job? you mentioned it works on 3.0.1 I will check if I am also using the same configuration or not. Regards, Shiva -- Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/

Re: Spark job crashing - Spark Structured Streaming with Kafka

2021-03-02 Thread Sachit Murarka
Hi Jungtaek, Please find full logs: java.io.EOFException at okio.RealBufferedSource.require(RealBufferedSource.java:61) at okio.RealBufferedSource.readByte(RealBufferedSource.java:74) at okhttp3.internal.ws.WebSocketReader.readHeader(WebSocketReader.java:117) at

Re: Spark job crashing - Spark Structured Streaming with Kafka

2021-03-02 Thread Jungtaek Lim
I feel this quite lacks information. Full stack traces from driver/executors are essential at least to determine what was happening. On Tue, Mar 2, 2021 at 5:26 PM Sachit Murarka wrote: > Hi All, > > My spark job is crashing (Structured stream) . Can anyone help please. I > am using spark 3.0.1

Spark job crashing - Spark Structured Streaming with Kafka

2021-03-02 Thread Sachit Murarka
Hi All, My spark job is crashing (Structured stream) . Can anyone help please. I am using spark 3.0.1 with kubernetes. [ERROR] - StreamingQueryException Exception in query.awaitTermination() File "/opt/spark/python/lib/pyspark.zip/pyspark/sql/streaming.py", line 103, in awaitTermination