Re: [VOTE] Release Spark 3.4.2 (RC1)

2023-11-29 Thread L. C. Hsieh
+1 Thanks Dongjoon! On Wed, Nov 29, 2023 at 7:53 PM Mridul Muralidharan wrote: > > +1 > > Signatures, digests, etc check out fine. > Checked out tag and build/tested with -Phive -Pyarn -Pmesos -Pkubernetes > > Regards, > Mridul > > On Wed, Nov 29, 2023 at 5:08 AM Yang Jie wrote: >> >>

Re: [DISCUSS] SPIP: Structured Streaming - Arbitrary State API v2

2023-11-29 Thread Anish Shrigondekar
Hi dev, Addressed the comments that Jungtaek had on the doc. Bumping the thread once again to see if other folks have any feedback on the proposal. Thanks, Anish On Mon, Nov 27, 2023 at 8:15 PM Jungtaek Lim wrote: > Kindly bump for better reach after the long holiday. Please kindly review >

Re: [DISCUSSION] SPIP: An Official Kubernetes Operator for Apache Spark

2023-11-29 Thread Shiqi Sun
Hi Zhou, Thanks for the reply. For the language choice, since I don't think I've used many k8s components written in Java on k8s, I can't really tell, but at least for the components written in Golang, they are well-organized, easy to read/maintain and run well in general. In addition, goroutines

Re: [VOTE] Release Spark 3.4.2 (RC1)

2023-11-29 Thread Mridul Muralidharan
+1 Signatures, digests, etc check out fine. Checked out tag and build/tested with -Phive -Pyarn -Pmesos -Pkubernetes Regards, Mridul On Wed, Nov 29, 2023 at 5:08 AM Yang Jie wrote: > +1(non-binding) > > Jie Yang > > On 2023/11/29 02:08:04 Kent Yao wrote: > > +1(non-binding) > > > > Kent Yao >

[sql] how to connect query stage to Spark job/stages?

2023-11-29 Thread Chenghao Lyu
Hi, I am seeking advice on measuring the performance of each QueryStage (QS) when AQE is enabled in Spark SQL. Specifically, I need help to automatically map a QS to its corresponding jobs (or stages) to get the QS runtime metrics. I recorded the QS structure via a customized injected Query

Re: Remove HiveContext from Apache Spark 4.0

2023-11-29 Thread Yang Jie
Thank you very much for the feedback from Dongjoon and Xiao Li. After carefully reading https://lists.apache.org/thread/mrx0y078cf3ozs7czykvv864y6dr55xq, I have decided to abandon the deletion of HiveContext. As Xiao Li said, its maintenance cost is not high, but it will increase the cost of

Re: Remove HiveContext from Apache Spark 4.0

2023-11-29 Thread Xiao Li
Thank you for raising it in the dev list. I do not think we should remove HiveContext based on the cost of break and maintenance. FYI, when releasing Spark 3.0, we had a lot of discussions about the related topics https://lists.apache.org/thread/mrx0y078cf3ozs7czykvv864y6dr55xq Dongjoon Hyun

Re: Remove HiveContext from Apache Spark 4.0

2023-11-29 Thread Dongjoon Hyun
Thank you for the heads-up. I agree with your intention and the fact that it's not useful in Apache Spark 4.0.0. However, as you know, historically, it was removed once and explicitly added back to the Apache Spark 3.0 via the vote. SPARK-31088 Add back HiveContext and createExternalTable (As a

Re: [VOTE] Release Spark 3.4.2 (RC1)

2023-11-29 Thread Yang Jie
+1(non-binding) Jie Yang On 2023/11/29 02:08:04 Kent Yao wrote: > +1(non-binding) > > Kent Yao > > On 2023/11/27 01:12:53 Dongjoon Hyun wrote: > > Hi, Marc. > > > > Given that it exists in 3.4.0 and 3.4.1, I don't think it's a release > > blocker for Apache Spark 3.4.2. > > > > When the

Remove HiveContext from Apache Spark 4.0

2023-11-29 Thread 杨杰
Hi all, In SPARK-46171 (apache/spark#44077 [1]), I’m trying to remove the deprecated HiveContext from Apache Spark 4.0 since HiveContext has been marked as deprecated after Spark 2.0. This is a long-deprecated API, it should be replaced with SparkSession with enableHiveSupport now, so I think