Re: [VOTE][SPIP] Lazy Materialization for Parquet Read Performance Improvement

2023-02-13 Thread Yuming Wang
+1 On Tue, Feb 14, 2023 at 11:27 AM Prem Sahoo wrote: > +1 > > On Mon, Feb 13, 2023 at 8:13 PM L. C. Hsieh wrote: > >> +1 >> >> On Mon, Feb 13, 2023 at 3:49 PM Mich Talebzadeh < >> mich.talebza...@gmail.com> wrote: >> >>> +1 for me >>> >>> >>> >>>view my Linkedin profile >>>

Executor tab missing information

2023-02-13 Thread Prem Sahoo
Hello All, I am executing spark jobs but in executor tab I am missing information, I cant see any data/info coming up. Please let me know what I am missing .

Re: [VOTE][SPIP] Lazy Materialization for Parquet Read Performance Improvement

2023-02-13 Thread Prem Sahoo
+1 On Mon, Feb 13, 2023 at 8:13 PM L. C. Hsieh wrote: > +1 > > On Mon, Feb 13, 2023 at 3:49 PM Mich Talebzadeh > wrote: > >> +1 for me >> >> >> >>view my Linkedin profile >> >> >> >> https://en.everybodywiki.com/Mich_Talebzadeh >>

Re: Executor metrics are missing on Prometheus sink

2023-02-13 Thread Qian Sun
Hi Luca, Thanks for your reply, which is very helpful for me :) I am trying other metrics sinks with cAdvisor to see the effect. If it works well, I will share it with the community. On Fri, Feb 10, 2023 at 4:26 PM Luca Canali wrote: > Hi Qian, > > > > Indeed the metrics available with the

Re: [VOTE][SPIP] Lazy Materialization for Parquet Read Performance Improvement

2023-02-13 Thread L. C. Hsieh
+1 On Mon, Feb 13, 2023 at 3:49 PM Mich Talebzadeh wrote: > +1 for me > > > >view my Linkedin profile > > > > https://en.everybodywiki.com/Mich_Talebzadeh > > > > *Disclaimer:* Use it at your own risk. Any and all responsibility

Re: Spark on Kube (virtua) coffee/tea/pop times

2023-02-13 Thread Mich Talebzadeh
Hi All, First thanks to Holden for organising this open discussion and exchange of ideas. I must apologize for problems with my microphone. Hopefully it should not happen again.. >From my own commercial experience with k8s, mainly Google GKE, there is a main concern that Spark on GKE is work in

Re: [VOTE][SPIP] Lazy Materialization for Parquet Read Performance Improvement

2023-02-13 Thread Mich Talebzadeh
+1 for me view my Linkedin profile https://en.everybodywiki.com/Mich_Talebzadeh *Disclaimer:* Use it at your own risk. Any and all responsibility for any loss, damage or destruction of data or any other property which may arise

Re: [VOTE][SPIP] Lazy Materialization for Parquet Read Performance Improvement

2023-02-13 Thread huaxin gao
+1 On Mon, Feb 13, 2023 at 3:09 PM Dongjoon Hyun wrote: > +1 > > Dongjoon > > On 2023/02/13 22:52:59 "L. C. Hsieh" wrote: > > Hi all, > > > > I'd like to start the vote for SPIP: Lazy Materialization for Parquet > > Read Performance Improvement. > > > > The high summary of the SPIP is that it

Re: [VOTE][SPIP] Lazy Materialization for Parquet Read Performance Improvement

2023-02-13 Thread Dongjoon Hyun
+1 Dongjoon On 2023/02/13 22:52:59 "L. C. Hsieh" wrote: > Hi all, > > I'd like to start the vote for SPIP: Lazy Materialization for Parquet > Read Performance Improvement. > > The high summary of the SPIP is that it proposes an improvement to the > Parquet reader with lazy materialization

[VOTE][SPIP] Lazy Materialization for Parquet Read Performance Improvement

2023-02-13 Thread L. C. Hsieh
Hi all, I'd like to start the vote for SPIP: Lazy Materialization for Parquet Read Performance Improvement. The high summary of the SPIP is that it proposes an improvement to the Parquet reader with lazy materialization which only materializes (i.e. decompress, de-code, etc...) necessary values.

Re: [DISCUSS] SPIP: Lazy Materialization for Parquet Read Performance Improvement

2023-02-13 Thread L. C. Hsieh
Hi Mich, The title of this thread is "[DISCUSS]". We need to have a public discussion on a SPIP proposal collecting comments before we can move forward to call for a vote on it. On Mon, Feb 13, 2023 at 2:35 PM Mich Talebzadeh wrote: > Hi, > > I thought we already voted to go ahead with this

Re: [DISCUSS] SPIP: Lazy Materialization for Parquet Read Performance Improvement

2023-02-13 Thread Mich Talebzadeh
Hi, I thought we already voted to go ahead with this proposal! view my Linkedin profile https://en.everybodywiki.com/Mich_Talebzadeh *Disclaimer:* Use it at your own risk. Any and all responsibility for any loss, damage or

Re: [DISCUSS] SPIP: Lazy Materialization for Parquet Read Performance Improvement

2023-02-13 Thread kazuyuki tanimura
Thank you Liang-Chi! Kazu > On Feb 11, 2023, at 7:12 PM, L. C. Hsieh wrote: > > Thanks all for your feedback. > > Given this positive feedback, if there is no other comments/discussion, I > will go to start a vote in the next few days. > > Thank you again! > > On Thu, Feb 2, 2023 at 10:12

Re: [VOTE] Release Spark 3.3.2 (RC1)

2023-02-13 Thread Holden Karau
That’s legit, if the patch author isn’t comfortable with a backport then let’s leave it be  On Mon, Feb 13, 2023 at 9:59 AM Dongjoon Hyun wrote: > Hi, All. > > As the author of that `Improvement` patch, I strongly disagree with giving > the wrong idea which Python 3.11 is officially supported

Re: [VOTE] Release Spark 3.3.2 (RC1)

2023-02-13 Thread Dongjoon Hyun
Hi, All. As the author of that `Improvement` patch, I strongly disagree with giving the wrong idea which Python 3.11 is officially supported in Spark 3.3. I only developed and delivered it for Apache Spark 3.4.0 specifically as `Improvement`. We may want to backport it branch-3.3 but it's also

Re: [VOTE] Release Spark 3.3.2 (RC1)

2023-02-13 Thread Chao Sun
+1 On Mon, Feb 13, 2023 at 9:20 AM L. C. Hsieh wrote: > > If it is not supported in Spark 3.3.x, it looks like an improvement at > Spark 3.4. > For such cases we usually do not back port. I think this is also why > the PR did not back port when it was merged. > > I'm okay if there is consensus

Re: [VOTE] Release Spark 3.3.2 (RC1)

2023-02-13 Thread Sean Owen
Agree, just, if it's such a tiny change, and it actually fixes the issue, maybe worth getting that into 3.3.x. I don't feel strongly. On Mon, Feb 13, 2023 at 11:19 AM L. C. Hsieh wrote: > If it is not supported in Spark 3.3.x, it looks like an improvement at > Spark 3.4. > For such cases we

Re: [VOTE] Release Spark 3.3.2 (RC1)

2023-02-13 Thread Holden Karau
I’d be in favor of a back porting with the idea its a bug fix for a language (admittedly not a version we’ve supported before) On Mon, Feb 13, 2023 at 9:19 AM L. C. Hsieh wrote: > If it is not supported in Spark 3.3.x, it looks like an improvement at > Spark 3.4. > For such cases we usually do

Re: [VOTE] Release Spark 3.3.2 (RC1)

2023-02-13 Thread L. C. Hsieh
If it is not supported in Spark 3.3.x, it looks like an improvement at Spark 3.4. For such cases we usually do not back port. I think this is also why the PR did not back port when it was merged. I'm okay if there is consensus to back port it. On Mon, Feb 13, 2023 at 9:08 AM Sean Owen wrote: >

Re: [VOTE] Release Spark 3.3.2 (RC1)

2023-02-13 Thread Sean Owen
Does that change change the result for Spark 3.3.x? It looks like we do not support Python 3.11 in Spark 3.3.x, which is one answer to whether this should be changed now. But if that's the only change that matters for Python 3.11 and makes it work, sure I think we should back-port. It doesn't

Re: [VOTE] Release Spark 3.3.2 (RC1)

2023-02-13 Thread Bjørn Jørgensen
There is a fix for python 3.11 https://github.com/apache/spark/pull/38987 We should have this in more branches. man. 13. feb. 2023 kl. 09:39 skrev Bjørn Jørgensen : > On manjaro it is Python 3.10.9 > > On ubuntu it is Python 3.11.1 > > man. 13. feb. 2023 kl. 03:24 skrev yangjie01 : > >> Which

Re: [VOTE] Release Spark 3.3.2 (RC1)

2023-02-13 Thread Bjørn Jørgensen
On manjaro it is Python 3.10.9 On ubuntu it is Python 3.11.1 man. 13. feb. 2023 kl. 03:24 skrev yangjie01 : > Which Python version do you use for testing? When I use the latest Python > 3.11, I can reproduce similar test failures (43 tests of sql module fail), > but when I use python 3.10, they

Re: Spark on Kube (virtua) coffee/tea/pop times

2023-02-13 Thread Holden Karau
Some general issues we found common ground around: Inter-Pod security, istio + mTLS Sidecar management Docker Images Add links to more related images - Helm links Data Locality concerns Upgrading Spark Versions Performance issues Thanks to everyone who was able to make the informal coffee chat

Re: [VOTE] Release Spark 3.3.2 (RC1)

2023-02-13 Thread William Hyun
+1 Thank you, William On 2023/02/13 07:32:49 John Zhuge wrote: > +1 (non-binding) > > Rebased internal branch. Passed build with Java 8 and Scala 2.12. Passed > integration tests with Python 3.10. > > On Sun, Feb 12, 2023 at 8:49 PM Yuming Wang wrote: > > > +1. > > > > On Mon, Feb 13, 2023