Re: [DISCUSS] Spark 4.0.0 release

2024-05-01 Thread Xiao Li
+1 for next Monday. We can do more previews when the other features are ready for preview. Tathagata Das 于2024年5月1日周三 08:46写道: > Next week sounds great! Thank you Wenchen! > > On Wed, May 1, 2024 at 11:16 AM Wenchen Fan wrote: > >> Yea I think a preview release won't hurt (without a branch

Re: [VOTE] SPARK-44444: Use ANSI SQL mode by default

2024-04-13 Thread Xiao Li
+1 On Sat, Apr 13, 2024 at 17:21 huaxin gao wrote: > +1 > > On Sat, Apr 13, 2024 at 4:36 PM L. C. Hsieh wrote: > >> +1 >> >> On Sat, Apr 13, 2024 at 4:12 PM Hyukjin Kwon >> wrote: >> > >> > +1 >> > >> > On Sun, Apr 14, 2024 at 7:46 AM Chao Sun wrote: >> >> >> >> +1. >> >> >> >> This feature

Re: [VOTE] Add new `Versions` in Apache Spark JIRA for Versioning of Spark Operator

2024-04-12 Thread Xiao Li
+1 On Fri, Apr 12, 2024 at 14:30 bo yang wrote: > +1 > > On Fri, Apr 12, 2024 at 12:34 PM huaxin gao > wrote: > >> +1 >> >> On Fri, Apr 12, 2024 at 9:07 AM Dongjoon Hyun >> wrote: >> >>> +1 >>> >>> Thank you! >>> >>> I hope we can customize `dev/merge_spark_pr.py` script per repository >>>

Re: [VOTE] SPIP: Pure Python Package in PyPI (Spark Connect)

2024-04-01 Thread Xiao Li
+1 Hussein Awala 于2024年4月1日周一 08:07写道: > +1(non-binding) I add to the difference will it make that it will also > simplify package maintenance and easily release a bug fix/new feature > without needing to wait for Pyspark to release. > > On Mon, Apr 1, 2024 at 4:56 PM Chao Sun wrote: > >> +1

Re: [VOTE] SPIP: Structured Logging Framework for Apache Spark

2024-03-12 Thread Xiao Li
+1 On Tue, Mar 12, 2024 at 6:09 AM Holden Karau wrote: > +1 > > Twitter: https://twitter.com/holdenkarau > Books (Learning Spark, High Performance Spark, etc.): > https://amzn.to/2MaRAG9 > YouTube Live Streams: https://www.youtube.com/user/holdenkarau > > > On Mon,

Re: [VOTE] Release Apache Spark 3.5.1 (RC2)

2024-02-20 Thread Xiao Li
+1 Xiao Cheng Pan 于2024年2月20日周二 04:59写道: > +1 (non-binding) > > - Build successfully from source code. > - Pass integration tests with Spark ClickHouse Connector[1] > > [1] https://github.com/housepower/spark-clickhouse-connector/pull/299 > > Thanks, > Cheng Pan > > > > On Feb 20, 2024, at

Re: Re: [DISCUSS] Release Spark 3.5.1?

2024-02-04 Thread Xiao Li
+1 On Sun, Feb 4, 2024 at 6:07 AM beliefer wrote: > +1 > > > > 在 2024-02-04 15:26:13,"Dongjoon Hyun" 写道: > > +1 > > On Sat, Feb 3, 2024 at 9:18 PM yangjie01 > wrote: > >> +1 >> >> 在 2024/2/4 13:13,“Kent Yao”mailto:y...@apache.org>> 写入: >> >> >> +1 >> >> >> Jungtaek Lim >

Re: Remove HiveContext from Apache Spark 4.0

2023-11-29 Thread Xiao Li
Thank you for raising it in the dev list. I do not think we should remove HiveContext based on the cost of break and maintenance. FYI, when releasing Spark 3.0, we had a lot of discussions about the related topics https://lists.apache.org/thread/mrx0y078cf3ozs7czykvv864y6dr55xq Dongjoon Hyun

Re: [VOTE] SPIP: An Official Kubernetes Operator for Apache Spark

2023-11-15 Thread Xiao Li
+1 bo yang 于2023年11月15日周三 05:55写道: > +1 > > On Tue, Nov 14, 2023 at 7:18 PM huaxin gao wrote: > >> +1 >> >> On Tue, Nov 14, 2023 at 10:45 AM Holden Karau >> wrote: >> >>> +1 >>> >>> On Tue, Nov 14, 2023 at 10:21 AM DB Tsai wrote: >>> +1 DB Tsai | https://www.dbtsai.com/ |

Re: [DISCUSSION] SPIP: An Official Kubernetes Operator for Apache Spark

2023-11-09 Thread Xiao Li
+1 huaxin gao 于2023年11月9日周四 16:53写道: > +1 > > On Thu, Nov 9, 2023 at 3:14 PM DB Tsai wrote: > >> +1 >> >> To be completely transparent, I am employed in the same department as >> Zhou at Apple. >> >> I support this proposal, provided that we witness community adoption >> following the release

Welcome to Our New Apache Spark Committer and PMCs

2023-10-02 Thread Xiao Li
Hi all, The Spark PMC is delighted to announce that we have voted to add one new committer and two new PMC members. These individuals have consistently contributed to the project and have clearly demonstrated their expertise. New Committer: - Jiaan Geng (focusing on Spark Connect and Spark SQL)

Re: [VOTE] Release Apache Spark 3.5.0 (RC5)

2023-09-11 Thread Xiao Li
+1 Xiao Yuanjian Li 于2023年9月11日周一 10:53写道: > @Peter Toth I've looked into the details of this > issue, and it appears that it's neither a regression in version 3.5.0 nor a > correctness issue. It's a bug related to a new feature. I think we can fix > this in 3.5.1 and list it as a known issue

Re: [VOTE] Release Apache Spark 3.5.0 (RC4)

2023-09-06 Thread Xiao Li
+1 Xiao Herman van Hovell 于2023年9月6日周三 22:08写道: > Tested connect, and everything looks good. > > +1 > > On Wed, Sep 6, 2023 at 8:11 AM Yuanjian Li wrote: > >> Please vote on releasing the following candidate(RC4) as Apache Spark >> version 3.5.0. >> >> The vote is open until 11:59pm Pacific

Re: Welcome two new Apache Spark committers

2023-08-06 Thread Xiao Li
Congratulations, Peter and Xiduo! Debasish Das 于2023年8月6日周日 19:08写道: > Congratulations Peter and Xidou. > > On Sun, Aug 6, 2023, 7:05 PM Wenchen Fan wrote: > >> Hi all, >> >> The Spark PMC recently voted to add two new committers. Please join me in >> welcoming them to their new role! >> >>

Re: [VOTE] SPIP: XML data source support

2023-07-28 Thread Xiao Li
+1 On Fri, Jul 28, 2023 at 15:54 Sean Owen wrote: > +1 I think that porting the package 'as is' into Spark is probably > worthwhile. > That's relatively easy; the code is already pretty battle-tested and not > that big and even originally came from Spark code, so is more or less > similar

Re: Spark Docker Official Image is now available

2023-07-20 Thread Xiao Li
Thank you, Yikun! This is great! On Wed, Jul 19, 2023 at 7:55 PM Ruifeng Zheng wrote: > Awesome, thank you YiKun for driving this! > > On Thu, Jul 20, 2023 at 9:12 AM Hyukjin Kwon wrote: > >> This is amazing, finally! >> >> On Thu, 20 Jul 2023 at 10:10, Yikun Jiang wrote: >> >>> The spark

Re: [VOTE][SPIP] Python Data Source API

2023-07-06 Thread Xiao Li
+1 Xiao Hyukjin Kwon 于2023年7月5日周三 17:28写道: > +1. > > See https://youtu.be/yj7XlTB1Jvc?t=604 :-). > > On Thu, 6 Jul 2023 at 09:15, Allison Wang > wrote: > >> Hi all, >> >> I'd like to start the vote for SPIP: Python Data Source API. >> >> The high-level summary for the SPIP is that it aims to

Re: [VOTE] Release Plan for Apache Spark 4.0.0 (June 2024)

2023-06-15 Thread Xiao Li
gt; >>>> +1 >>>> >>>> On Mon, Jun 12, 2023 at 12:50 PM kazuyuki tanimura >>>> wrote: >>>> >>>>> +1 (non-binding) >>>>> >>>>> Thank you! >>>>> Kazu >>>>>

Re: [VOTE] Release Plan for Apache Spark 4.0.0 (June 2024)

2023-06-12 Thread Xiao Li
Thanks for starting the vote. I do have a concern about the target release date of Spark 4.0. L. C. Hsieh 于2023年6月12日周一 11:09写道: > +1 > > On Mon, Jun 12, 2023 at 11:06 AM huaxin gao > wrote: > > > > +1 > > > > On Mon, Jun 12, 2023 at 11:05 AM Dongjoon Hyun > wrote: > >> > >> +1 > >> > >>

Re: Apache Spark 3.4.1 Release?

2023-06-09 Thread Xiao Li
+1 On Fri, Jun 9, 2023 at 08:30 Wenchen Fan wrote: > +1 > > On Fri, Jun 9, 2023 at 8:52 PM Xinrong Meng wrote: > >> +1. Thank you Doonjoon! >> >> Thanks, >> >> Xinrong Meng >> >> Mridul Muralidharan 于2023年6月9日 周五上午5:22写道: >> >>> >>> +1, thanks Dongjoon ! >>> >>> Regards, >>> Mridul >>> >>> On

Re: [ANNOUNCE] Apache Spark 3.4.0 released

2023-04-14 Thread Xiao Li
gt; docker pull apache/spark-r:v3.4.0 > > Thanks, > Dongjoon > > > On Fri, Apr 14, 2023 at 2:56 PM Dongjoon Hyun > wrote: > >> Thank you, Xinrong! >> >> Dongjoon. >> >> >> On Fri, Apr 14, 2023 at 1:37 PM Xiao Li wrote: >> >>&

Re: [ANNOUNCE] Apache Spark 3.4.0 released

2023-04-14 Thread Xiao Li
Thank you Xinrong! Congratulations everyone! This is a great release with tons of new features! Gengliang Wang 于2023年4月14日周五 13:04写道: > Congratulations everyone! > Thank you Xinrong for driving the release! > > On Fri, Apr 14, 2023 at 12:47 PM Xinrong Meng > wrote: > >> Hi All, >> >> We are

Re: [VOTE] Release Apache Spark 3.4.0 (RC7)

2023-04-12 Thread Xiao Li
+1 Xiao Li Emil Ejbyfeldt 于2023年4月12日周三 12:39写道: > +1 (non-binding) > > Ran some tests with the Scala 2.13 build using part of our internal > spark workload. > > On 12/04/2023 19:52, Chris Nauroth wrote: > > +1 (non-binding) > > > > * Verified all chec

Re: [VOTE] Release Apache Spark 3.4.0 (RC7)

2023-04-11 Thread Xiao Li
Thanks for testing it in your environment! > This is a minor issue itself, and only impacts the metrics for push-based > shuffle, but it will essentially completely eliminate the effort > in SPARK-36620. Based on my understanding, this is not a regression. It only affects the new enhancements

Re: [VOTE] Release Apache Spark 3.4.0 (RC5)

2023-04-05 Thread Xiao Li
Hi, Anton, Could you please provide a complete list of exceptions that are being used in the public connector API? Thanks, Xiao Xinrong Meng 于2023年4月5日周三 12:06写道: > Thank you! > > I created a blocker Jira for that for easier tracking: > https://issues.apache.org/jira/browse/SPARK-43041. > >

Re: Slack for PySpark users

2023-03-30 Thread Xiao Li
g official. > > Bests, > Dongjoon. > > > > On Wed, Mar 29, 2023 at 11:32 PM Xiao Li wrote: > >> +1 >> >> + @dev@spark.apache.org >> >> This is a good idea. The other Apache projects (e.g., Pinot, Druid, >> Flink) have created their own dedicate

Re: Slack for PySpark users

2023-03-30 Thread Xiao Li
+1 + @dev@spark.apache.org This is a good idea. The other Apache projects (e.g., Pinot, Druid, Flink) have created their own dedicated Slack workspaces for faster communication. We can do the same in Apache Spark. The Slack workspace will be maintained by the Apache Spark PMC. I propose to

Re: [ANNOUNCE] Apache Spark 3.3.2 released

2023-02-18 Thread Xiao Li
Thank you, Liang-Chi! Xiao On Sat, Feb 18, 2023 at 1:07 AM beliefer wrote: > Congratulations ! > > > > At 2023-02-17 16:58:22, "L. C. Hsieh" wrote: > >We are happy to announce the availability of Apache Spark 3.3.2! > > > >Spark 3.3.2 is a maintenance release containing stability fixes. This

Re: Welcome Yikun Jiang as a Spark committer

2022-10-09 Thread Xiao Li
Congratulations, Yikun! Xiao Yikun Jiang 于2022年10月9日周日 19:34写道: > Thank you all! > > Regards, > Yikun > > > On Mon, Oct 10, 2022 at 3:18 AM Chao Sun wrote: > >> Congratulations Yikun! >> >> On Sun, Oct 9, 2022 at 11:14 AM vaquar khan >> wrote: >> >>> Congratulations. >>> >>> Regards, >>>

Re: Dropping Apache Spark Hadoop2 Binary Distribution?

2022-10-05 Thread Xiao Li
+1. Xiao On Wed, Oct 5, 2022 at 12:49 PM Sean Owen wrote: > I'm OK with this. It simplifies maintenance a bit, and specifically may > allow us to finally move off of the ancient version of Guava (?) > > On Mon, Oct 3, 2022 at 10:16 PM Dongjoon Hyun > wrote: > >> Hi, All. >> >> I'm wondering

Re: [DISCUSS] SPIP: Support Docker Official Image for Spark

2022-09-21 Thread Xiao Li
+1 Yikun Jiang 于2022年9月21日周三 07:22写道: > Thanks for all your inputs! BTW, I also create a JIRA to track related > work: https://issues.apache.org/jira/browse/SPARK-40513 > > > can I be involved in this work? > > @qian Of course! Thanks! > > Regards, > Yikun > > On Wed, Sep 21, 2022 at 7:31 PM

Welcoming three new PMC members

2022-08-09 Thread Xiao Li
Hi all, The Spark PMC recently voted to add three new PMC members. Join me in welcoming them to their new roles! New PMC members: Huaxin Gao, Gengliang Wang and Maxim Gekk The Spark PMC

Re: Apache Spark 3.2.2 Release?

2022-07-06 Thread Xiao Li
+1 Xiao Cheng Su 于2022年7月6日周三 19:16写道: > +1 (non-binding) > > Thanks, > Cheng Su > > On Wed, Jul 6, 2022 at 6:01 PM Yuming Wang wrote: > >> +1 >> >> On Thu, Jul 7, 2022 at 5:53 AM Maxim Gekk >> wrote: >> >>> +1 >>> >>> On Thu, Jul 7, 2022 at 12:26 AM John Zhuge wrote: >>> +1 Thanks

Re: [PSA] Please rebase and sync your master branch in your forked repository

2022-06-20 Thread Xiao Li
Thank you, Hyukjin! Xiao On Mon, Jun 20, 2022 at 7:01 PM Yi Wu wrote: > Thanks for the work, Hyukjin. > > On Tue, Jun 21, 2022 at 7:59 AM Yuming Wang wrote: > >> Thank you Hyukjin. >> >> On Tue, Jun 21, 2022 at 7:46 AM Hyukjin Kwon wrote: >> >>> After

Re: Re: [VOTE][SPIP] Spark Connect

2022-06-15 Thread Xiao Li
+1 Xiao beliefer 于2022年6月14日周二 03:35写道: > +1 > Yeah, I tried to use Apache Livy, so as we can runing interactive query. > But the Spark Driver in Livy looks heavy. > > The SPIP may resolve the issue. > > > > At 2022-06-14 18:11:21, "Wenchen Fan" wrote: > > +1 > > On Tue, Jun 14, 2022 at 9:38

Stickers and Swag

2022-06-14 Thread Xiao Li
Hi, all, The ASF has an official store at RedBubble that Apache Community Development (ComDev) runs. If you are interested in buying Spark Swag, 70 products featuring the Spark logo are available: https://www.redbubble.com/shop/ap/113203780 Go

Re: 回复: [VOTE] Release Spark 3.3.0 (RC6)

2022-06-13 Thread Xiao Li
+1 Xiao beliefer 于2022年6月13日周一 20:04写道: > +1 AFAIK, no blocking issues now. > Glad to hear to release 3.3.0 ! > > > 在 2022-06-14 09:38:35,"Ruifeng Zheng" 写道: > > +1 (non-binding) > > Maxim, thank you for driving this release! > > thanks, > ruifeng > > > > -- 原始邮件

Re: SIGMOD System Award for Apache Spark

2022-05-13 Thread Xiao Li
Congratulations to everyone! Xiao On Fri, May 13, 2022 at 9:34 AM Dongjoon Hyun wrote: > Ya, it's really great!. Congratulations to the whole community! > > Dongjoon. > > On Fri, May 13, 2022 at 8:12 AM Chao Sun wrote: > >> Huge congrats to the whole community! >> >> On Fri, May 13, 2022 at

Re: Apache Spark 3.3 Release

2022-03-15 Thread Xiao Li
ocking what you want to > do. > > Please let the community start to ramp down as we agreed before. > > Dongjoon > > > > On Tue, Mar 15, 2022 at 3:07 PM Xiao Li wrote: > >> Please do not get me wrong. If we don't cut a branch, we are allowing all >> patches

Re: Apache Spark 3.3 Release

2022-03-15 Thread Xiao Li
avoid backporting the feature work that are not being well > discussed. > > > > On Tue, Mar 15, 2022 at 12:12 PM Xiao Li wrote: > >> Cutting the branch is simple, but we need to avoid backporting the >> feature work that are not being well discussed. Not all the m

Re: Apache Spark 3.3 Release

2022-03-15 Thread Xiao Li
a branch. > > [SPARK-38335][SQL] Implement parser support for DEFAULT column values > > Let's cut `branch-3.3` Today for Apache Spark 3.3.0 preparation. > > Best, > Dongjoon. > > > On Tue, Mar 15, 2022 at 10:17 AM Chao Sun wrote: > >> Cool, t

Re: Apache Spark 3.3 Release

2022-03-15 Thread Xiao Li
ew minutes ago. So, we can remove > it from the list. > > > > #35819 [SPARK-38524][SPARK-38553][K8S] Bump Volcano to v1.5.1 > > > > Thanks, > > Dongjoon. > > > > On Tue, Mar 15, 2022 at 9:48 AM Xiao Li wrote: > >> > >> Let me clarify my

Re: Apache Spark 3.3 Release

2022-03-15 Thread Xiao Li
Let me clarify my above suggestion. Maybe we can wait 3 more days to collect the list of actively developed PRs that we want to merge to 3.3 after the branch cut? Please do not rush to merge the PRs that are not fully reviewed. We can cut the branch this Friday and continue merging the PRs that

Re: Apache Spark 3.3 Release

2022-03-14 Thread Xiao Li
https://github.com/apache/spark/pull/35395 > - https://github.com/apache/spark/pull/35657 > > are actively being reviewed. It seems there are ongoing PRs for other > SPIPs as well but I'm not involved in those so not quite sure whether > they are intended for 3.3 release. > > Cha

Re: Apache Spark 3.3 Release

2022-03-14 Thread Xiao Li
Could you please list which features we want to finish before the branch cut? How long will they take? Xiao Chao Sun 于2022年3月14日周一 13:30写道: > Hi Max, > > As there are still some ongoing work for the above listed SPIPs, can we > still merge them after the branch cut? > > Thanks, > Chao > > On

Re: [VOTE] SPIP: Catalog API for view metadata

2022-02-03 Thread Xiao Li
Can we extend the voting window to next Wednesday? This week is a holiday week for the lunar new year. AFAIK, many members in Asia are taking the whole week off. They might not regularly check the emails. Also how about starting a separate email thread starting with [VOTE] ? Happy Lunar New

Re: [Apache Spark Jenkins] build system shutting down Dec 23th, 2021

2021-12-06 Thread Xiao Li
Hi, Shane, Thank you for your work on it! Xiao On Mon, Dec 6, 2021 at 6:20 PM L. C. Hsieh wrote: > Thank you, Shane. > > On Mon, Dec 6, 2021 at 4:27 PM Holden Karau wrote: > > > > Shane you kick ass thank you for everything you’ve done for us :) Keep > on rocking :) > > > > On Mon, Dec 6,

Re: [FYI] Build and run tests on Java 17 for Apache Spark 3.3

2021-11-12 Thread Xiao Li
Thank you! Great job! Xiao On Fri, Nov 12, 2021 at 7:02 PM Mridul Muralidharan wrote: > > Nice job ! > There are some nice API's which should be interesting to explore with JDK > 17 :-) > > Regards. > Mridul > > On Fri, Nov 12, 2021 at 7:08 PM Yuming Wang wrote: > >> Cool, thank you

Re: [ANNOUNCE] Apache Spark 3.2.0

2021-10-19 Thread Xiao Li
Thank you, Gengliang! Congrats to our community and all the contributors! Xiao Henrik Peng 于2021年10月19日周二 上午8:26写道: > Congrats and thanks! > > > Gengliang Wang 于2021年10月19日 周二下午10:16写道: > >> Hi all, >> >> Apache Spark 3.2.0 is the third release of the 3.x line. With tremendous >> contribution

Re: [VOTE] Release Spark 3.2.0 (RC7)

2021-10-11 Thread Xiao Li
+1 Xiao Li Yi Wu 于2021年10月11日周一 上午12:08写道: > +1 (non-binding) > > On Mon, Oct 11, 2021 at 1:57 PM Holden Karau wrote: > >> +1 >> >> On Sun, Oct 10, 2021 at 10:46 PM Wenchen Fan wrote: >> >>> +1 >>> >>> On Sat, Oct

Re: [VOTE] Release Spark 3.2.0 (RC1)

2021-08-31 Thread Xiao Li
Hi, Chao, How long will it take? Normally, in the RC stage, we always revert the upgrade made in the current release. We did the parquet upgrade multiple times in the previous releases for avoiding the major delay in our Spark release Thanks, Xiao On Tue, Aug 31, 2021 at 11:03 AM Chao Sun

Re: [build system] half of the jenkins workers are down

2021-08-09 Thread Xiao Li
Thank you, Shane! Xiao shane knapp ☠ 于2021年8月9日周一 下午1:26写道: > turns out that minikube/k8s and friends were being oom-killed and this was > causing all sorts of weirdnesses. > > i've upped the ram limits on all of the k8s jobs to 8G (from 6G), and > we'll keep an eye on things and see how they

Re: Flaky build in GitHub Actions

2021-07-26 Thread Xiao Li
Thank you, Liang-chi and Hyukjin! On Sun, Jul 25, 2021 at 6:25 PM Hyukjin Kwon wrote: > This is fixed up via Laingchi's PR: > https://github.com/apache/spark/pull/33447. The issue is almost fixed now > and less flaky. > I'm still interacting w/ GitHub Actions: they are still investigating the >

Re: Apache Spark 3.2 Expectation

2021-06-16 Thread Xiao Li
> > To Liang-Chi, I'm -1 for postponing the branch cut because this is a soft > cut and the committers still are able to commit to `branch-3.3` according > to their decisions. First, I think you are saying "branch-3.2"; Second, the "so cut" means no "code freeze", although we cut the branch. To

Re: [ANNOUNCE] Apache Spark 3.1.2 released

2021-06-01 Thread Xiao Li
Thank you! Xiao On Tue, Jun 1, 2021 at 9:29 PM Hyukjin Kwon wrote: > awesome! > > 2021년 6월 2일 (수) 오전 9:59, Dongjoon Hyun 님이 작성: > >> We are happy to announce the availability of Spark 3.1.2! >> >> Spark 3.1.2 is a maintenance release containing stability fixes. This >> release is based on the

Re: Apache Spark 3.1.2 Release?

2021-05-17 Thread Xiao Li
+1 Thanks, Dongjoon! Xiao On Mon, May 17, 2021 at 8:45 PM Kent Yao wrote: > +1. thanks Dongjoon > > *Kent Yao * > @ Data Science Center, Hangzhou Research Institute, NetEase Corp. > *a spark enthusiast* > *kyuubi is a unified multi-tenant JDBC > interface

Re: Welcoming six new Apache Spark committers

2021-03-27 Thread Xiao Li
Congratulations, everyone! Xiao Chao Sun 于2021年3月26日周五 下午6:30写道: > Congrats everyone! > > On Fri, Mar 26, 2021 at 6:23 PM Mridul Muralidharan > wrote: > >> >> Congratulations, looking forward to more exciting contributions ! >> >> Regards, >> Mridul >> >> On Fri, Mar 26, 2021 at 8:21 PM

Re: [VOTE] SPIP: Support pandas API layer on PySpark

2021-03-27 Thread Xiao Li
+1 Xiao Takeshi Yamamuro 于2021年3月26日周五 下午4:14写道: > +1 (non-binding) > > On Sat, Mar 27, 2021 at 4:53 AM Liang-Chi Hsieh wrote: > >> +1 (non-binding) >> >> >> rxin wrote >> > +1. Would open up a huge persona for Spark. >> > >> > On Fri, Mar 26 2021 at 11:30 AM, Bryan Cutler < >> >> > cutlerb@

Re: Apache Spark 3.2 Expectation

2021-03-10 Thread Xiao Li
Below are some nice-to-have features we can work on in Spark 3.2: Lateral Join support , interval data type, timestamp without time zone, un-nesting arbitrary queries, the returned metrics of DSV2, and error message standardization. Spark 3.2 will

Re: Apache Spark 2.4.8 (and EOL of 2.4)

2021-03-04 Thread Xiao Li
Thank you, Liang-Chi! Xiao On Thu, Mar 4, 2021 at 6:25 PM Hyukjin Kwon wrote: > Thanks @Liang-Chi Hsieh for driving this. > > 2021년 3월 5일 (금) 오전 5:21, Liang-Chi Hsieh 님이 작성: > >> >> Thanks all for the input. >> >> If there is no objection, I am going to cut the branch next Monday. >> >>

Re: Apache Spark 3.2 Expectation

2021-02-26 Thread Xiao Li
Thank you, Dongjoon, for initiating this discussion. Let us keep it open. It might take 1-2 weeks to collect from the community all the features we plan to build and ship in 3.2 since we just finished the 3.1 voting. > 3. +100 for Apache Spark 3.2.0 in July 2021. Maybe, we need `branch-cut` > in

Re: [VOTE] Release Spark 3.1.1 (RC3)

2021-02-25 Thread Xiao Li
is indeed cause for concern. >>> +1 on extending the voting deadline until we finish investigation of >>> this. >>> >>> Regards, >>> Mridul >>> >>> >>> On Wed, Feb 24, 2021 at 12:55 PM Xiao Li wrote: >>> >>>

Re: [VOTE] Release Spark 3.1.1 (RC3)

2021-02-24 Thread Xiao Li
-1 Could we extend the voting deadline? A few TPC-DS queries (q17, q18, q39a, q39b) are returning different results between Spark 3.0 and Spark 3.1. We need a few more days to understand whether these changes are expected. Xiao Mridul Muralidharan 于2021年2月24日周三 上午10:41写道: > > Sounds good,

Re: Apache Spark 3.0.2 Release ?

2021-02-12 Thread Xiao Li
+1 Happy Lunar New Year! Xiao On Fri, Feb 12, 2021 at 5:33 PM Hyukjin Kwon wrote: > Yeah, +1 too > > 2021년 2월 13일 (토) 오전 4:49, Dongjoon Hyun 님이 작성: > >> Thank you, Sean! >> >> On Fri, Feb 12, 2021 at 11:41 AM Sean Owen wrote: >> >>> Sounds like a fine time to me, sure. >>> >>> On Fri, Feb

Re: [VOTE] Release Spark 3.1.0 (RC1)

2021-01-07 Thread Xiao Li
> I will prepare to upload news in spark-website to explain that 3.1.0 is incompletely published because there was something wrong during the release process, and we go to 3.1.1 right away. +1 Sean Owen 于2021年1月7日周四 上午6:44写道: > While we can delete the tag, maybe just leave it. As a general

Re: [build system] WE'RE LIVE!

2020-12-01 Thread Xiao Li
Thank you, Shane! Xiao On Tue, Dec 1, 2020 at 5:34 PM Dongjoon Hyun wrote: > Yay! Thanks! > > Bests, > Dongjoon > > On Tue, Dec 1, 2020 at 5:31 PM Takeshi Yamamuro > wrote: > >> Many thanks, guys! >> I've checked I can re-trigger Jenkins tests. >> >> Bests, >> Takeshi >> >> On Wed, Dec 2,

Re: Seeking committers' help to review on SS PR

2020-11-30 Thread Xiao Li
Just want to say thank you to all the active SS contributors. I saw many great features/improvements in Streaming have been merged and will be available in the upcoming 3.1 release. - Cache fetched list of files beyond maxFilesPerTrigger as unread file (SPARK-32568) - Streamline the

Re: jenkins downtime tomorrow evening/weekend

2020-11-23 Thread Xiao Li
Thank you, Shane! On Mon, Nov 23, 2020 at 2:12 PM shane knapp ☠ wrote: > the third most terrifying event in the world, a massive jenkins plugin > update is happening in a couple of hours. i'm going to restart jenkins and > start working out any bugs/issues that pop up. > > this could be short,

Re: Spark 3.1 branch cut 4th Dec?

2020-11-20 Thread Xiao Li
L) has been decided, it's just a matter of normal >> review comments. >> >> On Fri, Nov 20, 2020 at 9:05 AM Dongjoon Hyun >> wrote: >> >>> Thank you for sharing, Xiao. >>> >>> I hope we are able to make some agreement for CREATE TABLE DDLs, t

Re: Spark 3.1 branch cut 4th Dec?

2020-11-20 Thread Xiao Li
our best to address it in 3.1. Thanks, Xiao Xiao Li 于2020年11月20日周五 上午8:52写道: > Hi, Dongjoon, > > Thank you for your feedback. I think *Early December* does not mean we > will cut the branch on Dec 1st. I do not think Dec 1st and Dec 4th are a > big deal. Normally, it would

Re: Spark 3.1 branch cut 4th Dec?

2020-11-20 Thread Xiao Li
hat > features you are waiting for now. > > We are creating Apache Spark together. > > Bests, > Dongjoon. > > > On Thu, Nov 19, 2020 at 11:38 PM Xiao Li wrote: > >> Correction: >> >> Merging the feature work after the branch cut should not be encouraged

Re: Spark 3.1 branch cut 4th Dec?

2020-11-19 Thread Xiao Li
have two weeks ahead of the proposed branch cut date. I hope each feature owner might hurry up and try to finish it before the branch cut. Xiao Xiao Li 于2020年11月19日周四 下午11:36写道: > We should try to merge the feature work after the branch cut. This should > not be encouraged in general, al

Re: Spark 3.1 branch cut 4th Dec?

2020-11-19 Thread Xiao Li
We should try to merge the feature work after the branch cut. This should not be encouraged in general, although some committers did make some exceptions based on their own judgement. This email is a good reminder message. At least, we have two weeks ahead of the proposed branch cut date. I hope

Re: [VOTE] Standardize Spark Exception Messages SPIP

2020-11-06 Thread Xiao Li
+1 On Fri, Nov 6, 2020 at 6:23 AM Gengliang Wang wrote: > +1 > > On Nov 6, 2020, at 1:52 PM, Wenchen Fan wrote: > > +1 > > On Fri, Nov 6, 2020 at 12:56 PM kalyan wrote: > >> +1 >> >> On Fri, Nov 6, 2020, 5:58 AM Matei Zaharia >> wrote: >> >>> +1 >>> >>> Matei >>> >>> > On Nov 5, 2020, at

Re: I'm going to be out starting Nov 5th

2020-11-01 Thread Xiao Li
Take care, Holden! Bests, Xiao On Sat, Oct 31, 2020 at 9:53 PM 郑瑞峰 wrote: > Take care, Holden! Best wishes! > > > -- 原始邮件 -- > *发件人:* "Hyukjin Kwon" ; > *发送时间:* 2020年11月1日(星期天) 上午10:24 > *收件人:* "Denny Lee"; > *抄送:* "Dongjoon Hyun";"Holden Karau"< >

Re: [DISCUSS][SPIP] Standardize Spark Exception Messages

2020-10-29 Thread Xiao Li
+1 This is a great proposal to improve the usability of Spark. Make Spark simple to use! Xiao Xinyi Yu 于2020年10月27日周二 下午8:25写道: > Hi Chang, > > It is a script that directly analyzes the source code searching for raw > "throw new” exception. : ) Hope that give an intuitive overview of current

Re: [build system] jenkins wedged again

2020-10-14 Thread Xiao Li
Thank you, Shane! Xiao On Wed, Oct 14, 2020 at 12:00 PM shane knapp ☠ wrote: > we're mostly back up, and just waiting for a couple of ubuntu boxes to > finish booting... prb seem to be building now! > > On Wed, Oct 14, 2020 at 11:48 AM shane knapp ☠ > wrote: > >> i'm going to reboot the

Re: [UPDATE] Apache Spark 3.1.0 Release Window

2020-10-12 Thread Xiao Li
Thank you, Dongjoon Xiao On Mon, Oct 12, 2020 at 4:19 PM Dongjoon Hyun wrote: > Hi, All. > > Apache Spark 3.1.0 Release Window is adjusted like the following today. > Please check the latest information on the official website. > > - >

Re: Apache Spark 3.1 Preparation Status (Oct. 2020)

2020-10-04 Thread Xiao Li
ing > to the changed release cadence the code freeze should happen in > mid-November. > > On Sun, Oct 4, 2020 at 6:26 PM Xiao Li wrote: > >> Apache Spark 3.1.0 should be compared with Apache Spark 2.1.0. >> >> >> I think we made a change in release cadence si

Re: Apache Spark 3.1 Preparation Status (Oct. 2020)

2020-10-04 Thread Xiao Li
e: >> >>> >>> +1 on pushing the branch cut for increased dev time to match previous >>> releases. >>> >>> Regards, >>> Mridul >>> >>> On Sat, Oct 3, 2020 at 10:22 PM Xiao Li wrote: >>> >>>>

Re: Apache Spark 3.1 Preparation Status (Oct. 2020)

2020-10-03 Thread Xiao Li
Thank you for your updates. Spark 3.0 got released on Jun 18, 2020. If Nov 1st is the target date of the 3.1 branch cut, the feature development time window is less than 5 months. This is shorter than what we did in Spark 2.3 and 2.4 releases. Below are three highly desirable feature work I am

Re: [VOTE][SPARK-30602] SPIP: Support push-based shuffle to improve shuffle efficiency

2020-09-14 Thread Xiao Li
+1 Xiao DB Tsai 于2020年9月14日周一 下午4:09写道: > +1 > > On Mon, Sep 14, 2020 at 12:30 PM Chandni Singh wrote: > >> +1 >> >> Chandni >> >> On Mon, Sep 14, 2020 at 11:41 AM Tom Graves >> wrote: >> >>> +1 >>> >>> Tom >>> >>> On Sunday, September 13, 2020, 10:00:05 PM CDT, Mridul Muralidharan < >>>

Re: [VOTE] Release Spark 3.0.1 (RC3)

2020-09-01 Thread Xiao Li
Want to change my vote to 0, because we are unable to produce an end-user query to hit this bug. Xiao On Mon, Aug 31, 2020 at 12:41 PM Xiao Li wrote: > -1 due to a regression introduced by a fix in 3.0.1. > > See https://github.com/apache/spark/pull/29602 > > Xiao > > On M

Re: [VOTE] Release Spark 3.0.1 (RC3)

2020-08-31 Thread Xiao Li
-1 due to a regression introduced by a fix in 3.0.1. See https://github.com/apache/spark/pull/29602 Xiao On Mon, Aug 31, 2020 at 9:26 AM Tom Graves wrote: > +1 > > Tom > > On Friday, August 28, 2020, 09:02:31 AM CDT, 郑瑞峰 > wrote: > > > Please vote on releasing the following candidate as

Re: pip/conda distribution headless mode

2020-08-30 Thread Xiao Li
Hi, Georg, This is being tracked by https://issues.apache.org/jira/browse/SPARK-32017 You can leave comments in the JIRA. Thanks, Xiao On Sun, Aug 30, 2020 at 3:06 PM Georg Heiler wrote: > Hi, > > I want to use pyspark as distributed via conda in headless mode. > It looks like the hadoop

Re: [VOTE] Release Spark 2.4.7 (RC1)

2020-08-17 Thread Xiao Li
https://issues.apache.org/jira/browse/SPARK-32609 got merged. This is to fix a correctness bug in DSV2 of Spark 2.4. Please include it in the upcoming Spark 2.4.7 release. Thanks, Xiao On Sun, Aug 9, 2020 at 10:26 PM Prashant Sharma wrote: > Thanks for letting us know. So this vote is

Re: [SparkSql] Casting of Predicate Literals

2020-08-04 Thread Xiao Li
Hi, Russell, You might hit the other cases in which CAST blocks the predicate pushdown. If the Cast was added by users and it changes the actual type, we are unable to optimize it automatically because it could change the query correctness. If it was added by our type coercion rules

Re: [VOTE] Update the committer guidelines to clarify when to commit changes.

2020-07-31 Thread Xiao Li
+1 Xiao On Fri, Jul 31, 2020 at 9:32 AM Mridul Muralidharan wrote: > > +1 > > Thanks, > Mridul > > On Thu, Jul 30, 2020 at 4:49 PM Holden Karau wrote: > >> Hi Spark Developers, >> >> After the discussion of the proposal to amend Spark committer guidelines, >> it appears folks are generally in

Re: Welcoming some new Apache Spark committers

2020-07-14 Thread Xiao Li
Welcome, Dilip, Huaxin and Jungtaek! Xiao On Tue, Jul 14, 2020 at 11:02 AM Holden Karau wrote: > So excited to have our committer pool growing with these awesome folks, > welcome y'all! > > On Tue, Jul 14, 2020 at 10:59 AM Driesprong, Fokko > wrote: > >> Welcome! >> >> Op di 14 jul. 2020 om

Re: restarting jenkins build system tomorrow (7/8) ~930am PDT

2020-07-13 Thread Xiao Li
Thank you very much, Shane! Xiao On Mon, Jul 13, 2020 at 10:15 AM shane knapp ☠ wrote: > alright, the system load graphs show that we've had a generally decreasing > load since friday, and have burned through ~3k builds/day since the reboot > last week! i don't see many timeouts, and the PRB

Re: [DISCUSS] Apache Spark 3.0.1 Release

2020-07-01 Thread Xiao Li
+1 on releasing both 3.0.1 and 2.4.7 Great! Three committers volunteer to be a release manager. Ruifeng, Prashant and Holden. Holden just helped release Spark 2.4.6. This time, maybe, Ruifeng and Prashant can be the release manager of 3.0.1 and 2.4.7 respectively. Xiao On Wed, Jul 1, 2020 at

Re: Use Hadoop-3.2 as a default Hadoop profile in 3.0.0?

2020-06-24 Thread Xiao Li
large yet; it's been around far >>> less time. >>> >>> The bigger question indeed is dropping Hadoop 2.x / Hive 1.x etc >>> eventually, not now. >>> But if the question now is build defaults, is it a big deal either way? >>> >>> On Tue, Jun 2

Re: Use Hadoop-3.2 as a default Hadoop profile in 3.0.0?

2020-06-23 Thread Xiao Li
> As a side note, Homebrew is not Apache Spark official channel, but it's > also popular distribution channel in the community. And, it's using Hadoop > 3.2 distribution already. Hadoop 2.7 is too old for Year 2021 (Apache Spark > 3.1), isn't it? > > Bests, > Dongjoon. > &

Re: Use Hadoop-3.2 as a default Hadoop profile in 3.0.0?

2020-06-23 Thread Xiao Li
ongjoon. > > > > > On Tue, Jun 23, 2020 at 12:09 AM Xiao Li wrote: > >> >> Our monthly pypi downloads of PySpark have reached 5.4 million. We should >> avoid forcing the current PySpark users to upgrade their Hadoop versions. >> If we change the default, w

Re: Use Hadoop-3.2 as a default Hadoop profile in 3.0.0?

2020-06-23 Thread Xiao Li
There exists some recent discussion on the following PR. Please let us > know your thoughts. > > https://github.com/apache/spark/pull/28897 > > > Bests, > Dongjoon. > > > On Fri, Nov 1, 2019 at 9:41 AM Xiao Li wrote: > >> Hi, Steve, >> >> Tha

Re: Revisiting the idea of a Spark 2.5 transitional release

2020-06-12 Thread Xiao Li
Based on my understanding, DSV2 is not stable yet. It still misses various features. Even our built-in file sources are still unable to fully migrate to DSV2. We plan to enhance it in the next few releases to close the gap. Also, the changes on DSV2 in Spark 3.0 did not break any existing

Re: Revisiting the idea of a Spark 2.5 transitional release

2020-06-12 Thread Xiao Li
Which new functionalities are you referring to? In Spark SQL, most of the major features in Spark 3.0 are difficult/time-consuming to backport. For example, adaptive query execution. Releasing a new version is not hard, but backporting/reviewing/maintaining these features are very time-consuming.

Re: [vote] Apache Spark 3.0 RC3

2020-06-09 Thread Xiao Li
+1 (binding) Xiao On Mon, Jun 8, 2020 at 10:13 PM Xingbo Jiang wrote: > +1(non-binding) > > Jiaxin Shan 于2020年6月8日 周一下午9:50写道: > >> +1 >> I build binary using the following command, test spark workloads on >> Kubernetes (AWS EKS) and it's working well. >> >> ./dev/make-distribution.sh --name

Re: [VOTE] Release Spark 2.4.6 (RC8)

2020-06-03 Thread Xiao Li
3.0 and there was a decision not to backport > the fix: SPARK-31170 <https://issues.apache.org/jira/browse/SPARK-31170> > > On Wed, Jun 3, 2020 at 1:04 PM Xiao Li wrote: > >> Just downloaded it in my local macbook. Trying to create a table using >> the pre-built

Re: [VOTE] Release Spark 2.4.6 (RC8)

2020-06-03 Thread Xiao Li
Just downloaded it in my local macbook. Trying to create a table using the pre-built PySpark. It sounds like the conf "spark.sql.warehouse.dir" does not take an effect. It is trying to create a directory in "file:/user/hive/warehouse/t1". I have not done any investigation yet. Have any of you hit

  1   2   3   4   >