Re: [VOTE] Release Apache Celeborn 0.4.2-rc1

2024-07-25 Thread Keyong Zhou
-bin.tgz.sha512 sha512sum --check apache-celeborn-0.4.2-source.tgz.sha512 ``` - LICENSE looks good. - NOTICE looks good. - build success from source code (macOS). ``` ./build/make-distribution.sh --sbt-enabled --release ``` Regards, Keyong Zhou Fu Chen 于2024年7月22日周一 18:14写道: > Hi Celeborn commun

Re: [ANNOUNCE] New Celeborn PMC Member: Nicholas Jiang

2024-07-23 Thread Keyong Zhou
Congrats! Regards, Keyong Zhou angers zhu 于2024年7月23日周二 18:09写道: > Congrats! > > > Thanks > Angerszh > > Cheng Pan 于2024年7月23日周二 18:01写道: > > > Congrats! > > > > Thanks, > > Cheng Pan > > > > On Tue, Jul 23, 2024 a

Re: [ANNOUNCE] New Celeborn Committer: Fei Wang

2024-07-22 Thread Keyong Zhou
Congratulations! Regards, Keyong Zhou angers zhu 于2024年7月23日周二 12:07写道: > Congratulations! > > Shaoyun Chen 于2024年7月23日周二 11:15写道: > > > Congratulations! > > > > Cheng Pan 于2024年7月23日周二 11:05写道: > > > > > > Hi Celeborn Community, > > >

Re: [VOTE] Release Apache Celeborn 0.4.2-rc0

2024-07-20 Thread Keyong Zhou
Hi Fu, I wonder why change all ${project.version} to 0.4.2 in [1] instead of just change the definition of to 0.4.2 like [2] does? Regards, Keyong Zhou Fu Chen 于2024年7月17日周三 00:23写道: > Hi Celeborn community, > > This is a call for a vote to release Apache Celeborn 0.4.2-rc0 > >

Re: [DISCUSS] CIP-10: Introduce Celeborn Chaos Testing Framework

2024-07-14 Thread Keyong Zhou
Thanks for the proposal! The chaos framework is very useful for Celeborn, there are two points I think are important: 1. We need to add correctness check in the framework, correctness is NO.1 important thing. 2. The framework should not intrude into the common code. Regards, Keyong Zhou

Re: Jira version update

2024-07-04 Thread Keyong Zhou
Thanks Mridul for pointing this out, I just modified 0.5.0 as released :) Regards, Keyong Zhou Mridul Muralidharan 于2024年7月5日周五 03:13写道: > Hi, > > While updating an issue manually, I noticed that 0.5.0 is still mentioned > as an unreleased version in jira. > Given 0.5 rel

Re: [VOTE] CIP-9: Celeborn RESTful API Refine

2024-07-02 Thread Keyong Zhou
+1 Regards, Keyong Zhou Fei Wang 于2024年7月3日周三 02:07写道: > Hi all, > > Thanks for all the feedback about the CIP-9 Celeborn RESTful API Refine > [1]. > The discussion thread is here [2]. > > I'd like to start a vote for it. The vote will be open for at le

Re: Re:[DISCUSS] Celeborn RESTful API Refine Proposal

2024-07-01 Thread Keyong Zhou
and starts a vote on it, like[2][3]. Regards, Keyong Zhou [1] https://cwiki.apache.org/confluence/display/CELEBORN/CIP-7+Celeborn+CLI [2] https://lists.apache.org/thread/xjh8z2kszq0kwj5bdz2bh3b1sotv593p [3] https://lists.apache.org/thread/bx58h25poypq0znolkb8vlhop4bw1x81 Fei Wang 于2024年7月2日周二 04:53写

Re: [VOTE] Release Apache Celeborn 0.5.0-rc3

2024-06-22 Thread Keyong Zhou
-bin.tgz.sha512 sha512sum --check apache-celeborn-0.5.0-source.tgz.sha512 ``` - LICENSE looks good. - NOTICE looks good. - build success from source code (macOS). ``` ./build/make-distribution.sh --sbt-enabled --release ``` BTW, thanks for the rich tests! Regards, Keyong Zhou Ethan Feng 于2024年6月19日周三 12:47写道

Re: Re:Re:[VOTE] CIP-6: Support Flink hybrid shuffle integration with Apache Celeborn

2024-06-15 Thread Keyong Zhou
+1 (binding) Regards, Keyong Zhou Ethan Feng 于2024年6月14日周五 16:10写道: > +1(binding) > > I think this CIP would bring performance benefits to Flink users. > > Thanks, > Ethan > > Nicholas 于2024年6月14日周五 14:50写道: > > > > +1(non-binding).

Re: Re: [VOTE] Contrinute Apache Celeborn CLI

2024-06-11 Thread Keyong Zhou
+1 Thanks for the proposal! Regards, Keyong Zhou Nicholas Jiang 于2024年6月12日周三 13:02写道: > +1. Looking forward to Celeborn CLI. > > > > > Regards, > > Nicholas Jiang > > > At 2024-06-12 12:26:34, "Aravind Patnam" wrote: > >Hi all, > > >

Re: [DISCUSSION] CIP-6: Support Flink hybrid shuffle integration with Apache Celeborn

2024-06-07 Thread Keyong Zhou
Shuffle with Celeborn's Reduce Partition as mentioned in the doc in the future, which I believe will benefit more for very large shuffle operators :) Regards, Keyong Zhou Nicholas Jiang 于2024年6月6日周四 13:25写道: > Hi Yuxin, > > Thanks for driving this CIP about integration with Hybrid Shuffl

Re: [DISCUSS] Celeborn CLI Proposal

2024-06-07 Thread Keyong Zhou
Hi Aravind, Thanks for the proposal! The proposal LGTM, I think it's very valuable. Regards, Keyong Zhou Aravind Patnam 于2024年6月7日周五 12:47写道: > Hi, > > Thanks Nicholas for the comments! > > I now got access to put the proposal in Confluence in the form of CIP, here > <htt

Re: [VOTE] Release Apache Celeborn 0.5.0-rc0

2024-06-07 Thread Keyong Zhou
is deleted, so it should also be removed from the LICENSE-binary file Regards, Keyong Zhou Ethan Feng 于2024年6月7日周五 15:56写道: > Hello, Celeborn community, > > This is a call for a vote to release Apache Celeborn > 0.5.0-rc0 > > The git tag to be voted upon: > https://githu

Re: [DRAFT] Celeborn Board Report

2024-06-07 Thread keyong zhou
Thanks Yu for reviewing, I'm going to submit the report today. Regards, Keyong Zhou Yu Li 于2024年6月5日周三 12:53写道: > +1. Thanks for compiling this, Keyong! > > Best Regards, > Yu > > On Tue, 4 Jun 2024 at 09:42, Keyong Zhou wrote: > > > > Hi community, > >

[DRAFT] Celeborn Board Report

2024-06-03 Thread Keyong Zhou
extensive outreach for our users, and encouraging them to contribute back to the project. Also, we are active in making a voice in various conferences to attract more users. Regards, Keyong Zhou

Re: [Discussion] Proposal Management in Celeborn Community

2024-05-29 Thread Keyong Zhou
+1 for me. About the comments by Cheng, IMHO discussing in maillist is also acceptable (and even better) Regards, Keyong Zhou Cheng Pan 于2024年5月29日周三 14:32写道: > +1 for archiving proposals on confluence. > > Does Confluence support inline comments like Google Docs does? I thi

Re: [DISCUSS] Time for 0.5.0

2024-05-24 Thread Keyong Zhou
+1 for releasing 0.5.0. But I think memory file storage is still experimental. Regards, Keyong Zhou Ethan Feng 于2024年5月24日周五 18:15写道: > Hello, Celeborn community, > > It has been 4 months since we released the last major version. Some > new features, such as SSL support and memory

Re: [VOTE] Release Apache Celeborn 0.4.1-rc1

2024-05-21 Thread Keyong Zhou
pure shuffle, 0.4.0 vs. 0.4.1: 10.3min vs. 10.3min Regards, Keyong Zhou Cheng Pan 于2024年5月21日周二 14:16写道: > +1 > > I have rolled out this version to a small cluster for several days, > everything goes well so far. > > I checked the > org.apache.celeborn:celeborn-client-spar

Re: Voice Of Apache interview request

2024-05-15 Thread Keyong Zhou
I'm in China (GMT + 8), 12 hours earlier than your zone. 9:00 -10:00 a.m. or 9:00 -10:00 p.m is good for me. BTW, could you send the questions ahead of time so that I can prepare for it? Regards, Keyong Zhou Bowen, Rich 于2024年5月15日周三 23:34写道: > I do interviews over Google Meet, so that I

Re: Re: [VOTE] Release Apache Celeborn 0.4.1-rc0

2024-05-08 Thread Keyong Zhou
Thanks Nichalos for volunteering for the test! Regards, Keyong Zhou Nicholas Jiang 于2024年5月8日周三 17:34写道: > Hi Keyong, > > > > > If no one takes the third test, I perhaps take random killing test via > chaos testing framework of celeborn in internal testing environm

Re: [DRAFT] Celeborn Board Report

2024-05-07 Thread Keyong Zhou
Thanks for all the feedback! I will submit the report today. Regards, Keyong Zhou Yu Li 于2024年5月8日周三 08:17写道: > LGTM. Thanks for the efforts, Keyong. > > Best Regards, > Yu > > > On Tue, May 7, 2024 at 9:30 AM Keyong Zhou wrote: > > > Thanks Yu for your comm

Re: [VOTE] Release Apache Celeborn 0.4.1-rc0

2024-05-07 Thread Keyong Zhou
Hi Nicholas, Thanks for the work! I think we need to test the following scenarios before publishing: 1. compatibility test 2. perf test: i.e. TPCDS, pure shuffle workload 3. random killing test I'll take the perf test, anyone take the other two? Regards, Keyong Zhou Nicholas Jiang 于2024年5月7日

Re: [DRAFT] Celeborn Board Report

2024-05-06 Thread Keyong Zhou
or making decisions. > > Best Regards, > Yu > > [1] > https://community.apache.org/newbiefaq.html#NewbieFAQ-IsthereaCodeofConductforApacheprojects > > On Sat, 4 May 2024 at 20:31, Mridul Muralidharan wrote: > > > > Ah ! Then it makes sense to not include it :-)

Re: [DRAFT] Celeborn Board Report

2024-05-04 Thread Keyong Zhou
Actually it's the second one. For the first one I didn't send the draft to dev maillist for discussion because of lack of experience... Regards, Keyong Zhou Mridul Muralidharan 于2024年5月3日周五 23:38写道: > Hi, > > I meant call it out as part of the board report, so that it is captured

Re: Voice Of Apache interview request

2024-05-03 Thread Keyong Zhou
Hi Rich, Thanks for reaching out! I'd like to be the volunteer. So, what do I need to do? Regards, Keyong Zhou Rich Bowen 于2024年4月30日周二 22:24写道: > Congratulations on graduating and becoming a top level project at the > Apache Software Foundation. > > As you may know, I produce a p

Re: [DRAFT] Celeborn Board Report

2024-05-03 Thread Keyong Zhou
<https://mp.weixin.qq.com/s/DdoJW-f3BZAvxciDbI3mTw> It'll be great if we can call out louder, do you have any idea? : ) Regards, Keyong Zhou Mridul Muralidharan 于2024年5月3日周五 07:40写道: > Hi, > > Do we want to call out graduation to TLP ? > > Regards, > Mridul > >

[DRAFT] Celeborn Board Report

2024-05-02 Thread Keyong Zhou
. Also, we are active in making a voice in various conferences to attract more users. Regards, Keyong Zhou

[ANNOUNCE] Add Mridul Muralidharan as new committer

2024-04-28 Thread Keyong Zhou
submission process. This should enable better productivity. A PMC member helps manage and guide the direction of the project. Please join me in congratulating Mridul Muralidharan! Regards, Keyong Zhou

Re: [DISCUSS] Time for 0.4.1

2024-04-12 Thread Keyong Zhou
+1, thanks Nicholas for volunteering! Regards, Keyong Zhou Shaoyun Chen 于2024年4月12日周五 22:03写道: > +1 > > Cheng Pan 于2024年4月12日周五 20:04写道: > > > > +1, we do need a patch release for 0.4 > > > > Thanks, > > Cheng Pan > > > > >

[ANNOUNCE] Add Chandni Singh as new committer

2024-03-21 Thread Keyong Zhou
submission process. This should enable better productivity. A (P)PMC member helps manage and guide the direction of the project. Please join me in congratulating Chandni Singh! Thanks, Keyong Zhou

Re: [VOTE] Graduate Apache Celeborn (incubating) as a TLP - Community

2024-03-01 Thread keyong zhou
+1 Regards, Keyong Zhou Mridul Muralidharan 于2024年3月1日周五 19:58写道: > +1 > > Regards, > Mridul > > > On Fri, Mar 1, 2024 at 4:35 AM Nicholas wrote: > > > > > +1. > > > > > > Regards, > > Nicholas Jiang > > > >

Re: [DISCUSS] Graduate Celeborn as TLP

2024-02-27 Thread Keyong Zhou
Thanks Willian for the information, as Cheng said, we didn't start the registration process before :) Best, Keyong Zhou Willem Jiang 于2024年2月27日周二 18:48写道: > It‘s OK if we don't register any trademark of Celeborn. > If we already registered the trademark of Celeborn, we need t

Re: [DISCUSS] Graduate Celeborn as TLP

2024-02-27 Thread Keyong Zhou
Thanks @Gabriel, hope Celeborn can be useful in your environment someday :) Best, Keyong Zhou Gabriel Lee 于2024年2月27日周二 11:43写道: > Hi Yu, > > Very glad to witness Celeborn's growth. Now Celeborn has already become a > leading and mature shuffle service project after a yea

[ANNONCE] New PPMC member: Fu Chen

2024-02-19 Thread Keyong Zhou
Hi Celeborn Community, The Podling Project Management Committee (PPMC) for Apache Celeborn has invited Fu Chen to become our PPMC member and we are pleased to announce that he has accepted. Fu Chen has been actively contributing to Celeborn community for more then one year[1], including SBT

Re: Large number of incubator-celeb...@noreply.github.com emails

2024-02-06 Thread Keyong Zhou
hy I asked is what Nicholas clarified about - saw a > nontrivial number of github issue related mails, and was not sure if we > were moving to using that ! > > Thanks, > Mridul > > > On Wed, Feb 7, 2024 at 12:52 AM Keyong Zhou wrote: > > > Hi Mridul, > > > >

Re: Large number of incubator-celeb...@noreply.github.com emails

2024-02-06 Thread Keyong Zhou
. To me, I'm actually fine with both. JIRA works well so far, will using Github be more beneficial? Glad to hear about your opinion. Thanks, Keyong Zhou Mridul Muralidharan 于2024年2月7日周三 14:03写道: > Looks like I am wrong, github issues can be used [1]. > Is Celeborn planning to use github

Re: [VOTE] Release Apache Celeborn(Incubating) 0.4.0-incubating-rc6

2024-01-31 Thread Keyong Zhou
bt-enabled --release ``` Thanks, Keyong Zhou Fu Chen 于2024年1月29日周一 21:46写道: > Hi Celeborn community, > > This is a call for a vote to release Apache Celeborn (Incubating) > 0.4.0-incubating-rc6 > > > The git tag to be voted upon: > > https://github.com/apache/incubator-celebo

Re: [VOTE] Release Apache Celeborn(Incubating) 0.4.0-incubating-rc4

2024-01-18 Thread Keyong Zhou
bt-enabled --release ``` Thanks, Keyong Zhou Fu Chen 于2024年1月18日周四 21:40写道: > Hi Celeborn community, > > This is a call for a vote to release Apache Celeborn (Incubating) > 0.4.0-incubating-rc4 > > > The git tag to be voted upon: > > https://github.com/apache/incubator-celebo

[ANNOUNCE] Add Xiaofeng Jiang as new committer

2024-01-11 Thread Keyong Zhou
submission process. This should enable better productivity. A (P)PMC member helps manage and guide the direction of the project. Please join me in congratulating Xiaofeng Jiang! Thanks, Keyong Zhou

Re: [VOTE] Release Apache Celeborn(Incubating) 0.4.0-incubating-rc3

2024-01-01 Thread Keyong Zhou
--release ``` Thanks, Keyong Zhou Fu Chen 于2024年1月1日周一 19:42写道: > Hi Celeborn community, > > This is a call for a vote to release Apache Celeborn (Incubating) > 0.4.0-incubating-rc3 > > > The git tag to be voted upon: > > https://github.com/apache/incubator-celebo

Re: [VOTE] Release Apache Celeborn(Incubating) 0.3.2-incubating-rc2

2023-12-31 Thread Keyong Zhou
--release ``` Thanks, Keyong Zhou Keyong Zhou 于2024年1月1日周一 09:44写道: > I checked > - git commit hash is correct. > - links are valid. > - "incubating" is in the name. > - PGP keys are good. > - hashes are correct. > - LICENSE looks good. > - NOTICE looks good. &

Re: [VOTE] Release Apache Celeborn(Incubating) 0.3.2-incubating-rc2

2023-12-31 Thread Keyong Zhou
`` Thanks, Keyong Zhou Nicholas Jiang 于2023年12月29日周五 19:50写道: > Hi Celeborn community, > > This is a call for a vote to release Apache Celeborn (Incubating) > > 0.3.2-incubating-rc2 > > > The git tag to be voted upon: > > > https://github.com/apache/incubator-

Re: [VOTE] Release Apache Celeborn(Incubating) 0.4.0-incubating-rc0

2023-12-21 Thread Keyong Zhou
--release ``` Thanks, Keyong Zhou Fu Chen 于2023年12月21日周四 21:41写道: > Hi Celeborn community, > > This is a call for a vote to release Apache Celeborn (Incubating) > 0.4.0-incubating-rc0 > > > The git tag to be voted upon: > > https://github.com/apache/incubator-celebo

Re: [VOTE] Release Apache Celeborn(Incubating) 0.3.2-incubating-rc1

2023-12-21 Thread Keyong Zhou
--release ``` Thanks, Keyong Zhou Nicholas Jiang 于2023年12月21日周四 14:06写道: > Hi Celeborn community, > > > This is a call for a vote to release Apache Celeborn (Incubating) > 0.3.2-incubating-rc1 > > > The git tag to be voted upon: > > https://github.com/apache/incubator

Re: [DISCUSS] Time for 0.4.0

2023-12-13 Thread keyong zhou
+1, thanks Fu! Fu Chen 于2023年12月14日周四 10:26写道: > Hi, Celeborn community, > > It has been a while since the 0.3.0 release, and I think it’s time to > prepare > for the next feature release 0.4.0. And I’m volunteering to be the release > manager if no others were applied. > > If no objections, I

Re: [DISCUSS] Time for 0.3.2

2023-12-07 Thread Keyong Zhou
+1 on 0.3.2, thanks Nicholas for volunteering! angers zhu 于2023年12月7日周四 17:00写道: > +1 on 0.3.2 > > Yihe Li 于2023年12月7日周四 16:41写道: > > > +1, thanks Nicholas! > > > > On 2023/12/07 07:43:09 Shaoyun Chen wrote: > > > +1 thanks Nicholas. > > > > > > Mridul Muralidharan 于2023年12月7日周四 15:03写道: > >

[ANNOUNCE] Add Yihe Li as new committer

2023-11-16 Thread Keyong Zhou
submission process. This should enable better productivity. A (P)PMC member helps manage and guide the direction of the project. Please join me in congratulating Yihe Li! Thanks, Keyong Zhou

Re: [ANNOUNCE] New Committer: Shaoyun Chen

2023-11-07 Thread Keyong Zhou
Congrats to Shaoyun Chen! Cheng Pan 于2023年11月7日周二 19:12写道: > Hi Celeborn Community, > > The Podling Project Management Committee (PPMC) for Apache Celeborn > has invited Shaoyun Chen to become a committer and we are pleased > to announce that he has accepted. > > Being a committer enables

Re: [PROPOSAL] Spark stage resubmission for shuffle fetch failure

2023-11-03 Thread Keyong Zhou
I checked RDD#getOutputDeterministicLevel and find that if an RDD's upstream is INDETERMINATE, then it's also INDETERMINATE. Thanks, Keyong Zhou Keyong Zhou 于2023年11月3日周五 19:57写道: > Hi Mridul, > > I still have a question. DAGScheduler#submitMissingTasks wi

Re: [PROPOSAL] Spark stage resubmission for shuffle fetch failure

2023-11-03 Thread Keyong Zhou
Hi Mridul, I still have a question. DAGScheduler#submitMissingTasks will only unregisterAllMapAndMergeOutput if the current ShuffleMapStage is Indeterminate. What if the current stage is determinate, but its upstream stage is Indeterminate, and its upstream stage is rerun? Thanks, Keyong Zhou

Re: [PROPOSAL] Spark stage resubmission for shuffle fetch failure

2023-10-19 Thread Keyong Zhou
gt; > " > > org.apache.spark.SparkException: Job aborted due to stage failure: A > shuffle map stage with indeterminate output was failed and retried. > However, Spark cannot rollback the ResultStage 1 to re-process the input > data, and has to fail this job. Please eliminate

Re: [PROPOSAL] Spark stage resubmission for shuffle fetch failure

2023-10-19 Thread Keyong Zhou
In fact, I'm wondering if Spark will rerun the whole reduce ShuffleMapStage if its upstream ShuffleMapStage is INDETERMINATE and rerun. Keyong Zhou 于2023年10月19日周四 23:00写道: > Thanks Erik for bringing up this question, I'm also curious about the > answer, any feedback is appreciated. >

Re: [PROPOSAL] Spark stage resubmission for shuffle fetch failure

2023-10-19 Thread Keyong Zhou
Thanks Erik for bringing up this question, I'm also curious about the answer, any feedback is appreciated. Thanks, Keyong Zhou Erik fang 于2023年10月19日周四 22:16写道: > Mridul, > > sure, I totally agree SPARK-25299 is a much better solution, as long as we > can get it from spark comm

Re: Question on Celeborn workers,

2023-10-16 Thread Keyong Zhou
mapper's output is stored in one file (perhaps multiple files if split happens), very similar to how ESS stores shuffle data. Combining MapPartition with ReducePartition (aggregate partition data) in Celeborn the same way how magnet does may be an interesting idea. Thanks, Keyong Zhou Mridul Muralidharan

Re: Question on Celeborn workers,

2023-10-16 Thread Keyong Zhou
that out! Thanks, Keyong Zhou Sungwoo Park 于2023年10月13日周五 02:22写道: > I have a question on how Celeborn distributes shuffle data among Celeborn > workers. > > From our observation, it seems that whenever a Celeborn worker fails or > gets killed (in a small cluster of less than 25

Re: [VOTE] Release Apache Celeborn(Incubating) 0.3.1-incubating-rc3

2023-10-06 Thread Keyong Zhou
--release ``` Thanks, Keyong Zhou Fu Chen 于2023年10月7日周六 10:27写道: > +1 > > I checked > - download links are valid. > - git commit hash is correct. > - no binary files in the source release. > - signatures are good. > ``` > gpg --import KEYS > gpg --verify apache-celebo

Re: [PROPOSAL] Spark stage resubmission for shuffle fetch failure

2023-10-06 Thread Keyong Zhou
of workers. If we want to reuse the shuffleId, we need to redesign the whole picture. Thanks, Keyong Zhou Sungwoo Park 于2023年10月2日周一 13:23写道: > Hi Keyong, > > Instead of picking up a new shuffleId, can we reuse an existing shuffleId > after unregistering it? If the following plan work

Re: [PROPOSAL] Spark stage resubmission for shuffle fetch failure

2023-09-30 Thread Keyong Zhou
Hi Sungwoo, I think your approach works with current architecture of Celeborn, and interpreting IOException when reading as read failure makes sense. Currently only when CommitFiles fails will LifecycleManager announce data lost. Thanks, Keyong Zhou Sungwoo Park 于2023年9月29日周五 22:05写道

Re: [PROPOSAL] Spark stage resubmission for shuffle fetch failure

2023-09-29 Thread Keyong Zhou
' also needs to refactor Worker's logic, which currently assumes that the succeeded attempts will not be changed after final committing files. Thanks, Keyong Zhou Keyong Zhou 于2023年9月29日周五 19:43写道: > Hi Sungwoo, > > Thanks for your reply. For the required two features you mentio

Re: [PROPOSAL] Spark stage resubmission for shuffle fetch failure

2023-09-29 Thread Keyong Zhou
, I'm happy to reference it in our website. [1] https://docs.google.com/document/d/1SM-oOM0JHEIoRHTYhE9PYH60_1D3NMxDR50LZIM7uW0/edit#heading=h.fudf3s3zacpr [2] https://celeborn.apache.org/docs/latest/developers/overview/#compute-engine-integration Thanks, Keyong Zhou 于2023年9月28日周四 11:12写道

Re: [DISCUSS] Support authentication in Celeborn

2023-09-16 Thread Keyong Zhou
Hi Chandni, Thanks for the explanation, I agree that ensuring the security of the client jar and its distribution falls outside the scope of adding authentication to Celeborn. I'm OK with the design doc, thanks! Let's see if other developers have other feedbacks. Thanks, Keyong Zhou Chandni

Re: [DISCUSS] Support authentication in Celeborn

2023-09-16 Thread Keyong Zhou
that the client jar is malicious. Maybe we don't need to consider such situation for now, I'm just thinking about the possibility. Thanks, Keyong Zhou Chandni Singh 于2023年9月16日周六 14:03写道: > Hi Keyong, > Thanks for reviewing the proposal. > 1. Should we store the shard secrets in Ratis among mast

Re: [DISCUSS] Support authentication in Celeborn

2023-09-15 Thread Keyong Zhou
t? 4. The doc says TTL is out of scope, is there a plan to support TTL in the future? Thanks, Keyong Zhou Chandni Singh 于2023年9月15日周五 06:34写道: > Hello Celeborn community, > > We have a proposal to add authentication to Celeborn: > > https://docs.google.com/document/d/1D1U2COYhS3ob7

Re: [VOTE] Release Apache Celeborn(Incubating) 0.3.1-incubating-rc2

2023-09-11 Thread Keyong Zhou
--release ``` Thanks, Keyong Zhou rexxiong 于2023年9月11日周一 22:08写道: > +1 (binding) > I checked > - Download links are valid. > - git commit hash is correct > - Checksums and signatures are valid. > - No binary files in the source release > - Files have the word incubating in the

Re: [VOTE] Release Apache Celeborn(Incubating) 0.3.1-incubating-rc1

2023-09-07 Thread Keyong Zhou
Seems the bugfix[1] is critical for supporting Flink, so I suggest preparing rc2. [1] https://github.com/apache/incubator-celeborn/pull/1881 Thanks, Keyong Zhou Zhongqiang Chen 于2023年9月6日周三 21:13写道: > -1I am so sorry. There is a bugfix for MapPartition Split.For more > Information, plea

Re: [VOTE] Release Apache Celeborn(Incubating) 0.3.1-incubating-rc0

2023-09-04 Thread Keyong Zhou
I found a perf degradation when spark's partition coalesce takes effect. This PR[1] fixes it, and I tested the following with [1]: 1. 1T TPCDS with row shuffle/columnar shuffle/columnar shuffle + codegen, it's OK 2. 0.3.0-incubating client with 0.3.1 server, it's OK 3. graceful shutdown, it's OK

Re: [VOTE] Release Apache Celeborn(Incubating) 0.3.1-incubating-rc0

2023-09-01 Thread Keyong Zhou
I will have a thorough test on spark for this rc Thanks, Keyong Zhou Ethan Feng 于2023年9月1日周五 15:48写道: > The Jira ticket is CELEBORN-941. I fixed it yesterday but I didn't > realize that merge_pr.sh failed to copy it into branch-0.3. > > Regards, > Ethan > > Cheng Pan

Re: [DISCUSS] Time for 0.3.1

2023-08-29 Thread Keyong Zhou
Zhou Cheng Pan 于2023年8月29日周二 14:26写道: > Thanks, I plan to cut the 0.3.1-rc0 on Fri Aug 30. > > Please let me know if you have any PRs that want to be shipped. > > Thanks, > Cheng Pan > > On Mon, Aug 28, 2023 at 2:08 PM Keyong Zhou wrote: > > > > Thanks Pan for

Re: [DISCUSS] Time for 0.3.1

2023-08-28 Thread Keyong Zhou
Thanks Pan for volunteering! I also think it's time to release 0.3.1. Thanks, Keyong Zhou Binjie Yang 于2023年8月28日周一 10:53写道: > +1, thanks for driving this. > > Thanks, > Binjie Yang > > On 2023/08/28 02:48:02 Cheng Pan wrote: > > Hi, Celeborn community, > >

Re: Question on speculative execution,

2023-08-20 Thread Keyong Zhou
Hi Sungwoo, Is there any other Exceptions when 'Premature EOF from inputStream' occurs? Could you send the log file of the reduce task? Thanks, Keyong Zhou Sungwoo Park 于2023年8月21日周一 12:32写道: > Hi Keyong. > > Thanks for your reply. We call mapperEnd() in attempt #2 (which is &

Re: Question on speculative execution,

2023-08-20 Thread Keyong Zhou
succeeds. 2. Since speculation execution is allowed, we can safely kill a task attempt when another attempt succeeds. Thanks, Keyong Zhou 于2023年8月19日周六 22:38写道: > Hello Celeborn team, > > We are quite close to completing our Celeborn-MR3 client, and I have a > question on speculati

Re: Question on ShuffleClient.readPartition()

2023-08-14 Thread Keyong Zhou
, Keyong Zhou 于2023年8月14日周一 15:09写道: > Hi Celeborn team, > > We are implementing a Celeborn-MR3 client, and have a question on the > order of chunks returned by ShuffleClient.readPartition(). > > --- Setup > > With shuffleId, mapId, attemptId, partitionId all fixed,

Re: Q. How to interrupt ShuffleClient and avoid revive requests due to HARD_SPLIT

2023-07-31 Thread Keyong Zhou
will re-push the failed data to the new worker. So, IMO, it's normal behavior that the driver receives lots of Revive requests. But if we add the shuffle id in ShuffleClientImpl's stageEndShuffleSet, then ShuffleClient should not send the requests any more. Thanks, Keyong Zhou Keyong Zhou 于2023

Re: Q. How to interrupt ShuffleClient and avoid revive requests due to HARD_SPLIT

2023-07-31 Thread Keyong Zhou
recommend you to patch the following PR: https://github.com/apache/incubator-celeborn/pull/1755 It's related to StageEnd logic. Thanks, Keyong Zhou 于2023年7月31日周一 10:54写道: > Hi Celeborn team, > > We are implementing a Celeborn-MR3 client, and have a question on how to > properly unregist

Re: Question of fetching mapper output

2023-07-24 Thread Keyong Zhou
Hi Sungwoo, Thanks for your update. Yes this mailing list is the right place to discuss Celeborn, any questions please feel free to ask. Thanks, Keyong Zhou 于2023年7月21日周五 13:54写道: > Hi Keyong, > > Unlike Spark/Flink clients, we had to directly modify the MR3 runtime code >

Re: [VOTE] Release Apache Celeborn(Incubating) 0.3.0-incubating-rc2

2023-07-19 Thread Keyong Zhou
--release ``` Thanks, Keyong Zhou Zhongqiang Chen 于2023年7月18日周二 18:34写道: > Hi Celeborn community, > > This is a call for a vote to release Apache Celeborn (Incubating) > 0.3.0-incubating-rc2 > > The git tag to be voted upon: > > https://github.com/apache/incubator-celeborn/

Re: [VOTE] Release Apache Celeborn(Incubating) 0.3.0-incubating-rc1

2023-07-17 Thread Keyong Zhou
/apache/incubator-celeborn/pull/1719 Thanks, Keyong Zhou

Re: Question of fetching mapper output

2023-07-16 Thread Keyong Zhou
MR3 (and Hive). Thanks, Keyong Zhou 于2023年7月16日周日 22:33写道: > We have extended the implementation of MR3 so that all partition > inputs can be fetched with a single call, e.g.: > >rssShuffleClient.readPartition(..., 0, 100) > > Now, Hive-MR3 with Celeborn runs as fast as Hiv

Re: Question on implementing Celeborn client,

2023-07-13 Thread Keyong Zhou
Hi, If you call endpoint.ask[CommitFilesResponse](message), you should wait for response. If responses is successful, you can be sure commit files succeeds. Please refer to CommitHandler.requestCommitFilesWithRetry. Thanks, Keyong Zhou 于2023年7月13日周四 15:54写道: > > Following are the main

Re: Question on implementing Celeborn client,

2023-07-12 Thread Keyong Zhou
, Keyong Zhou 于2023年7月12日周三 21:19写道: > Hi Keyong, > > Thanks for your quick reply. We thought that Celeborn API was clean and > very intuitive, and have not encountered serious problems yet for getting > our system up and running. We are not sure about just a few points that > a

Re: Question on implementing Celeborn client,

2023-07-12 Thread Keyong Zhou
, the developer API is not that much clean. It will be very helpful if you send PRs to improve Celeborn during your integration with MR3. Thanks, Keyong Zhou 于2023年7月12日周三 14:53写道: > Hi Team, > > We are currently implementing a Celeborn client for our application > (called MR3 which is si

Re: [DISCUSSION] Release Apache Celeborn(Incubating) 0.3.0-incubating-rc0

2023-07-02 Thread Keyong Zhou
Thanks Zhongqiang Chen for being our release manager for 0.3.0! I have no problem with releasing this version. I agree with Cheng Pan, we can prepare the release note first. Thanks, Keyong Zhou Cheng Pan 于2023年6月30日周五 15:18写道: > Thanks Zhongqiang for driving this release. > > I t

Re: [DISCUSS] Allow external contributors to run CI without approval

2023-06-16 Thread Keyong Zhou
+1 Thanks, Keyong Zhou Ethan Feng 于2023年6月16日周五 16:27写道: > Recent moves by Apache Infra have changed the policy on GitHub Actions from > "Only requires approval first time" to "Requires approval every time". > > I think this is not friendly for getting

Re: Committers: Please use `dev/merge_pr.py` to merge new PRs

2023-06-02 Thread Keyong Zhou
Thanks @Cheng Pan for introducing this nice tool! Keyong Zhou Cheng Pan 于2023年6月2日周五 22:38写道: > Hi Celeborn Committers, > > A PR merge tool `dev/merge_pr.py` was added to the Celeborn git > repo[1][2], it aims to simplify the PR merge and backport process, and > improve the git

Re: [VOTE] Release Apache Celeborn(Incubating) 0.2.1-incubating-rc0

2023-03-20 Thread Keyong Zhou
--release ``` Thanks, Keyong Zhou On 2023/03/17 09:17:25 rexxiong wrote: > Hi Celeborn community, > > This is a call for a vote to release Apache Celeborn (Incubating) > 0.2.1-incubating-rc0 > > The git tag to be voted upon: > https://github.com/apache/incubator-celeborn/rel

[ANNOUNCE] Add zhongqiangchen(Zhongqiang Chen) as new committer

2023-03-14 Thread Keyong Zhou
contributing to the project, pushing Celeborn to the next level together with all contributors of the community! Also, we are looking forward to add more and more committers to our project :) Thanks! Keyong Zhou

[ANNOUNCE] Add rexxiong(Jiashu Xiong) as new committer

2023-03-14 Thread Keyong Zhou
to the project, pushing Celeborn to the next level together with all contributors of the community! Also, we are looking forward to add more and more committers to our project :) Thanks! Keyong Zhou

Re: [NOTICE] Fix solution about rare data loss in release 0.2.0.

2023-03-08 Thread keyong zhou
Hi Yu, We do have a plan for a quick fix, before that we'd like to do more tests and collect more feedbacks for about a week. Thanks, Keyong Zhou Yu Li 于2023年3月9日周四 13:48写道: > Thanks for the note Ethan. > > I'm not sure but maybe it is worth a quick bug fix release, i.e. 0.2.1? A

Re: [Question] LimitedInputStream license issue in Spark source.

2023-03-03 Thread Keyong Zhou
Hi Yu, Thanks for the reminder, we have already fixed it :) https://github.com/apache/incubator-celeborn/commit/9aabb43699225d47c1470027b98a42210df914e8 https://github.com/apache/incubator-celeborn/commit/dcf1e018f6352a64250c64d64e21e3eae1f8fa14 Thanks, Keyong Zhou Yu Li 于2023年3月3日周五 18:03写道

Re: [VOTE] Release Apache Celeborn(Incubating) 0.2.0-incubating-rc5

2023-02-21 Thread Keyong Zhou
+1 (binding) I checked - links are valid. - "incubating" is in the name. - LICENSE looks good. - NOTICE looks good. - DISCLAIMER exists. - signatures are good. ``` gpg --verify apache-celeborn-0.2.0-incubating-source.tgz.asc gpg --verify apache-celeborn-0.2.0-incubating-bin.tgz.asc ``` -

Re: [VOTE] Release Apache Celeborn(Incubating) 0.2.0-incubating-rc4

2023-02-07 Thread Keyong Zhou
--release ``` Thanks, Keyong Zhou Ethan Feng 于2023年2月8日周三 10:37写道: > Hello Incubator Community, > > This is a call for a vote to release Apache Celeborn(Incubating) > version 0.2.0-incubating-rc4 > > The Apache Celeborn community has voted on and approved a proposal to

Re: [VOTE] Release Apache Celeborn(Incubating) 0.2.0-incubating-rc4

2023-02-06 Thread Keyong Zhou
--release ``` Thanks, Keyong Zhou Ethan Feng 于2023年2月4日周六 21:29写道: > Hi Celeborn community, > > This is a call for a vote to release Apache Celeborn (Incubating) > 0.2.0-incubating-rc4 > > The git tag to be voted upon: > > https://github.com/apache/incubator-celeborn/

Re: [VOTE] Release Apache Celeborn(Incubating) 0.2.0-incubating-rc3

2023-01-18 Thread Keyong Zhou
--release ``` Thanks, Keyong Zhou Ethan Feng 于2023年1月18日周三 22:01写道: > Hi Celeborn community, > > This is a call for the vote to release Apache Celeborn (Incubating) > 0.2.0-incubating-rc3 > > The git tag to be voted upon: > > https://github.com/apache/incubator-celeborn/

Call for UT

2023-01-11 Thread Keyong Zhou
Hi community, Currently the code coverage is quite low, I think it's time to boost the UT coverage, any effort will be appreciated, thanks! Thanks, Keyong Zhou

Re: [VOTE] Release Apache Celeborn(Incubating) 0.2.0-incubating-rc2

2023-01-08 Thread keyong zhou
+1 (non-binding) I checked - git commit hash is correct. - links are valid. - "incubating" is in the name. - LICENSE looks good. - NOTICE looks good. - DISCLAIMER exists. - build success from source code (macOS). ``` ./build/make-distribution.sh --release ``` Thanks, Keyong Zhou Eth

Re: [VOTE] Release Apache Celeborn(Incubating) 0.2.0-incubating-rc2

2023-01-08 Thread keyong zhou
+1 (non-binding) I checked - git commit hash is correct. - links are valid. - "incubating" is in the name. - LICENSE looks good. - NOTICE looks good. - DISCLAIMER exists. - build success from source code (macOS). ``` ./build/make-distribution.sh --release ``` Thanks, Keyong Zhou Eth

Re: [VOTE] Release Apache Celeborn(Incubating) 0.2.0-incubating-rc2

2023-01-05 Thread Keyong Zhou
+1 (binding) I checked - git commit hash is correct. - links are valid. - "incubating" is in the name. - LICENSE looks good. - NOTICE looks good. - DISCLAIMER exists. - build success from source code (macOS). ``` ./build/make-distribution.sh --release ``` Thanks, Keyong Zhou Cheng P

Re: [VOTE] Release Apache Celeborn(Incubating) 0.2.0-incubating-rc1

2023-01-03 Thread Keyong Zhou
Also, I think we should not include images in source tarball. Thanks, Keyong Zhou Keyong Zhou 于2023年1月4日周三 11:30写道: > Hi Feng, > > When I tried to decompress the apache-celeborn-0.2.0-incubating-bin.tgz on > CentOS I got the following error: > > ._apache-celeborn-0.2.0-incuba

Re: [VOTE] Release Apache Celeborn(Incubating) 0.2.0-incubating-rc1

2023-01-03 Thread Keyong Zhou
tar: 忽略未知的扩展头关键字‘LIBARCHIVE.xattr.com.apple.provenance’ apache-celeborn-0.2.0-incubating-bin/jars/ apache-celeborn-0.2.0-incubating-bin/._docker tar: 忽略未知的扩展头关键字‘LIBARCHIVE.xattr.com.apple.provenance’ I think we should re-build the tarball and restart the vote. Thanks, Keyong Zhou Ethan Feng 于

  1   2   >