UPDATE: After resolving a few issues in the release scripts, I can finally build the release packages. However, I can't upload them to the staging SVN repo due to a transmitting error, and it seems like a limitation from the server side. I tried it on both my local laptop and remote AWS instance, but neither works. These package binaries are like 300-400 MBs, and we just did a release last month. Not sure if this is a new limitation due to cost saving.
While I'm looking for help to get unblocked, I'm wondering if we can upload release packages to a public git repo instead, under the Apache account? On Thu, May 9, 2024 at 12:39 AM Holden Karau <holden.ka...@gmail.com> wrote: > That looks cool, maybe let’s split off a thread on how to improve our > release processes? > > Twitter: https://twitter.com/holdenkarau > Books (Learning Spark, High Performance Spark, etc.): > https://amzn.to/2MaRAG9 <https://amzn.to/2MaRAG9> > YouTube Live Streams: https://www.youtube.com/user/holdenkarau > > > On Wed, May 8, 2024 at 9:31 AM Erik Krogen <xkro...@apache.org> wrote: > >> On that note, GitHub recently released (public preview) a new feature >> called Artifact Attestions which may be relevant/useful here: Introducing >> Artifact Attestations–now in public beta - The GitHub Blog >> <https://github.blog/2024-05-02-introducing-artifact-attestations-now-in-public-beta/> >> >> On Wed, May 8, 2024 at 9:06 AM Nimrod Ofek <ofek.nim...@gmail.com> wrote: >> >>> I have no permissions so I can't do it but I'm happy to help (although I >>> am more familiar with Gitlab CICD than Github Actions). >>> Is there some point of contact that can provide me needed context and >>> permissions? >>> I'd also love to see why the costs are high and see how we can reduce >>> them... >>> >>> Thanks, >>> Nimrod >>> >>> On Wed, May 8, 2024 at 8:26 AM Holden Karau <holden.ka...@gmail.com> >>> wrote: >>> >>>> I think signing the artifacts produced from a secure CI sounds like a >>>> good idea. I know we’ve been asked to reduce our GitHub action usage but >>>> perhaps someone interested could volunteer to set that up. >>>> >>>> Twitter: https://twitter.com/holdenkarau >>>> Books (Learning Spark, High Performance Spark, etc.): >>>> https://amzn.to/2MaRAG9 <https://amzn.to/2MaRAG9> >>>> YouTube Live Streams: https://www.youtube.com/user/holdenkarau >>>> >>>> >>>> On Tue, May 7, 2024 at 9:43 PM Nimrod Ofek <ofek.nim...@gmail.com> >>>> wrote: >>>> >>>>> Hi, >>>>> Thanks for the reply. >>>>> >>>>> From my experience, a build on a build server would be much more >>>>> predictable and less error prone than building on some laptop- and of >>>>> course much faster to have builds, snapshots, release candidates, early >>>>> previews releases, release candidates or final releases. >>>>> It will enable us to have a preview version with current changes- >>>>> snapshot version, either automatically every day or if we need to save >>>>> costs (although build is really not expensive) - with a click of a button. >>>>> >>>>> Regarding keys for signing. - that's what vaults are for, all across >>>>> the industry we are using vaults (such as hashicorp vault)- but if the >>>>> build will be automated and the only thing which will be manual is to sign >>>>> the release for security reasons that would be reasonable. >>>>> >>>>> Thanks, >>>>> Nimrod >>>>> >>>>> >>>>> בתאריך יום ד׳, 8 במאי 2024, 00:54, מאת Holden Karau < >>>>> holden.ka...@gmail.com>: >>>>> >>>>>> Indeed. We could conceivably build the release in CI/CD but the final >>>>>> verification / signing should be done locally to keep the keys safe >>>>>> (there >>>>>> was some concern from earlier release processes). >>>>>> >>>>>> Twitter: https://twitter.com/holdenkarau >>>>>> Books (Learning Spark, High Performance Spark, etc.): >>>>>> https://amzn.to/2MaRAG9 <https://amzn.to/2MaRAG9> >>>>>> YouTube Live Streams: https://www.youtube.com/user/holdenkarau >>>>>> >>>>>> >>>>>> On Tue, May 7, 2024 at 10:55 AM Nimrod Ofek <ofek.nim...@gmail.com> >>>>>> wrote: >>>>>> >>>>>>> Hi, >>>>>>> >>>>>>> Sorry for the novice question, Wenchen - the release is done >>>>>>> manually from a laptop? Not using a CI CD process on a build server? >>>>>>> >>>>>>> Thanks, >>>>>>> Nimrod >>>>>>> >>>>>>> On Tue, May 7, 2024 at 8:50 PM Wenchen Fan <cloud0...@gmail.com> >>>>>>> wrote: >>>>>>> >>>>>>>> UPDATE: >>>>>>>> >>>>>>>> Unfortunately, it took me quite some time to set up my laptop and >>>>>>>> get it ready for the release process (docker desktop doesn't work >>>>>>>> anymore, >>>>>>>> my pgp key is lost, etc.). I'll start the RC process at my tomorrow. >>>>>>>> Thanks >>>>>>>> for your patience! >>>>>>>> >>>>>>>> Wenchen >>>>>>>> >>>>>>>> On Fri, May 3, 2024 at 7:47 AM yangjie01 <yangji...@baidu.com> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> +1 >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> *发件人**: *Jungtaek Lim <kabhwan.opensou...@gmail.com> >>>>>>>>> *日期**: *2024年5月2日 星期四 10:21 >>>>>>>>> *收件人**: *Holden Karau <holden.ka...@gmail.com> >>>>>>>>> *抄送**: *Chao Sun <sunc...@apache.org>, Xiao Li < >>>>>>>>> gatorsm...@gmail.com>, Tathagata Das <tathagata.das1...@gmail.com>, >>>>>>>>> Wenchen Fan <cloud0...@gmail.com>, Cheng Pan <pan3...@gmail.com>, >>>>>>>>> Nicholas Chammas <nicholas.cham...@gmail.com>, Dongjoon Hyun < >>>>>>>>> dongjoon.h...@gmail.com>, Cheng Pan <cheng...@apache.org>, Spark >>>>>>>>> dev list <dev@spark.apache.org>, Anish Shrigondekar < >>>>>>>>> anish.shrigonde...@databricks.com> >>>>>>>>> *主题**: *Re: [DISCUSS] Spark 4.0.0 release >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> +1 love to see it! >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> On Thu, May 2, 2024 at 10:08 AM Holden Karau < >>>>>>>>> holden.ka...@gmail.com> wrote: >>>>>>>>> >>>>>>>>> +1 :) yay previews >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> On Wed, May 1, 2024 at 5:36 PM Chao Sun <sunc...@apache.org> >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>> +1 >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> On Wed, May 1, 2024 at 5:23 PM Xiao Li <gatorsm...@gmail.com> >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>> +1 for next Monday. >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> We can do more previews when the other features are ready for >>>>>>>>> preview. >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> Tathagata Das <tathagata.das1...@gmail.com> 于2024年5月1日周三 08:46写道: >>>>>>>>> >>>>>>>>> Next week sounds great! Thank you Wenchen! >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> On Wed, May 1, 2024 at 11:16 AM Wenchen Fan <cloud0...@gmail.com> >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>> Yea I think a preview release won't hurt (without a branch cut). >>>>>>>>> We don't need to wait for all the ongoing projects to be ready. How >>>>>>>>> about >>>>>>>>> we do a 4.0 preview release based on the current master branch next >>>>>>>>> Monday? >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> On Wed, May 1, 2024 at 11:06 PM Tathagata Das < >>>>>>>>> tathagata.das1...@gmail.com> wrote: >>>>>>>>> >>>>>>>>> Hey all, >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> Reviving this thread, but Spark master has already accumulated a >>>>>>>>> huge amount of changes. As a downstream project maintainer, I want to >>>>>>>>> really start testing the new features and other breaking changes, and >>>>>>>>> it's >>>>>>>>> hard to do that without a Preview release. So the sooner we make a >>>>>>>>> Preview >>>>>>>>> release, the faster we can start getting feedback for fixing things >>>>>>>>> for a >>>>>>>>> great Spark 4.0 final release. >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> So I urge the community to produce a Spark 4.0 Preview soon even >>>>>>>>> if certain features targeting the Delta 4.0 release are still >>>>>>>>> incomplete. >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> Thanks! >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> On Wed, Apr 17, 2024 at 8:35 AM Wenchen Fan <cloud0...@gmail.com> >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>> Thank you all for the replies! >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> To @Nicholas Chammas <nicholas.cham...@gmail.com> : Thanks for >>>>>>>>> cleaning up the error terminology and documentation! I've merged the >>>>>>>>> first >>>>>>>>> PR and let's finish others before the 4.0 release. >>>>>>>>> >>>>>>>>> To @Dongjoon Hyun <dongjoon.h...@gmail.com> : Thanks for driving >>>>>>>>> the ANSI on by default effort! Now the vote has passed, let's flip the >>>>>>>>> config and finish the DataFrame error context feature before 4.0. >>>>>>>>> >>>>>>>>> To @Jungtaek Lim <kabhwan.opensou...@gmail.com> : Ack. We can >>>>>>>>> treat the Streaming state store data source as completed for 4.0 then. >>>>>>>>> >>>>>>>>> To @Cheng Pan <cheng...@apache.org> : Yea we definitely should >>>>>>>>> have a preview release. Let's collect more feedback on the ongoing >>>>>>>>> projects >>>>>>>>> and then we can propose a date for the preview release. >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> On Wed, Apr 17, 2024 at 1:22 PM Cheng Pan <pan3...@gmail.com> >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>> will we have preview release for 4.0.0 like we did for 2.0.0 and >>>>>>>>> 3.0.0? >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Cheng Pan >>>>>>>>> >>>>>>>>> >>>>>>>>> > On Apr 15, 2024, at 09:58, Jungtaek Lim < >>>>>>>>> kabhwan.opensou...@gmail.com> wrote: >>>>>>>>> > >>>>>>>>> > W.r.t. state data source - reader (SPARK-45511), there are >>>>>>>>> several follow-up tickets, but we don't plan to address them soon. The >>>>>>>>> current implementation is the final shape for Spark 4.0.0, unless >>>>>>>>> there are >>>>>>>>> demands on the follow-up tickets. >>>>>>>>> > >>>>>>>>> > We may want to check the plan for transformWithState - my >>>>>>>>> understanding is that we want to release the feature to 4.0.0, but >>>>>>>>> there >>>>>>>>> are several remaining works to be done. While the tentative timeline >>>>>>>>> for >>>>>>>>> releasing is June 2024, what would be the tentative timeline for the >>>>>>>>> RC cut? >>>>>>>>> > (cc. Anish to add more context on the plan for >>>>>>>>> transformWithState) >>>>>>>>> > >>>>>>>>> > On Sat, Apr 13, 2024 at 3:15 AM Wenchen Fan <cloud0...@gmail.com> >>>>>>>>> wrote: >>>>>>>>> > Hi all, >>>>>>>>> > >>>>>>>>> > It's close to the previously proposed 4.0.0 release date (June >>>>>>>>> 2024), and I think it's time to prepare for it and discuss the ongoing >>>>>>>>> projects: >>>>>>>>> > • >>>>>>>>> > ANSI by default >>>>>>>>> > • Spark Connect GA >>>>>>>>> > • Structured Logging >>>>>>>>> > • Streaming state store data source >>>>>>>>> > • new data type VARIANT >>>>>>>>> > • STRING collation support >>>>>>>>> > • Spark k8s operator versioning >>>>>>>>> > Please help to add more items to this list that are missed here. >>>>>>>>> I would like to volunteer as the release manager for Apache Spark >>>>>>>>> 4.0.0 if >>>>>>>>> there is no objection. Thank you all for the great work that fills >>>>>>>>> Spark >>>>>>>>> 4.0! >>>>>>>>> > >>>>>>>>> > Wenchen Fan >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> >>>>>>>>> Twitter: https://twitter.com/holdenkarau >>>>>>>>> <https://mailshield.baidu.com/check?q=9DewFnOIsK%2bK64Uu60Jx4QkcL9rDgnApD6spzOBjk%2fa2KQxn> >>>>>>>>> >>>>>>>>> Books (Learning Spark, High Performance Spark, etc.): >>>>>>>>> https://amzn.to/2MaRAG9 >>>>>>>>> <https://mailshield.baidu.com/check?q=D34Ozfkj%2bFrnkuu9ci%2b4FcMkreOvMZ3jO85bIw%3d%3d> >>>>>>>>> >>>>>>>>> YouTube Live Streams: https://www.youtube.com/user/holdenkarau >>>>>>>>> <https://mailshield.baidu.com/check?q=nadOZCZjNeU0qOVGCJesf8dvH4OrsWdKamKIxnJncPneWoN8%2bsIqc2DWow8%3d> >>>>>>>>> >>>>>>>>>