Re: [VOTE] Release Apache YuniKorn 1.5.0 RC1

2024-03-05 Thread Wilfred Spiegelenburg
Yes I think we need to spin a new RC: -1 for RC1

Go 1.21.8 delivers a total of 5 CVE fixes, with another CVE in the
protobuf code.
We should fix the two memory leaks discovered. Both are simple and
non-invasive fixes.

We should remove the reproducible build details from the README until
we figure out what is happening.

Wilfred

On Wed, 6 Mar 2024 at 10:15, Craig Condit  wrote:
>
> All of the below-mentioned issues have been resolved in branch-1.5.0 in 
> preparation for a possible 1.5.0-rc2. Assuming we move forward with rc2, we 
> should build with go 1.21.8 to ensure the latest fixes in the go standard 
> library are included as well.
>
> Craig
>
>
> > On Mar 5, 2024, at 3:12 PM, Craig Condit  wrote:
> >
> > -1 (binding).
> >
> > All,
> >
> > We have a few issues in rc1 that I believe we should address before 
> > shipping 1.5.0:
> >
> > CVEs:
> >
> > - CVE-2024-24783 (requires rebuild with go 1.21.8)
> > - CVE-2023-45290 (requires rebuild with go 1.21.8)
> > - CVE-2023-45289 (requires rebuild with go 1.21.8)
> > - CVE-2024-24786 (requires updates to google.golang.org/protobuf 
> >  and possibly github.com/golang/protobuf 
> > )
> >
> > Broken functionality:
> >
> > - Reproducible builds (unknown why this has failed, but we will need to 
> > remove the content from the README.md that claims reproducible status)
> >
> > Critical bugs (both memory leaks):
> >
> > - https://issues.apache.org/jira/browse/YUNIKORN-2465 - Remove Task objects 
> > from the shim upon pod completion (fix merged to master and to branch-1.5)
> > - https://issues.apache.org/jira/browse/YUNIKORN-2467 - Remove 
> > AllocationAsk from the core when a pod is completed (PR available; needs 
> > review to determine if this is a 1.5 blocker).
> >
> > I think we should address each of these and cut an rc2. Thought?
> >
> > Craig Condit
> >
> >> On Mar 2, 2024, at 10:38 AM, TingYao  wrote:
> >>
> >> Hello everyone,
> >>
> >> I would like to call a vote for releasing Apache YuniKorn 1.5.0 RC1.
> >>
> >> The release artefacts have been uploaded here:
> >> https://dist.apache.org/repos/dist/dev/yunikorn/1.5.0-RC1
> >>
> >> My public key is located in the KEYS file:
> >> https://downloads.apache.org//yunikorn/KEYS
> >>
> >> JIRA issues that have been resolved in this release:
> >> https://issues.apache.org/jira/issues/?filter=12352958
> >>
> >> Git tags for each component are as follows:
> >> yunikorn-scheduler-interface: v1.5.0-1
> >> yunikorn-core: v1.5.0-2
> >> yunikorn-k8shim: v1.5.0-2
> >> yunikorn-web: v1.5.0-1
> >> yunikorn-release: v1.5.0-2
> >>
> >> Once the release is voted on and approved, all repos will be tagged
> >> 1.5.0 for consistency.
> >>
> >> Please review and vote. The vote will be open for at least 72 hours
> >> and closes on Wednesday 5 March 2024, 17:00:00 UTC
> >>
> >> [ ] +1 Approve
> >> [ ] +0 No opinion
> >> [ ] -1 Disapprove (and the reason why)
> >>
> >> Thank you,
> >> Tingyao
> >
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@yunikorn.apache.org
> For additional commands, e-mail: dev-h...@yunikorn.apache.org
>

-
To unsubscribe, e-mail: dev-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: dev-h...@yunikorn.apache.org



Re: [VOTE] Release Apache YuniKorn 1.5.0 RC1

2024-03-05 Thread Craig Condit
All of the below-mentioned issues have been resolved in branch-1.5.0 in 
preparation for a possible 1.5.0-rc2. Assuming we move forward with rc2, we 
should build with go 1.21.8 to ensure the latest fixes in the go standard 
library are included as well.

Craig


> On Mar 5, 2024, at 3:12 PM, Craig Condit  wrote:
> 
> -1 (binding).
> 
> All,
> 
> We have a few issues in rc1 that I believe we should address before shipping 
> 1.5.0:
> 
> CVEs:
> 
> - CVE-2024-24783 (requires rebuild with go 1.21.8)
> - CVE-2023-45290 (requires rebuild with go 1.21.8)
> - CVE-2023-45289 (requires rebuild with go 1.21.8)
> - CVE-2024-24786 (requires updates to google.golang.org/protobuf 
>  and possibly github.com/golang/protobuf 
> )
> 
> Broken functionality:
> 
> - Reproducible builds (unknown why this has failed, but we will need to 
> remove the content from the README.md that claims reproducible status)
> 
> Critical bugs (both memory leaks):
> 
> - https://issues.apache.org/jira/browse/YUNIKORN-2465 - Remove Task objects 
> from the shim upon pod completion (fix merged to master and to branch-1.5)
> - https://issues.apache.org/jira/browse/YUNIKORN-2467 - Remove AllocationAsk 
> from the core when a pod is completed (PR available; needs review to 
> determine if this is a 1.5 blocker).
> 
> I think we should address each of these and cut an rc2. Thought?
> 
> Craig Condit
> 
>> On Mar 2, 2024, at 10:38 AM, TingYao  wrote:
>> 
>> Hello everyone,
>> 
>> I would like to call a vote for releasing Apache YuniKorn 1.5.0 RC1.
>> 
>> The release artefacts have been uploaded here:
>> https://dist.apache.org/repos/dist/dev/yunikorn/1.5.0-RC1
>> 
>> My public key is located in the KEYS file:
>> https://downloads.apache.org//yunikorn/KEYS
>> 
>> JIRA issues that have been resolved in this release:
>> https://issues.apache.org/jira/issues/?filter=12352958
>> 
>> Git tags for each component are as follows:
>> yunikorn-scheduler-interface: v1.5.0-1
>> yunikorn-core: v1.5.0-2
>> yunikorn-k8shim: v1.5.0-2
>> yunikorn-web: v1.5.0-1
>> yunikorn-release: v1.5.0-2
>> 
>> Once the release is voted on and approved, all repos will be tagged
>> 1.5.0 for consistency.
>> 
>> Please review and vote. The vote will be open for at least 72 hours
>> and closes on Wednesday 5 March 2024, 17:00:00 UTC
>> 
>> [ ] +1 Approve
>> [ ] +0 No opinion
>> [ ] -1 Disapprove (and the reason why)
>> 
>> Thank you,
>> Tingyao
> 


-
To unsubscribe, e-mail: dev-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: dev-h...@yunikorn.apache.org



[jira] [Resolved] (YUNIKORN-2469) Upgrade google.golang.org/protobuf to v1.33.0

2024-03-05 Thread Craig Condit (Jira)


 [ 
https://issues.apache.org/jira/browse/YUNIKORN-2469?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Condit resolved YUNIKORN-2469.

Fix Version/s: 1.5.0
   Resolution: Fixed

Merged all PRs to master and cherry-picked to branch-1.5.

> Upgrade google.golang.org/protobuf to v1.33.0
> -
>
> Key: YUNIKORN-2469
> URL: https://issues.apache.org/jira/browse/YUNIKORN-2469
> Project: Apache YuniKorn
>  Issue Type: Task
>  Components: core - common, release, shim - kubernetes
>Reporter: Craig Condit
>Assignee: Craig Condit
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 1.5.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: dev-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: dev-h...@yunikorn.apache.org



[jira] [Resolved] (YUNIKORN-2468) Remove language around reproducible builds from README

2024-03-05 Thread Craig Condit (Jira)


 [ 
https://issues.apache.org/jira/browse/YUNIKORN-2468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Condit resolved YUNIKORN-2468.

Fix Version/s: 1.5.0
   Resolution: Fixed

> Remove language around reproducible builds from README
> --
>
> Key: YUNIKORN-2468
> URL: https://issues.apache.org/jira/browse/YUNIKORN-2468
> Project: Apache YuniKorn
>  Issue Type: Task
>  Components: release
>Reporter: Craig Condit
>Assignee: Craig Condit
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.5.0
>
>
> The reproducible builds feature is currently not functioning properly in the 
> 1.5.0 release. We should remove references to it from the README.md file.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: dev-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: dev-h...@yunikorn.apache.org



[jira] [Resolved] (YUNIKORN-2467) Remove AllocationAsk from the core when a pod is completed

2024-03-05 Thread Craig Condit (Jira)


 [ 
https://issues.apache.org/jira/browse/YUNIKORN-2467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Condit resolved YUNIKORN-2467.

Fix Version/s: 1.5.0
   Resolution: Fixed

Merged to master and cherry-picked to branch-1.5.0.

> Remove AllocationAsk from the core when a pod is completed
> --
>
> Key: YUNIKORN-2467
> URL: https://issues.apache.org/jira/browse/YUNIKORN-2467
> Project: Apache YuniKorn
>  Issue Type: Bug
>  Components: shim - kubernetes
>Reporter: Peter Bacsko
>Assignee: Peter Bacsko
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 1.5.0
>
>
> A new issue was discovered while fixing YUNIKORN-2465. This also results in 
> growing memory usage in case of long running applications.
> When a pod reaches a terminal state (Success / Failed), we send an update 
> request from the shim to the core ({{Task.releaseAllocation()}}). However, we 
> only discard the allocation itself and we don't do anything about the ask. It 
> is kept inside the Application object until it becomes Completed.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: dev-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: dev-h...@yunikorn.apache.org



[jira] [Reopened] (YUNIKORN-2419) [UMBRELLA] Generate reproducible binaries

2024-03-05 Thread Craig Condit (Jira)


 [ 
https://issues.apache.org/jira/browse/YUNIKORN-2419?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Condit reopened YUNIKORN-2419:


> [UMBRELLA] Generate reproducible binaries
> -
>
> Key: YUNIKORN-2419
> URL: https://issues.apache.org/jira/browse/YUNIKORN-2419
> Project: Apache YuniKorn
>  Issue Type: Improvement
>  Components: shim - kubernetes, webapp
>Reporter: Craig Condit
>Assignee: Craig Condit
>Priority: Major
>  Labels: release-notes
> Fix For: 1.5.0
>
>
> Currently, the binaries we build for YuniKorn differ from one build to the 
> next. We should attempt to standardize our build output so that independently 
> built binaries from the same source code can be validated.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: dev-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: dev-h...@yunikorn.apache.org



[jira] [Created] (YUNIKORN-2469) Upgrade google.golang.org/protobuf to v1.33.0

2024-03-05 Thread Craig Condit (Jira)
Craig Condit created YUNIKORN-2469:
--

 Summary: Upgrade google.golang.org/protobuf to v1.33.0
 Key: YUNIKORN-2469
 URL: https://issues.apache.org/jira/browse/YUNIKORN-2469
 Project: Apache YuniKorn
  Issue Type: Task
  Components: core - common, release, scheduler-interface, shim - 
kubernetes
Reporter: Craig Condit
Assignee: Craig Condit






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: dev-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: dev-h...@yunikorn.apache.org



[jira] [Created] (YUNIKORN-2468) Remove language around reproducible builds from README

2024-03-05 Thread Craig Condit (Jira)
Craig Condit created YUNIKORN-2468:
--

 Summary: Remove language around reproducible builds from README
 Key: YUNIKORN-2468
 URL: https://issues.apache.org/jira/browse/YUNIKORN-2468
 Project: Apache YuniKorn
  Issue Type: Task
  Components: release
Reporter: Craig Condit
Assignee: Craig Condit


The reproducible builds feature is currently not functioning properly in the 
1.5.0 release. We should remove references to it from the README.md file.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: dev-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: dev-h...@yunikorn.apache.org



Re: [VOTE] Release Apache YuniKorn 1.5.0 RC1

2024-03-05 Thread Craig Condit
-1 (binding).

All,

We have a few issues in rc1 that I believe we should address before shipping 
1.5.0:

CVEs:

- CVE-2024-24783 (requires rebuild with go 1.21.8)
- CVE-2023-45290 (requires rebuild with go 1.21.8)
- CVE-2023-45289 (requires rebuild with go 1.21.8)
- CVE-2024-24786 (requires updates to google.golang.org/protobuf 
 and possibly github.com/golang/protobuf 
)

Broken functionality:

- Reproducible builds (unknown why this has failed, but we will need to remove 
the content from the README.md that claims reproducible status)

Critical bugs (both memory leaks):

- https://issues.apache.org/jira/browse/YUNIKORN-2465 - Remove Task objects 
from the shim upon pod completion (fix merged to master and to branch-1.5)
- https://issues.apache.org/jira/browse/YUNIKORN-2467 - Remove AllocationAsk 
from the core when a pod is completed (PR available; needs review to determine 
if this is a 1.5 blocker).

I think we should address each of these and cut an rc2. Thought?

Craig Condit

> On Mar 2, 2024, at 10:38 AM, TingYao  wrote:
> 
> Hello everyone,
> 
> I would like to call a vote for releasing Apache YuniKorn 1.5.0 RC1.
> 
> The release artefacts have been uploaded here:
>  https://dist.apache.org/repos/dist/dev/yunikorn/1.5.0-RC1
> 
> My public key is located in the KEYS file:
>  https://downloads.apache.org//yunikorn/KEYS
> 
> JIRA issues that have been resolved in this release:
>  https://issues.apache.org/jira/issues/?filter=12352958
> 
> Git tags for each component are as follows:
> yunikorn-scheduler-interface: v1.5.0-1
> yunikorn-core: v1.5.0-2
> yunikorn-k8shim: v1.5.0-2
> yunikorn-web: v1.5.0-1
> yunikorn-release: v1.5.0-2
> 
> Once the release is voted on and approved, all repos will be tagged
> 1.5.0 for consistency.
> 
> Please review and vote. The vote will be open for at least 72 hours
> and closes on Wednesday 5 March 2024, 17:00:00 UTC
> 
> [ ] +1 Approve
> [ ] +0 No opinion
> [ ] -1 Disapprove (and the reason why)
> 
> Thank you,
> Tingyao



[jira] [Created] (YUNIKORN-2467) Remove AllocationAsk from the core when a pod is completed

2024-03-05 Thread Peter Bacsko (Jira)
Peter Bacsko created YUNIKORN-2467:
--

 Summary: Remove AllocationAsk from the core when a pod is completed
 Key: YUNIKORN-2467
 URL: https://issues.apache.org/jira/browse/YUNIKORN-2467
 Project: Apache YuniKorn
  Issue Type: Bug
  Components: shim - kubernetes
Reporter: Peter Bacsko
Assignee: Peter Bacsko


A new issue was discovered while fixing YUNIKORN-2465. This also results in 
growing memory usage in case of long running applications.

When a pod reaches a terminal state (Success / Failed), we send an update 
request from the shim to the core ({{Task.releaseAllocation()}}. However, we 
only discard the allocation itself and we don't do anything about the ask. It 
is kept inside the Application object until it becomes Completed.





--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: dev-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: dev-h...@yunikorn.apache.org



Re: [VOTE] Release Apache YuniKorn 1.5.0 RC1

2024-03-05 Thread 陳昱霖
+1

   - Verified signatures and checksums (The provided one in this mail
   chain.)
   - Built on Ubuntu 23.04(amd64) with go1.21.6 linux/amd64
   - Checked Web UI,  the new node utilization chart/presentation of
   resource units looks good.
   - E2E tests passed
   - Run simple preemption test in shim example
   - Run SparkPi/SparkTC successfully


Yu-Lin Chen

Wilfred Spiegelenburg  於 2024年3月4日 週一 下午5:38寫道:

> The binaries generated for me are really different. I copied out the
> files generated during the building of the release and compared them
> with files I generated based on the release. The sha-512 sums that are
> part of the README in my release are again different from any that
> have been shown here or in the README.md of the release artefacts.
>
> from the release process: (linux/arm64) run on my local machine:
>   59811940  4 Mar 14:26 yunikorn-admission-controller
>   65162249  4 Mar 14:25 yunikorn-scheduler
>   83047917  4 Mar 14:26 yunikorn-scheduler-plugin
> from the build after (linux/arm64):
>  59811916  4 Mar 14:43 yunikorn-admission-controller
>  65162321  4 Mar 14:42 yunikorn-scheduler
>  83047941  4 Mar 14:42 yunikorn-scheduler-plugin
>
> That is running on the same machine with the same go compiler. Seems
> like something fundamentally is broken. We tested this a number of
> times before it was committed. Not sure why it worked back then and
> not any more.
>
> Wilfred
>
>
> On Mon, 4 Mar 2024 at 08:23, Craig Condit  wrote:
> >
> > I’m trying to validate the binaries produced using the new reproducible
> builds feature, but I’m getting different checksums than what the README
> indicates I should. Was the release tarball created from a fresh checkout
> of the release repository with no uncommitted changes?
> >
> > README.md shows these checksums for amd64:
> >
> >
> 74646cecfb0ec1bd171ea58ee28e12466939841ac4f6a4a56b482a3d336388c4cee707eba393bd4214750860bc7d2fd3dd877097bf5cee1e495e9b8b14004bc7
>   yunikorn-admission-controller
> >
> 1508297773eb2ef7910abd39b15221b09ee6ce48f29c4d5903f42f46e65a4f583048a8483243845af4a43ceeba911d03659798e8316995bf6cd87c9fcf86f02d
>   yunikorn-scheduler
> >
> 67cdfb99f50eb271f932205bd45fa2bf4e9108e815f7d51fcd0c9cf15747eed3dcf7e2f46a1f2eb2ae7d3d43ac88330e21bbf380a5e8b19128f14707c2777f9f
> yunikorn-scheduler-plugin
> >
> 1eaa7485480f6430cd58e85ec6fd1b4c11d1abe08c509e53e6cb6772c188dd75c5f9f2c8d79fc334d68a3b3c8260ccdf5631409897346759cb636c4098efdf94
> yunikorn-web
> >
> > My results:
> >
> >
> c47192a5f0b8b1afe6244b31b1fd31668c664ea8fbc9476c4678e5d2e2c2c4543908af95a960d6fbece36f1ae7ee34ebdeda56cc40989fe10fa51e56360a8c97
>   yunikorn-admission-controller
> >
> 71c4531b5d8a38c60196393d5bcbe053e5e24068c0b083dea21de93ae5891909df5d6a6ea2173c526978f23c6368555e5d079112207c1e8bed3a9ec19b69f186
>   yunikorn-scheduler
> >
> 54e60d6f9deb834e1fc33b5a065ee9f5db7a2a67374245075da889caf3182d17fe327a4036aed60fda6c9a301f488de917c17bafbb654ec806211297d6fc6ba3
>   yunikorn-scheduler-plugin
> >
> 518c70006448426eda6a533b816fc3e8251a92065009f20619d9cd1ca21e80906749d83194d12fd771a44e9f328f8aac4c8af8f76028f17a8c1570f663e25606
>   yunikorn-web
> >
> >
> > If we can’t reproduce the results, then the README content is invalid.
> I’ve tested the release process locally by generating the release tarball
> and then rebuilding again from the resulting tarball.
> >
> > Craig
> >
> > > On Mar 2, 2024, at 10:38 AM, TingYao  wrote:
> > >
> > > Hello everyone,
> > >
> > > I would like to call a vote for releasing Apache YuniKorn 1.5.0 RC1.
> > >
> > > The release artefacts have been uploaded here:
> > >  https://dist.apache.org/repos/dist/dev/yunikorn/1.5.0-RC1
> > >
> > > My public key is located in the KEYS file:
> > >  https://downloads.apache.org//yunikorn/KEYS
> > >
> > > JIRA issues that have been resolved in this release:
> > >  https://issues.apache.org/jira/issues/?filter=12352958
> > >
> > > Git tags for each component are as follows:
> > > yunikorn-scheduler-interface: v1.5.0-1
> > > yunikorn-core: v1.5.0-2
> > > yunikorn-k8shim: v1.5.0-2
> > > yunikorn-web: v1.5.0-1
> > > yunikorn-release: v1.5.0-2
> > >
> > > Once the release is voted on and approved, all repos will be tagged
> > > 1.5.0 for consistency.
> > >
> > > Please review and vote. The vote will be open for at least 72 hours
> > > and closes on Wednesday 5 March 2024, 17:00:00 UTC
> > >
> > > [ ] +1 Approve
> > > [ ] +0 No opinion
> > > [ ] -1 Disapprove (and the reason why)
> > >
> > > Thank you,
> > > Tingyao
> >
> >
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@yunikorn.apache.org
> > For additional commands, e-mail: dev-h...@yunikorn.apache.org
> >
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@yunikorn.apache.org
> For additional commands, e-mail: dev-h...@yunikorn.apache.org
>
>