Hi William and Ashish, For RATIS-2104, Duong found out that the problem was the test itself, which shut down the cluster twice.
RATIS-2105 is for NettyRpc. It seems okay since we have not changed NettyRpc. I will rerun the tests one more time. Tsz-Wo On Mon, Jun 3, 2024 at 7:43 PM Ashish <[email protected]> wrote: > These are test failures, so we can create JIRA's and monitor them. > > Are we planning to hold the vote for these failures, since we already > passed 72 hr window ? > > On Mon, Jun 3, 2024 at 10:31 AM Tsz Wo Sze <[email protected]> wrote: > > > Hi William, > > > > I am checking the failures in more details to see if they are real > > problems. > > > > Tsz-Wo > > > > On Mon, Jun 3, 2024 at 4:19 PM William Song <[email protected]> wrote: > > > > > Hi Tsz-Wo, > > > > > > Thanks for verifying the release candidate, will keep an eye on > > > RATIS-2104, 2105. > > > > > > I’m trying to reproduce the test failure on my local PC (Mac M chip, > JDK > > > 8/JDK 20) but haven’t see those failures. For RATIS-2104, during the > > > development of TestLeaderInstallSnapshot I've also seen the > > > java.lang.IllegalStateException (allLeaks.size=5), suggesting > > > AppendEntriesProtos are leaked, if I forget to let the isolated leader > > join > > > back at the end. > > > > > > I’ll keep on reproducing RATIS-2104, 2105. > > > > > > William > > > > > > > 2024年6月3日 07:57,Tsz Wo Sze <[email protected]> 写道: > > > > > > > > Similar to Ashish, I also saw some test failures. The following two > > > tests > > > > failed all the time. > > > > - RATIS-2104 TestLeaderInstallSnapshot may fail with > > > > java.lang.IllegalStateException: allLeaks.size = 4 > > > > - RATIS-2105 TestRetryCacheWithNettyRpc may fail with > TimeoutException > > > > > > > > Will check the details. > > > > > > > > Tsz-Wo > > > > > > > > On Sat, Jun 1, 2024 at 11:06 PM Tsz Wo Sze <[email protected]> > wrote: > > > > > > > >> I am still running tests. Will vote soon. > > > >> > > > >> @Ashish, thanks for verifying the release and vote! > > > >> > > > >>> Side note, on the 1st try I did get test failure due to timeout. > > > >> > > > >> Occasionally, tests may fail with a bind exception which leads to > > > timeout. > > > >> > > > >> > > > >> Tsz-Wo > > > >> > > > >> > > > >> On Fri, May 31, 2024 at 8:55 AM Ashish <[email protected]> > > wrote: > > > >> > > > >>> +1 (non-binding) > > > >>> > > > >>> - Ran the build on Mac/JDK 11, build fine and all test cases pass > > > >>> > > > >>> Side note, on the 1st try I did get test failure due to timeout. > > > >>> However, was not reproduced, shall keep an eye on the failure, in > > case > > > it > > > >>> happens again. > > > >>> > > > >>> Thanks > > > >>> Ashish > > > >>> > > > >>> On Thu, May 30, 2024 at 12:38 PM Attila Doroszlai < > > > [email protected]> > > > >>> wrote: > > > >>> > > > >>>>> I’m calling a vote For Apache Ratis Release 3.1.0 rc0. > > > >>>> > > > >>>> +1 > > > >>>> > > > >>>> * Verified checksum, signature > > > >>>> * Compared source tarball to repo at the given tag > > > >>>> * Ran Rat check on source > > > >>>> * Built from source > > > >>>> * Ran tests locally > > > >>>> * Ran regular Ozone CI using the staged jars [1] (passed, one unit > > > >>>> test needs to be tweaked to pass) > > > >>>> > > > >>>>> The git commit hash: > > > >>>>> 0a34940149aef2ed6597564ca1eb5176ccd493c7 > > > >>>> > > > >>>> Note: this commit hash is for 2.5.0-rc0. The right hash for > > 3.1.0-rc0 > > > >>>> is bfd029cf45bf8229ef12d1d84e3692115231fb1b. > > > >>>> > > > >>>> Thanks William for preparing rc0. > > > >>>> > > > >>>> -Attila > > > >>>> > > > >>>> [1] https://github.com/adoroszlai/ozone/actions/runs/9306740993 > > > >>>> > > > >>> > > > >>> > > > >>> -- > > > >>> thanks > > > >>> ashish > > > >>> > > > >> > > > > > > > > > > > -- > thanks > ashish >
