Re: Github build queue

2024-02-09 Thread David Arthur
I tried to enable the merge queue on my public fork, but the option is not
available. I did a little searching and it looks like ASF does not allow
this feature to be used. I've filed an INFRA ticket to ask again
https://issues.apache.org/jira/browse/INFRA-25485

-David

On Fri, Feb 9, 2024 at 7:18 PM Ismael Juma  wrote:

> Also, on the mockito stubbings point, we did upgrade to Mockito 5.8 for the
> Java 11 and newer builds:
>
> https://github.com/apache/kafka/blob/trunk/gradle/dependencies.gradle#L64
>
> So, we should be good when it comes to that too.
>
> Ismael
>
> On Fri, Feb 9, 2024 at 4:15 PM Ismael Juma  wrote:
>
> > Nice!
> >
> > Ismael
> >
> > On Fri, Feb 9, 2024 at 3:43 PM Greg Harris  >
> > wrote:
> >
> >> Hey all,
> >>
> >> I implemented a fairly aggressive PR [1] to demote flaky tests to
> >> integration tests, and the end result is a much faster (10m locally,
> >> 1h on Jenkins) build which is also very reliable.
> >>
> >> I believe this would make unitTest suitable for use in the merge
> >> queue, with the caveat that it doesn't run 25k integration tests, and
> >> doesn't perform the mockito strict stubbing verification.
> >> This would still be a drastic improvement, as we would then be running
> >> the build and 87k unit tests that we aren't running today.
> >>
> >> Thanks!
> >> Greg
> >>
> >> [1] https://github.com/apache/kafka/pull/15349
> >>
> >> On Fri, Feb 9, 2024 at 9:25 AM Ismael Juma  wrote:
> >> >
> >> > Please check https://github.com/apache/kafka/pull/14186 before making
> >> the
> >> > `unitTest` and `integrationTest` split.
> >> >
> >> > Ismael
> >> >
> >> > On Fri, Feb 9, 2024 at 9:16 AM Josep Prat  >
> >> > wrote:
> >> >
> >> > > Regarding "Split our CI "test" job into unit and integration so we
> can
> >> > > start collecting data on those suites", can we run these 2 tasks in
> >> the
> >> > > same machine? So they won't need to compile classes twice for the
> same
> >> > > exact code?
> >> > >
> >> > > On Fri, Feb 9, 2024 at 6:05 PM Ismael Juma 
> wrote:
> >> > >
> >> > > > Why can't we add @Tag("integration") for all of those tests? Seems
> >> like
> >> > > > that would not be too hard.
> >> > > >
> >> > > > Ismael
> >> > > >
> >> > > > On Fri, Feb 9, 2024 at 9:03 AM Greg Harris
> >>  >> > > >
> >> > > > wrote:
> >> > > >
> >> > > > > Hi David,
> >> > > > >
> >> > > > > +1 on that strategy.
> >> > > > >
> >> > > > > I see several flaky tests that aren't marked with
> >> @Tag("integration")
> >> > > > > or @IntegrationTest, and I think those would make using the
> >> unitTest
> >> > > > > target ineffective here. We could also start a new tag
> >> @Tag("flaky")
> >> > > > > and exclude that.
> >> > > > >
> >> > > > > Thanks,
> >> > > > > Greg
> >> > > > >
> >> > > > > On Fri, Feb 9, 2024 at 8:57 AM David Arthur 
> >> wrote:
> >> > > > > >
> >> > > > > > I do think we can add a PR to the merge queue while bypassing
> >> branch
> >> > > > > > potections (like we do for the Merge button today), but I'm
> not
> >> 100%
> >> > > > > sure.
> >> > > > > > I like the idea of running unit tests, though I don't think we
> >> have
> >> > > > data
> >> > > > > on
> >> > > > > > how long just the unit tests run on Jenkins (since we run the
> >> "test"
> >> > > > > target
> >> > > > > > which includes all tests). I'm also not sure how flaky the
> unit
> >> test
> >> > > > > suite
> >> > > > > > is alone.
> >> > > > > >
> >> > > > > > Since we already bypass the PR checks when merging, it seems
> >> that
> >> > > > adding
> >> > > > > a
> >> > > > > > required compile/check step before landing on trunk is
> strictly
> >> an
> >> > > > > > improvement.
> >> > > > > >
> >> > > > > > What about this as a short term plan:
> >> > > > > >
> >> > > > > > 1) Add the merge queue, only run compile/check
> >> > > > > > 2) Split our CI "test" job into unit and integration so we can
> >> start
> >> > > > > > collecting data on those suites
> >> > > > > > 3) Add "unitTest" to merge queue job once we're satisfied it
> >> won't
> >> > > > cause
> >> > > > > > disruption
> >> > > > > >
> >> > > > > >
> >> > > > > >
> >> > > > > >
> >> > > > > > On Fri, Feb 9, 2024 at 11:43 AM Josep Prat
> >> > >  >> > > > >
> >> > > > > > wrote:
> >> > > > > >
> >> > > > > > > Hi David,
> >> > > > > > > I like the idea, it will solve the problem we've seen a
> >> couple of
> >> > > > > times in
> >> > > > > > > the last 2 weeks where compilation for some Scala version
> >> failed,
> >> > > it
> >> > > > > was
> >> > > > > > > probably overlooked during the PR build because of the
> >> flakiness of
> >> > > > > tests
> >> > > > > > > and the compilation failure was buried among the amount of
> >> failed
> >> > > > > tests.
> >> > > > > > >
> >> > > > > > > Regarding the type of check, I'm not sure what's best, have
> a
> >> real
> >> > > > > quick
> >> > > > > > > check or a longer one including unit tests. A full test
> suite
> >> will
> >> > > > run
> >> > > > > per
> >> > > > > > > each commit in each PR (these we 

Re: Github build queue

2024-02-09 Thread Ismael Juma
Also, on the mockito stubbings point, we did upgrade to Mockito 5.8 for the
Java 11 and newer builds:

https://github.com/apache/kafka/blob/trunk/gradle/dependencies.gradle#L64

So, we should be good when it comes to that too.

Ismael

On Fri, Feb 9, 2024 at 4:15 PM Ismael Juma  wrote:

> Nice!
>
> Ismael
>
> On Fri, Feb 9, 2024 at 3:43 PM Greg Harris 
> wrote:
>
>> Hey all,
>>
>> I implemented a fairly aggressive PR [1] to demote flaky tests to
>> integration tests, and the end result is a much faster (10m locally,
>> 1h on Jenkins) build which is also very reliable.
>>
>> I believe this would make unitTest suitable for use in the merge
>> queue, with the caveat that it doesn't run 25k integration tests, and
>> doesn't perform the mockito strict stubbing verification.
>> This would still be a drastic improvement, as we would then be running
>> the build and 87k unit tests that we aren't running today.
>>
>> Thanks!
>> Greg
>>
>> [1] https://github.com/apache/kafka/pull/15349
>>
>> On Fri, Feb 9, 2024 at 9:25 AM Ismael Juma  wrote:
>> >
>> > Please check https://github.com/apache/kafka/pull/14186 before making
>> the
>> > `unitTest` and `integrationTest` split.
>> >
>> > Ismael
>> >
>> > On Fri, Feb 9, 2024 at 9:16 AM Josep Prat 
>> > wrote:
>> >
>> > > Regarding "Split our CI "test" job into unit and integration so we can
>> > > start collecting data on those suites", can we run these 2 tasks in
>> the
>> > > same machine? So they won't need to compile classes twice for the same
>> > > exact code?
>> > >
>> > > On Fri, Feb 9, 2024 at 6:05 PM Ismael Juma  wrote:
>> > >
>> > > > Why can't we add @Tag("integration") for all of those tests? Seems
>> like
>> > > > that would not be too hard.
>> > > >
>> > > > Ismael
>> > > >
>> > > > On Fri, Feb 9, 2024 at 9:03 AM Greg Harris
>> > > > >
>> > > > wrote:
>> > > >
>> > > > > Hi David,
>> > > > >
>> > > > > +1 on that strategy.
>> > > > >
>> > > > > I see several flaky tests that aren't marked with
>> @Tag("integration")
>> > > > > or @IntegrationTest, and I think those would make using the
>> unitTest
>> > > > > target ineffective here. We could also start a new tag
>> @Tag("flaky")
>> > > > > and exclude that.
>> > > > >
>> > > > > Thanks,
>> > > > > Greg
>> > > > >
>> > > > > On Fri, Feb 9, 2024 at 8:57 AM David Arthur 
>> wrote:
>> > > > > >
>> > > > > > I do think we can add a PR to the merge queue while bypassing
>> branch
>> > > > > > potections (like we do for the Merge button today), but I'm not
>> 100%
>> > > > > sure.
>> > > > > > I like the idea of running unit tests, though I don't think we
>> have
>> > > > data
>> > > > > on
>> > > > > > how long just the unit tests run on Jenkins (since we run the
>> "test"
>> > > > > target
>> > > > > > which includes all tests). I'm also not sure how flaky the unit
>> test
>> > > > > suite
>> > > > > > is alone.
>> > > > > >
>> > > > > > Since we already bypass the PR checks when merging, it seems
>> that
>> > > > adding
>> > > > > a
>> > > > > > required compile/check step before landing on trunk is strictly
>> an
>> > > > > > improvement.
>> > > > > >
>> > > > > > What about this as a short term plan:
>> > > > > >
>> > > > > > 1) Add the merge queue, only run compile/check
>> > > > > > 2) Split our CI "test" job into unit and integration so we can
>> start
>> > > > > > collecting data on those suites
>> > > > > > 3) Add "unitTest" to merge queue job once we're satisfied it
>> won't
>> > > > cause
>> > > > > > disruption
>> > > > > >
>> > > > > >
>> > > > > >
>> > > > > >
>> > > > > > On Fri, Feb 9, 2024 at 11:43 AM Josep Prat
>> > > > > > > >
>> > > > > > wrote:
>> > > > > >
>> > > > > > > Hi David,
>> > > > > > > I like the idea, it will solve the problem we've seen a
>> couple of
>> > > > > times in
>> > > > > > > the last 2 weeks where compilation for some Scala version
>> failed,
>> > > it
>> > > > > was
>> > > > > > > probably overlooked during the PR build because of the
>> flakiness of
>> > > > > tests
>> > > > > > > and the compilation failure was buried among the amount of
>> failed
>> > > > > tests.
>> > > > > > >
>> > > > > > > Regarding the type of check, I'm not sure what's best, have a
>> real
>> > > > > quick
>> > > > > > > check or a longer one including unit tests. A full test suite
>> will
>> > > > run
>> > > > > per
>> > > > > > > each commit in each PR (these we have definitely more than 8
>> per
>> > > day)
>> > > > > and
>> > > > > > > this should be used to ensure changes are safe and sound. I'm
>> not
>> > > > sure
>> > > > > if
>> > > > > > > having unit tests run as well before the merge itself would
>> cause
>> > > too
>> > > > > much
>> > > > > > > of an extra load on the CI machines.
>> > > > > > > We can go with `gradlew unitTest` and see if this takes too
>> long or
>> > > > > causes
>> > > > > > > too many delays with the normal pipeline.
>> > > > > > >
>> > > > > > > Best,
>> > > > > > >
>> > > > > > > On Fri, Feb 9, 2024 at 4:16 PM Ismael Juma > >
>> > > > wrote:
>> > > > > > >

Re: Github build queue

2024-02-09 Thread Ismael Juma
Nice!

Ismael

On Fri, Feb 9, 2024 at 3:43 PM Greg Harris 
wrote:

> Hey all,
>
> I implemented a fairly aggressive PR [1] to demote flaky tests to
> integration tests, and the end result is a much faster (10m locally,
> 1h on Jenkins) build which is also very reliable.
>
> I believe this would make unitTest suitable for use in the merge
> queue, with the caveat that it doesn't run 25k integration tests, and
> doesn't perform the mockito strict stubbing verification.
> This would still be a drastic improvement, as we would then be running
> the build and 87k unit tests that we aren't running today.
>
> Thanks!
> Greg
>
> [1] https://github.com/apache/kafka/pull/15349
>
> On Fri, Feb 9, 2024 at 9:25 AM Ismael Juma  wrote:
> >
> > Please check https://github.com/apache/kafka/pull/14186 before making
> the
> > `unitTest` and `integrationTest` split.
> >
> > Ismael
> >
> > On Fri, Feb 9, 2024 at 9:16 AM Josep Prat 
> > wrote:
> >
> > > Regarding "Split our CI "test" job into unit and integration so we can
> > > start collecting data on those suites", can we run these 2 tasks in the
> > > same machine? So they won't need to compile classes twice for the same
> > > exact code?
> > >
> > > On Fri, Feb 9, 2024 at 6:05 PM Ismael Juma  wrote:
> > >
> > > > Why can't we add @Tag("integration") for all of those tests? Seems
> like
> > > > that would not be too hard.
> > > >
> > > > Ismael
> > > >
> > > > On Fri, Feb 9, 2024 at 9:03 AM Greg Harris
>  > > >
> > > > wrote:
> > > >
> > > > > Hi David,
> > > > >
> > > > > +1 on that strategy.
> > > > >
> > > > > I see several flaky tests that aren't marked with
> @Tag("integration")
> > > > > or @IntegrationTest, and I think those would make using the
> unitTest
> > > > > target ineffective here. We could also start a new tag
> @Tag("flaky")
> > > > > and exclude that.
> > > > >
> > > > > Thanks,
> > > > > Greg
> > > > >
> > > > > On Fri, Feb 9, 2024 at 8:57 AM David Arthur 
> wrote:
> > > > > >
> > > > > > I do think we can add a PR to the merge queue while bypassing
> branch
> > > > > > potections (like we do for the Merge button today), but I'm not
> 100%
> > > > > sure.
> > > > > > I like the idea of running unit tests, though I don't think we
> have
> > > > data
> > > > > on
> > > > > > how long just the unit tests run on Jenkins (since we run the
> "test"
> > > > > target
> > > > > > which includes all tests). I'm also not sure how flaky the unit
> test
> > > > > suite
> > > > > > is alone.
> > > > > >
> > > > > > Since we already bypass the PR checks when merging, it seems that
> > > > adding
> > > > > a
> > > > > > required compile/check step before landing on trunk is strictly
> an
> > > > > > improvement.
> > > > > >
> > > > > > What about this as a short term plan:
> > > > > >
> > > > > > 1) Add the merge queue, only run compile/check
> > > > > > 2) Split our CI "test" job into unit and integration so we can
> start
> > > > > > collecting data on those suites
> > > > > > 3) Add "unitTest" to merge queue job once we're satisfied it
> won't
> > > > cause
> > > > > > disruption
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > On Fri, Feb 9, 2024 at 11:43 AM Josep Prat
> > >  > > > >
> > > > > > wrote:
> > > > > >
> > > > > > > Hi David,
> > > > > > > I like the idea, it will solve the problem we've seen a couple
> of
> > > > > times in
> > > > > > > the last 2 weeks where compilation for some Scala version
> failed,
> > > it
> > > > > was
> > > > > > > probably overlooked during the PR build because of the
> flakiness of
> > > > > tests
> > > > > > > and the compilation failure was buried among the amount of
> failed
> > > > > tests.
> > > > > > >
> > > > > > > Regarding the type of check, I'm not sure what's best, have a
> real
> > > > > quick
> > > > > > > check or a longer one including unit tests. A full test suite
> will
> > > > run
> > > > > per
> > > > > > > each commit in each PR (these we have definitely more than 8
> per
> > > day)
> > > > > and
> > > > > > > this should be used to ensure changes are safe and sound. I'm
> not
> > > > sure
> > > > > if
> > > > > > > having unit tests run as well before the merge itself would
> cause
> > > too
> > > > > much
> > > > > > > of an extra load on the CI machines.
> > > > > > > We can go with `gradlew unitTest` and see if this takes too
> long or
> > > > > causes
> > > > > > > too many delays with the normal pipeline.
> > > > > > >
> > > > > > > Best,
> > > > > > >
> > > > > > > On Fri, Feb 9, 2024 at 4:16 PM Ismael Juma 
> > > > wrote:
> > > > > > >
> > > > > > > > Hi David,
> > > > > > > >
> > > > > > > > I think this is a helpful thing (and something I hoped we
> would
> > > use
> > > > > when
> > > > > > > I
> > > > > > > > learned about it), but it does require the validation checks
> to
> > > be
> > > > > > > reliable
> > > > > > > > (or else the PR won't be merged). Sounds like you are
> suggesting
> > > to
> > > > > skip
> > > > > > > > the tests for the merge queue validation. Could we 

Re: Github build queue

2024-02-09 Thread Greg Harris
Hey all,

I implemented a fairly aggressive PR [1] to demote flaky tests to
integration tests, and the end result is a much faster (10m locally,
1h on Jenkins) build which is also very reliable.

I believe this would make unitTest suitable for use in the merge
queue, with the caveat that it doesn't run 25k integration tests, and
doesn't perform the mockito strict stubbing verification.
This would still be a drastic improvement, as we would then be running
the build and 87k unit tests that we aren't running today.

Thanks!
Greg

[1] https://github.com/apache/kafka/pull/15349

On Fri, Feb 9, 2024 at 9:25 AM Ismael Juma  wrote:
>
> Please check https://github.com/apache/kafka/pull/14186 before making the
> `unitTest` and `integrationTest` split.
>
> Ismael
>
> On Fri, Feb 9, 2024 at 9:16 AM Josep Prat 
> wrote:
>
> > Regarding "Split our CI "test" job into unit and integration so we can
> > start collecting data on those suites", can we run these 2 tasks in the
> > same machine? So they won't need to compile classes twice for the same
> > exact code?
> >
> > On Fri, Feb 9, 2024 at 6:05 PM Ismael Juma  wrote:
> >
> > > Why can't we add @Tag("integration") for all of those tests? Seems like
> > > that would not be too hard.
> > >
> > > Ismael
> > >
> > > On Fri, Feb 9, 2024 at 9:03 AM Greg Harris  > >
> > > wrote:
> > >
> > > > Hi David,
> > > >
> > > > +1 on that strategy.
> > > >
> > > > I see several flaky tests that aren't marked with @Tag("integration")
> > > > or @IntegrationTest, and I think those would make using the unitTest
> > > > target ineffective here. We could also start a new tag @Tag("flaky")
> > > > and exclude that.
> > > >
> > > > Thanks,
> > > > Greg
> > > >
> > > > On Fri, Feb 9, 2024 at 8:57 AM David Arthur  wrote:
> > > > >
> > > > > I do think we can add a PR to the merge queue while bypassing branch
> > > > > potections (like we do for the Merge button today), but I'm not 100%
> > > > sure.
> > > > > I like the idea of running unit tests, though I don't think we have
> > > data
> > > > on
> > > > > how long just the unit tests run on Jenkins (since we run the "test"
> > > > target
> > > > > which includes all tests). I'm also not sure how flaky the unit test
> > > > suite
> > > > > is alone.
> > > > >
> > > > > Since we already bypass the PR checks when merging, it seems that
> > > adding
> > > > a
> > > > > required compile/check step before landing on trunk is strictly an
> > > > > improvement.
> > > > >
> > > > > What about this as a short term plan:
> > > > >
> > > > > 1) Add the merge queue, only run compile/check
> > > > > 2) Split our CI "test" job into unit and integration so we can start
> > > > > collecting data on those suites
> > > > > 3) Add "unitTest" to merge queue job once we're satisfied it won't
> > > cause
> > > > > disruption
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > On Fri, Feb 9, 2024 at 11:43 AM Josep Prat
> >  > > >
> > > > > wrote:
> > > > >
> > > > > > Hi David,
> > > > > > I like the idea, it will solve the problem we've seen a couple of
> > > > times in
> > > > > > the last 2 weeks where compilation for some Scala version failed,
> > it
> > > > was
> > > > > > probably overlooked during the PR build because of the flakiness of
> > > > tests
> > > > > > and the compilation failure was buried among the amount of failed
> > > > tests.
> > > > > >
> > > > > > Regarding the type of check, I'm not sure what's best, have a real
> > > > quick
> > > > > > check or a longer one including unit tests. A full test suite will
> > > run
> > > > per
> > > > > > each commit in each PR (these we have definitely more than 8 per
> > day)
> > > > and
> > > > > > this should be used to ensure changes are safe and sound. I'm not
> > > sure
> > > > if
> > > > > > having unit tests run as well before the merge itself would cause
> > too
> > > > much
> > > > > > of an extra load on the CI machines.
> > > > > > We can go with `gradlew unitTest` and see if this takes too long or
> > > > causes
> > > > > > too many delays with the normal pipeline.
> > > > > >
> > > > > > Best,
> > > > > >
> > > > > > On Fri, Feb 9, 2024 at 4:16 PM Ismael Juma 
> > > wrote:
> > > > > >
> > > > > > > Hi David,
> > > > > > >
> > > > > > > I think this is a helpful thing (and something I hoped we would
> > use
> > > > when
> > > > > > I
> > > > > > > learned about it), but it does require the validation checks to
> > be
> > > > > > reliable
> > > > > > > (or else the PR won't be merged). Sounds like you are suggesting
> > to
> > > > skip
> > > > > > > the tests for the merge queue validation. Could we perhaps
> > include
> > > > the
> > > > > > unit
> > > > > > > tests as well? That would incentivize us to ensure the unit tests
> > > are
> > > > > > fast
> > > > > > > and reliable. Getting the integration tests to the same state
> > will
> > > > be a
> > > > > > > longer journey.
> > > > > > >
> > > > > > > Ismael
> > > > > > >
> > > > > > > On Fri, Feb 9, 2024 at 7:04 AM David Arthur 
> > > > 

Re: Github build queue

2024-02-09 Thread Ismael Juma
Please check https://github.com/apache/kafka/pull/14186 before making the
`unitTest` and `integrationTest` split.

Ismael

On Fri, Feb 9, 2024 at 9:16 AM Josep Prat 
wrote:

> Regarding "Split our CI "test" job into unit and integration so we can
> start collecting data on those suites", can we run these 2 tasks in the
> same machine? So they won't need to compile classes twice for the same
> exact code?
>
> On Fri, Feb 9, 2024 at 6:05 PM Ismael Juma  wrote:
>
> > Why can't we add @Tag("integration") for all of those tests? Seems like
> > that would not be too hard.
> >
> > Ismael
> >
> > On Fri, Feb 9, 2024 at 9:03 AM Greg Harris  >
> > wrote:
> >
> > > Hi David,
> > >
> > > +1 on that strategy.
> > >
> > > I see several flaky tests that aren't marked with @Tag("integration")
> > > or @IntegrationTest, and I think those would make using the unitTest
> > > target ineffective here. We could also start a new tag @Tag("flaky")
> > > and exclude that.
> > >
> > > Thanks,
> > > Greg
> > >
> > > On Fri, Feb 9, 2024 at 8:57 AM David Arthur  wrote:
> > > >
> > > > I do think we can add a PR to the merge queue while bypassing branch
> > > > potections (like we do for the Merge button today), but I'm not 100%
> > > sure.
> > > > I like the idea of running unit tests, though I don't think we have
> > data
> > > on
> > > > how long just the unit tests run on Jenkins (since we run the "test"
> > > target
> > > > which includes all tests). I'm also not sure how flaky the unit test
> > > suite
> > > > is alone.
> > > >
> > > > Since we already bypass the PR checks when merging, it seems that
> > adding
> > > a
> > > > required compile/check step before landing on trunk is strictly an
> > > > improvement.
> > > >
> > > > What about this as a short term plan:
> > > >
> > > > 1) Add the merge queue, only run compile/check
> > > > 2) Split our CI "test" job into unit and integration so we can start
> > > > collecting data on those suites
> > > > 3) Add "unitTest" to merge queue job once we're satisfied it won't
> > cause
> > > > disruption
> > > >
> > > >
> > > >
> > > >
> > > > On Fri, Feb 9, 2024 at 11:43 AM Josep Prat
>  > >
> > > > wrote:
> > > >
> > > > > Hi David,
> > > > > I like the idea, it will solve the problem we've seen a couple of
> > > times in
> > > > > the last 2 weeks where compilation for some Scala version failed,
> it
> > > was
> > > > > probably overlooked during the PR build because of the flakiness of
> > > tests
> > > > > and the compilation failure was buried among the amount of failed
> > > tests.
> > > > >
> > > > > Regarding the type of check, I'm not sure what's best, have a real
> > > quick
> > > > > check or a longer one including unit tests. A full test suite will
> > run
> > > per
> > > > > each commit in each PR (these we have definitely more than 8 per
> day)
> > > and
> > > > > this should be used to ensure changes are safe and sound. I'm not
> > sure
> > > if
> > > > > having unit tests run as well before the merge itself would cause
> too
> > > much
> > > > > of an extra load on the CI machines.
> > > > > We can go with `gradlew unitTest` and see if this takes too long or
> > > causes
> > > > > too many delays with the normal pipeline.
> > > > >
> > > > > Best,
> > > > >
> > > > > On Fri, Feb 9, 2024 at 4:16 PM Ismael Juma 
> > wrote:
> > > > >
> > > > > > Hi David,
> > > > > >
> > > > > > I think this is a helpful thing (and something I hoped we would
> use
> > > when
> > > > > I
> > > > > > learned about it), but it does require the validation checks to
> be
> > > > > reliable
> > > > > > (or else the PR won't be merged). Sounds like you are suggesting
> to
> > > skip
> > > > > > the tests for the merge queue validation. Could we perhaps
> include
> > > the
> > > > > unit
> > > > > > tests as well? That would incentivize us to ensure the unit tests
> > are
> > > > > fast
> > > > > > and reliable. Getting the integration tests to the same state
> will
> > > be a
> > > > > > longer journey.
> > > > > >
> > > > > > Ismael
> > > > > >
> > > > > > On Fri, Feb 9, 2024 at 7:04 AM David Arthur 
> > > wrote:
> > > > > >
> > > > > > > Hey folks,
> > > > > > >
> > > > > > > I recently learned about Github's Merge Queue feature, and I
> > think
> > > it
> > > > > > could
> > > > > > > help us out.
> > > > > > >
> > > > > > > Essentially, when you hit the Merge button on a PR, it will add
> > > the PR
> > > > > > to a
> > > > > > > queue and let you run a CI job before merging. Just something
> > > simple
> > > > > like
> > > > > > > compile + static analysis would probably save us from a lot of
> > > > > headaches
> > > > > > on
> > > > > > > trunk.
> > > > > > >
> > > > > > > I can think of two situations this would help us avoid:
> > > > > > > * Two valid PRs are merged near one another, but they create a
> > code
> > > > > > > breakage (rare)
> > > > > > > * A quick little "fixup" commit on a PR actually breaks
> something
> > > (less
> > > > > > > rare)
> > > > > > >
> > > > > > > Looking at 

Re: Github build queue

2024-02-09 Thread Greg Harris
Hey Ismael,

I took David's statement here:

> Split our CI "test" job into unit and integration so we can start collecting 
> data on those suites

to include moving all the flaky tests to the integrationTest target by
adding the annotation. We can do that while the merge queue is just
running the compile/jar target, and then tweak the merge queue to run
the unitTest target once we're satisfied that it isn't flaky.

Thanks,
Greg

On Fri, Feb 9, 2024 at 9:17 AM Josep Prat  wrote:
>
> Regarding "Split our CI "test" job into unit and integration so we can
> start collecting data on those suites", can we run these 2 tasks in the
> same machine? So they won't need to compile classes twice for the same
> exact code?
>
> On Fri, Feb 9, 2024 at 6:05 PM Ismael Juma  wrote:
>
> > Why can't we add @Tag("integration") for all of those tests? Seems like
> > that would not be too hard.
> >
> > Ismael
> >
> > On Fri, Feb 9, 2024 at 9:03 AM Greg Harris 
> > wrote:
> >
> > > Hi David,
> > >
> > > +1 on that strategy.
> > >
> > > I see several flaky tests that aren't marked with @Tag("integration")
> > > or @IntegrationTest, and I think those would make using the unitTest
> > > target ineffective here. We could also start a new tag @Tag("flaky")
> > > and exclude that.
> > >
> > > Thanks,
> > > Greg
> > >
> > > On Fri, Feb 9, 2024 at 8:57 AM David Arthur  wrote:
> > > >
> > > > I do think we can add a PR to the merge queue while bypassing branch
> > > > potections (like we do for the Merge button today), but I'm not 100%
> > > sure.
> > > > I like the idea of running unit tests, though I don't think we have
> > data
> > > on
> > > > how long just the unit tests run on Jenkins (since we run the "test"
> > > target
> > > > which includes all tests). I'm also not sure how flaky the unit test
> > > suite
> > > > is alone.
> > > >
> > > > Since we already bypass the PR checks when merging, it seems that
> > adding
> > > a
> > > > required compile/check step before landing on trunk is strictly an
> > > > improvement.
> > > >
> > > > What about this as a short term plan:
> > > >
> > > > 1) Add the merge queue, only run compile/check
> > > > 2) Split our CI "test" job into unit and integration so we can start
> > > > collecting data on those suites
> > > > 3) Add "unitTest" to merge queue job once we're satisfied it won't
> > cause
> > > > disruption
> > > >
> > > >
> > > >
> > > >
> > > > On Fri, Feb 9, 2024 at 11:43 AM Josep Prat  > >
> > > > wrote:
> > > >
> > > > > Hi David,
> > > > > I like the idea, it will solve the problem we've seen a couple of
> > > times in
> > > > > the last 2 weeks where compilation for some Scala version failed, it
> > > was
> > > > > probably overlooked during the PR build because of the flakiness of
> > > tests
> > > > > and the compilation failure was buried among the amount of failed
> > > tests.
> > > > >
> > > > > Regarding the type of check, I'm not sure what's best, have a real
> > > quick
> > > > > check or a longer one including unit tests. A full test suite will
> > run
> > > per
> > > > > each commit in each PR (these we have definitely more than 8 per day)
> > > and
> > > > > this should be used to ensure changes are safe and sound. I'm not
> > sure
> > > if
> > > > > having unit tests run as well before the merge itself would cause too
> > > much
> > > > > of an extra load on the CI machines.
> > > > > We can go with `gradlew unitTest` and see if this takes too long or
> > > causes
> > > > > too many delays with the normal pipeline.
> > > > >
> > > > > Best,
> > > > >
> > > > > On Fri, Feb 9, 2024 at 4:16 PM Ismael Juma 
> > wrote:
> > > > >
> > > > > > Hi David,
> > > > > >
> > > > > > I think this is a helpful thing (and something I hoped we would use
> > > when
> > > > > I
> > > > > > learned about it), but it does require the validation checks to be
> > > > > reliable
> > > > > > (or else the PR won't be merged). Sounds like you are suggesting to
> > > skip
> > > > > > the tests for the merge queue validation. Could we perhaps include
> > > the
> > > > > unit
> > > > > > tests as well? That would incentivize us to ensure the unit tests
> > are
> > > > > fast
> > > > > > and reliable. Getting the integration tests to the same state will
> > > be a
> > > > > > longer journey.
> > > > > >
> > > > > > Ismael
> > > > > >
> > > > > > On Fri, Feb 9, 2024 at 7:04 AM David Arthur 
> > > wrote:
> > > > > >
> > > > > > > Hey folks,
> > > > > > >
> > > > > > > I recently learned about Github's Merge Queue feature, and I
> > think
> > > it
> > > > > > could
> > > > > > > help us out.
> > > > > > >
> > > > > > > Essentially, when you hit the Merge button on a PR, it will add
> > > the PR
> > > > > > to a
> > > > > > > queue and let you run a CI job before merging. Just something
> > > simple
> > > > > like
> > > > > > > compile + static analysis would probably save us from a lot of
> > > > > headaches
> > > > > > on
> > > > > > > trunk.
> > > > > > >
> > > > > > > I can think of two situations 

Re: Github build queue

2024-02-09 Thread David Arthur
> Regarding "Split our CI "test" job into unit and integration

I believe all of the "steps" inside the "stage" directive are run on the
same node sequentially. I think we could do something like

steps {
  doValidation()
  doUnitTest()
  doIntegrationTest()
  tryStreamsArchetype()
}

and it shouldn't affect the overall runtime much.


+1 to sticking with @Tag("integration") rather than adding a new tag. It
would be good to keep track of any unit tests we "downgrade" to integration
with a JIRA.


On Fri, Feb 9, 2024 at 12:18 PM Josep Prat 
wrote:

> Regarding "Split our CI "test" job into unit and integration so we can
> start collecting data on those suites", can we run these 2 tasks in the
> same machine? So they won't need to compile classes twice for the same
> exact code?
>
> On Fri, Feb 9, 2024 at 6:05 PM Ismael Juma  wrote:
>
> > Why can't we add @Tag("integration") for all of those tests? Seems like
> > that would not be too hard.
> >
> > Ismael
> >
> > On Fri, Feb 9, 2024 at 9:03 AM Greg Harris  >
> > wrote:
> >
> > > Hi David,
> > >
> > > +1 on that strategy.
> > >
> > > I see several flaky tests that aren't marked with @Tag("integration")
> > > or @IntegrationTest, and I think those would make using the unitTest
> > > target ineffective here. We could also start a new tag @Tag("flaky")
> > > and exclude that.
> > >
> > > Thanks,
> > > Greg
> > >
> > > On Fri, Feb 9, 2024 at 8:57 AM David Arthur  wrote:
> > > >
> > > > I do think we can add a PR to the merge queue while bypassing branch
> > > > potections (like we do for the Merge button today), but I'm not 100%
> > > sure.
> > > > I like the idea of running unit tests, though I don't think we have
> > data
> > > on
> > > > how long just the unit tests run on Jenkins (since we run the "test"
> > > target
> > > > which includes all tests). I'm also not sure how flaky the unit test
> > > suite
> > > > is alone.
> > > >
> > > > Since we already bypass the PR checks when merging, it seems that
> > adding
> > > a
> > > > required compile/check step before landing on trunk is strictly an
> > > > improvement.
> > > >
> > > > What about this as a short term plan:
> > > >
> > > > 1) Add the merge queue, only run compile/check
> > > > 2) Split our CI "test" job into unit and integration so we can start
> > > > collecting data on those suites
> > > > 3) Add "unitTest" to merge queue job once we're satisfied it won't
> > cause
> > > > disruption
> > > >
> > > >
> > > >
> > > >
> > > > On Fri, Feb 9, 2024 at 11:43 AM Josep Prat
>  > >
> > > > wrote:
> > > >
> > > > > Hi David,
> > > > > I like the idea, it will solve the problem we've seen a couple of
> > > times in
> > > > > the last 2 weeks where compilation for some Scala version failed,
> it
> > > was
> > > > > probably overlooked during the PR build because of the flakiness of
> > > tests
> > > > > and the compilation failure was buried among the amount of failed
> > > tests.
> > > > >
> > > > > Regarding the type of check, I'm not sure what's best, have a real
> > > quick
> > > > > check or a longer one including unit tests. A full test suite will
> > run
> > > per
> > > > > each commit in each PR (these we have definitely more than 8 per
> day)
> > > and
> > > > > this should be used to ensure changes are safe and sound. I'm not
> > sure
> > > if
> > > > > having unit tests run as well before the merge itself would cause
> too
> > > much
> > > > > of an extra load on the CI machines.
> > > > > We can go with `gradlew unitTest` and see if this takes too long or
> > > causes
> > > > > too many delays with the normal pipeline.
> > > > >
> > > > > Best,
> > > > >
> > > > > On Fri, Feb 9, 2024 at 4:16 PM Ismael Juma 
> > wrote:
> > > > >
> > > > > > Hi David,
> > > > > >
> > > > > > I think this is a helpful thing (and something I hoped we would
> use
> > > when
> > > > > I
> > > > > > learned about it), but it does require the validation checks to
> be
> > > > > reliable
> > > > > > (or else the PR won't be merged). Sounds like you are suggesting
> to
> > > skip
> > > > > > the tests for the merge queue validation. Could we perhaps
> include
> > > the
> > > > > unit
> > > > > > tests as well? That would incentivize us to ensure the unit tests
> > are
> > > > > fast
> > > > > > and reliable. Getting the integration tests to the same state
> will
> > > be a
> > > > > > longer journey.
> > > > > >
> > > > > > Ismael
> > > > > >
> > > > > > On Fri, Feb 9, 2024 at 7:04 AM David Arthur 
> > > wrote:
> > > > > >
> > > > > > > Hey folks,
> > > > > > >
> > > > > > > I recently learned about Github's Merge Queue feature, and I
> > think
> > > it
> > > > > > could
> > > > > > > help us out.
> > > > > > >
> > > > > > > Essentially, when you hit the Merge button on a PR, it will add
> > > the PR
> > > > > > to a
> > > > > > > queue and let you run a CI job before merging. Just something
> > > simple
> > > > > like
> > > > > > > compile + static analysis would probably save us from a lot of
> > > > > headaches
> > > > 

Re: Github build queue

2024-02-09 Thread Josep Prat
Regarding "Split our CI "test" job into unit and integration so we can
start collecting data on those suites", can we run these 2 tasks in the
same machine? So they won't need to compile classes twice for the same
exact code?

On Fri, Feb 9, 2024 at 6:05 PM Ismael Juma  wrote:

> Why can't we add @Tag("integration") for all of those tests? Seems like
> that would not be too hard.
>
> Ismael
>
> On Fri, Feb 9, 2024 at 9:03 AM Greg Harris 
> wrote:
>
> > Hi David,
> >
> > +1 on that strategy.
> >
> > I see several flaky tests that aren't marked with @Tag("integration")
> > or @IntegrationTest, and I think those would make using the unitTest
> > target ineffective here. We could also start a new tag @Tag("flaky")
> > and exclude that.
> >
> > Thanks,
> > Greg
> >
> > On Fri, Feb 9, 2024 at 8:57 AM David Arthur  wrote:
> > >
> > > I do think we can add a PR to the merge queue while bypassing branch
> > > potections (like we do for the Merge button today), but I'm not 100%
> > sure.
> > > I like the idea of running unit tests, though I don't think we have
> data
> > on
> > > how long just the unit tests run on Jenkins (since we run the "test"
> > target
> > > which includes all tests). I'm also not sure how flaky the unit test
> > suite
> > > is alone.
> > >
> > > Since we already bypass the PR checks when merging, it seems that
> adding
> > a
> > > required compile/check step before landing on trunk is strictly an
> > > improvement.
> > >
> > > What about this as a short term plan:
> > >
> > > 1) Add the merge queue, only run compile/check
> > > 2) Split our CI "test" job into unit and integration so we can start
> > > collecting data on those suites
> > > 3) Add "unitTest" to merge queue job once we're satisfied it won't
> cause
> > > disruption
> > >
> > >
> > >
> > >
> > > On Fri, Feb 9, 2024 at 11:43 AM Josep Prat  >
> > > wrote:
> > >
> > > > Hi David,
> > > > I like the idea, it will solve the problem we've seen a couple of
> > times in
> > > > the last 2 weeks where compilation for some Scala version failed, it
> > was
> > > > probably overlooked during the PR build because of the flakiness of
> > tests
> > > > and the compilation failure was buried among the amount of failed
> > tests.
> > > >
> > > > Regarding the type of check, I'm not sure what's best, have a real
> > quick
> > > > check or a longer one including unit tests. A full test suite will
> run
> > per
> > > > each commit in each PR (these we have definitely more than 8 per day)
> > and
> > > > this should be used to ensure changes are safe and sound. I'm not
> sure
> > if
> > > > having unit tests run as well before the merge itself would cause too
> > much
> > > > of an extra load on the CI machines.
> > > > We can go with `gradlew unitTest` and see if this takes too long or
> > causes
> > > > too many delays with the normal pipeline.
> > > >
> > > > Best,
> > > >
> > > > On Fri, Feb 9, 2024 at 4:16 PM Ismael Juma 
> wrote:
> > > >
> > > > > Hi David,
> > > > >
> > > > > I think this is a helpful thing (and something I hoped we would use
> > when
> > > > I
> > > > > learned about it), but it does require the validation checks to be
> > > > reliable
> > > > > (or else the PR won't be merged). Sounds like you are suggesting to
> > skip
> > > > > the tests for the merge queue validation. Could we perhaps include
> > the
> > > > unit
> > > > > tests as well? That would incentivize us to ensure the unit tests
> are
> > > > fast
> > > > > and reliable. Getting the integration tests to the same state will
> > be a
> > > > > longer journey.
> > > > >
> > > > > Ismael
> > > > >
> > > > > On Fri, Feb 9, 2024 at 7:04 AM David Arthur 
> > wrote:
> > > > >
> > > > > > Hey folks,
> > > > > >
> > > > > > I recently learned about Github's Merge Queue feature, and I
> think
> > it
> > > > > could
> > > > > > help us out.
> > > > > >
> > > > > > Essentially, when you hit the Merge button on a PR, it will add
> > the PR
> > > > > to a
> > > > > > queue and let you run a CI job before merging. Just something
> > simple
> > > > like
> > > > > > compile + static analysis would probably save us from a lot of
> > > > headaches
> > > > > on
> > > > > > trunk.
> > > > > >
> > > > > > I can think of two situations this would help us avoid:
> > > > > > * Two valid PRs are merged near one another, but they create a
> code
> > > > > > breakage (rare)
> > > > > > * A quick little "fixup" commit on a PR actually breaks something
> > (less
> > > > > > rare)
> > > > > >
> > > > > > Looking at our Github stats, we are averaging under 40 commits
> per
> > > > week.
> > > > > > Assuming those primarily come in on weekdays, that's 8 commits
> per
> > day.
> > > > > If
> > > > > > we just run "gradlew check -x tests" for the merge queue job, I
> > don't
> > > > > think
> > > > > > we'd get backlogged.
> > > > > >
> > > > > > Thoughts?
> > > > > > David
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > > David Arthur
> > > > > >
> > > > >
> > > >
> > > >

Re: Github build queue

2024-02-09 Thread Ismael Juma
Why can't we add @Tag("integration") for all of those tests? Seems like
that would not be too hard.

Ismael

On Fri, Feb 9, 2024 at 9:03 AM Greg Harris 
wrote:

> Hi David,
>
> +1 on that strategy.
>
> I see several flaky tests that aren't marked with @Tag("integration")
> or @IntegrationTest, and I think those would make using the unitTest
> target ineffective here. We could also start a new tag @Tag("flaky")
> and exclude that.
>
> Thanks,
> Greg
>
> On Fri, Feb 9, 2024 at 8:57 AM David Arthur  wrote:
> >
> > I do think we can add a PR to the merge queue while bypassing branch
> > potections (like we do for the Merge button today), but I'm not 100%
> sure.
> > I like the idea of running unit tests, though I don't think we have data
> on
> > how long just the unit tests run on Jenkins (since we run the "test"
> target
> > which includes all tests). I'm also not sure how flaky the unit test
> suite
> > is alone.
> >
> > Since we already bypass the PR checks when merging, it seems that adding
> a
> > required compile/check step before landing on trunk is strictly an
> > improvement.
> >
> > What about this as a short term plan:
> >
> > 1) Add the merge queue, only run compile/check
> > 2) Split our CI "test" job into unit and integration so we can start
> > collecting data on those suites
> > 3) Add "unitTest" to merge queue job once we're satisfied it won't cause
> > disruption
> >
> >
> >
> >
> > On Fri, Feb 9, 2024 at 11:43 AM Josep Prat 
> > wrote:
> >
> > > Hi David,
> > > I like the idea, it will solve the problem we've seen a couple of
> times in
> > > the last 2 weeks where compilation for some Scala version failed, it
> was
> > > probably overlooked during the PR build because of the flakiness of
> tests
> > > and the compilation failure was buried among the amount of failed
> tests.
> > >
> > > Regarding the type of check, I'm not sure what's best, have a real
> quick
> > > check or a longer one including unit tests. A full test suite will run
> per
> > > each commit in each PR (these we have definitely more than 8 per day)
> and
> > > this should be used to ensure changes are safe and sound. I'm not sure
> if
> > > having unit tests run as well before the merge itself would cause too
> much
> > > of an extra load on the CI machines.
> > > We can go with `gradlew unitTest` and see if this takes too long or
> causes
> > > too many delays with the normal pipeline.
> > >
> > > Best,
> > >
> > > On Fri, Feb 9, 2024 at 4:16 PM Ismael Juma  wrote:
> > >
> > > > Hi David,
> > > >
> > > > I think this is a helpful thing (and something I hoped we would use
> when
> > > I
> > > > learned about it), but it does require the validation checks to be
> > > reliable
> > > > (or else the PR won't be merged). Sounds like you are suggesting to
> skip
> > > > the tests for the merge queue validation. Could we perhaps include
> the
> > > unit
> > > > tests as well? That would incentivize us to ensure the unit tests are
> > > fast
> > > > and reliable. Getting the integration tests to the same state will
> be a
> > > > longer journey.
> > > >
> > > > Ismael
> > > >
> > > > On Fri, Feb 9, 2024 at 7:04 AM David Arthur 
> wrote:
> > > >
> > > > > Hey folks,
> > > > >
> > > > > I recently learned about Github's Merge Queue feature, and I think
> it
> > > > could
> > > > > help us out.
> > > > >
> > > > > Essentially, when you hit the Merge button on a PR, it will add
> the PR
> > > > to a
> > > > > queue and let you run a CI job before merging. Just something
> simple
> > > like
> > > > > compile + static analysis would probably save us from a lot of
> > > headaches
> > > > on
> > > > > trunk.
> > > > >
> > > > > I can think of two situations this would help us avoid:
> > > > > * Two valid PRs are merged near one another, but they create a code
> > > > > breakage (rare)
> > > > > * A quick little "fixup" commit on a PR actually breaks something
> (less
> > > > > rare)
> > > > >
> > > > > Looking at our Github stats, we are averaging under 40 commits per
> > > week.
> > > > > Assuming those primarily come in on weekdays, that's 8 commits per
> day.
> > > > If
> > > > > we just run "gradlew check -x tests" for the merge queue job, I
> don't
> > > > think
> > > > > we'd get backlogged.
> > > > >
> > > > > Thoughts?
> > > > > David
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > > David Arthur
> > > > >
> > > >
> > >
> > >
> > > --
> > > [image: Aiven] 
> > >
> > > *Josep Prat*
> > > Open Source Engineering Director, *Aiven*
> > > josep.p...@aiven.io   |   +491715557497
> > > aiven.io    |   <
> https://www.facebook.com/aivencloud
> > > >
> > >      <
> > > https://twitter.com/aiven_io>
> > > *Aiven Deutschland GmbH*
> > > Alexanderufer 3-7, 10117 Berlin
> > > Geschäftsführer: Oskari Saarenmaa & Hannu Valtonen
> > > Amtsgericht Charlottenburg, HRB 209739 B
> > >
> >
> >
> > --
> > David Arthur
>


Re: Github build queue

2024-02-09 Thread Greg Harris
Hi David,

+1 on that strategy.

I see several flaky tests that aren't marked with @Tag("integration")
or @IntegrationTest, and I think those would make using the unitTest
target ineffective here. We could also start a new tag @Tag("flaky")
and exclude that.

Thanks,
Greg

On Fri, Feb 9, 2024 at 8:57 AM David Arthur  wrote:
>
> I do think we can add a PR to the merge queue while bypassing branch
> potections (like we do for the Merge button today), but I'm not 100% sure.
> I like the idea of running unit tests, though I don't think we have data on
> how long just the unit tests run on Jenkins (since we run the "test" target
> which includes all tests). I'm also not sure how flaky the unit test suite
> is alone.
>
> Since we already bypass the PR checks when merging, it seems that adding a
> required compile/check step before landing on trunk is strictly an
> improvement.
>
> What about this as a short term plan:
>
> 1) Add the merge queue, only run compile/check
> 2) Split our CI "test" job into unit and integration so we can start
> collecting data on those suites
> 3) Add "unitTest" to merge queue job once we're satisfied it won't cause
> disruption
>
>
>
>
> On Fri, Feb 9, 2024 at 11:43 AM Josep Prat 
> wrote:
>
> > Hi David,
> > I like the idea, it will solve the problem we've seen a couple of times in
> > the last 2 weeks where compilation for some Scala version failed, it was
> > probably overlooked during the PR build because of the flakiness of tests
> > and the compilation failure was buried among the amount of failed tests.
> >
> > Regarding the type of check, I'm not sure what's best, have a real quick
> > check or a longer one including unit tests. A full test suite will run per
> > each commit in each PR (these we have definitely more than 8 per day) and
> > this should be used to ensure changes are safe and sound. I'm not sure if
> > having unit tests run as well before the merge itself would cause too much
> > of an extra load on the CI machines.
> > We can go with `gradlew unitTest` and see if this takes too long or causes
> > too many delays with the normal pipeline.
> >
> > Best,
> >
> > On Fri, Feb 9, 2024 at 4:16 PM Ismael Juma  wrote:
> >
> > > Hi David,
> > >
> > > I think this is a helpful thing (and something I hoped we would use when
> > I
> > > learned about it), but it does require the validation checks to be
> > reliable
> > > (or else the PR won't be merged). Sounds like you are suggesting to skip
> > > the tests for the merge queue validation. Could we perhaps include the
> > unit
> > > tests as well? That would incentivize us to ensure the unit tests are
> > fast
> > > and reliable. Getting the integration tests to the same state will be a
> > > longer journey.
> > >
> > > Ismael
> > >
> > > On Fri, Feb 9, 2024 at 7:04 AM David Arthur  wrote:
> > >
> > > > Hey folks,
> > > >
> > > > I recently learned about Github's Merge Queue feature, and I think it
> > > could
> > > > help us out.
> > > >
> > > > Essentially, when you hit the Merge button on a PR, it will add the PR
> > > to a
> > > > queue and let you run a CI job before merging. Just something simple
> > like
> > > > compile + static analysis would probably save us from a lot of
> > headaches
> > > on
> > > > trunk.
> > > >
> > > > I can think of two situations this would help us avoid:
> > > > * Two valid PRs are merged near one another, but they create a code
> > > > breakage (rare)
> > > > * A quick little "fixup" commit on a PR actually breaks something (less
> > > > rare)
> > > >
> > > > Looking at our Github stats, we are averaging under 40 commits per
> > week.
> > > > Assuming those primarily come in on weekdays, that's 8 commits per day.
> > > If
> > > > we just run "gradlew check -x tests" for the merge queue job, I don't
> > > think
> > > > we'd get backlogged.
> > > >
> > > > Thoughts?
> > > > David
> > > >
> > > >
> > > >
> > > >
> > > > --
> > > > David Arthur
> > > >
> > >
> >
> >
> > --
> > [image: Aiven] 
> >
> > *Josep Prat*
> > Open Source Engineering Director, *Aiven*
> > josep.p...@aiven.io   |   +491715557497
> > aiven.io    |    > >
> >      <
> > https://twitter.com/aiven_io>
> > *Aiven Deutschland GmbH*
> > Alexanderufer 3-7, 10117 Berlin
> > Geschäftsführer: Oskari Saarenmaa & Hannu Valtonen
> > Amtsgericht Charlottenburg, HRB 209739 B
> >
>
>
> --
> David Arthur


Re: Github build queue

2024-02-09 Thread David Arthur
I do think we can add a PR to the merge queue while bypassing branch
potections (like we do for the Merge button today), but I'm not 100% sure.
I like the idea of running unit tests, though I don't think we have data on
how long just the unit tests run on Jenkins (since we run the "test" target
which includes all tests). I'm also not sure how flaky the unit test suite
is alone.

Since we already bypass the PR checks when merging, it seems that adding a
required compile/check step before landing on trunk is strictly an
improvement.

What about this as a short term plan:

1) Add the merge queue, only run compile/check
2) Split our CI "test" job into unit and integration so we can start
collecting data on those suites
3) Add "unitTest" to merge queue job once we're satisfied it won't cause
disruption




On Fri, Feb 9, 2024 at 11:43 AM Josep Prat 
wrote:

> Hi David,
> I like the idea, it will solve the problem we've seen a couple of times in
> the last 2 weeks where compilation for some Scala version failed, it was
> probably overlooked during the PR build because of the flakiness of tests
> and the compilation failure was buried among the amount of failed tests.
>
> Regarding the type of check, I'm not sure what's best, have a real quick
> check or a longer one including unit tests. A full test suite will run per
> each commit in each PR (these we have definitely more than 8 per day) and
> this should be used to ensure changes are safe and sound. I'm not sure if
> having unit tests run as well before the merge itself would cause too much
> of an extra load on the CI machines.
> We can go with `gradlew unitTest` and see if this takes too long or causes
> too many delays with the normal pipeline.
>
> Best,
>
> On Fri, Feb 9, 2024 at 4:16 PM Ismael Juma  wrote:
>
> > Hi David,
> >
> > I think this is a helpful thing (and something I hoped we would use when
> I
> > learned about it), but it does require the validation checks to be
> reliable
> > (or else the PR won't be merged). Sounds like you are suggesting to skip
> > the tests for the merge queue validation. Could we perhaps include the
> unit
> > tests as well? That would incentivize us to ensure the unit tests are
> fast
> > and reliable. Getting the integration tests to the same state will be a
> > longer journey.
> >
> > Ismael
> >
> > On Fri, Feb 9, 2024 at 7:04 AM David Arthur  wrote:
> >
> > > Hey folks,
> > >
> > > I recently learned about Github's Merge Queue feature, and I think it
> > could
> > > help us out.
> > >
> > > Essentially, when you hit the Merge button on a PR, it will add the PR
> > to a
> > > queue and let you run a CI job before merging. Just something simple
> like
> > > compile + static analysis would probably save us from a lot of
> headaches
> > on
> > > trunk.
> > >
> > > I can think of two situations this would help us avoid:
> > > * Two valid PRs are merged near one another, but they create a code
> > > breakage (rare)
> > > * A quick little "fixup" commit on a PR actually breaks something (less
> > > rare)
> > >
> > > Looking at our Github stats, we are averaging under 40 commits per
> week.
> > > Assuming those primarily come in on weekdays, that's 8 commits per day.
> > If
> > > we just run "gradlew check -x tests" for the merge queue job, I don't
> > think
> > > we'd get backlogged.
> > >
> > > Thoughts?
> > > David
> > >
> > >
> > >
> > >
> > > --
> > > David Arthur
> > >
> >
>
>
> --
> [image: Aiven] 
>
> *Josep Prat*
> Open Source Engineering Director, *Aiven*
> josep.p...@aiven.io   |   +491715557497
> aiven.io    |    >
>      <
> https://twitter.com/aiven_io>
> *Aiven Deutschland GmbH*
> Alexanderufer 3-7, 10117 Berlin
> Geschäftsführer: Oskari Saarenmaa & Hannu Valtonen
> Amtsgericht Charlottenburg, HRB 209739 B
>


-- 
David Arthur


Re: Github build queue

2024-02-09 Thread Josep Prat
Hi David,
I like the idea, it will solve the problem we've seen a couple of times in
the last 2 weeks where compilation for some Scala version failed, it was
probably overlooked during the PR build because of the flakiness of tests
and the compilation failure was buried among the amount of failed tests.

Regarding the type of check, I'm not sure what's best, have a real quick
check or a longer one including unit tests. A full test suite will run per
each commit in each PR (these we have definitely more than 8 per day) and
this should be used to ensure changes are safe and sound. I'm not sure if
having unit tests run as well before the merge itself would cause too much
of an extra load on the CI machines.
We can go with `gradlew unitTest` and see if this takes too long or causes
too many delays with the normal pipeline.

Best,

On Fri, Feb 9, 2024 at 4:16 PM Ismael Juma  wrote:

> Hi David,
>
> I think this is a helpful thing (and something I hoped we would use when I
> learned about it), but it does require the validation checks to be reliable
> (or else the PR won't be merged). Sounds like you are suggesting to skip
> the tests for the merge queue validation. Could we perhaps include the unit
> tests as well? That would incentivize us to ensure the unit tests are fast
> and reliable. Getting the integration tests to the same state will be a
> longer journey.
>
> Ismael
>
> On Fri, Feb 9, 2024 at 7:04 AM David Arthur  wrote:
>
> > Hey folks,
> >
> > I recently learned about Github's Merge Queue feature, and I think it
> could
> > help us out.
> >
> > Essentially, when you hit the Merge button on a PR, it will add the PR
> to a
> > queue and let you run a CI job before merging. Just something simple like
> > compile + static analysis would probably save us from a lot of headaches
> on
> > trunk.
> >
> > I can think of two situations this would help us avoid:
> > * Two valid PRs are merged near one another, but they create a code
> > breakage (rare)
> > * A quick little "fixup" commit on a PR actually breaks something (less
> > rare)
> >
> > Looking at our Github stats, we are averaging under 40 commits per week.
> > Assuming those primarily come in on weekdays, that's 8 commits per day.
> If
> > we just run "gradlew check -x tests" for the merge queue job, I don't
> think
> > we'd get backlogged.
> >
> > Thoughts?
> > David
> >
> >
> >
> >
> > --
> > David Arthur
> >
>


-- 
[image: Aiven] 

*Josep Prat*
Open Source Engineering Director, *Aiven*
josep.p...@aiven.io   |   +491715557497
aiven.io    |   
     
*Aiven Deutschland GmbH*
Alexanderufer 3-7, 10117 Berlin
Geschäftsführer: Oskari Saarenmaa & Hannu Valtonen
Amtsgericht Charlottenburg, HRB 209739 B


Re: Github build queue

2024-02-09 Thread Ismael Juma
Hi David,

I think this is a helpful thing (and something I hoped we would use when I
learned about it), but it does require the validation checks to be reliable
(or else the PR won't be merged). Sounds like you are suggesting to skip
the tests for the merge queue validation. Could we perhaps include the unit
tests as well? That would incentivize us to ensure the unit tests are fast
and reliable. Getting the integration tests to the same state will be a
longer journey.

Ismael

On Fri, Feb 9, 2024 at 7:04 AM David Arthur  wrote:

> Hey folks,
>
> I recently learned about Github's Merge Queue feature, and I think it could
> help us out.
>
> Essentially, when you hit the Merge button on a PR, it will add the PR to a
> queue and let you run a CI job before merging. Just something simple like
> compile + static analysis would probably save us from a lot of headaches on
> trunk.
>
> I can think of two situations this would help us avoid:
> * Two valid PRs are merged near one another, but they create a code
> breakage (rare)
> * A quick little "fixup" commit on a PR actually breaks something (less
> rare)
>
> Looking at our Github stats, we are averaging under 40 commits per week.
> Assuming those primarily come in on weekdays, that's 8 commits per day. If
> we just run "gradlew check -x tests" for the merge queue job, I don't think
> we'd get backlogged.
>
> Thoughts?
> David
>
>
>
>
> --
> David Arthur
>


Re: Github build queue

2024-02-09 Thread Chris Egerton
+1, would love this.

On Fri, Feb 9, 2024, 10:04 David Arthur  wrote:

> Hey folks,
>
> I recently learned about Github's Merge Queue feature, and I think it could
> help us out.
>
> Essentially, when you hit the Merge button on a PR, it will add the PR to a
> queue and let you run a CI job before merging. Just something simple like
> compile + static analysis would probably save us from a lot of headaches on
> trunk.
>
> I can think of two situations this would help us avoid:
> * Two valid PRs are merged near one another, but they create a code
> breakage (rare)
> * A quick little "fixup" commit on a PR actually breaks something (less
> rare)
>
> Looking at our Github stats, we are averaging under 40 commits per week.
> Assuming those primarily come in on weekdays, that's 8 commits per day. If
> we just run "gradlew check -x tests" for the merge queue job, I don't think
> we'd get backlogged.
>
> Thoughts?
> David
>
>
>
>
> --
> David Arthur
>