I second Robert idea of ‘inteligently’ running only the affected tests,
probably
there is no need to run Java for a go fix (and eventually if any issue it
can be
catched in postcommit), same for a dev who just fixed something in KafkaIO
and has
to wait for other IO tests to pass. I suppose that languages, IOs and
extensions
are ‘easy’ to isolate so maybe we can start with those.

Earlier signals are also definitely great to have too, but not sure how we
can
have those with the current infra.

 From a quicklook the biggest time is consumed by the examples module
probably
because they run in Dataflow with real IOs no?, that module alone takes ~35
minutes, so maybe moving it to postcommit will gain us some quick
improvement.
On the other hand we should probably not dismiss the consequences of moving
more
stuff to postcommit given that our current postcommit is not the most
stable, or
the quickest, only the Dataflow suite takes 1h30!


On Tue, May 22, 2018 at 12:01 AM Mikhail Gryzykhin <mig...@google.com>
wrote:

> What we can do here is estimate how much effort we want to put in and set
remote target.
> Such as:
> Third quarter 2018 -- 1hr SLO
> Forth quarter 2018 -- 30min SLO,
> etc.

> Combined with policy for newly added tests, this can give us some goal to
aim for.

> --Mikhail

> Have feedback?


> On Mon, May 21, 2018 at 2:06 PM Scott Wegner <sweg...@google.com> wrote:

>> Thanks for the proposal, I left comments in the doc. Overall I think
it's a great idea.

>> I've seen other projects with much faster pre-commits, and it requires
strict guidelines on unit test design and keeping tests isolated in-memory
as much as possible. That's not currently the case in Java; we have
pre-commits which submit pipelines to Dataflow service.

>> I don't know if it's feasible to get Java down to 15-20 mins in the
short term, but a good starting point would be to document the requirements
for a test to run as pre-commit, and start enforcing it for new tests.


>> On Fri, May 18, 2018 at 3:25 PM Henning Rohde <hero...@google.com> wrote:

>>> Good proposal. I think it should be considered in tandem with the "No
commit on red post-commit" proposal and could be far more ambitious than 2
hours. For example, something in the <15-20 mins range, say, would be much
less of an inconvenience to the development effort. Go takes ~3 mins, which
means that it is practical to wait until a PR is green before asking anyone
to look at it. If I need to wait for a Java or Python pre-commit, I task
switch and come back later. If the post-commits are enforced to be green,
we could possibly gain a much more productive flow at the cost of the
occasional post-commit break, compared to now. Maybe IOs can be less
extensively tested pre-commit, for example, or only if actually changed?

>>> I also like Robert's suggestion of spitting up pre-commits into
something more fine-grained to get a clear partial signal quicker. If we
have an adequate number of Jenkins slots, it might also speed things up
overall.

>>> Thanks,
>>>    Henning

>>> On Fri, May 18, 2018 at 12:30 PM Scott Wegner <sweg...@google.com>
wrote:

>>>> re: intelligently skipping tests for code that doesn't change (i.e.
Java tests on Python PR): this should be possible. We already have
build-caching enabled in Gradle, but I believe it is local to the git
workspace and doesn't persist between Jenkins runs.

>>>> With a quick search, I see there is a Jenkins Build Cacher Plugin [1]
that hooks into Gradle build cache and does exactly what we need. Does
anybody know whether we could get this enabled on our Jenkins?

>>>> [1] https://wiki.jenkins.io/display/JENKINS/Job+Cacher+Plugin

>>>> On Fri, May 18, 2018 at 12:08 PM Robert Bradshaw <rober...@google.com>
wrote:

>>>>> [somehow  my email got garbled...]

>>>>> Now that we're using gradle, perhaps we could be more intelligent
about only running the affected tests? E.g. when you touch Python (or Go)
you shouldn't need to run the Java precommit at all, which would reduce the
latency for those PRs and also the time spent in queue. Presumably this
could even be applied per-module for the Java tests. (Maybe a large, shared
build cache could help here as well...)

>>>>> I also wouldn't be opposed to a quicker immediate signal, plus more
extensive tests before actually merging. It's also nice to not have to wait
an hour to see that you have a lint error; quick stuff like that could be
signaled quickly before a contributor looses context.

>>>>> - Robert



>>>>> On Fri, May 18, 2018 at 5:55 AM Kenneth Knowles <k...@google.com>
wrote:

>>>>>> I like the idea. I think it is a good time for the project to start
tracking this and keeping it usable.

>>>>>> Certainly 2 hours is more than enough, is that not so? The Java
precommit seems to take <=40 minutes while Python takes ~20 and Go is so
fast it doesn't matter. Do we have enough stragglers that we don't make it
in the 95th percentile? Is the time spent in the Jenkins queue?

>>>>>> For our current coverage, I'd be willing to go for:

>>>>>>    - 1 hr hard cap (someone better at stats could choose %ile)
>>>>>>    - roll back or remove test from precommit if fix looks like more
than 1 week (roll back if it is perf degradation, remove test from
precommit if it is additional coverage that just doesn't fit in the time)

>>>>>> There's a longer-term issue that doing a full build each time is
expected to linearly scale up with the size of our repo (it is the monorepo
problem but for a minirepo) so there is no cap that is feasible until we
have effective cross-build caching. And my long-term goal would be <30
minutes. At the latency of opening a pull request and then checking your
email that's not burdensome, but an hour is.

>>>>>> Kenn

>>>>>> On Thu, May 17, 2018 at 6:54 PM Udi Meiri <eh...@google.com> wrote:

>>>>>>> HI,
>>>>>>> I have a proposal to improve contributor experience by keeping
precommit times low.

>>>>>>> I'm looking to get community consensus and approval about:
>>>>>>> 1. How long should precommits take. 2 hours @95th percentile over
the past 4 weeks is the current proposal.
>>>>>>> 2. The process for dealing with slowness. Do we: fix, roll back,
remove a test from precommit?
>>>>>>> Rolling back if a fix is estimated to take longer than 2 weeks is
the current proposal.


https://docs.google.com/document/d/1udtvggmS2LTMmdwjEtZCcUQy6aQAiYTI3OrTP8CLfJM/edit?usp=sharing

Reply via email to