On Wed, Oct 25, 2017 at 3:59 PM, Lukasz Cwik <lc...@google.com.invalid>
wrote:

> Another suggestion would be to break apart the project into multiple maven
> projects which are released on independent schedules.
>

I think at some point this makes a lot of sense. Libraries of transforms
built against a stable SDK like Java should be OK, and this is a big chunk
of the build time. Fine to move to postcommit / selective precommit too.

Kenn



> Does anyone have any data points for what has worked for other open source
> communities in the past?
> I only have anecdotal evidence that Apache Ant+Ivy has worked well as it
> was much faster then Apache Maven.
>
> On Wed, Oct 25, 2017 at 1:03 PM, Kenneth Knowles <k...@google.com.invalid>
> wrote:
>
> > Hi all,
> >
> > I wanted to circle back on this. We've had continual quasi-outages, where
> > Jenkins occasionally schedules jobs but not reliably. Meanwhile, many of
> > our most important jobs continued to hit their timeouts when they did get
> > scheduled. On top of that, I've run the wrong command once before
> merging a
> > PR, resulting in a couple broken precommit runs.
> >
> > Just so everyone knows the actions that have been taken and what we still
> > need to do:
> >
> >  - Java postcommit and precommit timeouts bumped to 240 minutes (!!)
> >  - Seed job made independent of other jobs, so we can experiment safely
> >  - RAT config fixed
> >
> > The latency of precommit is still prohibitive for effective use. Not only
> > are we waiting too long, but the number of workers needed to avoid
> backlog
> > is excessive. Here are some ideas for things we can do:
> >
> >  - Proceed further on the pipeline job, which is what we want to do long
> > term
> >  - Precommit: run only on demand, so workers are not congested by
> automatic
> > builds; we could also pick and choose what we want to run
> >  - Consider non-Maven build orchestrators that can do dependency-driven
> > builds
> >  - Precommit: run fewer tests; I think this would mean leaving out some
> > modules that we are OK finding issues in the postcommit, or making them
> on
> > demand only
> >  - ValidatesRunner tests: Instead of running a single mavenJob, run
> > multiple, once which just installs while skipping everything else, then
> > follow it with just running the tests we care about
> >  - Examples integration tests: separate from the maven precommit; also
> run
> > as sequenced invocations
> >
> > These are just some ideas; I honestly don't know. I think my upper limit
> > for default precommit feedback is probably 30 minutes, and even there I
> am
> > not very happy.
> >
> > Any other suggestions?
> >
> > Kenn
> >
> > On Mon, Oct 23, 2017 at 3:29 PM, Kenneth Knowles <k...@google.com> wrote:
> >
> > > I want to wait and get some green from Jenkins running against the HEAD
> > > groovy scripts to confirm. I haven't sat at my desk long enough to see
> a
> > > full `mvn -P release clean verify` yet.
> > >
> > > On Mon, Oct 23, 2017 at 3:27 PM, Kenneth Knowles <k...@google.com>
> wrote:
> > >
> > >> I was easily able to reproduce all the sorts of failures (and some
> > more!)
> > >>
> > >> Here are some things that now work that didn't work, or didn't work
> > >> correctly, before
> > >>
> > >>  - mvn apache-rat:check
> > >>  - mvn -P release apache-rat:check
> > >>  - mvn -P release -f somewhere/else/pom.xml apache-rat:check
> > >>
> > >> It turns out we had these issues:
> > >>
> > >>  - items in .gitignore that our RAT config did not ignore
> > >>  - sub-modules actually *did* inherit the RAT config, depending on the
> > >> command you run
> > >>  - paths in the RAT exclude were relative to current dir
> > >>  - paths in our RAT exclude that aren't even part of our codebase or
> > >> generated targets, but just things that Jenkins dropped in the
> > workspace,
> > >> and we were cloning directly into the workspace
> > >>
> > >> I think with the new Jenkins config and the new pom.xml things should
> > >> work well.
> > >>
> > >> Incidentally, Valentyn, changes to Jenkins job DSL groovy scripts do
> not
> > >> take effect from being merged but only when the seed job next runs.
> > (TODO:
> > >> fix this). I am still trying to get the relevant Jenkins UI pages to
> > load
> > >> to get your change incorporated into the live scripts, which were
> > regressed
> > >> by being run from an old PR (TODO: fix the fact that this can happen).
> > >>
> > >> Kenn
> > >>
> > >> On Mon, Oct 23, 2017 at 2:21 PM, Valentyn Tymofieiev <
> > >> valen...@google.com.invalid> wrote:
> > >>
> > >>> Thanks a lot!
> > >>>
> > >>> Kenn, Were you able to reproduce RAT failures and test the fix
> > locally? I
> > >>> think "mvn clean verify -Prelease" still passes for me.
> > >>>
> > >>> Timeout for the test suite has been increased in
> > >>> https://github.com/apache/
> > >>> beam/pull/4028 <https://github.com/apache/beam/pull/4028>.
> > >>>
> > >>> On Mon, Oct 23, 2017 at 2:10 PM, Kenneth Knowles
> > <k...@google.com.invalid
> > >>> >
> > >>> wrote:
> > >>>
> > >>> > Wrong link - https://github.com/apache/beam/pull/4027
> > >>> >
> > >>> > On Mon, Oct 23, 2017 at 2:10 PM, Kenneth Knowles <k...@google.com>
> > >>> wrote:
> > >>> >
> > >>> > > Yea, root cause is the config bug I described. Proposed fix at
> > >>> > > https://github.com/apache/beam/pull/4019/files. I'm working with
> > >>> infra
> > >>> > to
> > >>> > > sort out other build issues that are probably not related.
> > >>> > >
> > >>> > > On Mon, Oct 23, 2017 at 1:53 PM, Lukasz Cwik
> > >>> <lc...@google.com.invalid>
> > >>> > > wrote:
> > >>> > >
> > >>> > >> The build breakage I outlined is being tracked in
> > >>> > >> https://issues.apache.org/jira/browse/BEAM-3092
> > >>> > >>
> > >>> > >> On Mon, Oct 23, 2017 at 11:54 AM, Lukasz Cwik <lc...@google.com
> >
> > >>> wrote:
> > >>> > >>
> > >>> > >> > Another breaking change was caused by
> > https://github.com/apache/
> > >>> > >> > beam/commit/241d3cedd5a8fd3f360b8ec2f3a8ef5001cbca98 because
> it
> > >>> > changed
> > >>> > >> > the build layout on the Jenkins server and our RAT rules
> > expected
> > >>> to
> > >>> > >> apply
> > >>> > >> > from a root directory. I pinged Kenneth Knowles about it and
> he
> > >>> said
> > >>> > he
> > >>> > >> was
> > >>> > >> > taking a look.
> > >>> > >> >
> > >>> > >> > On Mon, Oct 23, 2017 at 11:53 AM, Raghu Angadi
> > >>> > >> <rang...@google.com.invalid
> > >>> > >> > > wrote:
> > >>> > >> >
> > >>> > >> >> Regd (1) :
> > >>> > >> >>
> > >>> > >> >> [4] did have have a file without Apache Licence. It was fixed
> > the
> > >>> > next
> > >>> > >> >> day (
> > >>> > >> >> commit
> > >>> > >> >> <https://github.com/apache/beam/commit/249da9b8a1e86d0fe4c3d
> > >>> > >> >> c7b83032ad38c3dcac0#diff-26b77e086ff8292ef54f12b22b7b767a>),
> > >>> > >> >> thanks to Ken Knowles who pinged me about it.
> > >>> > >> >>
> > >>> > >> >> On Mon, Oct 23, 2017 at 11:45 AM, Valentyn Tymofieiev <
> > >>> > >> >> valen...@google.com>
> > >>> > >> >> wrote:
> > >>> > >> >>
> > >>> > >> >> > Hi Beam-Dev,
> > >>> > >> >> >
> > >>> > >> >> > It's been >5 days since the last successful run of a
> > >>> > >> >> > beam_PreCommit_Java_MavenInstall build[1]  and >4 days
> since
> > >>> last
> > >>> > >> >> > successful run of beam_PreCommit_Java_MavenInstall[2].
> > >>> > >> >> >
> > >>> > >> >> > Looking at build logs I see following problems.
> > >>> > >> >> >
> > >>> > >> >> > 1. After October 17, postcommit builds started to fail with
> > >>> > >> >> >
> > >>> > >> >> > Failed to execute goal org.apache.rat:apache-rat-plug
> > >>> in:0.12:check
> > >>> > >> >> > (default) on project beam-parent: Too many files with
> > >>> unapproved
> > >>> > >> >> license: 1
> > >>> > >> >> > See RAT report in: /home/jenkins/jenkins-slave/wo
> > >>> > >> >> > rkspace/beam_PostCommit_Java_
> MavenInstall/target/beam-paren
> > >>> > >> >> > t-2.3.0-SNAPSHOT.rat
> > >>> > >> >> >
> > >>> > >> >> > The earliest build that I see this error is Postcommit
> #5052
> > >>> [3].
> > >>> > >> >> >
> > >>> > >> >> > This makes me suspect [4] or [5] as a breaking change,
> since
> > >>> they
> > >>> > >> change
> > >>> > >> >> > pom files.
> > >>> > >> >> >
> > >>> > >> >> > Questions:
> > >>> > >> >> > - Is there a way we can reproduce this failure locally? mvn
> > >>> clean
> > >>> > >> verify
> > >>> > >> >> > passes locally for me.
> > >>> > >> >> > - Is there a way we can see the See RAT report mentioned in
> > the
> > >>> > error
> > >>> > >> >> > log?
> > >>> > >> >> >
> > >>> > >> >> > 2. Prior to onset of #1 Java Precommit builds no longer
> > >>> complete
> > >>> > >> within
> > >>> > >> >> > allotted 150 min time. Looking at [6-8] it seems the build
> > >>> makes
> > >>> > >> >> consistent
> > >>> > >> >> > progress, but just does not finish on time. We can also see
> > >>> several
> > >>> > >> >> recent
> > >>> > >> >> > successful builds with execution time very close to time
> out
> > >>> > [9-11].
> > >>> > >> >> >
> > >>> > >> >> > I'd like to propose to increase time limit for Java
> precommit
> > >>> test
> > >>> > >> suite
> > >>> > >> >> > from 2.5 to 4 hours. 4 hours is long time. I agree that we
> > >>> should
> > >>> > >> >> > definitely try to reduce the test execution time, and
> reduce
> > >>> > >> flakiness.
> > >>> > >> >> > However we need the tests at least pass for now. If we
> write
> > >>> off
> > >>> > >> failed
> > >>> > >> >> > test suites as 'flakes' and merge PRs without having a
> green
> > >>> test
> > >>> > >> >> signal,
> > >>> > >> >> > we will have to spend more time tracing breakages such as
> #1.
> > >>> > >> >> >
> > >>> > >> >> > Thoughts?
> > >>> > >> >> >
> > >>> > >> >> > Thanks,
> > >>> > >> >> > Valentyn
> > >>> > >> >> >
> > >>> > >> >> > [1] https://builds.apache.org/job/
> > >>> beam_PreCommit_Java_MavenInsta
> > >>> > ll/
> > >>> > >> >> > [2] https://builds.apache.org/job/
> > >>> beam_PostCommit_Java_MavenInst
> > >>> > all/
> > >>> > >> >> > [3] https://builds.apache.org/job/
> > >>> beam_PostCommit_Java_MavenInst
> > >>> > >> >> > all/5052/changes
> > >>> > >> >> >
> > >>> > >> >> > [4] https://github.com/apache/
> beam/commit/d745cc9d8cc1735d3b
> > >>> > >> >> > c3c67ba3e2617cb7f11a8c
> > >>> > >> >> > [5] https://github.com/apache/beam
> > >>> /commit/0d8ab6cbbc762dd9f9be1b
> > >>> > >> >> > 3e9a26b6c9d0bb6dc3
> > >>> > >> >> >
> > >>> > >> >> > [6] https://builds.apache.org/job/
> > >>> beam_PreCommit_Java_MavenInsta
> > >>> > >> >> ll/15222/
> > >>> > >> >> > [7] https://builds.apache.org/job/
> > >>> beam_PreCommit_Java_MavenInsta
> > >>> > >> >> ll/15195/
> > >>> > >> >> > [8] https://builds.apache.org/job/
> > >>> beam_PreCommit_Java_MavenInsta
> > >>> > >> >> ll/15220/
> > >>> > >> >> >
> > >>> > >> >> > [9] https://builds.apache.org/job/
> > >>> beam_PreCommit_Java_MavenInsta
> > >>> > >> >> ll/15009/
> > >>> > >> >> > [10] https://builds.apache.org/job/
> > >>> beam_PreCommit_Java_MavenInsta
> > >>> > >> >> ll/15068/
> > >>> > >> >> > [11] https://builds.apache.org/job/
> > >>> beam_PreCommit_Java_MavenInsta
> > >>> > >> >> ll/15016/
> > >>> > >> >> >
> > >>> > >> >> >
> > >>> > >> >>
> > >>> > >> >
> > >>> > >> >
> > >>> > >>
> > >>> > >
> > >>> > >
> > >>> >
> > >>>
> > >>
> > >>
> > >
> >
>

Reply via email to