On Thu, Oct 26, 2017 at 9:14 AM, Kenneth Knowles <k...@google.com> wrote:

> On Wed, Oct 25, 2017 at 6:51 PM, Robert Bradshaw <
> rober...@google.com.invalid> wrote:
>>
>> These are just some ideas; I honestly don't know. I think my upper limit
>> for default precommit feedback is probably 30 minutes, and even there I am
>> not very happy.
>>
>>
>> We should be able to get a good signal in less than 30 minutes. Lint,
>> smoke
>> tests, etc. I think there is a more extensive set of tests that should
>> pass
>> before merging.
>>
>
> Yea, definitely. Digging into this in particular because I think it is
> quite promising as a way to reduce build congestion.
>
> It looks like `mvn clean test -Dfindbugs.skip` would finish in 10 minutes
> if we fixed it. Currently it looks like some modules don't work under this
> invocation, probably requiring some state to be established by prior bits
> run by `verify`. If we iterated on that until review was done, then 30-90
> minutes to prevent breakage would be fine, especially since a failure in
> that longer slower build is probably a signal to add a quicker repro unit
> test.
>

Looked into this, and some IOs explicitly use Guava 19, so presumably that
is important, while the core Java SDK requires Guava 21. This is shaded
away during `mvn verify` or `mvn install` but for `mvn test` there is no
packaging or shading, so when the IO tests run with Guava 19 taking
precedence, checkArguments within the Java SDK fail. This one might be
fixable, but my 5 second takeaway is that our codebase cannot build against
itself without some shading (or equivalent).

Kenn





>
>>
>> Also it might not hurt to audit our tests and see if there are a small
>> number that take a disproportionate amount of time.
>>
>> Any other suggestions?
>>
>> Kenn
>>
>> On Mon, Oct 23, 2017 at 3:29 PM, Kenneth Knowles <k...@google.com> wrote:
>>
>> > I want to wait and get some green from Jenkins running against the HEAD
>> > groovy scripts to confirm. I haven't sat at my desk long enough to see a
>> > full `mvn -P release clean verify` yet.
>> >
>> > On Mon, Oct 23, 2017 at 3:27 PM, Kenneth Knowles <k...@google.com>
>> wrote:
>> >
>> >> I was easily able to reproduce all the sorts of failures (and some
>> more!)
>> >>
>> >> Here are some things that now work that didn't work, or didn't work
>> >> correctly, before
>> >>
>> >>  - mvn apache-rat:check
>> >>  - mvn -P release apache-rat:check
>> >>  - mvn -P release -f somewhere/else/pom.xml apache-rat:check
>> >>
>> >> It turns out we had these issues:
>> >>
>> >>  - items in .gitignore that our RAT config did not ignore
>> >>  - sub-modules actually *did* inherit the RAT config, depending on the
>> >> command you run
>> >>  - paths in the RAT exclude were relative to current dir
>> >>  - paths in our RAT exclude that aren't even part of our codebase or
>> >> generated targets, but just things that Jenkins dropped in the
>> workspace,
>> >> and we were cloning directly into the workspace
>> >>
>> >> I think with the new Jenkins config and the new pom.xml things should
>> >> work well.
>> >>
>> >> Incidentally, Valentyn, changes to Jenkins job DSL groovy scripts do
>> not
>> >> take effect from being merged but only when the seed job next runs.
>> (TODO:
>> >> fix this). I am still trying to get the relevant Jenkins UI pages to
>> load
>> >> to get your change incorporated into the live scripts, which were
>> regressed
>> >> by being run from an old PR (TODO: fix the fact that this can happen).
>> >>
>> >> Kenn
>> >>
>> >> On Mon, Oct 23, 2017 at 2:21 PM, Valentyn Tymofieiev <
>> >> valen...@google.com.invalid> wrote:
>> >>
>> >>> Thanks a lot!
>> >>>
>> >>> Kenn, Were you able to reproduce RAT failures and test the fix
>> locally?
>> I
>> >>> think "mvn clean verify -Prelease" still passes for me.
>> >>>
>> >>> Timeout for the test suite has been increased in
>> >>> https://github.com/apache/
>> >>> beam/pull/4028 <https://github.com/apache/beam/pull/4028>.
>> >>>
>> >>> On Mon, Oct 23, 2017 at 2:10 PM, Kenneth Knowles
>> <k...@google.com.invalid
>> >>> >
>> >>> wrote:
>> >>>
>> >>> > Wrong link - https://github.com/apache/beam/pull/4027
>> >>> >
>> >>> > On Mon, Oct 23, 2017 at 2:10 PM, Kenneth Knowles <k...@google.com>
>> >>> wrote:
>> >>> >
>> >>> > > Yea, root cause is the config bug I described. Proposed fix at
>> >>> > > https://github.com/apache/beam/pull/4019/files. I'm working with
>> >>> infra
>> >>> > to
>> >>> > > sort out other build issues that are probably not related.
>> >>> > >
>> >>> > > On Mon, Oct 23, 2017 at 1:53 PM, Lukasz Cwik
>> >>> <lc...@google.com.invalid>
>> >>> > > wrote:
>> >>> > >
>> >>> > >> The build breakage I outlined is being tracked in
>> >>> > >> https://issues.apache.org/jira/browse/BEAM-3092
>> >>> > >>
>> >>> > >> On Mon, Oct 23, 2017 at 11:54 AM, Lukasz Cwik <lc...@google.com>
>> >>> wrote:
>> >>> > >>
>> >>> > >> > Another breaking change was caused by
>> https://github.com/apache/
>> >>> > >> > beam/commit/241d3cedd5a8fd3f360b8ec2f3a8ef5001cbca98 because
>> it
>> >>> > changed
>> >>> > >> > the build layout on the Jenkins server and our RAT rules
>> expected
>> >>> to
>> >>> > >> apply
>> >>> > >> > from a root directory. I pinged Kenneth Knowles about it and he
>> >>> said
>> >>> > he
>> >>> > >> was
>> >>> > >> > taking a look.
>> >>> > >> >
>> >>> > >> > On Mon, Oct 23, 2017 at 11:53 AM, Raghu Angadi
>> >>> > >> <rang...@google.com.invalid
>> >>> > >> > > wrote:
>> >>> > >> >
>> >>> > >> >> Regd (1) :
>> >>> > >> >>
>> >>> > >> >> [4] did have have a file without Apache Licence. It was fixed
>> the
>> >>> > next
>> >>> > >> >> day (
>> >>> > >> >> commit
>> >>> > >> >> <https://github.com/apache/beam/commit/249da9b8a1e86d0fe4c3d
>> >>> > >> >> c7b83032ad38c3dcac0#diff-26b77e086ff8292ef54f12b22b7b767a>),
>> >>> > >> >> thanks to Ken Knowles who pinged me about it.
>> >>> > >> >>
>> >>> > >> >> On Mon, Oct 23, 2017 at 11:45 AM, Valentyn Tymofieiev <
>> >>> > >> >> valen...@google.com>
>> >>> > >> >> wrote:
>> >>> > >> >>
>> >>> > >> >> > Hi Beam-Dev,
>> >>> > >> >> >
>> >>> > >> >> > It's been >5 days since the last successful run of a
>> >>> > >> >> > beam_PreCommit_Java_MavenInstall build[1]  and >4 days
>> since
>> >>> last
>> >>> > >> >> > successful run of beam_PreCommit_Java_MavenInstall[2].
>> >>> > >> >> >
>> >>> > >> >> > Looking at build logs I see following problems.
>> >>> > >> >> >
>> >>> > >> >> > 1. After October 17, postcommit builds started to fail with
>> >>> > >> >> >
>> >>> > >> >> > Failed to execute goal org.apache.rat:apache-rat-plug
>> >>> in:0.12:check
>> >>> > >> >> > (default) on project beam-parent: Too many files with
>> >>> unapproved
>> >>> > >> >> license: 1
>> >>> > >> >> > See RAT report in: /home/jenkins/jenkins-slave/wo
>> >>> > >> >> > rkspace/beam_PostCommit_Java_MavenInstall/target/beam-paren
>> >>> > >> >> > t-2.3.0-SNAPSHOT.rat
>> >>> > >> >> >
>> >>> > >> >> > The earliest build that I see this error is Postcommit #5052
>> >>> [3].
>> >>> > >> >> >
>> >>> > >> >> > This makes me suspect [4] or [5] as a breaking change, since
>> >>> they
>> >>> > >> change
>> >>> > >> >> > pom files.
>> >>> > >> >> >
>> >>> > >> >> > Questions:
>> >>> > >> >> > - Is there a way we can reproduce this failure locally? mvn
>> >>> clean
>> >>> > >> verify
>> >>> > >> >> > passes locally for me.
>> >>> > >> >> > - Is there a way we can see the See RAT report mentioned in
>> the
>> >>> > error
>> >>> > >> >> > log?
>> >>> > >> >> >
>> >>> > >> >> > 2. Prior to onset of #1 Java Precommit builds no longer
>> >>> complete
>> >>> > >> within
>> >>> > >> >> > allotted 150 min time. Looking at [6-8] it seems the build
>> >>> makes
>> >>> > >> >> consistent
>> >>> > >> >> > progress, but just does not finish on time. We can also see
>> >>> several
>> >>> > >> >> recent
>> >>> > >> >> > successful builds with execution time very close to time out
>> >>> > [9-11].
>> >>> > >> >> >
>> >>> > >> >> > I'd like to propose to increase time limit for Java
>> precommit
>> >>> test
>> >>> > >> suite
>> >>> > >> >> > from 2.5 to 4 hours. 4 hours is long time. I agree that we
>> >>> should
>> >>> > >> >> > definitely try to reduce the test execution time, and reduce
>> >>> > >> flakiness.
>> >>> > >> >> > However we need the tests at least pass for now. If we write
>> >>> off
>> >>> > >> failed
>> >>> > >> >> > test suites as 'flakes' and merge PRs without having a green
>> >>> test
>> >>> > >> >> signal,
>> >>> > >> >> > we will have to spend more time tracing breakages such as
>> #1.
>> >>> > >> >> >
>> >>> > >> >> > Thoughts?
>> >>> > >> >> >
>> >>> > >> >> > Thanks,
>> >>> > >> >> > Valentyn
>> >>> > >> >> >
>> >>> > >> >> > [1] https://builds.apache.org/job/
>> >>> beam_PreCommit_Java_MavenInsta
>> >>> > ll/
>> >>> > >> >> > [2] https://builds.apache.org/job/
>> >>> beam_PostCommit_Java_MavenInst
>> >>> > all/
>> >>> > >> >> > [3] https://builds.apache.org/job/
>> >>> beam_PostCommit_Java_MavenInst
>> >>> > >> >> > all/5052/changes
>> >>> > >> >> >
>> >>> > >> >> > [4] https://github.com/apache/beam
>> /commit/d745cc9d8cc1735d3b
>> >>> > >> >> > c3c67ba3e2617cb7f11a8c
>> >>> > >> >> > [5] https://github.com/apache/beam
>> >>> /commit/0d8ab6cbbc762dd9f9be1b
>> >>> > >> >> > 3e9a26b6c9d0bb6dc3
>> >>> > >> >> >
>> >>> > >> >> > [6] https://builds.apache.org/job/
>> >>> beam_PreCommit_Java_MavenInsta
>> >>> > >> >> ll/15222/
>> >>> > >> >> > [7] https://builds.apache.org/job/
>> >>> beam_PreCommit_Java_MavenInsta
>> >>> > >> >> ll/15195/
>> >>> > >> >> > [8] https://builds.apache.org/job/
>> >>> beam_PreCommit_Java_MavenInsta
>> >>> > >> >> ll/15220/
>> >>> > >> >> >
>> >>> > >> >> > [9] https://builds.apache.org/job/
>> >>> beam_PreCommit_Java_MavenInsta
>> >>> > >> >> ll/15009/
>> >>> > >> >> > [10] https://builds.apache.org/job/
>> >>> beam_PreCommit_Java_MavenInsta
>> >>> > >> >> ll/15068/
>> >>> > >> >> > [11] https://builds.apache.org/job/
>> >>> beam_PreCommit_Java_MavenInsta
>> >>> > >> >> ll/15016/
>> >>> > >> >> >
>> >>> > >> >> >
>> >>> > >> >>
>> >>> > >> >
>> >>> > >> >
>> >>> > >>
>> >>> > >
>> >>> > >
>> >>> >
>> >>>
>> >>
>> >>
>> >
>>
>
>

Reply via email to