Agreed. I thinking having a formal vote before Luke had numbers and results
would have been too early. However now that we have such numbers, we should
think about having a vote.
Also, while I disagree with Romain that Gradle is not "enterprise ready" (it's
heavily used by Netflix, LinkedIn, Siemens, and is the default build framework
for Android apps), it would be interesting to see if any other ASF projects are
using it. I don't think that should not make or break the decision - we should
do what's best for the Beam project, and "everyone else is doing something" is
rarely a good argument - it will provide good data points for us to evaluate.
Reuven
On Mon, Nov 27, 2017 at 10:23 PM, Jean-Baptiste Onofré <[email protected]
<mailto:[email protected]>> wrote:
Hi Luke,
just curious (and maybe I missed it): did we do a formal vote to merge the
gradle build ?
Gradle is now on master, we have some Jira to update the release guide with
gradle. It's fine, but I remember only a discussion, not a vote.
In order to embrace the community and avoid to have some contributors
"frustrated" (meaning that "this project doesn't care about contributor,
they just do whatever they want"), I would have love to see a formal vote
about Gradle more than just a discussion.
My $0.01
Regards
JB
On 11/27/2017 07:46 PM, Lukasz Cwik wrote:
I have collected data by running several builds against master using
Gradle
and Maven without using Gradle's support for incremental builds.
Gradle (mins)
min: 25.04
max: 160.14
median: 45.78
average: 52.19
stdev: 30.80
Maven (mins)
min: 56.86
max: 216.55 (actually > 240 mins because this data does not include
timeouts)
median: 87.93
average: 109.10
stdev: 48.01
I excluded a few timeouts (240 mins) that happened during the Maven
build
from its numbers but we can see conclusively that Gradle is twice as
fast
for the build when compared to Maven when run using Jenkins.
On my desktop, I have enabled incremental builds and have seen a major
improvement on the above numbers but it doesn't yet work correctly
because
of incorrectly specified inputs/outputs for certain tasks.
The data is available here
https://docs.google.com/spreadsheets/d/1MHVjF-xoI49_NJqEQakUgnNIQ7Qbjzu8Y1q_h3dbF1M/edit?usp=sharing
<https://docs.google.com/spreadsheets/d/1MHVjF-xoI49_NJqEQakUgnNIQ7Qbjzu8Y1q_h3dbF1M/edit?usp=sharing>
With this data, I feel confident that we should swap and have opened the
following issue https://issues.apache.org/jira/browse/BEAM-3249
<https://issues.apache.org/jira/browse/BEAM-3249> and related
sub-tasks.
On Sun, Nov 19, 2017 at 11:23 AM, Jean-Baptiste Onofré
<[email protected]
<mailto:[email protected]>>
wrote:
Thanks for the update Luke.
I'm updating my local working copy to do new tests.
Regards
JB
On 11/19/2017 08:21 PM, Lukasz Cwik wrote:
The gradle build rules have been merged, I'm adding a
precommit[1] to
start
collecting data about the build times. It currently only mirrors
the Java
mvn install precommit. I'll gather data over the next two weeks
and
provide
a summary here.
You can rerun the precommit by issuing "Run Java Gradle
PreCommit"
1: https://github.com/apache/beam/pull/4146
<https://github.com/apache/beam/pull/4146>
On Mon, Nov 13, 2017 at 9:08 AM, Lukasz Cwik <[email protected]
<mailto:[email protected]>> wrote:
There has been plenty of time for comments on the PR and the
approach.
So far Ken Knowles has provided the most feedback on the PR,
Ken would
you
like to finish the review?
On Fri, Nov 10, 2017 at 1:22 PM, Romain Manni-Bucau <
[email protected] <mailto:[email protected]>
wrote:
This is only a setup thing and better to not break the
master history for
poc/tests, in particular when no very localized.
Alternative can be to
ask
another temp repo to infra and have a synchro between
both but dont
think
it does worth it personally.
Le 10 nov. 2017 18:57, "Lukasz Cwik"
<[email protected]> a
écrit :
The reason to get it on master is because that is where
all the PRs
are. An
upstream branch without any development means no
data.
Also, our Jenkins setup via job-dsl doesn't honor
using the Jenkins
configuration on the branch because the seed job
always runs against
master.
On Thu, Nov 9, 2017 at 9:59 PM, Romain Manni-Bucau <
[email protected] <mailto:[email protected]>>
wrote:
What about pushing it on a "upstream" branch and
testing it for 1
week in
parallel of the maven reference build? If gradle is
always 50% faster
on
jenkins then it could become master setup without
much discussion I
guess.
We can even have 2 jenkins jobs: one with the
daemon etc and one
without.
Also noticed yesterday that gradle build is
killing my machine (all 8
cores
are 100%) during the first minutes vs maven
build which let me do
something
else. Then all the consumed time which makes
gradle not that fast is
about
python. Will try to send figures later today.
Le 10 nov. 2017 00:10, "Lukasz Cwik"
<[email protected]> a
écrit
:
I wouldn't mind merging this change in so I
could setup those Gradle
Jenkins precommits.
As per our contribution guidelines, any
committer willing to sign
off
on
the PR?
On Thu, Nov 9, 2017 at 2:12 PM, Romain
Manni-Bucau <
[email protected]
<mailto:[email protected]>>
wrote:
Le 9 nov. 2017 21:31, "Kenneth Knowles"
<[email protected]>
a
écrit :
Keep in mind that a clean build is
unusual during development (it
is
common
for mvn use and that is a bug) and also
not necessary for
precommits
if
the
build tool is correct enough that
caching is safe. So while this
number
matters, it is not the most important.
Not sure, in dev you bypass the build
tool most of the time
anyway -
thanks
to IDE or other shortcuts - but not on
PR and CI. Keep in mind
that
not
doing a clean and killing gradle daemon makes
the build not
reproducible
and therefore useful :(. Starting to build
from a subpart of the
reactor
-
with the mentionned mvn plugin for
instance - can be nice on some
CI
like
travis if the caching is well configured but
still not a guarantee
the
build is "green".
My trade off is to ensure an easy build
and relevant result over
the
time
criteria. Do you share it as well or prefer
time over other
criteria
-
which leads to other conclusions and options
indeed and can make
us
not
understanding each other?
On Thu, Nov 9, 2017 at 11:30 AM, Romain
Manni-Bucau <
[email protected]
<mailto:[email protected]>
wrote:
I will try next week yes but the 2 runs
i did were 28mn vs 32mn
from
memory
- after having downloaded all deps
once.
Le 9 nov. 2017 19:45, "Lukasz Cwik"
<[email protected]>
a
écrit :
If Gradle was slow, do you mind
running the build with
--profile
and
sharing that and also sharing the Maven
build log?
On Thu, Nov 9, 2017 at 10:43 AM,
Lukasz Cwik <
[email protected]
<mailto:[email protected]>>
wrote:
Romain, I don't understand your
last comment, were you
trying
to
say
that
you had the same Gradle build
times like I did and it was an
improvement
over Maven or that you did not
and you experienced build
times
that
were
equivalent to Maven?
On Thu, Nov 9, 2017 at 9:51
AM, Romain Manni-Bucau <
[email protected]
<mailto:[email protected]>>
wrote:
2017-11-09 18:38 GMT+01:00
Kenneth Knowles
<[email protected]
:
On Thu, Nov 9, 2017 at 9:11 AM,
Romain Manni-Bucau <
[email protected]
<mailto:[email protected]>>
wrote:
(this is another
topic so we can
maybe open another
thread)
issue
is
not much about python but more
about the fact the build
is
not
self
contained. it is a maven build and
maven should be
sufficient
without
having to install python +
dependencies.
Let's leave out the
topic of whether our
build should
install
things
like
JDKs, Python,
Golang, Docker,
protoc, findbugs,
RAT, etc.
That
issue
is
somewhat independent of
build tool, and the new
build
isn't
worse
than
the
old one as far as it
goes.
Yep, globally the same
time with clean and
killing the
daemon.
Kenn
I don't see any
technical
blockers to do
it (except time
;)) but it is
always a
bit
annoying
to
git clone then not be able to
build.
Romain
Manni-Bucau
@rmannibucau |
Blog | Old Blog
| Github |
LinkedIn
2017-11-09 18:07
GMT+01:00 Lukasz
Cwik
<[email protected]
:
Hmm, I have had good luck when
following the Python
quick
start
setup
<https://beam.apache.org/get-started/quickstart-py/
<https://beam.apache.org/get-started/quickstart-py/>>
on
multiple
machines
by ensuring
the
installed
version of
setuptools,
virtualenv
and
pip
are
new
enough
versions.
You can
always skip
the Python
portion of
the build by
excluding
the
build
task as so:
./gradlew
build -x
":beam-sdks-parent:beam-sdks-
python:build"
On Thu, Nov
9, 2017 at
8:58 AM,
Romain
Manni-Bucau
<
[email protected]
<mailto:[email protected]>>
wrote:
The 1.3.5
file is when
i installed
the python
dependencies
manually
to make the build passing
(the pip command never
passed
on
my
computer
and therefore the
build always has
been broken until
i
installed
it
manually - independently
from the build tool).
Romain
Manni-Bucau
@rmannibucau
| Blog
| Old
Blog |
Github |
LinkedIn
2017-11-09
17:51
GMT+01:00 Lukasz
Cwik
<[email protected]
:
It turns out that the
Apache Rat Ant task and
the
Apache
Rat
Maven
plugin
differ in
that the
plugin
automatically excludes
certain
files
by
default
while the
Ant task
does not.
See:
http://creadur.apache.org/rat/
apache-rat-plugin/check-mojo.
html#useDefaultExcludes
I
fixed the
list
to
exclude
".idea/"
instead
of
"idea/"
since
there
was a
typo.
I
have
no
idea
what
the
file
"=1.3.5"
is.
Can
you
take a
look
at
the
contents?
On
Thu,
Nov
9,
2017
at
12:03 AM,
Romain
Manni-Bucau
<
[email protected]
<mailto:[email protected]>>
wrote:
Ok,
the
rat
issues
I
got
were:
==
File:
/home/rmannibucau/1_dev/beam/.idea/*
==
File:
/home/rmannibucau/1_dev/beam/
sdks/python/=1.3.5
The
first
one
could
be
in
my
default
exclude
-
even
if
eclipse/idea
files should be in
the default exclude
set of beam
rat
config
IMHO,
the last one is more
a "?" can probably
be
exclude
as
well
if
created
by the build at some
point.
Romain
Manni-Bucau
@rmannibucau
|
Blog
| Old
Blog
| Github
|
LinkedIn
2017-11-08
19:17
GMT+01:00
Jean-Baptiste
Onofré
<
[email protected]
<mailto:[email protected]>
:
Thanks for the
update. I was
swamped on some
meetings.
I'm
back to
test
the
latest
changes.
Regards
JB
On
Nov
8,
2017,
18:56,
at
18:56,
Lukasz
Cwik
<[email protected]
wrote:
Thanks
everyone
for
trying
this
build
out
in
different
workspaces /
configurations. This
will help make sure
the
build
works
for
more
people
and
will
get
rid
of
any
rough
edges.
Performance
(All):
Maven
performs
parallelization
at
the
module
level,
an
entire
module
needs
to
complete
before
any
dependent
modules
can
start,
this
means
running
all
the
checks
like
findbugs,
checkstyle,
tests
need
to
finish.
Gradle
has
task
level
parallelism
between
subprojects
which
means
that
as
soon
as
the
compile and
shade steps
are done for
a project,
and
dependent
subprojects
can
typically
start.
This
means
that
we
get
increased
parallelism
due
to
not
needing
to
wait
for
findbugs,
checkstyle,
tests
to
run. I
typically
see
~20
tasks
(at
peak)
running
on
my
desktop
in
parallel.
Apache
Rat
(JB
/ Romain):
What
files
are
in
the
rat
report
that
fail
(its
likely
that
I'm
missing
some
exclusion
for
a
build time
artifact)?
Also,
please
try
the
build
again
after
running
`git
clean
-fdx`
in
your
workspace.
Python
(JB):
As
for
the
Python
SDK,
you'll
need
to
share
more
details
about
the
failure.
Gradle
4.3:
I would
like
to
defer
the
swap
to
Gradle
4.3
until
after
this
PR
since
it
will
be
a much
smaller
set
of
changes.
On
Wed,
Nov
8,
2017
at
12:54
AM,
Jean-Baptiste
Onofré
<
[email protected]
<mailto:[email protected]>>
wrote:
Same
for
me
for
rat
and
python
build
too:
FAILURE:
Build
completed
with
2 failures.
1:
Task
failed
with
an
exception.
-----------
* What
went
wrong:
Execution
failed
for
task
':rat'.
Found
905
files
with
unapproved/unknown
licenses.
See
file:/home/jbonofre/Workspace/
beam/build/reports/rat/rat-
report.txt
* Try:
Run
with
--stacktrace
option
to
get
the
stack
trace.
Run
with