Re: Java compiler OOMs on Jenkins/Gradle

2018-05-01 Thread Scott Wegner
Sorry about the instability. We need to get the Gradle jobs tuned for our
Jenkins machines, and there's no way to test my configuration changes
without affecting all jobs :-/

The changes I'm making are here: https://github.com/apache/beam/pull/5218

It seems that they're still not quite right: the intent is to allocate half
the memory to each job, and then divide it up by worker. But for 16 workers
each is getting ~1GB, even though the machines have 100GB total. I suspect
I'm just calling the wrong API.

On Tue, May 1, 2018 at 1:41 PM Eugene Kirpichov 
wrote:

> Thanks! FWIW seems that my other Jenkins build is about to fail with the
> same issue
> https://builds.apache.org/job/beam_PreCommit_Java_GradleBuild/4806/ -
> "Expiring Daemon because JVM Tenured space is exhausted"
>
> On Tue, May 1, 2018 at 1:36 PM Lukasz Cwik  wrote:
>
>> +sweg...@google.com who is currently messing around with tuning some
>> Gradle flags related to the JVM and its memory usage.
>>
>> On Tue, May 1, 2018 at 1:34 PM Eugene Kirpichov 
>> wrote:
>>
>>> Hi,
>>>
>>> I've seen the same issue twice in a row on PR
>>> https://github.com/apache/beam/pull/4264 : the Java precommit fails
>>> with messages like:
>>>
>>> > Task :beam-sdks-java-core:compileTestJava
>>> An exception has occurred in the compiler ((version info not
>>> available)). Please file a bug against the Java compiler via the Java bug
>>> reporting page (http://bugreport.java.com) after checking the Bug
>>> Database (http://bugs.java.com) for duplicates. Include your program
>>> and the following diagnostic in your report. Thank you.
>>> java.lang.OutOfMemoryError: GC overhead limit exceeded
>>>
>>> Full build link:
>>> https://builds.apache.org/job/beam_PreCommit_Java_GradleBuild/4803/consoleFull
>>>
>>> Anybody know what's up with that? I thought we got new powerful Jenkins
>>> executors and we shouldn't be running out of memory? However, I see that
>>> the build specifies* -Dorg.gradle.jvmargs=-Xmx512m* - this seems too
>>> small. Should we increase this?
>>>
>>> Thanks.
>>>
>> --


Got feedback? http://go/swegner-feedback


Re: Java compiler OOMs on Jenkins/Gradle

2018-05-01 Thread Eugene Kirpichov
Thanks! FWIW seems that my other Jenkins build is about to fail with the
same issue
https://builds.apache.org/job/beam_PreCommit_Java_GradleBuild/4806/ -
"Expiring Daemon because JVM Tenured space is exhausted"

On Tue, May 1, 2018 at 1:36 PM Lukasz Cwik  wrote:

> +sweg...@google.com who is currently messing around with tuning some
> Gradle flags related to the JVM and its memory usage.
>
> On Tue, May 1, 2018 at 1:34 PM Eugene Kirpichov 
> wrote:
>
>> Hi,
>>
>> I've seen the same issue twice in a row on PR
>> https://github.com/apache/beam/pull/4264 : the Java precommit fails with
>> messages like:
>>
>> > Task :beam-sdks-java-core:compileTestJava
>> An exception has occurred in the compiler ((version info not available)).
>> Please file a bug against the Java compiler via the Java bug reporting page
>> (http://bugreport.java.com) after checking the Bug Database (
>> http://bugs.java.com) for duplicates. Include your program and the
>> following diagnostic in your report. Thank you.
>> java.lang.OutOfMemoryError: GC overhead limit exceeded
>>
>> Full build link:
>> https://builds.apache.org/job/beam_PreCommit_Java_GradleBuild/4803/consoleFull
>>
>> Anybody know what's up with that? I thought we got new powerful Jenkins
>> executors and we shouldn't be running out of memory? However, I see that
>> the build specifies* -Dorg.gradle.jvmargs=-Xmx512m* - this seems too
>> small. Should we increase this?
>>
>> Thanks.
>>
>


Re: Java compiler OOMs on Jenkins/Gradle

2018-05-01 Thread Lukasz Cwik
+sweg...@google.com who is currently messing around with tuning some Gradle
flags related to the JVM and its memory usage.

On Tue, May 1, 2018 at 1:34 PM Eugene Kirpichov 
wrote:

> Hi,
>
> I've seen the same issue twice in a row on PR
> https://github.com/apache/beam/pull/4264 : the Java precommit fails with
> messages like:
>
> > Task :beam-sdks-java-core:compileTestJava
> An exception has occurred in the compiler ((version info not available)).
> Please file a bug against the Java compiler via the Java bug reporting page
> (http://bugreport.java.com) after checking the Bug Database (
> http://bugs.java.com) for duplicates. Include your program and the
> following diagnostic in your report. Thank you.
> java.lang.OutOfMemoryError: GC overhead limit exceeded
>
> Full build link:
> https://builds.apache.org/job/beam_PreCommit_Java_GradleBuild/4803/consoleFull
>
> Anybody know what's up with that? I thought we got new powerful Jenkins
> executors and we shouldn't be running out of memory? However, I see that
> the build specifies* -Dorg.gradle.jvmargs=-Xmx512m* - this seems too
> small. Should we increase this?
>
> Thanks.
>


Java compiler OOMs on Jenkins/Gradle

2018-05-01 Thread Eugene Kirpichov
Hi,

I've seen the same issue twice in a row on PR
https://github.com/apache/beam/pull/4264 : the Java precommit fails with
messages like:

> Task :beam-sdks-java-core:compileTestJava
An exception has occurred in the compiler ((version info not available)).
Please file a bug against the Java compiler via the Java bug reporting page
(http://bugreport.java.com) after checking the Bug Database (
http://bugs.java.com) for duplicates. Include your program and the
following diagnostic in your report. Thank you.
java.lang.OutOfMemoryError: GC overhead limit exceeded

Full build link:
https://builds.apache.org/job/beam_PreCommit_Java_GradleBuild/4803/consoleFull

Anybody know what's up with that? I thought we got new powerful Jenkins
executors and we shouldn't be running out of memory? However, I see that
the build specifies* -Dorg.gradle.jvmargs=-Xmx512m* - this seems too small.
Should we increase this?

Thanks.


Jenkins build is back to normal : beam_SeedJob #1611

2018-05-01 Thread Apache Jenkins Server
See 



Build failed in Jenkins: beam_SeedJob #1610

2018-05-01 Thread Apache Jenkins Server
See 

--
GitHub pull request #5218 of commit 614d205083bbb5a2123977a74b57a48fff177979, 
no merge conflicts.
Setting status of 614d205083bbb5a2123977a74b57a48fff177979 to PENDING with url 
https://builds.apache.org/job/beam_SeedJob/1610/ and message: 'Build started 
sha1 is merged.'
Using context: Jenkins: Seed Job
[EnvInject] - Loading node environment variables.
Building remotely on beam13 (beam) in workspace 

 > git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url https://github.com/apache/beam.git # timeout=10
Fetching upstream changes from https://github.com/apache/beam.git
 > git --version # timeout=10
 > git fetch --tags --progress https://github.com/apache/beam.git 
 > +refs/heads/*:refs/remotes/origin/* 
 > +refs/pull/5218/*:refs/remotes/origin/pr/5218/*
 > git rev-parse refs/remotes/origin/pr/5218/merge^{commit} # timeout=10
 > git rev-parse refs/remotes/origin/origin/pr/5218/merge^{commit} # timeout=10
Checking out Revision 6c9142625d1083f340e6a02b33fcb1ac91004f47 
(refs/remotes/origin/pr/5218/merge)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 6c9142625d1083f340e6a02b33fcb1ac91004f47
Commit message: "Merge 614d205083bbb5a2123977a74b57a48fff177979 into 
92fd475afca09da7da1224775342bd668b53d83a"
 > git rev-list --no-walk ce1361c53eeb6c22c19b1cad7cba739eec96b583 # timeout=10
First time build. Skipping changelog.
Cleaning workspace
 > git rev-parse --verify HEAD # timeout=10
Resetting working tree
 > git reset --hard # timeout=10
 > git clean -fdx # timeout=10
Processing DSL script job_00_seed.groovy
Processing DSL script job_beam_Inventory.groovy
Processing DSL script job_beam_PerformanceTests_Dataflow.groovy
Processing DSL script job_beam_PerformanceTests_FileBasedIO_IT.groovy
Processing DSL script job_beam_PerformanceTests_FileBasedIO_IT_HDFS.groovy
Processing DSL script job_beam_PerformanceTests_HadoopInputFormat.groovy
Processing DSL script job_beam_PerformanceTests_JDBC.groovy
Processing DSL script job_beam_PerformanceTests_MongoDBIO_IT.groovy
Processing DSL script job_beam_PerformanceTests_Python.groovy
Processing DSL script job_beam_PerformanceTests_Spark.groovy
Processing DSL script job_beam_PostCommit_Go_GradleBuild.groovy
Processing DSL script job_beam_PostCommit_Java_GradleBuild.groovy
Processing DSL script job_beam_PostCommit_Java_ValidatesRunner_Apex.groovy
Processing DSL script job_beam_PostCommit_Java_ValidatesRunner_Dataflow.groovy
ERROR: (common_job_properties.groovy, line 177) No signature of method: 
java.util.LinkedHashMap.switches() is applicable for argument types: 
(java.lang.String) values: [--info]
Possible solutions: with(groovy.lang.Closure)



Jenkins build is back to normal : beam_SeedJob #1609

2018-05-01 Thread Apache Jenkins Server
See 




Build failed in Jenkins: beam_SeedJob #1608

2018-05-01 Thread Apache Jenkins Server
See 

--
GitHub pull request #5218 of commit 1f36f495690b868ccc63d6f0898cfb49847e68cf, 
no merge conflicts.
Setting status of 1f36f495690b868ccc63d6f0898cfb49847e68cf to PENDING with url 
https://builds.apache.org/job/beam_SeedJob/1608/ and message: 'Build started 
sha1 is merged.'
Using context: Jenkins: Seed Job
[EnvInject] - Loading node environment variables.
Building remotely on beam12 (beam) in workspace 

 > git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url https://github.com/apache/beam.git # timeout=10
Fetching upstream changes from https://github.com/apache/beam.git
 > git --version # timeout=10
 > git fetch --tags --progress https://github.com/apache/beam.git 
 > +refs/heads/*:refs/remotes/origin/* 
 > +refs/pull/5218/*:refs/remotes/origin/pr/5218/*
 > git rev-parse refs/remotes/origin/pr/5218/merge^{commit} # timeout=10
 > git rev-parse refs/remotes/origin/origin/pr/5218/merge^{commit} # timeout=10
Checking out Revision ce1361c53eeb6c22c19b1cad7cba739eec96b583 
(refs/remotes/origin/pr/5218/merge)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f ce1361c53eeb6c22c19b1cad7cba739eec96b583
Commit message: "Merge 1f36f495690b868ccc63d6f0898cfb49847e68cf into 
92fd475afca09da7da1224775342bd668b53d83a"
First time build. Skipping changelog.
Cleaning workspace
 > git rev-parse --verify HEAD # timeout=10
Resetting working tree
 > git reset --hard # timeout=10
 > git clean -fdx # timeout=10
Processing DSL script job_00_seed.groovy
Processing DSL script job_beam_Inventory.groovy
Processing DSL script job_beam_PerformanceTests_Dataflow.groovy
Processing DSL script job_beam_PerformanceTests_FileBasedIO_IT.groovy
Processing DSL script job_beam_PerformanceTests_FileBasedIO_IT_HDFS.groovy
Processing DSL script job_beam_PerformanceTests_HadoopInputFormat.groovy
Processing DSL script job_beam_PerformanceTests_JDBC.groovy
Processing DSL script job_beam_PerformanceTests_MongoDBIO_IT.groovy
Processing DSL script job_beam_PerformanceTests_Python.groovy
Processing DSL script job_beam_PerformanceTests_Spark.groovy
Processing DSL script job_beam_PostCommit_Go_GradleBuild.groovy
Processing DSL script job_beam_PostCommit_Java_GradleBuild.groovy
Processing DSL script job_beam_PostCommit_Java_ValidatesRunner_Apex.groovy
Processing DSL script job_beam_PostCommit_Java_ValidatesRunner_Dataflow.groovy
ERROR: (common_job_properties.groovy, line 177) No signature of method: 
java.util.LinkedHashMap.switches() is applicable for argument types: 
(java.lang.String) values: [--info]
Possible solutions: with(groovy.lang.Closure)



Re: [SQL] Reconciling Beam SQL Environments with Calcite Schema

2018-05-01 Thread Andrew Pilloud
I'm just starting to move forward on this. Looking at my team's short term
needs for SQL, option one would be good enough, however I agree with Kenn
that we want something like option two eventually. I also don't want to
break existing users and it sounds like there is at least one custom
MetaStore not in beam. So my plan is to go with option two and simplify the
interface where functionality loss will not result.

There is a common set of operations between the MetaStore and the
TableProvider. I'd like to make MetaStore inherit the interface of
TableProvider. Most operations we need (createTable, dropTable, listTables)
are already identical between the two, and so this will have no impact on
custom implementations. The buildBeamSqlTable operation does differ: the
MetaStore takes a table name, the TableProvider takes a table object.
However everything calling this API already has the full table object, so I
would like to simplify this interface by passing the table object in both
cases. Objections?

Andrew

On Tue, Apr 24, 2018 at 9:27 AM James  wrote:

> Kenn: yes, MetaStore is user-facing, Users can choose to implement their
> own MetaStore, currently only an InMemory implementation in Beam CodeBase.
>
> Andrew: I like the second option, since it "retains the ability for DDL
> operations to be processed by a custom MetaStore.", IMO we should have the
> DDL ability as a fully functional SQL.
>
> On Tue, Apr 24, 2018 at 10:28 PM Kenneth Knowles  wrote:
>
>> Can you say more about how the metastore is used? I presume it is or will
>> be user-facing, so are Beam SQL users already providing their own?
>>
>> I'm sure we want something like that eventually to support things like
>> Apache Atlas and HCatalog, IIUC for the "create if needed" logic when using
>> Beam SQL to create a derived data set. But I don't think we should build
>> out those code paths until we have at least one non-in-memory
>> implementation.
>>
>> Just a really high level $0.02.
>>
>> Kenn
>>
>> On Mon, Apr 23, 2018 at 4:56 PM Andrew Pilloud 
>> wrote:
>>
>>> I'm working on updating our Beam DDL code to use the DDL execution
>>> functionality that recently merged into core calcite. This enables us to
>>> take advantage of Calcite JDBC as a way to use Beam SQL. As part of that I
>>> need to reconcile the Beam SQL Environments with the Calcite Schema (which
>>> is calcite's environment). We currently have copies of our tables in the
>>> Beam meta/store, Calcite Schema, BeamSqlEnv, and BeamQueryPlanner. I have a
>>> pending PR which merges the later two to just use the Calcite Schema copy.
>>> Merging the Beam MetaStore and Calcite Schema isn't as simple. I have
>>> two options I'm looking for feedback on:
>>>
>>> 1. Make Calcite Schema authoritative and demote MetaStore to be
>>> something more like a Calcite TableFactory. Calcite Schema already
>>> implements the semantics of our InMemoryMetaStore. If the Store interface
>>> is just over built, this approach would result in a significant reduction
>>> in code. This would however eliminate the CRUD part of the interface
>>> leaving just the buildBeamSqlTable function.
>>>
>>> 2. Pass the Beam MetaStore into Calcite wrapped with a class translating
>>> to Calcite Schema (like we do already with tables). Instead of copying
>>> tables into the Calcite Schema we would pass in Beam meta/store as the
>>> source of truth and Calcite would manipulate tables directly in the Beam
>>> meta/store. This is a bit more complicated but retains the ability for DDL
>>> operations to be processed by a custom MetaStore.
>>>
>>> Thoughts?
>>>
>>> Andrew
>>>
>>


Build failed in Jenkins: beam_SeedJob #1607

2018-05-01 Thread Apache Jenkins Server
See 

--
GitHub pull request #5218 of commit 71cdb8f168bea0b1cb45822450dadf7380e08b5a, 
no merge conflicts.
Setting status of 71cdb8f168bea0b1cb45822450dadf7380e08b5a to PENDING with url 
https://builds.apache.org/job/beam_SeedJob/1607/ and message: 'Build started 
sha1 is merged.'
Using context: Jenkins: Seed Job
[EnvInject] - Loading node environment variables.
Building remotely on beam14 (beam) in workspace 

 > git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url https://github.com/apache/beam.git # timeout=10
Fetching upstream changes from https://github.com/apache/beam.git
 > git --version # timeout=10
 > git fetch --tags --progress https://github.com/apache/beam.git 
 > +refs/heads/*:refs/remotes/origin/* 
 > +refs/pull/5218/*:refs/remotes/origin/pr/5218/*
 > git rev-parse refs/remotes/origin/pr/5218/merge^{commit} # timeout=10
 > git rev-parse refs/remotes/origin/origin/pr/5218/merge^{commit} # timeout=10
Checking out Revision e5f9edc4a79a641ed4910b54904e93c0352964fa 
(refs/remotes/origin/pr/5218/merge)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f e5f9edc4a79a641ed4910b54904e93c0352964fa
Commit message: "Merge 71cdb8f168bea0b1cb45822450dadf7380e08b5a into 
92fd475afca09da7da1224775342bd668b53d83a"
First time build. Skipping changelog.
Cleaning workspace
 > git rev-parse --verify HEAD # timeout=10
Resetting working tree
 > git reset --hard # timeout=10
 > git clean -fdx # timeout=10
Processing DSL script job_00_seed.groovy
Processing DSL script job_beam_Inventory.groovy
ERROR: startup failed:
workspace:/.test-infra/jenkins/common_job_properties.groovy: 168: unexpected 
token: String @ line 168, column 8.
 void String setGradleSwitches(context, maxWorkers: 
Runtime.getRuntime().availableProcessors()) {
  ^

1 error




Re: [PROPOSAL] Preparing 2.5.0 release next week

2018-05-01 Thread Scott Wegner
Sounds good, thanks J.B. Feel free to ping if you need anything.

On Mon, Apr 30, 2018 at 10:12 PM Jean-Baptiste Onofré 
wrote:

> That's a good idea ! I think using Slack to ping/ask is a good way as it's
> async.
>
> Regards
> JB
>
> On 05/01/2018 06:51 AM, Reuven Lax wrote:
> > I think it makes sense to have someone who hadn't done the Gradle
> migration to
> > run the release. However would it make sense for someone who did work on
> the
> > migration to partner with you JB? There may be issues that are simply
> due to
> > things that were not documented well. In that case the partner can
> quickly help
> > resolve, and can then be the one who makes sure that the documentation
> is updated.
> >
> > Reuven
> >
> > On Mon, Apr 30, 2018 at 9:36 PM Jean-Baptiste Onofré  > > wrote:
> >
> > Hi Scott,
> >
> > Thanks for the update. The Gradle build crashed on my machine (not
> related to
> > Gradle). I launched a new one.
> >
> > I'm volunteer to cut the release: I think I know Gradle decently,
> and even if I
> > didn't work on the gradle "migration" during the last two weeks, I
> think it's
> > actually better: I have an "external" view on the latest changes.
> >
> > Thoughts ?
> >
> > Regards
> > JB
> >
> > On 05/01/2018 02:05 AM, Scott Wegner wrote:
> > > Welcome back JB!
> > >
> > > I just sent a separate update about Gradle [1]-- the build
> migration is
> > complete
> > > and the release documentation has been updated.
> > >
> > > I recommend we produce the 2.5.0 release using Gradle. Having a
> successful
> > > release should be the final validation before declaring the Gradle
> migration
> > > complete. So the sooner we can have a Gradle release, the sooner
> we can
> > get back
> > > to a single build system :)
> > >
> > > If it would be helpful, I suggest that somebody who's been working
> on the
> > Gradle
> > > migration to manage the 2.5.0 release. That way if we encounter
> any issues
> > from
> > > the build system, they should have sufficient expertise to fix it.
> > >
> > >
> > [1]
> https://lists.apache.org/thread.html/e543b3850bfc4950d57bc18624e1d4131324c6cf691fd10034947cad@%3Cdev.beam.apache.org%3E
>
> > >
> > > On Mon, Apr 30, 2018 at 11:38 AM Romain Manni-Bucau <
> rmannibu...@gmail.com
> > 
> > > >>
> wrote:
> > >
> > >
> > >
> > > Le 30 avr. 2018 19:39, "Jean-Baptiste Onofré"  > 
> > > >> a écrit :
> > >
> > > Hi guys,
> > >
> > > now that I'm back from vacations, I bring back 2.5.0
> release on
> > the table ;)
> > >
> > > This is also related to the current status of build
> (Maven/Gradle).
> > >
> > > FYI, I gonna start the Jira triage tomorrow and I launched
> couple of
> > > build on my
> > > machine (both Maven and Gradle) to get an update on the
> current
> > status.
> > >
> > > Please, let me know if you have an opinion about Gradle vs
> Maven
> > for the
> > > release.
> > >
> > >
> > > Produced artifacts are still too different to use gradle IMHO.
> Jira were
> > > created but not yet fixed last time i tried gradle so clearly
> maven IMHO.
> > >
> > >
> > > Thanks !
> > > Regards
> > > JB
> > >
> > > On 04/06/2018 10:48 AM, Jean-Baptiste Onofré wrote:
> > > > Hi guys,
> > > >
> > > > Apache Beam 2.4.0 has been released on March 20th.
> > > >
> > > > According to our cycle of release (roughly 6 weeks), we
> should think
> > > about 2.5.0.
> > > >
> > > > I'm volunteer to tackle this release.
> > > >
> > > > I'm proposing the following items:
> > > >
> > > > 1. We start the Jira triage now, up to Tuesday
> > > > 2. I would like to cut the release on Tuesday night
> (Europe time)
> > > > 2bis. I think it's wiser to still use Maven for this
> release. Do you
> > > think we
> > > > will be ready to try a release with Gradle ?
> > > >
> > > > After this release, I would like a discussion about:
> > > > 1. Gradle release (if we release 2.5.0 with Maven)
> > > > 2. Isolate release cycle per Beam part. I think it would
> be
> > > interesting to have
> > > > different release cycle: SDKs, DSLs, Runners, IOs.
> That's another
> > > discussion, I
> > > > will start a thread about 

Re: Gradle Status: Migrated!

2018-05-01 Thread Henning Rohde
JB - for your comparison, please also omit cross-compiling all the Go
examples because they are only built using Gradle.




On Tue, May 1, 2018 at 8:59 AM Jean-Baptiste Onofré  wrote:

> Thanks for the update Kenn, that makes sense.
>
> I'm checking the artifacts generated by Gradle right now.
>
> Regards
> JB
>
> On 01/05/2018 17:42, Kenneth Knowles wrote:
> > Raw execution time for tasks from clean is not the only thing to test. I
> > would say it is not even important. Try these from clean:
> >
> >   - Gradle: ./gradlew :beam-sdks-java-io-mongodb:test && ./gradlew
> > :beam-sdks-java-io-mongodb:test
> >   - Maven: mvn -pl sdks/java/io/mongodb test -am && mvn -pl
> > sdks/java/io/mongodb test -am
> >
> > Quick run on my laptop:
> >
> >   - Gradle: 66s (65s then 1s)
> >   - Maven: 317s (173s then 144s)
> >
> > Of course, the mvn command runs a bunch of useless executions AND it is
> > incorrect because it isn't using built jars. That's part of the point -
> > there is no way to do what you want with mvn. Let's try to make a
> > command that avoids useless work and builds the jars:
> >
> >   - Maven:  (mvn -pl sdks/java/io/mongodb install -DskipTests -am && mvn
> > -pl sdks/java/io/mongodb test) && (each time)
> >
> > That takes 102s the first time and 64s the second time. And that is
> > about the realistic workflow for someone trying to get something done.
> > Even if we touch a file Gradle finishes in 20s. So the best case for mvn
> > is this head-to-head:
> >
> >   - Gradle: 65s + 20s + 20s + 20s + 20s + ...
> >   - Maven: 102s + 64s + 64s + 64s + 64s + ...
> >
> > Kenn
> >
> >
> > On Tue, May 1, 2018 at 8:09 AM Jean-Baptiste Onofré  > > wrote:
> >
> > Thanks, for me, Maven 3.5.2 takes quite the same time than Gradle
> > (using
> > the wrapper). It's maybe related to my environment.
> >
> > Anyway, I'm doing a complete build review both in term of building
> > time,
> > and equivalence (artifacts publishing, test, plugin execution).
> >
> > I will provide an update soon.
> >
> > Regards
> > JB
> >
> > On 01/05/2018 16:57, Reuven Lax wrote:
> >  > Luke did gather data which showed that on our Jenkins executors
> the
> >  > Gradle build was much faster than the Maven build. Also right now
> we
> >  > have incremental builds turned off, but once we're confident
> > enough to
> >  > enable them (at least for local development) that will often drop
> > build
> >  > times a lot.
> >  >
> >  > On Tue, May 1, 2018 at 4:01 AM Jean-Baptiste Onofré
> > 
> >  > >> wrote:
> >  >
> >  > By the way, I'm curious: did someone evaluate the build time
> gap
> >  > between Maven
> >  > and Gradle ? One of the main reason to migrate to Gradle was
> > the inc
> >  > build and
> >  > build time. The builds I have launched are quite the same in
> >  > duration. I will do
> >  > deeper tests to evaluate the gap.
> >  >
> >  > Regards
> >  > JB
> >  >
> >  > On 05/01/2018 12:48 PM, Łukasz Gajowy wrote:
> >  >  > Hi Scott,
> >  >  >
> >  >  > thanks for the update! Just a clarification about IO
> > performance
> >  > tests: those
> >  >  > were fully migrated in Beam and all task necessary for
> running
> >  > them are there
> >  >  > but Jenkins jobs still run mvn commands. This is due the
> > fact that
> >  >  > PerfkitBenchmarker code (which is invoked by Jenkins and
> >  > constructs the commands
> >  >  > by itself) was not updated yet. This should be finished
> before
> >  > fully dropping mvn.
> >  >  >
> >  >  > More on that topic here, in
> >  >  > comments: https://issues.apache.org/jira/browse/BEAM-3942
> >  >  > PR changing the commands to gradle is waiting for PerfKit
> > devs review
> >  >  > here:
> >  >
> https://github.com/GoogleCloudPlatform/PerfKitBenchmarker/pull/1648
> >  >  >
> >  >  > Best regards,
> >  >  >
> >  >  > 2018-05-01 9:17 GMT+02:00 Romain Manni-Bucau
> >  > 
> > >
> >  >  >  >   >  >  >  >
> >  >  > Hi Scott
> >  >  >
> >  >  > While
> >  >
> https://issues.apache.org/jira/plugins/servlet/mobile#issue/BEAM-4057
> >  >  >
> >  >
> >   <
> https://issues.apache.org/jira/plugins/servlet/mobile#issue/BEAM-4057> is
> >  >  > open, gradle is a concurrent 

Re: Gradle Status: Migrated!

2018-05-01 Thread Jean-Baptiste Onofré

Thanks for the update Kenn, that makes sense.

I'm checking the artifacts generated by Gradle right now.

Regards
JB

On 01/05/2018 17:42, Kenneth Knowles wrote:
Raw execution time for tasks from clean is not the only thing to test. I 
would say it is not even important. Try these from clean:


  - Gradle: ./gradlew :beam-sdks-java-io-mongodb:test && ./gradlew 
:beam-sdks-java-io-mongodb:test
  - Maven: mvn -pl sdks/java/io/mongodb test -am && mvn -pl 
sdks/java/io/mongodb test -am


Quick run on my laptop:

  - Gradle: 66s (65s then 1s)
  - Maven: 317s (173s then 144s)

Of course, the mvn command runs a bunch of useless executions AND it is 
incorrect because it isn't using built jars. That's part of the point - 
there is no way to do what you want with mvn. Let's try to make a 
command that avoids useless work and builds the jars:


  - Maven:  (mvn -pl sdks/java/io/mongodb install -DskipTests -am && mvn 
-pl sdks/java/io/mongodb test) && (each time)


That takes 102s the first time and 64s the second time. And that is 
about the realistic workflow for someone trying to get something done. 
Even if we touch a file Gradle finishes in 20s. So the best case for mvn 
is this head-to-head:


  - Gradle: 65s + 20s + 20s + 20s + 20s + ...
  - Maven: 102s + 64s + 64s + 64s + 64s + ...

Kenn


On Tue, May 1, 2018 at 8:09 AM Jean-Baptiste Onofré > wrote:


Thanks, for me, Maven 3.5.2 takes quite the same time than Gradle
(using
the wrapper). It's maybe related to my environment.

Anyway, I'm doing a complete build review both in term of building
time,
and equivalence (artifacts publishing, test, plugin execution).

I will provide an update soon.

Regards
JB

On 01/05/2018 16:57, Reuven Lax wrote:
 > Luke did gather data which showed that on our Jenkins executors the
 > Gradle build was much faster than the Maven build. Also right now we
 > have incremental builds turned off, but once we're confident
enough to
 > enable them (at least for local development) that will often drop
build
 > times a lot.
 >
 > On Tue, May 1, 2018 at 4:01 AM Jean-Baptiste Onofré

 > >> wrote:
 >
 >     By the way, I'm curious: did someone evaluate the build time gap
 >     between Maven
 >     and Gradle ? One of the main reason to migrate to Gradle was
the inc
 >     build and
 >     build time. The builds I have launched are quite the same in
 >     duration. I will do
 >     deeper tests to evaluate the gap.
 >
 >     Regards
 >     JB
 >
 >     On 05/01/2018 12:48 PM, Łukasz Gajowy wrote:
 >      > Hi Scott,
 >      >
 >      > thanks for the update! Just a clarification about IO
performance
 >     tests: those
 >      > were fully migrated in Beam and all task necessary for running
 >     them are there
 >      > but Jenkins jobs still run mvn commands. This is due the
fact that
 >      > PerfkitBenchmarker code (which is invoked by Jenkins and
 >     constructs the commands
 >      > by itself) was not updated yet. This should be finished before
 >     fully dropping mvn.
 >      >
 >      > More on that topic here, in
 >      > comments: https://issues.apache.org/jira/browse/BEAM-3942
 >      > PR changing the commands to gradle is waiting for PerfKit
devs review
 >      > here:
 > https://github.com/GoogleCloudPlatform/PerfKitBenchmarker/pull/1648
 >      >
 >      > Best regards,
 >      >
 >      > 2018-05-01 9:17 GMT+02:00 Romain Manni-Bucau
 >     
>
 >      >        >
 >      >     Hi Scott
 >      >
 >      >     While
 > https://issues.apache.org/jira/plugins/servlet/mobile#issue/BEAM-4057
 >      >
 > 
   is

 >      >     open, gradle is a concurrent of maven but maven must
stay the
 >     default build
 >      >     tool cause gradle breaks users.
 >      >
 >      >
 >      >     Le 1 mai 2018 01:59, "Scott Wegner"

 >     >
 >      >     
     écrit :
 >      >
 >      >         Many many of you have been hacking diligently on the
 >     Gradle build, and
 >      >         I'm happy to announce that we now 

Re: Gradle Status: Migrated!

2018-05-01 Thread Kenneth Knowles
Raw execution time for tasks from clean is not the only thing to test. I
would say it is not even important. Try these from clean:

 - Gradle: ./gradlew :beam-sdks-java-io-mongodb:test && ./gradlew :
beam-sdks-java-io-mongodb:test
 - Maven: mvn -pl sdks/java/io/mongodb test -am && mvn -pl
sdks/java/io/mongodb test -am

Quick run on my laptop:

 - Gradle: 66s (65s then 1s)
 - Maven: 317s (173s then 144s)

Of course, the mvn command runs a bunch of useless executions AND it is
incorrect because it isn't using built jars. That's part of the point -
there is no way to do what you want with mvn. Let's try to make a command
that avoids useless work and builds the jars:

 - Maven:  (mvn -pl sdks/java/io/mongodb install -DskipTests -am && mvn -pl
sdks/java/io/mongodb test) && (each time)

That takes 102s the first time and 64s the second time. And that is about
the realistic workflow for someone trying to get something done. Even if we
touch a file Gradle finishes in 20s. So the best case for mvn is this
head-to-head:

 - Gradle: 65s + 20s + 20s + 20s + 20s + ...
 - Maven: 102s + 64s + 64s + 64s + 64s + ...

Kenn


On Tue, May 1, 2018 at 8:09 AM Jean-Baptiste Onofré  wrote:

> Thanks, for me, Maven 3.5.2 takes quite the same time than Gradle (using
> the wrapper). It's maybe related to my environment.
>
> Anyway, I'm doing a complete build review both in term of building time,
> and equivalence (artifacts publishing, test, plugin execution).
>
> I will provide an update soon.
>
> Regards
> JB
>
> On 01/05/2018 16:57, Reuven Lax wrote:
> > Luke did gather data which showed that on our Jenkins executors the
> > Gradle build was much faster than the Maven build. Also right now we
> > have incremental builds turned off, but once we're confident enough to
> > enable them (at least for local development) that will often drop build
> > times a lot.
> >
> > On Tue, May 1, 2018 at 4:01 AM Jean-Baptiste Onofré  > > wrote:
> >
> > By the way, I'm curious: did someone evaluate the build time gap
> > between Maven
> > and Gradle ? One of the main reason to migrate to Gradle was the inc
> > build and
> > build time. The builds I have launched are quite the same in
> > duration. I will do
> > deeper tests to evaluate the gap.
> >
> > Regards
> > JB
> >
> > On 05/01/2018 12:48 PM, Łukasz Gajowy wrote:
> >  > Hi Scott,
> >  >
> >  > thanks for the update! Just a clarification about IO performance
> > tests: those
> >  > were fully migrated in Beam and all task necessary for running
> > them are there
> >  > but Jenkins jobs still run mvn commands. This is due the fact that
> >  > PerfkitBenchmarker code (which is invoked by Jenkins and
> > constructs the commands
> >  > by itself) was not updated yet. This should be finished before
> > fully dropping mvn.
> >  >
> >  > More on that topic here, in
> >  > comments: https://issues.apache.org/jira/browse/BEAM-3942
> >  > PR changing the commands to gradle is waiting for PerfKit devs
> review
> >  > here:
> > https://github.com/GoogleCloudPlatform/PerfKitBenchmarker/pull/1648
> >  >
> >  > Best regards,
> >  >
> >  > 2018-05-01 9:17 GMT+02:00 Romain Manni-Bucau
> > 
> >  > >>:
> >  >
> >  > Hi Scott
> >  >
> >  > While
> >
> https://issues.apache.org/jira/plugins/servlet/mobile#issue/BEAM-4057
> >  >
> >   <
> https://issues.apache.org/jira/plugins/servlet/mobile#issue/BEAM-4057> is
> >  > open, gradle is a concurrent of maven but maven must stay the
> > default build
> >  > tool cause gradle breaks users.
> >  >
> >  >
> >  > Le 1 mai 2018 01:59, "Scott Wegner"  > 
> >  > >> a
> > écrit :
> >  >
> >  > Many many of you have been hacking diligently on the
> > Gradle build, and
> >  > I'm happy to announce that we now have a
> > fully-functioning Gradle build!
> >  > There's been a ton of progress since our last update [1]:
> >  >
> >  > * Improved nightly snapshot release [2]
> >  > * Improve runner quickstarts [5] [11]
> >  > * Python post-commit ported to Gradle [3]
> >  > * Update performance testing framework for Gradle [4] [12]
> >  > * Generate javadocs from Gradle [6]
> >  > * Update to latest Gradle version [7] [21]
> >  > * Updated documentation [8] [22]
> >  > * Tune CI build resource usage for Jenkins [9] [19]
> >  > * Improve shading of test jars [10] [13] [14]
> >  > * Add 'errorprone' and 'spotless' static analysis 

Re: Gradle Status: Migrated!

2018-05-01 Thread Jean-Baptiste Onofré
Thanks, for me, Maven 3.5.2 takes quite the same time than Gradle (using 
the wrapper). It's maybe related to my environment.


Anyway, I'm doing a complete build review both in term of building time, 
and equivalence (artifacts publishing, test, plugin execution).


I will provide an update soon.

Regards
JB

On 01/05/2018 16:57, Reuven Lax wrote:
Luke did gather data which showed that on our Jenkins executors the 
Gradle build was much faster than the Maven build. Also right now we 
have incremental builds turned off, but once we're confident enough to 
enable them (at least for local development) that will often drop build 
times a lot.


On Tue, May 1, 2018 at 4:01 AM Jean-Baptiste Onofré > wrote:


By the way, I'm curious: did someone evaluate the build time gap
between Maven
and Gradle ? One of the main reason to migrate to Gradle was the inc
build and
build time. The builds I have launched are quite the same in
duration. I will do
deeper tests to evaluate the gap.

Regards
JB

On 05/01/2018 12:48 PM, Łukasz Gajowy wrote:
 > Hi Scott,
 >
 > thanks for the update! Just a clarification about IO performance
tests: those
 > were fully migrated in Beam and all task necessary for running
them are there
 > but Jenkins jobs still run mvn commands. This is due the fact that
 > PerfkitBenchmarker code (which is invoked by Jenkins and
constructs the commands
 > by itself) was not updated yet. This should be finished before
fully dropping mvn.
 >
 > More on that topic here, in
 > comments: https://issues.apache.org/jira/browse/BEAM-3942
 > PR changing the commands to gradle is waiting for PerfKit devs review
 > here:
https://github.com/GoogleCloudPlatform/PerfKitBenchmarker/pull/1648
 >
 > Best regards,
 >
 > 2018-05-01 9:17 GMT+02:00 Romain Manni-Bucau

 > >>:
 >
 >     Hi Scott
 >
 >     While
https://issues.apache.org/jira/plugins/servlet/mobile#issue/BEAM-4057
 >   
   is

 >     open, gradle is a concurrent of maven but maven must stay the
default build
 >     tool cause gradle breaks users.
 >
 >
 >     Le 1 mai 2018 01:59, "Scott Wegner" 
 >     >> a
écrit :
 >
 >         Many many of you have been hacking diligently on the
Gradle build, and
 >         I'm happy to announce that we now have a
fully-functioning Gradle build!
 >         There's been a ton of progress since our last update [1]:
 >
 >         * Improved nightly snapshot release [2]
 >         * Improve runner quickstarts [5] [11]
 >         * Python post-commit ported to Gradle [3]
 >         * Update performance testing framework for Gradle [4] [12]
 >         * Generate javadocs from Gradle [6]
 >         * Update to latest Gradle version [7] [21]
 >         * Updated documentation [8] [22]
 >         * Tune CI build resource usage for Jenkins [9] [19]
 >         * Improve shading of test jars [10] [13] [14]
 >         * Add 'errorprone' and 'spotless' static analysis [15] [24]
 >         * Improve IntelliJ project generation [16] [17]
 >         * Reduce number of ValidatesRunner tests [18]
 >         * Update release documentation for Gradle [20]
 >         * Update docker build scripts for Gradle [23]
 >
 >         The build process and Jenkins environment have stabilized
and we've
 >         resolved migration blockers. The final step is to use
Gradle to produce
 >         an official release. The release documentation has been
updated for
 >         Gradle and I recommend we use these docs for the 2.5.0
release. Assuming
 >         the release goes well, we can declare the migration fully
validated and
 >         stop supporting dual build systems.
 >
 >         During the migration we identified a number of
opportunities to improve
 >         the build even further. Feel free to grab one of the
items off of the
 >         JIRA: BEAM-4045 [24]
 >
 >         Thanks again to all those that contributed. This has
truly been a
 >         community effort!
 >
 >         [1]

https://lists.apache.org/thread.html/5f6bae323acc1b050962e68ec310613e0121b05bc5c42915c536fb59@%3Cdev.beam.apache.org%3E
 >   
  

 >         [2] https://github.com/apache/beam/pull/5142
 >         

Re: Gradle Status: Migrated!

2018-05-01 Thread Reuven Lax
Luke did gather data which showed that on our Jenkins executors the Gradle
build was much faster than the Maven build. Also right now we have
incremental builds turned off, but once we're confident enough to enable
them (at least for local development) that will often drop build times a
lot.

On Tue, May 1, 2018 at 4:01 AM Jean-Baptiste Onofré  wrote:

> By the way, I'm curious: did someone evaluate the build time gap between
> Maven
> and Gradle ? One of the main reason to migrate to Gradle was the inc build
> and
> build time. The builds I have launched are quite the same in duration. I
> will do
> deeper tests to evaluate the gap.
>
> Regards
> JB
>
> On 05/01/2018 12:48 PM, Łukasz Gajowy wrote:
> > Hi Scott,
> >
> > thanks for the update! Just a clarification about IO performance tests:
> those
> > were fully migrated in Beam and all task necessary for running them are
> there
> > but Jenkins jobs still run mvn commands. This is due the fact that
> > PerfkitBenchmarker code (which is invoked by Jenkins and constructs the
> commands
> > by itself) was not updated yet. This should be finished before fully
> dropping mvn.
> >
> > More on that topic here, in
> > comments: https://issues.apache.org/jira/browse/BEAM-3942
> > PR changing the commands to gradle is waiting for PerfKit devs review
> > here:
> https://github.com/GoogleCloudPlatform/PerfKitBenchmarker/pull/1648
> >
> > Best regards,
> >
> > 2018-05-01 9:17 GMT+02:00 Romain Manni-Bucau  > >:
> >
> > Hi Scott
> >
> > While
> https://issues.apache.org/jira/plugins/servlet/mobile#issue/BEAM-4057
> > <
> https://issues.apache.org/jira/plugins/servlet/mobile#issue/BEAM-4057> is
> > open, gradle is a concurrent of maven but maven must stay the
> default build
> > tool cause gradle breaks users.
> >
> >
> > Le 1 mai 2018 01:59, "Scott Wegner"  > > a écrit :
> >
> > Many many of you have been hacking diligently on the Gradle
> build, and
> > I'm happy to announce that we now have a fully-functioning
> Gradle build!
> > There's been a ton of progress since our last update [1]:
> >
> > * Improved nightly snapshot release [2]
> > * Improve runner quickstarts [5] [11]
> > * Python post-commit ported to Gradle [3]
> > * Update performance testing framework for Gradle [4] [12]
> > * Generate javadocs from Gradle [6]
> > * Update to latest Gradle version [7] [21]
> > * Updated documentation [8] [22]
> > * Tune CI build resource usage for Jenkins [9] [19]
> > * Improve shading of test jars [10] [13] [14]
> > * Add 'errorprone' and 'spotless' static analysis [15] [24]
> > * Improve IntelliJ project generation [16] [17]
> > * Reduce number of ValidatesRunner tests [18]
> > * Update release documentation for Gradle [20]
> > * Update docker build scripts for Gradle [23]
> >
> > The build process and Jenkins environment have stabilized and
> we've
> > resolved migration blockers. The final step is to use Gradle to
> produce
> > an official release. The release documentation has been updated
> for
> > Gradle and I recommend we use these docs for the 2.5.0 release.
> Assuming
> > the release goes well, we can declare the migration fully
> validated and
> > stop supporting dual build systems.
> >
> > During the migration we identified a number of opportunities to
> improve
> > the build even further. Feel free to grab one of the items off
> of the
> > JIRA: BEAM-4045 [24]
> >
> > Thanks again to all those that contributed. This has truly been a
> > community effort!
> >
> > [1]
> https://lists.apache.org/thread.html/5f6bae323acc1b050962e68ec310613e0121b05bc5c42915c536fb59@%3Cdev.beam.apache.org%3E
> > <
> https://lists.apache.org/thread.html/5f6bae323acc1b050962e68ec310613e0121b05bc5c42915c536fb59@%3Cdev.beam.apache.org%3E
> >
> > [2] https://github.com/apache/beam/pull/5142
> > 
> > [3] https://github.com/apache/beam/pull/5146
> > 
> > [4] https://github.com/apache/beam/pull/5003
> > 
> > [5] https://github.com/apache/beam/pull/5151
> > 
> > [6] https://github.com/apache/beam/pull/5121
> > 
> > [7] https://github.com/apache/beam/pull/5104
> > 
> > [8] https://github.com/apache/beam/pull/5183
> > 
> > [9] https://github.com/apache/beam/pull/5171
> > 

Re: Gradle Status: Migrated!

2018-05-01 Thread Jean-Baptiste Onofré
By the way, I'm curious: did someone evaluate the build time gap between Maven
and Gradle ? One of the main reason to migrate to Gradle was the inc build and
build time. The builds I have launched are quite the same in duration. I will do
deeper tests to evaluate the gap.

Regards
JB

On 05/01/2018 12:48 PM, Łukasz Gajowy wrote:
> Hi Scott, 
> 
> thanks for the update! Just a clarification about IO performance tests: those
> were fully migrated in Beam and all task necessary for running them are there
> but Jenkins jobs still run mvn commands. This is due the fact that
> PerfkitBenchmarker code (which is invoked by Jenkins and constructs the 
> commands
> by itself) was not updated yet. This should be finished before fully dropping 
> mvn. 
> 
> More on that topic here, in
> comments: https://issues.apache.org/jira/browse/BEAM-3942
> PR changing the commands to gradle is waiting for PerfKit devs review
> here: https://github.com/GoogleCloudPlatform/PerfKitBenchmarker/pull/1648
> 
> Best regards,
> 
> 2018-05-01 9:17 GMT+02:00 Romain Manni-Bucau  >:
> 
> Hi Scott
> 
> While 
> https://issues.apache.org/jira/plugins/servlet/mobile#issue/BEAM-4057
>  is
> open, gradle is a concurrent of maven but maven must stay the default 
> build
> tool cause gradle breaks users.
> 
> 
> Le 1 mai 2018 01:59, "Scott Wegner"  > a écrit :
> 
> Many many of you have been hacking diligently on the Gradle build, and
> I'm happy to announce that we now have a fully-functioning Gradle 
> build!
> There's been a ton of progress since our last update [1]:
> 
> * Improved nightly snapshot release [2]
> * Improve runner quickstarts [5] [11]
> * Python post-commit ported to Gradle [3]
> * Update performance testing framework for Gradle [4] [12]
> * Generate javadocs from Gradle [6]
> * Update to latest Gradle version [7] [21]
> * Updated documentation [8] [22]
> * Tune CI build resource usage for Jenkins [9] [19]
> * Improve shading of test jars [10] [13] [14]
> * Add 'errorprone' and 'spotless' static analysis [15] [24]
> * Improve IntelliJ project generation [16] [17]
> * Reduce number of ValidatesRunner tests [18]
> * Update release documentation for Gradle [20]
> * Update docker build scripts for Gradle [23]
> 
> The build process and Jenkins environment have stabilized and we've
> resolved migration blockers. The final step is to use Gradle to 
> produce
> an official release. The release documentation has been updated for
> Gradle and I recommend we use these docs for the 2.5.0 release. 
> Assuming
> the release goes well, we can declare the migration fully validated 
> and
> stop supporting dual build systems.
> 
> During the migration we identified a number of opportunities to 
> improve
> the build even further. Feel free to grab one of the items off of the
> JIRA: BEAM-4045 [24]
> 
> Thanks again to all those that contributed. This has truly been a
> community effort!
> 
> [1] 
> https://lists.apache.org/thread.html/5f6bae323acc1b050962e68ec310613e0121b05bc5c42915c536fb59@%3Cdev.beam.apache.org%3E
> 
> 
> [2] https://github.com/apache/beam/pull/5142
>  
> [3] https://github.com/apache/beam/pull/5146
>  
> [4] https://github.com/apache/beam/pull/5003
>  
> [5] https://github.com/apache/beam/pull/5151
>  
> [6] https://github.com/apache/beam/pull/5121
>  
> [7] https://github.com/apache/beam/pull/5104
>  
> [8] https://github.com/apache/beam/pull/5183
>  
> [9] https://github.com/apache/beam/pull/5171
>  
> [10] https://github.com/apache/beam/pull/5117
>  
> [11] https://github.com/apache/beam/pull/5200
>  
> [12] https://github.com/apache/beam/pull/5051
>  
> [13] https://github.com/apache/beam/pull/4740
>  
> [14] https://github.com/apache/beam/pull/4702
> 

Re: Gradle Status: Migrated!

2018-05-01 Thread Łukasz Gajowy
Hi Scott,

thanks for the update! Just a clarification about IO performance tests:
those were fully migrated in Beam and all task necessary for running them
are there but Jenkins jobs still run mvn commands. This is due the fact
that PerfkitBenchmarker code (which is invoked by Jenkins and constructs
the commands by itself) was not updated yet. This should be finished before
fully dropping mvn.

More on that topic here, in comments:
https://issues.apache.org/jira/browse/BEAM-3942
PR changing the commands to gradle is waiting for PerfKit devs review here:
https://github.com/GoogleCloudPlatform/PerfKitBenchmarker/pull/1648

Best regards,

2018-05-01 9:17 GMT+02:00 Romain Manni-Bucau :

> Hi Scott
>
> While https://issues.apache.org/jira/plugins/servlet/
> mobile#issue/BEAM-4057 is open, gradle is a concurrent of maven but maven
> must stay the default build tool cause gradle breaks users.
>
>
> Le 1 mai 2018 01:59, "Scott Wegner"  a écrit :
>
>> Many many of you have been hacking diligently on the Gradle build, and
>> I'm happy to announce that we now have a fully-functioning Gradle build!
>> There's been a ton of progress since our last update [1]:
>>
>> * Improved nightly snapshot release [2]
>> * Improve runner quickstarts [5] [11]
>> * Python post-commit ported to Gradle [3]
>> * Update performance testing framework for Gradle [4] [12]
>> * Generate javadocs from Gradle [6]
>> * Update to latest Gradle version [7] [21]
>> * Updated documentation [8] [22]
>> * Tune CI build resource usage for Jenkins [9] [19]
>> * Improve shading of test jars [10] [13] [14]
>> * Add 'errorprone' and 'spotless' static analysis [15] [24]
>> * Improve IntelliJ project generation [16] [17]
>> * Reduce number of ValidatesRunner tests [18]
>> * Update release documentation for Gradle [20]
>> * Update docker build scripts for Gradle [23]
>>
>> The build process and Jenkins environment have stabilized and we've
>> resolved migration blockers. The final step is to use Gradle to produce an
>> official release. The release documentation has been updated for Gradle and
>> I recommend we use these docs for the 2.5.0 release. Assuming the release
>> goes well, we can declare the migration fully validated and stop supporting
>> dual build systems.
>>
>> During the migration we identified a number of opportunities to improve
>> the build even further. Feel free to grab one of the items off of the JIRA:
>> BEAM-4045 [24]
>>
>> Thanks again to all those that contributed. This has truly been a
>> community effort!
>>
>> [1] https://lists.apache.org/thread.html/5f6bae323acc1b05096
>> 2e68ec310613e0121b05bc5c42915c536fb59@%3Cdev.beam.apache.org%3E
>> [2] https://github.com/apache/beam/pull/5142
>> [3] https://github.com/apache/beam/pull/5146
>> [4] https://github.com/apache/beam/pull/5003
>> [5] https://github.com/apache/beam/pull/5151
>> [6] https://github.com/apache/beam/pull/5121
>> [7] https://github.com/apache/beam/pull/5104
>> [8] https://github.com/apache/beam/pull/5183
>> [9] https://github.com/apache/beam/pull/5171
>> [10] https://github.com/apache/beam/pull/5117
>> [11] https://github.com/apache/beam/pull/5200
>> [12] https://github.com/apache/beam/pull/5051
>> [13] https://github.com/apache/beam/pull/4740
>> [14] https://github.com/apache/beam/pull/4702
>> [15] https://github.com/apache/beam/pull/4701
>> [16] https://github.com/apache/beam/pull/4626
>> [17] https://github.com/apache/beam/pull/4625
>> [18] https://github.com/apache/beam/pull/5193
>> [19] https://github.com/apache/beam/pull/5222
>> [20] https://github.com/apache/beam/pull/5187
>> [21] https://github.com/apache/beam/pull/5217
>> [22] https://github.com/apache/beam/pull/5115
>> [23] https://github.com/apache/beam/pull/5252
>> [24] https://github.com/apache/beam/pull/5161
>> [25] https://issues.apache.org/jira/browse/BEAM-4045
>> --
>>
>>
>> Got feedback? http://go/swegner-feedback
>>
>


Build failed in Jenkins: beam_Release_Gradle_NightlySnapshot #25

2018-05-01 Thread Apache Jenkins Server
See 


Changes:

[ankurgoenka] fixing generic errors

[echauchot] [BEAM-3119] ensure metrics thread is shutdown

[echauchot] [BEAM-4088] Rebase on master, fix conflicts and adapt test to master

[echauchot] [BEAM-4088] Add a test with a real pipeline that has a DoFn with 
metrics

[thw] [BEAM-4131] Include SDK into Python SDK harness container.

[herohde] [BEAM-4175] Fix direct ouput pardo issue

[aljoscha.krettek] [BEAM-3909] Add tests for Flink DoFnOperator side-input 
checkpointing

[echauchot] [BEAM-4088] Blocks until the threads in metricsExecutorService is 
done,

[pgerver] Enhance awsCredentialsProvider option description

[kenn] Add errorprone to basic Gradle config

[kenn] Fix suppressed failure in ResourceIdTester

[kenn] Fix erroneous tests in ViewTest

[kenn] Add missing @Test in CollectionCoderTest

[kenn] Sickbay broken LocalResourceIdTest case

[kenn] Remove inaccurate GuardedBy in both copies of DirectMetrics

[kenn] Fix suppressed failure in RetryHttpRequestInitializerTest

[kenn] Remove unnecessary type arguments in ReduceFnRunnerTest

[kenn] Add missing @Test in SplittableParDoProcessFnTest

[kenn] Remove unnecessary type argument in PipelineTranslationTest

[kenn] Fix compile-type constant int overflow in S3FileSystem

[kenn] Disable findbugs for JdbcIO where it gave false positives, now that we

[kenn] Sickbay broken GcsResourceIdTest case

[kenn] Sickbay broken HadoopResourceIdTest case

[kenn] Sickbay broken SplittableParDoProcessFnTest case

[kenn] Add missing @Test in S3ResourceIdTest

[kenn] Fix typo in CassandraIOIT

[kenn] Fix nonsensical @RunWith in ExampleEchoPipelineTest

[kenn] Fix missing @Test in both copies of WatermarkManagerTest

[kenn] Fix misuse of Arrays.asList in KinesisMockWriteTest

[kenn] Remove extraneous type variables in MongoDBGridFSIOTest

[kenn] Sickbay broken S3ResourceIdTest cases

[kenn] Sickbay broken WatermarkManagerTest case

[apilloud] [BEAM-3983][SQL] Add BigQuery table provider

[iemejia] Fix gradle script to build the docker image of 'sdks/java/container'

[thw] cleanup

[Pablo] Creation of utils.py whit CountingSource class on it.

[Pablo] Changed import from examples.snippets to io.utils

--
[...truncated 2.39 MB...]
(see http://errorprone.info/bugpattern/NullablePrimitive)
  Did you mean to remove this line?
:186:
 warning: [NullablePrimitive] @Nullable should not be used for primitive types 
since they cannot be null
  @Nullable
  ^
(see http://errorprone.info/bugpattern/NullablePrimitive)
  Did you mean to remove this line?
:284:
 warning: [NullablePrimitive] @Nullable should not be used for primitive types 
since they cannot be null
  @Nullable
  ^
(see http://errorprone.info/bugpattern/NullablePrimitive)
  Did you mean to remove this line?
:92:
 warning: [ImmutableEnumChecker] enums should be immutable: 'NexmarkSuite' has 
field 'configurations' of type 
'java.util.List', 'List' is 
mutable
  private final List configurations;
   ^
(see http://errorprone.info/bugpattern/ImmutableEnumChecker)
:40:
 warning: [ImmutableEnumChecker] enums should be immutable: 'Tag' has non-final 
field 'value'
private int value = -1;
^
(see http://errorprone.info/bugpattern/ImmutableEnumChecker)
  Did you mean 'private final int value = -1;'?
:
 warning: Cannot find annotation method 'value()' in type 'DefaultAnnotation'
:41:
 warning: [MutableConstantField] Constant field declarations should use the 
immutable type (such as ImmutableList) instead of the general collection 
interface type (such as List)
  public static final Map ADAPTERS =
 ^
(see http://errorprone.info/bugpattern/MutableConstantField)
  Did you mean 'public static final ImmutableMap 
ADAPTERS ='?

Re: Gradle Status: Migrated!

2018-05-01 Thread Romain Manni-Bucau
Hi Scott

While https://issues.apache.org/jira/plugins/servlet/mobile#issue/BEAM-4057
is open, gradle is a concurrent of maven but maven must stay the default
build tool cause gradle breaks users.


Le 1 mai 2018 01:59, "Scott Wegner"  a écrit :

> Many many of you have been hacking diligently on the Gradle build, and I'm
> happy to announce that we now have a fully-functioning Gradle build!
> There's been a ton of progress since our last update [1]:
>
> * Improved nightly snapshot release [2]
> * Improve runner quickstarts [5] [11]
> * Python post-commit ported to Gradle [3]
> * Update performance testing framework for Gradle [4] [12]
> * Generate javadocs from Gradle [6]
> * Update to latest Gradle version [7] [21]
> * Updated documentation [8] [22]
> * Tune CI build resource usage for Jenkins [9] [19]
> * Improve shading of test jars [10] [13] [14]
> * Add 'errorprone' and 'spotless' static analysis [15] [24]
> * Improve IntelliJ project generation [16] [17]
> * Reduce number of ValidatesRunner tests [18]
> * Update release documentation for Gradle [20]
> * Update docker build scripts for Gradle [23]
>
> The build process and Jenkins environment have stabilized and we've
> resolved migration blockers. The final step is to use Gradle to produce an
> official release. The release documentation has been updated for Gradle and
> I recommend we use these docs for the 2.5.0 release. Assuming the release
> goes well, we can declare the migration fully validated and stop supporting
> dual build systems.
>
> During the migration we identified a number of opportunities to improve
> the build even further. Feel free to grab one of the items off of the JIRA:
> BEAM-4045 [24]
>
> Thanks again to all those that contributed. This has truly been a
> community effort!
>
> [1] https://lists.apache.org/thread.html/5f6bae323acc1b050962e68ec31061
> 3e0121b05bc5c42915c536fb59@%3Cdev.beam.apache.org%3E
> [2] https://github.com/apache/beam/pull/5142
> [3] https://github.com/apache/beam/pull/5146
> [4] https://github.com/apache/beam/pull/5003
> [5] https://github.com/apache/beam/pull/5151
> [6] https://github.com/apache/beam/pull/5121
> [7] https://github.com/apache/beam/pull/5104
> [8] https://github.com/apache/beam/pull/5183
> [9] https://github.com/apache/beam/pull/5171
> [10] https://github.com/apache/beam/pull/5117
> [11] https://github.com/apache/beam/pull/5200
> [12] https://github.com/apache/beam/pull/5051
> [13] https://github.com/apache/beam/pull/4740
> [14] https://github.com/apache/beam/pull/4702
> [15] https://github.com/apache/beam/pull/4701
> [16] https://github.com/apache/beam/pull/4626
> [17] https://github.com/apache/beam/pull/4625
> [18] https://github.com/apache/beam/pull/5193
> [19] https://github.com/apache/beam/pull/5222
> [20] https://github.com/apache/beam/pull/5187
> [21] https://github.com/apache/beam/pull/5217
> [22] https://github.com/apache/beam/pull/5115
> [23] https://github.com/apache/beam/pull/5252
> [24] https://github.com/apache/beam/pull/5161
> [25] https://issues.apache.org/jira/browse/BEAM-4045
> --
>
>
> Got feedback? http://go/swegner-feedback
>


Re: Kafka connector for Beam Python SDK

2018-05-01 Thread Chamikara Jayalath
Thanks all for the comments. Based on the discussion so far, looks like we
have to flesh out the cross-language transforms feature quite a bit before
we can utilize some of the existing Java IO in other SDKs. This might
involve redesigning some of the existing Java IOs to allow expressing
second order APIs in other languages without significantly affecting the
execution performance. Also, several people agreed that adding a SDF based
Kafka source to Python SDK will allow us to better iron out the SDF API.
I'll start prototyping the Kafka connector. Please follow
https://issues.apache.org/jira/browse/BEAM-3788 for details.

Thanks,
Cham

On Mon, Apr 30, 2018 at 4:08 PM Kenneth Knowles  wrote:

> The numbers on that PR are not really what end-to-end means to me - it
> normally means you have a fully represented productionized use case and the
> metric you are looking at is the actual impact on the full system (like
> latency from a tap on mobile to a dashboard being updated, or monthly
> compute cost for a system). Also FWIW when I say "TableRow" I don't mean
> the JSON wire format for them. I also believe that Luke's proposal has
> never been measured.
>
> But it is an obvious & fair point that encoding the table repeatedly will
> bloat compared to a second order transform. However in an end to end test
> you'd want to compare against ways of shuffling compact proxies and joining
> back to the full value, depending on the portability overhead. But, I
> agree, as you say, there's probably a decent design space of blending a
> portable first-order transform with little second-order helpers. Or perhaps
> with a shared memory layout we can have reasonable performance with a
> cross-language higher-order situation.
>
> Kenn
>
>
>
> On Mon, Apr 30, 2018 at 10:54 AM Eugene Kirpichov 
> wrote:
>
>> I think we've discussed this before... It is true that all of our
>> second-order APIs can be re-expressed as first-order APIs, but that would
>> come at a very serious performance cost - e.g. significant increase in
>> amount of data shuffled / materialized. The second-order APIs (most
>> importantly, Dynamic Destinations in BigQueryIO and FileIO.write) were
>> deliberately designed this way to minimize amount of data shuffled (grouped
>> by destination) and postpone the redundant representation (record,
>> destination) as late as possible in the graph.
>>
>> Experience from https://github.com/apache/beam/pull/3894 shows that
>> getting rid of TableRow in the representation alone gives 2x-3x end-to-end
>> performance gains. Gains from avoiding redundancy are likely still higher.
>> Some back-of-the-envelope calculations:
>> - Imagine we're writing purchases to BigQuery tables (store_id,
>> customer_id, product_id, quantity), one table per store, e.g.
>> "stores_dataset:purchases_$storeid".
>> - With the second-order representation, we group records (store_id,
>> customer_id, product_id, quantity) by store_id - both key and value are a
>> handful of bytes.
>> - With the first-order representation, we group records of the form
>> {table_name:"stores_dataset:purchases_$storeid", row: {"store_id":"...",
>> "customer_id":"...", "product_id":"...", "quantity":"..."}} which is many
>> times more in encoded form, both key and value. Schemas, perhaps, might
>> help with encoding the TableRow less redundantly, but a) I don't know if
>> that's actually the case b) the schemas work will also take more time to
>> land in non-Java SDKs c) this doesn't get rid of the redundancy in keys (at
>> a minimum, in a cross-language world, we'll definitely need to *encode* all
>> such keys redundantly)
>>
>> I.e., yes, first-order APIs are possible, and likely even desirable for
>> the first version of cross-language IO, but I do not believe that staying
>> first-order-only is an acceptable long-term state.
>>
>> Note also that second-order doesn't necessarily require ability to invoke
>> cross-language lambdas - there are other approaches, e.g. rewriting in the
>> SDK language only the sub-parts of the transforms that actually use the
>> lambdas.
>>
>> On Mon, Apr 30, 2018 at 10:34 AM Lukasz Cwik  wrote:
>>
>>> I believe that most (all?) of these cases of executing a lambda could be
>>> avoided if we passed along structured records like:
>>> {
>>>   table_name:
>>>   row: { ... }
>>> }
>>>
>>>
>>> On Mon, Apr 30, 2018 at 10:24 AM Chamikara Jayalath <
>>> chamik...@google.com> wrote:
>>>


 On Mon, Apr 30, 2018 at 9:54 AM Kenneth Knowles  wrote:

> I agree with Cham's motivations as far as "we need it now" and getting
> Python SDF up and running and exercised on a real connector.
>
> But I do find the current API of BigQueryIO to be a poor example. That
> particular functionality on BigQueryIO seems extraneous and goes against
> our own style guide [1]. The recommended way to write it would be for
> BigQueryIO to output a