Re: Master broken

2019-01-18 Thread Michael Luckey
Something weird is going on here. A 'clean check' does not reexecute spotlessJava here after execution of spotlessApply... a --rerun-tasks triggers that failure repeatedly Regarding the license header, could it be that you executed spotlessApply once and now it is fetched from the build cache

Re: Master broken

2019-01-18 Thread Michael Luckey
As far as I understand, Kenn is right. Spotless is by default formatting all sourceSets. As antlr adds generated as a sourceset, this will probably trigger here. Dunno if other generators do differently, though, i.e not adding as sources... On Fri, Jan 18, 2019 at 9:04 PM Reuven Lax wrote: > No

Re: Master broken

2019-01-18 Thread Michael Luckey
It is on running spotlessJava, not spotlessApply On Fri, Jan 18, 2019 at 5:02 PM Reuven Lax wrote: > I don't get those errors when I run spotlessApply, and I don't see those > errors happening on Jenkins. Are you doing anything special to run > spotless? In general, I don't think spotless was ru

Re: gradle clean causes long-running python installs

2019-01-18 Thread Michael Luckey
What does `setup.py clean` do anyway? Only removing a 'build' output folder? Or something more sophisticated? Cause those pyc files still remain... if looking into differences between a 'Gradle cleaned' sdks/python and 'git cleaned' one... On Fri, Jan 18, 2019 at 8:43 PM Kenneth Knowles wrote:

Re: org.apache.beam.sdk.io.FileIOTest.testMatchWatchForNewFiles failing

2019-01-18 Thread Boyuan Zhang
The BEAM-6352 is tracking this failure. On Fri, Jan 18, 2019 at 8:48 PM Reuven Lax wrote: > This test is consistently failing for me. Has anyone else seen this? > > Reuven >

org.apache.beam.sdk.io.FileIOTest.testMatchWatchForNewFiles failing

2019-01-18 Thread Reuven Lax
This test is consistently failing for me. Has anyone else seen this? Reuven

Re: Master broken

2019-01-18 Thread Reuven Lax
FYI I've been trying to merge pr/7545 to fix things. However a number of flaky Beam tests are making it slow to get a green run. On Fri, Jan 18, 2019 at 12:56 PM Kenneth Knowles wrote: > What I mean is that every Gradle project by virtue of having the Java (or > other) plugin has explicit lists

Re: [Proposal] Requesting PMC approval to start planning for Beam Summits 2019

2019-01-18 Thread Ahmet Altay
Thank you Joana. Kenn and PMC members could you comment on what needs to be done to move this forward? On Thu, Jan 17, 2019 at 3:40 PM joanafil...@google.com < joanafil...@google.com> wrote: > Dear Project Management Committee, > > > The Beam Summits are community events funded by a Sponsoring C

Re: Beam Contribution

2019-01-18 Thread Kenneth Knowles
Done & welcome! On Fri, Jan 18, 2019 at 1:53 PM Daniel Chen wrote: > Hello, > > I'm Daniel Chen from the Samza team at Linkedin. For my upcoming work on > the Beam Samza runner, I would like to be added as a contributor in order > to facilitate ticket tracking. My ASF jira username is dchen. > >

Confluence wiki edit access request

2019-01-18 Thread Udi Meiri
username: udim Thanks! smime.p7s Description: S/MIME Cryptographic Signature

Beam Contribution

2019-01-18 Thread Daniel Chen
Hello, I'm Daniel Chen from the Samza team at Linkedin. For my upcoming work on the Beam Samza runner, I would like to be added as a contributor in order to facilitate ticket tracking. My ASF jira username is dchen. Thanks, Daniel

Re: Master broken

2019-01-18 Thread Kenneth Knowles
What I mean is that every Gradle project by virtue of having the Java (or other) plugin has explicit lists of sources. FWIW I traced it to here: https://github.com/diffplug/spotless/blob/master/plugin-gradle/src/main/java/com/diffplug/gradle/spotless/SpotlessTask.java#L194 which eventually gets it

Re: Master broken

2019-01-18 Thread Reuven Lax
No includes _or_ excludes are specified in our spotless config, which implies that spotless should be scraping everything. On Fri, Jan 18, 2019 at 12:00 PM Kenneth Knowles wrote: > We have many other generated sources that would not pass spotless. It > sounds like the Antlr-generated sources are

Re: Master broken

2019-01-18 Thread Kenneth Knowles
We have many other generated sources that would not pass spotless. It sounds like the Antlr-generated sources are ending up in a source set that spotless runs over. I would assume the Gradle plugin uses those, not a filesystem scrape, to find the files it should process. Worth checking. Kenn On F

Re: gradle clean causes long-running python installs

2019-01-18 Thread Udi Meiri
grpcio-tools could probably be moved under the "test" tag in setup.py. Not sure why it has to be specified in gradle configs. On Fri, Jan 18, 2019 at 11:43 AM Kenneth Knowles wrote: > Can you `setupVirtualEnv` just enough to run `setup.py clean` without > installing gcpio-tools, etc? > > Kenn >

Re: Master broken

2019-01-18 Thread Alan Myrvold
The Jenkins test does a fresh clone of the repo, without generating code before the spotless test. On Fri, Jan 18, 2019 at 11:41 AM Kenneth Knowles wrote: > Those are the paths that cause the Jenkins job to be run. It doesn't > affect the Gradle task. > > Kenn > > On Fri, Jan 18, 2019 at 11:34 A

Re: gradle clean causes long-running python installs

2019-01-18 Thread Kenneth Knowles
Can you `setupVirtualEnv` just enough to run `setup.py clean` without installing gcpio-tools, etc? Kenn On Fri, Jan 18, 2019 at 11:20 AM Udi Meiri wrote: > setup.py has requirements like setuptools, which are installed in the > virtual environment. > So even running the clean command requires t

Re: flink portable runner usage

2019-01-18 Thread Thomas Weise
Hello Hai, Yes, we are working on a use case for Python/Flink that should go to production soon. It's using the Flink runner in *streaming* mode. The source is Kinesis, but we implemented support for Kafka also. You can find that in our Beam fork [1] The Flink runner supports multiple element bun

Re: Master broken

2019-01-18 Thread Kenneth Knowles
Those are the paths that cause the Jenkins job to be run. It doesn't affect the Gradle task. Kenn On Fri, Jan 18, 2019 at 11:34 AM Reuven Lax wrote: > FYI, Jenkins works because it explicitly specifies which paths to run > spotless on, as below. As a result, Jenkins (correctly) does not run >

Re: Master broken

2019-01-18 Thread Reuven Lax
FYI, Jenkins works because it explicitly specifies which paths to run spotless on, as below. As a result, Jenkins (correctly) does not run spotless on generated src. PrecommitJobBuilder builder = new PrecommitJobBuilder( scope: this, nameBase: 'Spotless', gradleTask: 'spotlessCheck',

Re: Confusing sentence in Windowing section in Beam programming guide

2019-01-18 Thread Kenneth Knowles
That is correct. For global window there is no such thing as late data. Kenn On Fri, Jan 18, 2019, 11:13 Ruoyun Huang Very helpful discussion (and the fixing PR). > > To make sure my take-way is correct. The status quo is a) "for a Global > Window, then there is *no possible scenario* where data

Re: gradle clean causes long-running python installs

2019-01-18 Thread Udi Meiri
setup.py has requirements like setuptools, which are installed in the virtual environment. So even running the clean command requires the virtualenv to be set up. A possible fix could be to skip :beam-sdks-python:cleanPython if setupVirtualenv has not been run. (perhaps by checking for the existen

Re: Confusing sentence in Windowing section in Beam programming guide

2019-01-18 Thread Ruoyun Huang
Very helpful discussion (and the fixing PR). To make sure my take-way is correct. The status quo is a) "for a Global Window, then there is *no possible scenario* where data is identified as late". Rather than b) "for a global window we *no longer* compare watermark to identify late data, but *the

Re: The full list of proposals / prototype documents

2019-01-18 Thread Alex Van Boxel
typically me... I just click on 2 of the 3 that were not shared. I went over all of the proposals to see where I needed to get access, here is the list:SQL / Schema - Pubsub to Beam SQL [doc ] - Calcite/

Re: Master broken

2019-01-18 Thread Reuven Lax
Thanks, working on a PR now to exclude generated code. I wonder if this is why spotless has always been so slow. On Fri, Jan 18, 2019 at 8:28 AM Ismaël Mejía wrote: > What command are you running to build? > This issue was reported also by other users in the slack channel. > Agree the fix should

Re: Adding KMS support to generic filesystem interface

2019-01-18 Thread Udi Meiri
Hi Ismaël, I'd like your feedback, especially from the AWS perspective. I wasn't aware of BEAM-3821, but I did create a JIRA for Cloud KMS support on GCS: https://issues.apache.org/jira/browse/BEAM-5959 Some details of my plan for KMS support: 1. Add KMS settings to sources and sinks. 2. Add a --k

Re: [spark runner based on dataset POC] your opinion

2019-01-18 Thread Gleb Kanterov
Agree with Kenn. It should be possible, Spark has a similar concept called ExpressionEncoder, I was doing similar derivation using Scala macro in typelevel/frameless . Most of the code in Beam is a blackbox function in ParDo, and the only way to translate it

Re: The full list of proposals / prototype documents

2019-01-18 Thread Alexey Romanenko
Hi Alex, Hmm, afaik, this is mostly google docs file which shared with anyone who knows the link. Could you send here the names of proposals that required an access approval? Thanks. > On 18 Jan 2019, at 16:58, Alex Van Boxel wrote: > > Hey Alexey, > > I see that a lot (well, I tried 2) pro

Re: [PROPOSAL] Prepare Beam 2.10.0 release

2019-01-18 Thread Scott Wegner
For BEAM-6352, I have a rollback ready for review: https://github.com/apache/beam/pull/7540 Conversation about the decision to rollback vs. roll-forward for this change is on the JIRA issue. On Fri, Jan 18, 2019 at 8:22 AM Maximilian Michels wrote: > I've created the revert for the pipeline opti

Re: Master broken

2019-01-18 Thread Ismaël Mejía
What command are you running to build? This issue was reported also by other users in the slack channel. Agree the fix should be trivial On Fri, Jan 18, 2019 at 5:13 PM Reuven Lax wrote: > > Does this only happen on fresh clones? I created a fresh branch synced to > origin/master, and I can't re

Re: Adding KMS support to generic filesystem interface

2019-01-18 Thread Ismaël Mejía
Hello Udi, I implemented the support for KMS in Amazon and I am really interested in check your PR. However I won't have time to do it until next monday. I hope waiting a bit is ok with you if you want some feedback from me. I am curious if you considered or are aware of this issue: BEAM-3821 Sup

Re: Master broken

2019-01-18 Thread Reuven Lax
It's failing because the first line of your generated file does not match the license header (the first line is "Generated from "). Interestingly, when I look at my generated files, the first line _is_ the license file, which is why spotless doesn't fail for me. What's more, I assume the same

Re: [PROPOSAL] Prepare Beam 2.10.0 release

2019-01-18 Thread Maximilian Michels
I've created the revert for the pipeline options parsing which we agreed on: https://github.com/apache/beam/pull/7564 On 17.01.19 15:16, Maximilian Michels wrote: An issue with the Flink Runner when restarting streaming pipelines: https://jira.apache.org/jira/browse/BEAM-6460 Looks like it wil

Re: Master broken

2019-01-18 Thread Reuven Lax
Does this only happen on fresh clones? I created a fresh branch synced to origin/master, and I can't reproduce this still. If spotless is running against generated code, that seems like a bug in our spotless setup. Should be trivial to fix by creating a target block in our spotless config. Reuven

Re: The full list of proposals / prototype documents

2019-01-18 Thread Ismaël Mejía
We should make mandatory to add the document to the design documents page before sending it to the mailing list to avoid missing information. Or even better than we have a more formal improvement proposal so we also know the status of the proposals to know if they are accepted, refused, etc (curren

Re: Master broken

2019-01-18 Thread Ismaël Mejía
Just make a fresh clone and run `./gradlew check -p sdks/java/core` it should break. If you add 'spotlessApply' the build passes but this should not be the default, no?, the default is 'spotlessCheck' On Fri, Jan 18, 2019 at 5:02 PM Reuven Lax wrote: > > I don't get those errors when I run spotle

Re: Master broken

2019-01-18 Thread Reuven Lax
I don't get those errors when I run spotlessApply, and I don't see those errors happening on Jenkins. Are you doing anything special to run spotless? In general, I don't think spotless was running on generated code before. On Fri, Jan 18, 2019 at 6:06 AM Ismaël Mejía wrote: > When running the bu

Re: The full list of proposals / prototype documents

2019-01-18 Thread Alex Van Boxel
Hey Alexey, I see that a lot (well, I tried 2) proposals require access approval. Should that be the case? _/ _/ Alex Van Boxel On Fri, Jan 18, 2019 at 4:51 PM Alexey Romanenko wrote: > I’m sorry but I forgot to mention that the whole list could be found here: > https://beam.apache.org/contr

Re: The full list of proposals / prototype documents

2019-01-18 Thread Alexey Romanenko
I’m sorry but I forgot to mention that the whole list could be found here: https://beam.apache.org/contribute/design-documents/ > On 18 Jan 2019, at 16:49, Alexey Romanenko wrote: > > FYI: I updated the list of design documents to make it

Re: The full list of proposals / prototype documents

2019-01-18 Thread Alexey Romanenko
FYI: I updated the list of design documents to make it up-to-date. PR: https://github.com/apache/beam/pull/7560 Please, feel free to add new ones if I missed something. Also, I’d like to remind that it would be very helpful to add design document to thi

Re: [spark runner based on dataset POC] your opinion

2019-01-18 Thread Kenneth Knowles
I wonder if this could tie in with Reuven's recent work. He's basically making it so every type with an "obvious" schema automatically converts to/from Row whenever needed. Sounds like a similar need, superficially. Kenn On Fri, Jan 18, 2019, 02:36 Manu Zhang Hi Etienne, > > I see your point. I'

Master broken

2019-01-18 Thread Ismaël Mejía
When running the build on master we got an error message. Looks related to the recent inclusion/generation of stuff with ANTLR. Can Reuven or someone else involved in this specific changes please take a look? > Task :beam-sdks-java-core:spotlessJava FAILED FAILURE: Build failed with an exception.

Re: [spark runner based on dataset POC] your opinion

2019-01-18 Thread Manu Zhang
Hi Etienne, I see your point. I'm a bit worried that every ParDo has to be wrapped in a `mapPartition` which introduces cost of serde and forgoes the benefits of Dataset API. Maybe Dataset is not the best idea to integrate Beam with Spark. Just my $0.02. Manu On Thu, Jan 17, 2019 at 10:44 PM Et