Re: SNAPSHOTS have not been updated since february

2019-03-26 Thread Daniel Oliveira
I made a bug for this specific issue (artifacts not publishing to the
Apache Maven repo): https://issues.apache.org/jira/browse/BEAM-6919

While I was gathering info for the bug report I also noticed +Yifan Zou
 has an experimental PR testing a fix:
https://github.com/apache/beam/pull/8148

On Tue, Mar 26, 2019 at 11:42 AM Boyuan Zhang  wrote:

> +Daniel Oliveira 
>
> On Tue, Mar 26, 2019 at 9:57 AM Boyuan Zhang  wrote:
>
>> Sorry for the typo. Ideally, the snapshot publish is *independent* from
>> postrelease_snapshot.
>>
>> On Tue, Mar 26, 2019 at 9:55 AM Boyuan Zhang  wrote:
>>
>>> Hey,
>>>
>>> I'm trying to publish the artifacts by commenting "Run Gradle Publish"
>>> in my PR, but there are several errors saying "cannot write artifacts
>>> into dir"
>>> ,
>>> anyone has idea on it? Ideally, the snapshot publish is dependent from
>>> postrelease_snapshot. The publish task is to build and publish artifacts
>>> and the postrelease_snapshot is to verify whether the snapshot works.
>>>
>>> On Tue, Mar 26, 2019 at 8:45 AM Ahmet Altay  wrote:
>>>
 I believe this is related to
 https://issues.apache.org/jira/browse/BEAM-6840 and +Boyuan Zhang
  has a fix in progress
 https://github.com/apache/beam/pull/8132

 On Tue, Mar 26, 2019 at 7:09 AM Ismaël Mejía  wrote:

> I was trying to validate a fix on the Spark runner and realized that
> Beam SNAPSHOTS have not been updated since February 24 !
>
>
> https://repository.apache.org/content/repositories/snapshots/org/apache/beam/beam-sdks-java-core/2.12.0-SNAPSHOT/
>
> Can somebody please take a look at why this is not been updated?
>
> Thanks,
> Ismaël
>



Re: Build blocking on

2019-03-26 Thread Kenneth Knowles
Exactly. What you have said is what we should move towards IMO.

On Tue, Mar 26, 2019 at 4:02 PM Michael Luckey  wrote:

>
>
> On Tue, Mar 26, 2019 at 11:40 PM Kenneth Knowles  wrote:
>
>> +1 to separating integration tests from "build". It should be able to
>> succeed without internet access (if deps are cached).
>>
>
> Big +1 here.
>
> I d even suggest the following:
> - ./gradlew build must succeed 'offline' (if deps cached)
> - no flaky tests on build target. Flaky tests should be 'offloaded' to
> preCommit (probably)
>
>   Main point here from my side would be user experience. If someone
> downloads released source code, she must be able to just do a './gradlew
> build publishToMavenLocal' without failing tests. Not fully understood
> python/go in this context, though.
>
> - no docker tests on build target
>
>   imho just to much of required infrastructure for a 'build'
>
>
>
>
>>
>> On Tue, Mar 26, 2019 at 3:18 PM Michael Luckey 
>> wrote:
>>
>>> Of course, we could implement something here. But I am worried about the
>>> consequences. As gogradle writes into (user) global state this would have
>>> unexpected side effects.
>>>
>>> Consider a developer running Project A - which happens to also use
>>> gogradle - and during build of that project issues an innocent beam clean.
>>> Would be no fun to track all that failures arising out of those now deleted
>>> state.
>>>
>>> Unfortunately, I do not have a clear understanding yet, what's going on
>>> here, why that cache is put there and how it gets inconsistent. Need to
>>> look into that. (But might be obsoleted anyway by Roberts plans to drop
>>> gogradle for something different.)
>>>
>>> On Tue, Mar 26, 2019 at 7:28 PM Thomas Weise  wrote:
>>>
 Can this be addressed by having "clean" remove all state that gogradle
 leaves behind? This staleness issue has bitten me a few times also and it
 would be good to have a reliable way to deal with it, even if it involves
 an extra clean.


 On Tue, Mar 26, 2019 at 11:14 AM Michael Luckey 
 wrote:

> @Udi
> Did you try to just delete the
> '/usr/local/google/home/ehudm/.gradle/go/repo/cloud.google.com'
> folder?
>
> @Robert
> As said before, I am a bit scared about the implications. Shelling out
> is done by python, and from build perspective, this does not work very
> well, unfortunately. I.e. no caching, up-to-date checks etc...
>
> But of course, we need to play with this a bit more.
>
> On Tue, Mar 26, 2019 at 6:24 PM Robert Burke 
> wrote:
>
>> Reading the error from the gradle scan, it largely looks like some
>> part of the GCP dependencies for the build depends on a package, where 
>> the
>> commit version is no longer around. The main issue with gogradle is that
>> it's entirely distinct from the usual Go workflow, which means deps users
>> use are likely to be different to what's in the lock file.
>>
>> This work will be tracked in
>> https://issues.apache.org/jira/browse/BEAM-5379
>> GoGradle hasn't moved to support the new-go way of handling deps, so
>> my inclination is to simplify to simple scripts for Gradle that shell out
>> the to Go tool for handling Go dep management, over trying to fix 
>> GoGradle.
>>
>> On Tue, 26 Mar 2019 at 09:43, Udi Meiri  wrote:
>>
>>> Robert, from what I recall it's not flaky for me - it consistently
>>> fails. Let me know if there's a way to get more logging about this 
>>> error.
>>>
>>> On Mon, Mar 25, 2019, 19:50 Robert Burke  wrote:
>>>
 It's concerning to me that 1) the Go dependency resolution via
 gogradle is flaky, and 2) that it can block other languages.

 I suppose 2) makes sense since it's part of the container
 bootstrapping code, but that makes 1) a serious problem, of which I 
 wasn't
 aware.
 I should have time to investigate this in the next two weeks.

 On Mon, 25 Mar 2019 at 18:08, Michael Luckey 
 wrote:

> Just for the record,
>
> using a vm here, because did not yet get all task running on my
> mac, and did not want to mess with my setup.
>
> So installed vanilla ubuntu-18.04 LTS on virtual box, 26GB ram, 6
> cores and further
>
> sudo apt update
>
> sudo apt install gcc
>
> sudo apt install make
>
> sudo apt install perl
>
> sudo apt install curl
>
> sudo apt install openjdk-8-jdk
>
> sudo apt install python
>
> sudo apt install -y software-properties-common
>
> sudo add-apt-repository ppa:deadsnakes/ppa
>
> sudo apt update
>
> sudo apt install python3.5
>
> sudo apt-get install apt-transport-https ca-certificates curl

Re: Build blocking on

2019-03-26 Thread Michael Luckey
On Tue, Mar 26, 2019 at 11:40 PM Kenneth Knowles  wrote:

> +1 to separating integration tests from "build". It should be able to
> succeed without internet access (if deps are cached).
>

Big +1 here.

I d even suggest the following:
- ./gradlew build must succeed 'offline' (if deps cached)
- no flaky tests on build target. Flaky tests should be 'offloaded' to
preCommit (probably)

  Main point here from my side would be user experience. If someone
downloads released source code, she must be able to just do a './gradlew
build publishToMavenLocal' without failing tests. Not fully understood
python/go in this context, though.

- no docker tests on build target

  imho just to much of required infrastructure for a 'build'




>
> On Tue, Mar 26, 2019 at 3:18 PM Michael Luckey 
> wrote:
>
>> Of course, we could implement something here. But I am worried about the
>> consequences. As gogradle writes into (user) global state this would have
>> unexpected side effects.
>>
>> Consider a developer running Project A - which happens to also use
>> gogradle - and during build of that project issues an innocent beam clean.
>> Would be no fun to track all that failures arising out of those now deleted
>> state.
>>
>> Unfortunately, I do not have a clear understanding yet, what's going on
>> here, why that cache is put there and how it gets inconsistent. Need to
>> look into that. (But might be obsoleted anyway by Roberts plans to drop
>> gogradle for something different.)
>>
>> On Tue, Mar 26, 2019 at 7:28 PM Thomas Weise  wrote:
>>
>>> Can this be addressed by having "clean" remove all state that gogradle
>>> leaves behind? This staleness issue has bitten me a few times also and it
>>> would be good to have a reliable way to deal with it, even if it involves
>>> an extra clean.
>>>
>>>
>>> On Tue, Mar 26, 2019 at 11:14 AM Michael Luckey 
>>> wrote:
>>>
 @Udi
 Did you try to just delete the
 '/usr/local/google/home/ehudm/.gradle/go/repo/cloud.google.com' folder?

 @Robert
 As said before, I am a bit scared about the implications. Shelling out
 is done by python, and from build perspective, this does not work very
 well, unfortunately. I.e. no caching, up-to-date checks etc...

 But of course, we need to play with this a bit more.

 On Tue, Mar 26, 2019 at 6:24 PM Robert Burke 
 wrote:

> Reading the error from the gradle scan, it largely looks like some
> part of the GCP dependencies for the build depends on a package, where the
> commit version is no longer around. The main issue with gogradle is that
> it's entirely distinct from the usual Go workflow, which means deps users
> use are likely to be different to what's in the lock file.
>
> This work will be tracked in
> https://issues.apache.org/jira/browse/BEAM-5379
> GoGradle hasn't moved to support the new-go way of handling deps, so
> my inclination is to simplify to simple scripts for Gradle that shell out
> the to Go tool for handling Go dep management, over trying to fix 
> GoGradle.
>
> On Tue, 26 Mar 2019 at 09:43, Udi Meiri  wrote:
>
>> Robert, from what I recall it's not flaky for me - it consistently
>> fails. Let me know if there's a way to get more logging about this error.
>>
>> On Mon, Mar 25, 2019, 19:50 Robert Burke  wrote:
>>
>>> It's concerning to me that 1) the Go dependency resolution via
>>> gogradle is flaky, and 2) that it can block other languages.
>>>
>>> I suppose 2) makes sense since it's part of the container
>>> bootstrapping code, but that makes 1) a serious problem, of which I 
>>> wasn't
>>> aware.
>>> I should have time to investigate this in the next two weeks.
>>>
>>> On Mon, 25 Mar 2019 at 18:08, Michael Luckey 
>>> wrote:
>>>
 Just for the record,

 using a vm here, because did not yet get all task running on my
 mac, and did not want to mess with my setup.

 So installed vanilla ubuntu-18.04 LTS on virtual box, 26GB ram, 6
 cores and further

 sudo apt update

 sudo apt install gcc

 sudo apt install make

 sudo apt install perl

 sudo apt install curl

 sudo apt install openjdk-8-jdk

 sudo apt install python

 sudo apt install -y software-properties-common

 sudo add-apt-repository ppa:deadsnakes/ppa

 sudo apt update

 sudo apt install python3.5

 sudo apt-get install apt-transport-https ca-certificates curl
 gnupg-agent software-properties-common

 curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo
 apt-key add -

 sudo apt-key fingerprint 0EBFCD88

 sudo add-apt-repository "deb [arch=amd64]
 

Re: Build blocking on

2019-03-26 Thread Michael Luckey
On Tue, Mar 26, 2019 at 11:18 PM Mikhail Gryzykhin 
wrote:

> I believe what happens is that testPy2Gcp actually runs integration tests
> that try to connect to GCP.
>

Actually I was hoping for an explanation like this. Any suggestion how I
could confirm that on my behalf?


> Without having GCP cluster and configuration on your machine I'd expect
> these tests to fail.
>

Hmm... here I am actually unsure, what would be the best to handle such
cases.

If I understand correctly, we currently skip some tests which do not meet
expectations, kind of 'can not run on your arch' thingies... So I am
undecided, whether I d prefer those tests to be skipped if gcp
configuration is missing

pro
* dev is still able to run the tests (whichever task they are associated
with) without having to separate the failures out. For instance, these
'testPy2Gcp' does actually execute 'some tests' - which might be already
covered by some other calls... But I definitely do not like the idea, to
put the burden on the developer to track which tasks/tests might be
executed on local machine. Unless this distinction is really coarse - and
pre/postcommit is something I really would like to be able to run locally...


con
* we definitely need to make sure, those tests are not accidentally skipped
on CI servers.


>
> I'd say we should remove testPy2Gcp task from "build" task and explicitly
> keep it as integration test.
>
> --Mikhail
>
>
> On Tue, Mar 26, 2019 at 3:12 PM Michael Luckey 
> wrote:
>
>>
>>
>> On Tue, Mar 26, 2019 at 10:29 PM Udi Meiri  wrote:
>>
>>> Luckey, I couldn't recreate your issue, but I still haven't done a full
>>> build.
>>> I created a new GCE VM with using the ubuntu-1804-bionic-v20190212a
>>> image (n1-standard-4 machine type).
>>>
>>> Ran the following:
>>> sudo apt-get update
>>> sudo apt-get install python-pip
>>> sudo apt-get install python-virtualenv
>>> git clone https://github.com/apache/beam.git
>>> cd beam
>>> ./gradlew :beam-sdks-python:testPy2Gcp
>>> [failed: no JAVA_HOME]
>>> sudo apt-get install openjdk-8-jdk
>>> ./gradlew :beam-sdks-python:testPy2Gcp
>>>
>>> Got: BUILD SUCCESSFUL in 7m 52s
>>>
>>
>> Nice. Thanks a lot for your help here.
>>
>> If I understand correctly, this VM is already located within gcp. Could
>> it already have some setup, which needs to be done on 'my' VM? For instance
>> I was contemplating about that test trying 'to call home', but as I am
>> (unfortunately ;) no googler and do not have any gcp specific setup, fails
>> here but misses to timeout? This is just some weird assumption, did not yet
>> look into the actual implementation.
>>
>> Which I seemingly need to do here :(
>>
>>
>>> Then I tried:
>>> ./gradlew build
>>>
>>> And ran out of disk space. :) (beam/ is taking 4.5G and the VM boot disk
>>> is 10G total)
>>>
>>
>> Ouch :D
>>
>>
>>>
>>> On Tue, Mar 26, 2019 at 1:35 PM Robert Burke  wrote:
>>>
 Michael, your concern is reasonable, especially with the experience
 with python, though that does help me bootstrap this work. :)

 The go tools provide caching and avoid redoing work if the source files
 haven't changed. This applies most particularly for `go build` and `go
 test`. As long as the go code isn't changing at every invocation, this
 should be fine. I'm not aware of the same being the case for the usual
 python tools.

  The real trick is ensuring a valid and consistent environment for the
 go code.

 The environment question becomes easier for everyone by moving to go
 modules, which were designed to provide these kinds of consistent builds.
 It also avoids needing a GOPATH set. Any directory is permitted, as long as
 the go.mod is present.

 (The Go SDK doesn't yet us go modules, so go.mod and go.sum aren't yet
 in the repo.)

 The main blocker is see is updating the Jenkins machines to have the
 latest version of Go (1.12) instead of 1.10, which doesn't support modules.
 This only blocks a final submission, rather than the work fortunately.

 On Tue, Mar 26, 2019, 1:08 PM Udi Meiri  wrote:

> "rm -r ~/.gradle/go/repo/" worked for me (there was more than one
> package with issues).
> My ~/.bashrc has
>   export GOPATH=$HOME/go
> so maybe that's making the difference in my setup.
>
> On Tue, Mar 26, 2019 at 11:28 AM Thomas Weise  wrote:
>
>> Can this be addressed by having "clean" remove all state that
>> gogradle leaves behind? This staleness issue has bitten me a few times 
>> also
>> and it would be good to have a reliable way to deal with it, even if it
>> involves an extra clean.
>>
>>
>> On Tue, Mar 26, 2019 at 11:14 AM Michael Luckey 
>> wrote:
>>
>>> @Udi
>>> Did you try to just delete the
>>> '/usr/local/google/home/ehudm/.gradle/go/repo/cloud.google.com'
>>> folder?
>>>
>>> @Robert
>>> As said before, I am a bit scared about the implications. 

Re: Build blocking on

2019-03-26 Thread Kenneth Knowles
+1 to separating integration tests from "build". It should be able to
succeed without internet access (if deps are cached).

On Tue, Mar 26, 2019 at 3:18 PM Michael Luckey  wrote:

> Of course, we could implement something here. But I am worried about the
> consequences. As gogradle writes into (user) global state this would have
> unexpected side effects.
>
> Consider a developer running Project A - which happens to also use
> gogradle - and during build of that project issues an innocent beam clean.
> Would be no fun to track all that failures arising out of those now deleted
> state.
>
> Unfortunately, I do not have a clear understanding yet, what's going on
> here, why that cache is put there and how it gets inconsistent. Need to
> look into that. (But might be obsoleted anyway by Roberts plans to drop
> gogradle for something different.)
>
> On Tue, Mar 26, 2019 at 7:28 PM Thomas Weise  wrote:
>
>> Can this be addressed by having "clean" remove all state that gogradle
>> leaves behind? This staleness issue has bitten me a few times also and it
>> would be good to have a reliable way to deal with it, even if it involves
>> an extra clean.
>>
>>
>> On Tue, Mar 26, 2019 at 11:14 AM Michael Luckey 
>> wrote:
>>
>>> @Udi
>>> Did you try to just delete the
>>> '/usr/local/google/home/ehudm/.gradle/go/repo/cloud.google.com' folder?
>>>
>>> @Robert
>>> As said before, I am a bit scared about the implications. Shelling out
>>> is done by python, and from build perspective, this does not work very
>>> well, unfortunately. I.e. no caching, up-to-date checks etc...
>>>
>>> But of course, we need to play with this a bit more.
>>>
>>> On Tue, Mar 26, 2019 at 6:24 PM Robert Burke  wrote:
>>>
 Reading the error from the gradle scan, it largely looks like some part
 of the GCP dependencies for the build depends on a package, where the
 commit version is no longer around. The main issue with gogradle is that
 it's entirely distinct from the usual Go workflow, which means deps users
 use are likely to be different to what's in the lock file.

 This work will be tracked in
 https://issues.apache.org/jira/browse/BEAM-5379
 GoGradle hasn't moved to support the new-go way of handling deps, so my
 inclination is to simplify to simple scripts for Gradle that shell out the
 to Go tool for handling Go dep management, over trying to fix GoGradle.

 On Tue, 26 Mar 2019 at 09:43, Udi Meiri  wrote:

> Robert, from what I recall it's not flaky for me - it consistently
> fails. Let me know if there's a way to get more logging about this error.
>
> On Mon, Mar 25, 2019, 19:50 Robert Burke  wrote:
>
>> It's concerning to me that 1) the Go dependency resolution via
>> gogradle is flaky, and 2) that it can block other languages.
>>
>> I suppose 2) makes sense since it's part of the container
>> bootstrapping code, but that makes 1) a serious problem, of which I 
>> wasn't
>> aware.
>> I should have time to investigate this in the next two weeks.
>>
>> On Mon, 25 Mar 2019 at 18:08, Michael Luckey 
>> wrote:
>>
>>> Just for the record,
>>>
>>> using a vm here, because did not yet get all task running on my mac,
>>> and did not want to mess with my setup.
>>>
>>> So installed vanilla ubuntu-18.04 LTS on virtual box, 26GB ram, 6
>>> cores and further
>>>
>>> sudo apt update
>>>
>>> sudo apt install gcc
>>>
>>> sudo apt install make
>>>
>>> sudo apt install perl
>>>
>>> sudo apt install curl
>>>
>>> sudo apt install openjdk-8-jdk
>>>
>>> sudo apt install python
>>>
>>> sudo apt install -y software-properties-common
>>>
>>> sudo add-apt-repository ppa:deadsnakes/ppa
>>>
>>> sudo apt update
>>>
>>> sudo apt install python3.5
>>>
>>> sudo apt-get install apt-transport-https ca-certificates curl
>>> gnupg-agent software-properties-common
>>>
>>> curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo
>>> apt-key add -
>>>
>>> sudo apt-key fingerprint 0EBFCD88
>>>
>>> sudo add-apt-repository "deb [arch=amd64]
>>> https://download.docker.com/linux/ubuntu \
>>>
>>> $(lsb_release -cs) \
>>>
>>> stable"
>>>
>>> sudo apt-get update
>>>
>>> sudo apt-get install docker-ce docker-ce-cli containerd.io
>>>
>>> sudo groupadd docker
>>>
>>> sudo usermod -aG docker $USER
>>>
>>> git config --global user.email "d...@spam.me"
>>>
>>> git config --global user.name "Some Guy"
>>>
>>> curl https://bootstrap.pypa.io/get-pip.py -o get-pip.py
>>>
>>> sudo python get-pip.py
>>>
>>> rm get-pip.py
>>>
>>> sudo pip install --upgrade virtualenv
>>>
>>> sudo pip install cython
>>>
>>> sudo apt-get install python-dev
>>>
>>> sudo apt-get install python3-distutils
>>>

Re: Build blocking on

2019-03-26 Thread Mikhail Gryzykhin
I believe what happens is that testPy2Gcp actually runs integration tests
that try to connect to GCP. Without having GCP cluster and configuration on
your machine I'd expect these tests to fail.

I'd say we should remove testPy2Gcp task from "build" task and explicitly
keep it as integration test.

--Mikhail


On Tue, Mar 26, 2019 at 3:12 PM Michael Luckey  wrote:

>
>
> On Tue, Mar 26, 2019 at 10:29 PM Udi Meiri  wrote:
>
>> Luckey, I couldn't recreate your issue, but I still haven't done a full
>> build.
>> I created a new GCE VM with using the ubuntu-1804-bionic-v20190212a image
>> (n1-standard-4 machine type).
>>
>> Ran the following:
>> sudo apt-get update
>> sudo apt-get install python-pip
>> sudo apt-get install python-virtualenv
>> git clone https://github.com/apache/beam.git
>> cd beam
>> ./gradlew :beam-sdks-python:testPy2Gcp
>> [failed: no JAVA_HOME]
>> sudo apt-get install openjdk-8-jdk
>> ./gradlew :beam-sdks-python:testPy2Gcp
>>
>> Got: BUILD SUCCESSFUL in 7m 52s
>>
>
> Nice. Thanks a lot for your help here.
>
> If I understand correctly, this VM is already located within gcp. Could it
> already have some setup, which needs to be done on 'my' VM? For instance I
> was contemplating about that test trying 'to call home', but as I am
> (unfortunately ;) no googler and do not have any gcp specific setup, fails
> here but misses to timeout? This is just some weird assumption, did not yet
> look into the actual implementation.
>
> Which I seemingly need to do here :(
>
>
>> Then I tried:
>> ./gradlew build
>>
>> And ran out of disk space. :) (beam/ is taking 4.5G and the VM boot disk
>> is 10G total)
>>
>
> Ouch :D
>
>
>>
>> On Tue, Mar 26, 2019 at 1:35 PM Robert Burke  wrote:
>>
>>> Michael, your concern is reasonable, especially with the experience with
>>> python, though that does help me bootstrap this work. :)
>>>
>>> The go tools provide caching and avoid redoing work if the source files
>>> haven't changed. This applies most particularly for `go build` and `go
>>> test`. As long as the go code isn't changing at every invocation, this
>>> should be fine. I'm not aware of the same being the case for the usual
>>> python tools.
>>>
>>>  The real trick is ensuring a valid and consistent environment for the
>>> go code.
>>>
>>> The environment question becomes easier for everyone by moving to go
>>> modules, which were designed to provide these kinds of consistent builds.
>>> It also avoids needing a GOPATH set. Any directory is permitted, as long as
>>> the go.mod is present.
>>>
>>> (The Go SDK doesn't yet us go modules, so go.mod and go.sum aren't yet
>>> in the repo.)
>>>
>>> The main blocker is see is updating the Jenkins machines to have the
>>> latest version of Go (1.12) instead of 1.10, which doesn't support modules.
>>> This only blocks a final submission, rather than the work fortunately.
>>>
>>> On Tue, Mar 26, 2019, 1:08 PM Udi Meiri  wrote:
>>>
 "rm -r ~/.gradle/go/repo/" worked for me (there was more than one
 package with issues).
 My ~/.bashrc has
   export GOPATH=$HOME/go
 so maybe that's making the difference in my setup.

 On Tue, Mar 26, 2019 at 11:28 AM Thomas Weise  wrote:

> Can this be addressed by having "clean" remove all state that gogradle
> leaves behind? This staleness issue has bitten me a few times also and it
> would be good to have a reliable way to deal with it, even if it involves
> an extra clean.
>
>
> On Tue, Mar 26, 2019 at 11:14 AM Michael Luckey 
> wrote:
>
>> @Udi
>> Did you try to just delete the
>> '/usr/local/google/home/ehudm/.gradle/go/repo/cloud.google.com'
>> folder?
>>
>> @Robert
>> As said before, I am a bit scared about the implications. Shelling
>> out is done by python, and from build perspective, this does not work 
>> very
>> well, unfortunately. I.e. no caching, up-to-date checks etc...
>>
>> But of course, we need to play with this a bit more.
>>
>> On Tue, Mar 26, 2019 at 6:24 PM Robert Burke 
>> wrote:
>>
>>> Reading the error from the gradle scan, it largely looks like some
>>> part of the GCP dependencies for the build depends on a package, where 
>>> the
>>> commit version is no longer around. The main issue with gogradle is that
>>> it's entirely distinct from the usual Go workflow, which means deps 
>>> users
>>> use are likely to be different to what's in the lock file.
>>>
>>> This work will be tracked in
>>> https://issues.apache.org/jira/browse/BEAM-5379
>>> GoGradle hasn't moved to support the new-go way of handling deps, so
>>> my inclination is to simplify to simple scripts for Gradle that shell 
>>> out
>>> the to Go tool for handling Go dep management, over trying to fix 
>>> GoGradle.
>>>
>>> On Tue, 26 Mar 2019 at 09:43, Udi Meiri  wrote:
>>>
 Robert, from what I recall it's not flaky for me - it consistently

Re: Build blocking on

2019-03-26 Thread Michael Luckey
On Tue, Mar 26, 2019 at 10:29 PM Udi Meiri  wrote:

> Luckey, I couldn't recreate your issue, but I still haven't done a full
> build.
> I created a new GCE VM with using the ubuntu-1804-bionic-v20190212a image
> (n1-standard-4 machine type).
>
> Ran the following:
> sudo apt-get update
> sudo apt-get install python-pip
> sudo apt-get install python-virtualenv
> git clone https://github.com/apache/beam.git
> cd beam
> ./gradlew :beam-sdks-python:testPy2Gcp
> [failed: no JAVA_HOME]
> sudo apt-get install openjdk-8-jdk
> ./gradlew :beam-sdks-python:testPy2Gcp
>
> Got: BUILD SUCCESSFUL in 7m 52s
>

Nice. Thanks a lot for your help here.

If I understand correctly, this VM is already located within gcp. Could it
already have some setup, which needs to be done on 'my' VM? For instance I
was contemplating about that test trying 'to call home', but as I am
(unfortunately ;) no googler and do not have any gcp specific setup, fails
here but misses to timeout? This is just some weird assumption, did not yet
look into the actual implementation.

Which I seemingly need to do here :(


> Then I tried:
> ./gradlew build
>
> And ran out of disk space. :) (beam/ is taking 4.5G and the VM boot disk
> is 10G total)
>

Ouch :D


>
> On Tue, Mar 26, 2019 at 1:35 PM Robert Burke  wrote:
>
>> Michael, your concern is reasonable, especially with the experience with
>> python, though that does help me bootstrap this work. :)
>>
>> The go tools provide caching and avoid redoing work if the source files
>> haven't changed. This applies most particularly for `go build` and `go
>> test`. As long as the go code isn't changing at every invocation, this
>> should be fine. I'm not aware of the same being the case for the usual
>> python tools.
>>
>>  The real trick is ensuring a valid and consistent environment for the go
>> code.
>>
>> The environment question becomes easier for everyone by moving to go
>> modules, which were designed to provide these kinds of consistent builds.
>> It also avoids needing a GOPATH set. Any directory is permitted, as long as
>> the go.mod is present.
>>
>> (The Go SDK doesn't yet us go modules, so go.mod and go.sum aren't yet in
>> the repo.)
>>
>> The main blocker is see is updating the Jenkins machines to have the
>> latest version of Go (1.12) instead of 1.10, which doesn't support modules.
>> This only blocks a final submission, rather than the work fortunately.
>>
>> On Tue, Mar 26, 2019, 1:08 PM Udi Meiri  wrote:
>>
>>> "rm -r ~/.gradle/go/repo/" worked for me (there was more than one
>>> package with issues).
>>> My ~/.bashrc has
>>>   export GOPATH=$HOME/go
>>> so maybe that's making the difference in my setup.
>>>
>>> On Tue, Mar 26, 2019 at 11:28 AM Thomas Weise  wrote:
>>>
 Can this be addressed by having "clean" remove all state that gogradle
 leaves behind? This staleness issue has bitten me a few times also and it
 would be good to have a reliable way to deal with it, even if it involves
 an extra clean.


 On Tue, Mar 26, 2019 at 11:14 AM Michael Luckey 
 wrote:

> @Udi
> Did you try to just delete the
> '/usr/local/google/home/ehudm/.gradle/go/repo/cloud.google.com'
> folder?
>
> @Robert
> As said before, I am a bit scared about the implications. Shelling out
> is done by python, and from build perspective, this does not work very
> well, unfortunately. I.e. no caching, up-to-date checks etc...
>
> But of course, we need to play with this a bit more.
>
> On Tue, Mar 26, 2019 at 6:24 PM Robert Burke 
> wrote:
>
>> Reading the error from the gradle scan, it largely looks like some
>> part of the GCP dependencies for the build depends on a package, where 
>> the
>> commit version is no longer around. The main issue with gogradle is that
>> it's entirely distinct from the usual Go workflow, which means deps users
>> use are likely to be different to what's in the lock file.
>>
>> This work will be tracked in
>> https://issues.apache.org/jira/browse/BEAM-5379
>> GoGradle hasn't moved to support the new-go way of handling deps, so
>> my inclination is to simplify to simple scripts for Gradle that shell out
>> the to Go tool for handling Go dep management, over trying to fix 
>> GoGradle.
>>
>> On Tue, 26 Mar 2019 at 09:43, Udi Meiri  wrote:
>>
>>> Robert, from what I recall it's not flaky for me - it consistently
>>> fails. Let me know if there's a way to get more logging about this 
>>> error.
>>>
>>> On Mon, Mar 25, 2019, 19:50 Robert Burke  wrote:
>>>
 It's concerning to me that 1) the Go dependency resolution via
 gogradle is flaky, and 2) that it can block other languages.

 I suppose 2) makes sense since it's part of the container
 bootstrapping code, but that makes 1) a serious problem, of which I 
 wasn't
 aware.
 I should have time to 

Re: Build blocking on

2019-03-26 Thread Michael Luckey
On Tue, Mar 26, 2019 at 9:35 PM Robert Burke  wrote:

> Michael, your concern is reasonable, especially with the experience with
> python, though that does help me bootstrap this work. :)
>
> The go tools provide caching and avoid redoing work if the source files
> haven't changed.
>

Of course, that's pretty cool. But I am currently unable to fully get the
consequences. It would still be better, if the build system (in our case
currently gradle) would track this, as it might chose to do different
processing. And of course, the benefits from using the build cache would
also be missed out.

To clarify, I am not arguing against using/delegating to native tools - in
fact gradle always delegates to native tools, e.g. javac - and also
gogradle is doing the same, I am just trying to point out, that we should
keep a close eye on the actual implementation to leverage all those gradle
goodies. Which is easily done wrong with introducing all kind of issues in
the build. And should not be done by simply calling out to some shell
script. For instance our python implementation seems far to often just
return a '-1' and requires some digging into logs to eventually find the
root cause. Which I believe could be done better.

What we would essentially try to do, is kind of reimplementing gogradle,
which is doable, of course, to get over the shortcomings of that plugin.


> This applies most particularly for `go build` and `go test`. As long as
> the go code isn't changing at every invocation, this should be fine. I'm
> not aware of the same being the case for the usual python tools.
>

>  The real trick is ensuring a valid and consistent environment for the go
> code.
>
> The environment question becomes easier for everyone by moving to go
> modules, which were designed to provide these kinds of consistent builds.
> It also avoids needing a GOPATH set. Any directory is permitted, as long as
> the go.mod is present.
>
> (The Go SDK doesn't yet us go modules, so go.mod and go.sum aren't yet in
> the repo.)
>
> The main blocker is see is updating the Jenkins machines to have the
> latest version of Go (1.12) instead of 1.10, which doesn't support modules.
> This only blocks a final submission, rather than the work fortunately.
>

I d really love to assist you in getting that running. So if you think I
could be of help here, just ping me and we could give that a try.


>
> On Tue, Mar 26, 2019, 1:08 PM Udi Meiri  wrote:
>
>> "rm -r ~/.gradle/go/repo/" worked for me (there was more than one package
>> with issues).
>> My ~/.bashrc has
>>   export GOPATH=$HOME/go
>> so maybe that's making the difference in my setup.
>>
>> On Tue, Mar 26, 2019 at 11:28 AM Thomas Weise  wrote:
>>
>>> Can this be addressed by having "clean" remove all state that gogradle
>>> leaves behind? This staleness issue has bitten me a few times also and it
>>> would be good to have a reliable way to deal with it, even if it involves
>>> an extra clean.
>>>
>>>
>>> On Tue, Mar 26, 2019 at 11:14 AM Michael Luckey 
>>> wrote:
>>>
 @Udi
 Did you try to just delete the
 '/usr/local/google/home/ehudm/.gradle/go/repo/cloud.google.com' folder?

 @Robert
 As said before, I am a bit scared about the implications. Shelling out
 is done by python, and from build perspective, this does not work very
 well, unfortunately. I.e. no caching, up-to-date checks etc...

 But of course, we need to play with this a bit more.

 On Tue, Mar 26, 2019 at 6:24 PM Robert Burke 
 wrote:

> Reading the error from the gradle scan, it largely looks like some
> part of the GCP dependencies for the build depends on a package, where the
> commit version is no longer around. The main issue with gogradle is that
> it's entirely distinct from the usual Go workflow, which means deps users
> use are likely to be different to what's in the lock file.
>
> This work will be tracked in
> https://issues.apache.org/jira/browse/BEAM-5379
> GoGradle hasn't moved to support the new-go way of handling deps, so
> my inclination is to simplify to simple scripts for Gradle that shell out
> the to Go tool for handling Go dep management, over trying to fix 
> GoGradle.
>
> On Tue, 26 Mar 2019 at 09:43, Udi Meiri  wrote:
>
>> Robert, from what I recall it's not flaky for me - it consistently
>> fails. Let me know if there's a way to get more logging about this error.
>>
>> On Mon, Mar 25, 2019, 19:50 Robert Burke  wrote:
>>
>>> It's concerning to me that 1) the Go dependency resolution via
>>> gogradle is flaky, and 2) that it can block other languages.
>>>
>>> I suppose 2) makes sense since it's part of the container
>>> bootstrapping code, but that makes 1) a serious problem, of which I 
>>> wasn't
>>> aware.
>>> I should have time to investigate this in the next two weeks.
>>>
>>> On Mon, 25 Mar 2019 at 18:08, Michael Luckey 
>>> 

Re: Build blocking on

2019-03-26 Thread Udi Meiri
Luckey, I couldn't recreate your issue, but I still haven't done a full
build.
I created a new GCE VM with using the ubuntu-1804-bionic-v20190212a image
(n1-standard-4 machine type).

Ran the following:
sudo apt-get update
sudo apt-get install python-pip
sudo apt-get install python-virtualenv
git clone https://github.com/apache/beam.git
cd beam
./gradlew :beam-sdks-python:testPy2Gcp
[failed: no JAVA_HOME]
sudo apt-get install openjdk-8-jdk
./gradlew :beam-sdks-python:testPy2Gcp

Got: BUILD SUCCESSFUL in 7m 52s

Then I tried:
./gradlew build

And ran out of disk space. :) (beam/ is taking 4.5G and the VM boot disk is
10G total)

On Tue, Mar 26, 2019 at 1:35 PM Robert Burke  wrote:

> Michael, your concern is reasonable, especially with the experience with
> python, though that does help me bootstrap this work. :)
>
> The go tools provide caching and avoid redoing work if the source files
> haven't changed. This applies most particularly for `go build` and `go
> test`. As long as the go code isn't changing at every invocation, this
> should be fine. I'm not aware of the same being the case for the usual
> python tools.
>
>  The real trick is ensuring a valid and consistent environment for the go
> code.
>
> The environment question becomes easier for everyone by moving to go
> modules, which were designed to provide these kinds of consistent builds.
> It also avoids needing a GOPATH set. Any directory is permitted, as long as
> the go.mod is present.
>
> (The Go SDK doesn't yet us go modules, so go.mod and go.sum aren't yet in
> the repo.)
>
> The main blocker is see is updating the Jenkins machines to have the
> latest version of Go (1.12) instead of 1.10, which doesn't support modules.
> This only blocks a final submission, rather than the work fortunately.
>
> On Tue, Mar 26, 2019, 1:08 PM Udi Meiri  wrote:
>
>> "rm -r ~/.gradle/go/repo/" worked for me (there was more than one package
>> with issues).
>> My ~/.bashrc has
>>   export GOPATH=$HOME/go
>> so maybe that's making the difference in my setup.
>>
>> On Tue, Mar 26, 2019 at 11:28 AM Thomas Weise  wrote:
>>
>>> Can this be addressed by having "clean" remove all state that gogradle
>>> leaves behind? This staleness issue has bitten me a few times also and it
>>> would be good to have a reliable way to deal with it, even if it involves
>>> an extra clean.
>>>
>>>
>>> On Tue, Mar 26, 2019 at 11:14 AM Michael Luckey 
>>> wrote:
>>>
 @Udi
 Did you try to just delete the
 '/usr/local/google/home/ehudm/.gradle/go/repo/cloud.google.com' folder?

 @Robert
 As said before, I am a bit scared about the implications. Shelling out
 is done by python, and from build perspective, this does not work very
 well, unfortunately. I.e. no caching, up-to-date checks etc...

 But of course, we need to play with this a bit more.

 On Tue, Mar 26, 2019 at 6:24 PM Robert Burke 
 wrote:

> Reading the error from the gradle scan, it largely looks like some
> part of the GCP dependencies for the build depends on a package, where the
> commit version is no longer around. The main issue with gogradle is that
> it's entirely distinct from the usual Go workflow, which means deps users
> use are likely to be different to what's in the lock file.
>
> This work will be tracked in
> https://issues.apache.org/jira/browse/BEAM-5379
> GoGradle hasn't moved to support the new-go way of handling deps, so
> my inclination is to simplify to simple scripts for Gradle that shell out
> the to Go tool for handling Go dep management, over trying to fix 
> GoGradle.
>
> On Tue, 26 Mar 2019 at 09:43, Udi Meiri  wrote:
>
>> Robert, from what I recall it's not flaky for me - it consistently
>> fails. Let me know if there's a way to get more logging about this error.
>>
>> On Mon, Mar 25, 2019, 19:50 Robert Burke  wrote:
>>
>>> It's concerning to me that 1) the Go dependency resolution via
>>> gogradle is flaky, and 2) that it can block other languages.
>>>
>>> I suppose 2) makes sense since it's part of the container
>>> bootstrapping code, but that makes 1) a serious problem, of which I 
>>> wasn't
>>> aware.
>>> I should have time to investigate this in the next two weeks.
>>>
>>> On Mon, 25 Mar 2019 at 18:08, Michael Luckey 
>>> wrote:
>>>
 Just for the record,

 using a vm here, because did not yet get all task running on my
 mac, and did not want to mess with my setup.

 So installed vanilla ubuntu-18.04 LTS on virtual box, 26GB ram, 6
 cores and further

 sudo apt update

 sudo apt install gcc

 sudo apt install make

 sudo apt install perl

 sudo apt install curl

 sudo apt install openjdk-8-jdk

 sudo apt install python

 sudo apt 

Re: Build blocking on

2019-03-26 Thread Robert Burke
Michael, your concern is reasonable, especially with the experience with
python, though that does help me bootstrap this work. :)

The go tools provide caching and avoid redoing work if the source files
haven't changed. This applies most particularly for `go build` and `go
test`. As long as the go code isn't changing at every invocation, this
should be fine. I'm not aware of the same being the case for the usual
python tools.

 The real trick is ensuring a valid and consistent environment for the go
code.

The environment question becomes easier for everyone by moving to go
modules, which were designed to provide these kinds of consistent builds.
It also avoids needing a GOPATH set. Any directory is permitted, as long as
the go.mod is present.

(The Go SDK doesn't yet us go modules, so go.mod and go.sum aren't yet in
the repo.)

The main blocker is see is updating the Jenkins machines to have the latest
version of Go (1.12) instead of 1.10, which doesn't support modules. This
only blocks a final submission, rather than the work fortunately.

On Tue, Mar 26, 2019, 1:08 PM Udi Meiri  wrote:

> "rm -r ~/.gradle/go/repo/" worked for me (there was more than one package
> with issues).
> My ~/.bashrc has
>   export GOPATH=$HOME/go
> so maybe that's making the difference in my setup.
>
> On Tue, Mar 26, 2019 at 11:28 AM Thomas Weise  wrote:
>
>> Can this be addressed by having "clean" remove all state that gogradle
>> leaves behind? This staleness issue has bitten me a few times also and it
>> would be good to have a reliable way to deal with it, even if it involves
>> an extra clean.
>>
>>
>> On Tue, Mar 26, 2019 at 11:14 AM Michael Luckey 
>> wrote:
>>
>>> @Udi
>>> Did you try to just delete the
>>> '/usr/local/google/home/ehudm/.gradle/go/repo/cloud.google.com' folder?
>>>
>>> @Robert
>>> As said before, I am a bit scared about the implications. Shelling out
>>> is done by python, and from build perspective, this does not work very
>>> well, unfortunately. I.e. no caching, up-to-date checks etc...
>>>
>>> But of course, we need to play with this a bit more.
>>>
>>> On Tue, Mar 26, 2019 at 6:24 PM Robert Burke  wrote:
>>>
 Reading the error from the gradle scan, it largely looks like some part
 of the GCP dependencies for the build depends on a package, where the
 commit version is no longer around. The main issue with gogradle is that
 it's entirely distinct from the usual Go workflow, which means deps users
 use are likely to be different to what's in the lock file.

 This work will be tracked in
 https://issues.apache.org/jira/browse/BEAM-5379
 GoGradle hasn't moved to support the new-go way of handling deps, so my
 inclination is to simplify to simple scripts for Gradle that shell out the
 to Go tool for handling Go dep management, over trying to fix GoGradle.

 On Tue, 26 Mar 2019 at 09:43, Udi Meiri  wrote:

> Robert, from what I recall it's not flaky for me - it consistently
> fails. Let me know if there's a way to get more logging about this error.
>
> On Mon, Mar 25, 2019, 19:50 Robert Burke  wrote:
>
>> It's concerning to me that 1) the Go dependency resolution via
>> gogradle is flaky, and 2) that it can block other languages.
>>
>> I suppose 2) makes sense since it's part of the container
>> bootstrapping code, but that makes 1) a serious problem, of which I 
>> wasn't
>> aware.
>> I should have time to investigate this in the next two weeks.
>>
>> On Mon, 25 Mar 2019 at 18:08, Michael Luckey 
>> wrote:
>>
>>> Just for the record,
>>>
>>> using a vm here, because did not yet get all task running on my mac,
>>> and did not want to mess with my setup.
>>>
>>> So installed vanilla ubuntu-18.04 LTS on virtual box, 26GB ram, 6
>>> cores and further
>>>
>>> sudo apt update
>>>
>>> sudo apt install gcc
>>>
>>> sudo apt install make
>>>
>>> sudo apt install perl
>>>
>>> sudo apt install curl
>>>
>>> sudo apt install openjdk-8-jdk
>>>
>>> sudo apt install python
>>>
>>> sudo apt install -y software-properties-common
>>>
>>> sudo add-apt-repository ppa:deadsnakes/ppa
>>>
>>> sudo apt update
>>>
>>> sudo apt install python3.5
>>>
>>> sudo apt-get install apt-transport-https ca-certificates curl
>>> gnupg-agent software-properties-common
>>>
>>> curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo
>>> apt-key add -
>>>
>>> sudo apt-key fingerprint 0EBFCD88
>>>
>>> sudo add-apt-repository "deb [arch=amd64]
>>> https://download.docker.com/linux/ubuntu \
>>>
>>> $(lsb_release -cs) \
>>>
>>> stable"
>>>
>>> sudo apt-get update
>>>
>>> sudo apt-get install docker-ce docker-ce-cli containerd.io
>>>
>>> sudo groupadd docker
>>>
>>> sudo usermod -aG docker $USER
>>>
>>> git 

Re: Build blocking on

2019-03-26 Thread Udi Meiri
"rm -r ~/.gradle/go/repo/" worked for me (there was more than one package
with issues).
My ~/.bashrc has
  export GOPATH=$HOME/go
so maybe that's making the difference in my setup.

On Tue, Mar 26, 2019 at 11:28 AM Thomas Weise  wrote:

> Can this be addressed by having "clean" remove all state that gogradle
> leaves behind? This staleness issue has bitten me a few times also and it
> would be good to have a reliable way to deal with it, even if it involves
> an extra clean.
>
>
> On Tue, Mar 26, 2019 at 11:14 AM Michael Luckey 
> wrote:
>
>> @Udi
>> Did you try to just delete the
>> '/usr/local/google/home/ehudm/.gradle/go/repo/cloud.google.com' folder?
>>
>> @Robert
>> As said before, I am a bit scared about the implications. Shelling out is
>> done by python, and from build perspective, this does not work very well,
>> unfortunately. I.e. no caching, up-to-date checks etc...
>>
>> But of course, we need to play with this a bit more.
>>
>> On Tue, Mar 26, 2019 at 6:24 PM Robert Burke  wrote:
>>
>>> Reading the error from the gradle scan, it largely looks like some part
>>> of the GCP dependencies for the build depends on a package, where the
>>> commit version is no longer around. The main issue with gogradle is that
>>> it's entirely distinct from the usual Go workflow, which means deps users
>>> use are likely to be different to what's in the lock file.
>>>
>>> This work will be tracked in
>>> https://issues.apache.org/jira/browse/BEAM-5379
>>> GoGradle hasn't moved to support the new-go way of handling deps, so my
>>> inclination is to simplify to simple scripts for Gradle that shell out the
>>> to Go tool for handling Go dep management, over trying to fix GoGradle.
>>>
>>> On Tue, 26 Mar 2019 at 09:43, Udi Meiri  wrote:
>>>
 Robert, from what I recall it's not flaky for me - it consistently
 fails. Let me know if there's a way to get more logging about this error.

 On Mon, Mar 25, 2019, 19:50 Robert Burke  wrote:

> It's concerning to me that 1) the Go dependency resolution via
> gogradle is flaky, and 2) that it can block other languages.
>
> I suppose 2) makes sense since it's part of the container
> bootstrapping code, but that makes 1) a serious problem, of which I wasn't
> aware.
> I should have time to investigate this in the next two weeks.
>
> On Mon, 25 Mar 2019 at 18:08, Michael Luckey 
> wrote:
>
>> Just for the record,
>>
>> using a vm here, because did not yet get all task running on my mac,
>> and did not want to mess with my setup.
>>
>> So installed vanilla ubuntu-18.04 LTS on virtual box, 26GB ram, 6
>> cores and further
>>
>> sudo apt update
>>
>> sudo apt install gcc
>>
>> sudo apt install make
>>
>> sudo apt install perl
>>
>> sudo apt install curl
>>
>> sudo apt install openjdk-8-jdk
>>
>> sudo apt install python
>>
>> sudo apt install -y software-properties-common
>>
>> sudo add-apt-repository ppa:deadsnakes/ppa
>>
>> sudo apt update
>>
>> sudo apt install python3.5
>>
>> sudo apt-get install apt-transport-https ca-certificates curl
>> gnupg-agent software-properties-common
>>
>> curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo
>> apt-key add -
>>
>> sudo apt-key fingerprint 0EBFCD88
>>
>> sudo add-apt-repository "deb [arch=amd64]
>> https://download.docker.com/linux/ubuntu \
>>
>> $(lsb_release -cs) \
>>
>> stable"
>>
>> sudo apt-get update
>>
>> sudo apt-get install docker-ce docker-ce-cli containerd.io
>>
>> sudo groupadd docker
>>
>> sudo usermod -aG docker $USER
>>
>> git config --global user.email "d...@spam.me"
>>
>> git config --global user.name "Some Guy"
>>
>> curl https://bootstrap.pypa.io/get-pip.py -o get-pip.py
>>
>> sudo python get-pip.py
>>
>> rm get-pip.py
>>
>> sudo pip install --upgrade virtualenv
>>
>> sudo pip install cython
>>
>> sudo apt-get install python-dev
>>
>> sudo apt-get install python3-distutils
>>
>> sudo apt-get install python3-dev # for python3.x installs
>>
>>
>> git clone https://github.com/apache/beam.git cd beam/ ./gradlew
>> build
>>
>> Nothing else changed/added. (hopefully, need to reassure myself here)
>>
>> Unfortunately, this is failing. Need to exclude those python tests
>> (and of course website, which usually fails on lira links)
>>
>> So I might be missing some env settings for gap, dunno. Probably
>> missed some docs.
>>
>>
>>
>> On Tue, Mar 26, 2019 at 1:46 AM Michael Luckey 
>> wrote:
>>
>>> Thanks Udi for trying that!
>>>
>>> In fact, the go dependency resolution is flaky. Did not look into
>>> that, but just rerunning usually works. Of course, less than optimal, 
>>> but,

Re: SNAPSHOTS have not been updated since february

2019-03-26 Thread Boyuan Zhang
+Daniel Oliveira 

On Tue, Mar 26, 2019 at 9:57 AM Boyuan Zhang  wrote:

> Sorry for the typo. Ideally, the snapshot publish is *independent* from
> postrelease_snapshot.
>
> On Tue, Mar 26, 2019 at 9:55 AM Boyuan Zhang  wrote:
>
>> Hey,
>>
>> I'm trying to publish the artifacts by commenting "Run Gradle Publish" in
>> my PR, but there are several errors saying "cannot write artifacts into
>> dir"
>> ,
>> anyone has idea on it? Ideally, the snapshot publish is dependent from
>> postrelease_snapshot. The publish task is to build and publish artifacts
>> and the postrelease_snapshot is to verify whether the snapshot works.
>>
>> On Tue, Mar 26, 2019 at 8:45 AM Ahmet Altay  wrote:
>>
>>> I believe this is related to
>>> https://issues.apache.org/jira/browse/BEAM-6840 and +Boyuan Zhang
>>>  has a fix in progress
>>> https://github.com/apache/beam/pull/8132
>>>
>>> On Tue, Mar 26, 2019 at 7:09 AM Ismaël Mejía  wrote:
>>>
 I was trying to validate a fix on the Spark runner and realized that
 Beam SNAPSHOTS have not been updated since February 24 !


 https://repository.apache.org/content/repositories/snapshots/org/apache/beam/beam-sdks-java-core/2.12.0-SNAPSHOT/

 Can somebody please take a look at why this is not been updated?

 Thanks,
 Ismaël

>>>


Re: Build blocking on

2019-03-26 Thread Thomas Weise
Can this be addressed by having "clean" remove all state that gogradle
leaves behind? This staleness issue has bitten me a few times also and it
would be good to have a reliable way to deal with it, even if it involves
an extra clean.


On Tue, Mar 26, 2019 at 11:14 AM Michael Luckey  wrote:

> @Udi
> Did you try to just delete the
> '/usr/local/google/home/ehudm/.gradle/go/repo/cloud.google.com' folder?
>
> @Robert
> As said before, I am a bit scared about the implications. Shelling out is
> done by python, and from build perspective, this does not work very well,
> unfortunately. I.e. no caching, up-to-date checks etc...
>
> But of course, we need to play with this a bit more.
>
> On Tue, Mar 26, 2019 at 6:24 PM Robert Burke  wrote:
>
>> Reading the error from the gradle scan, it largely looks like some part
>> of the GCP dependencies for the build depends on a package, where the
>> commit version is no longer around. The main issue with gogradle is that
>> it's entirely distinct from the usual Go workflow, which means deps users
>> use are likely to be different to what's in the lock file.
>>
>> This work will be tracked in
>> https://issues.apache.org/jira/browse/BEAM-5379
>> GoGradle hasn't moved to support the new-go way of handling deps, so my
>> inclination is to simplify to simple scripts for Gradle that shell out the
>> to Go tool for handling Go dep management, over trying to fix GoGradle.
>>
>> On Tue, 26 Mar 2019 at 09:43, Udi Meiri  wrote:
>>
>>> Robert, from what I recall it's not flaky for me - it consistently
>>> fails. Let me know if there's a way to get more logging about this error.
>>>
>>> On Mon, Mar 25, 2019, 19:50 Robert Burke  wrote:
>>>
 It's concerning to me that 1) the Go dependency resolution via gogradle
 is flaky, and 2) that it can block other languages.

 I suppose 2) makes sense since it's part of the container bootstrapping
 code, but that makes 1) a serious problem, of which I wasn't aware.
 I should have time to investigate this in the next two weeks.

 On Mon, 25 Mar 2019 at 18:08, Michael Luckey 
 wrote:

> Just for the record,
>
> using a vm here, because did not yet get all task running on my mac,
> and did not want to mess with my setup.
>
> So installed vanilla ubuntu-18.04 LTS on virtual box, 26GB ram, 6
> cores and further
>
> sudo apt update
>
> sudo apt install gcc
>
> sudo apt install make
>
> sudo apt install perl
>
> sudo apt install curl
>
> sudo apt install openjdk-8-jdk
>
> sudo apt install python
>
> sudo apt install -y software-properties-common
>
> sudo add-apt-repository ppa:deadsnakes/ppa
>
> sudo apt update
>
> sudo apt install python3.5
>
> sudo apt-get install apt-transport-https ca-certificates curl
> gnupg-agent software-properties-common
>
> curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo
> apt-key add -
>
> sudo apt-key fingerprint 0EBFCD88
>
> sudo add-apt-repository "deb [arch=amd64]
> https://download.docker.com/linux/ubuntu \
>
> $(lsb_release -cs) \
>
> stable"
>
> sudo apt-get update
>
> sudo apt-get install docker-ce docker-ce-cli containerd.io
>
> sudo groupadd docker
>
> sudo usermod -aG docker $USER
>
> git config --global user.email "d...@spam.me"
>
> git config --global user.name "Some Guy"
>
> curl https://bootstrap.pypa.io/get-pip.py -o get-pip.py
>
> sudo python get-pip.py
>
> rm get-pip.py
>
> sudo pip install --upgrade virtualenv
>
> sudo pip install cython
>
> sudo apt-get install python-dev
>
> sudo apt-get install python3-distutils
>
> sudo apt-get install python3-dev # for python3.x installs
>
>
> git clone https://github.com/apache/beam.git cd beam/ ./gradlew build
>
> Nothing else changed/added. (hopefully, need to reassure myself here)
>
> Unfortunately, this is failing. Need to exclude those python tests
> (and of course website, which usually fails on lira links)
>
> So I might be missing some env settings for gap, dunno. Probably
> missed some docs.
>
>
>
> On Tue, Mar 26, 2019 at 1:46 AM Michael Luckey 
> wrote:
>
>> Thanks Udi for trying that!
>>
>> In fact, the go dependency resolution is flaky. Did not look into
>> that, but just rerunning usually works. Of course, less than optimal, 
>> but,
>> well...
>>
>> Running build target is of course just an aggregation of task to run.
>> And unfortunately just running that
>>
>> ./gradlew  :beam-sdks-python:testPy2Gcp
>>
>> stalls on my (virtual) machine.
>>
>> On Tue, Mar 26, 2019 at 1:35 AM Udi Meiri  wrote:
>>
>>> Okay, `./gradlew build` failed pretty quickly for me:
>>>
>>> > Task 

Re: Writing bytes to BigQuery with beam

2019-03-26 Thread Pablo Estrada
Sure, we can make users explicitly ask for schema autodetection, instead of
it being the default when no schema is provided. I think that's reasonable.


On Mon, Mar 25, 2019, 7:19 PM Valentyn Tymofieiev 
wrote:

> Thanks everyone for input on this thread. I think there is a confusion
> between not specifying the schema, and asking BigQuery to do schema
> autodetection. This is not the same thing, however in recent changes to BQ
> IO that happened after 2.11 release, we are forcing schema autodetection,
> when schema is not specified, see: [1].
>
> I think we need to revise this ahead of 2.12. It may be better if users
> explicitly opt-in to schema autodetection if they wish. Autodetection is an
> approximation, and in particular, as we figured out in this thread, it does
> not work correctly for BYTES data.
>
> I suspect that if we disable schema autodetection, and/or make previous
> implementation of BQ sink a default option, we will be able to write BYTES
> data to a previously created BQ table without specifying the schema, and
> making a call to BQ to fetch the schema won't be necessary. We'd need to
> verify that.
>

> Another interesting note, as per Juta's analysis
> ,
> google-cloud-bigquery client does not require additional base64 encoding
> for bytes, so once we migrate to use this client, base64 encoding/decoding
> of Bytes data won't be necessary in Beam.
>
> [1]
> https://github.com/apache/beam/blob/0b71f541e93f3bd69af87ad8a6db46ccb4a01ddc/sdks/python/apache_beam/io/gcp/bigquery_tools.py#L321
> .
> [2]
> https://docs.google.com/document/d/19zvDycWzF82MmtCmxrhqqyXKaRq8slRIjdxE6E8MObA/edit#bookmark=id.7pfrsz1c8hcj
>
> On Mon, Mar 25, 2019 at 2:26 PM Chamikara Jayalath 
> wrote:
>
>>
>>
>> On Mon, Mar 25, 2019 at 2:16 PM Pablo Estrada  wrote:
>>
>>> +Chamikara Jayalath  with the new BigQuery sink,
>>> schema autodetection is supported (it's a very simple thing to have). Do
>>> you think we should not have it?
>>> Best
>>> -P.
>>>
>>
>> Ah good to know. But IMO users should be able to write to existing tables
>> without specifying a schema (when CEATE_DISPOSITION is CREATE_NEVER for
>> example). How do users enable schema auto-detection ? Probably this should
>> not be enabled by default and we should clearly advertise that bytes type
>> is not supported (or support it with extra information). Just my 2 cents.
>>
>> Thanks,
>> Cham
>>
>>
>>>
>>> On Mon, Mar 25, 2019 at 11:01 AM Chamikara Jayalath <
>>> chamik...@google.com> wrote:
>>>


 On Mon, Mar 25, 2019 at 2:03 AM Juta Staes  wrote:

>
> On Mon, 25 Mar 2019 at 06:15, Valentyn Tymofieiev 
> wrote:
>
>> We received feedback on
>> https://issuetracker.google.com/issues/129006689 - BQ developers say
>> that schema identification is done and they discourage to use schema
>> autodetection in tables using BYTES. In light of this, I think may be 
>> fair
>> to recommend Beam users to specify BQ schemas as well when they interact
>> with BQ, and call out that writing binary data to BQ will likely fail
>> unless schema is specified. Does that make sense?
>>
>
> Given that schema autodetect does not work for bytes I think it is
> indeed a good solution to require users to specify BQ schemas as well when
> they write to BQ
>
> So new summary:
> 1. Beam will base64-encode raw bytes, before passing them to BQ over
> rest API. This will be a change in behavior for Python 2 (for good 
> reasons).
> 2. When reading data from BQ, all fields of type BYTES will be
> base64-decoded.
> 3. Beam will send an API call to BigQuery to get table schema,
> whenever schema is not supplied, to work around
> https://issuetracker.google.com/issues/129006689. Beam will require
> users to specify the schema when writing bytes to BQ.
>

 I'm not sure why we reached this conclusion. We (Beam) does not use BQ
 schema auto detection feature currently.  So why not just send an API
 signal to get the schema when users are writing to existing tables ? Also,
 even if we decide to support schema auto detection in the future we will
 not be able to support this for BYTEs type (due to the restriction by BQ).


> Thanks all for your input on this!
> Juta
>
>


Re: New contributor

2019-03-26 Thread Connell O'Callaghan
Welcome Guobao!!!

On Tue, Mar 26, 2019 at 11:09 AM Melissa Pashniak 
wrote:

> Welcome!
>
>
> On Tue, Mar 26, 2019 at 10:17 AM Kenneth Knowles  wrote:
>
>> Welcome! Cool project. A lot of code, and thorough experiments.
>>
>> Kenn
>>
>> On Tue, Mar 26, 2019 at 9:15 AM Chamikara Jayalath 
>> wrote:
>>
>>> Welcome!
>>>
>>> On Tue, Mar 26, 2019 at 8:56 AM Ahmet Altay  wrote:
>>>
 Welcome Guobao!

 On Tue, Mar 26, 2019 at 7:13 AM Ismaël Mejía  wrote:

> Welcome Guobao!
>
> Nice that you are joining us. Looking forward for your contributions !
> Take the time to read the contribution guide
> https://beam.apache.org/contribute/ and don't hesitate to ask any
> question you may have.
>
> On Tue, Mar 26, 2019 at 2:14 PM Alexey Romanenko
>  wrote:
> >
> > Welcome, Guobao! Great to have you on board!
> >
> > Alexey
> >
> > On 26 Mar 2019, at 11:49, Guobao Li  wrote:
> >
> > Hi all,
> >
> > I am Guobao Li from Talend. I am new to Apache Beam. Currently I am
> working on the implementation of CouchbaseIO and hope to contribute in 
> more
> areas in the future.
> >
> > I have some Open Source experience. I contributed on extension of a
> DSL to introduce the architecture of Parameter Server to Apache SystemML
> [1] as part of GSoC 2018. I earned comittership due to my work. So I’m 
> very
> glad to be here and continue contributing to Open Source now on Apache 
> Beam.
> >
> > Regards,
> > Guobao Li
> >
> > [1]
> https://summerofcode.withgoogle.com/archive/2018/projects/5148916517437440/
> >
> >
>



Re: Build blocking on

2019-03-26 Thread Michael Luckey
@Udi
Did you try to just delete the
'/usr/local/google/home/ehudm/.gradle/go/repo/cloud.google.com' folder?

@Robert
As said before, I am a bit scared about the implications. Shelling out is
done by python, and from build perspective, this does not work very well,
unfortunately. I.e. no caching, up-to-date checks etc...

But of course, we need to play with this a bit more.

On Tue, Mar 26, 2019 at 6:24 PM Robert Burke  wrote:

> Reading the error from the gradle scan, it largely looks like some part of
> the GCP dependencies for the build depends on a package, where the commit
> version is no longer around. The main issue with gogradle is that it's
> entirely distinct from the usual Go workflow, which means deps users use
> are likely to be different to what's in the lock file.
>
> This work will be tracked in
> https://issues.apache.org/jira/browse/BEAM-5379
> GoGradle hasn't moved to support the new-go way of handling deps, so my
> inclination is to simplify to simple scripts for Gradle that shell out the
> to Go tool for handling Go dep management, over trying to fix GoGradle.
>
> On Tue, 26 Mar 2019 at 09:43, Udi Meiri  wrote:
>
>> Robert, from what I recall it's not flaky for me - it consistently fails.
>> Let me know if there's a way to get more logging about this error.
>>
>> On Mon, Mar 25, 2019, 19:50 Robert Burke  wrote:
>>
>>> It's concerning to me that 1) the Go dependency resolution via gogradle
>>> is flaky, and 2) that it can block other languages.
>>>
>>> I suppose 2) makes sense since it's part of the container bootstrapping
>>> code, but that makes 1) a serious problem, of which I wasn't aware.
>>> I should have time to investigate this in the next two weeks.
>>>
>>> On Mon, 25 Mar 2019 at 18:08, Michael Luckey 
>>> wrote:
>>>
 Just for the record,

 using a vm here, because did not yet get all task running on my mac,
 and did not want to mess with my setup.

 So installed vanilla ubuntu-18.04 LTS on virtual box, 26GB ram, 6 cores
 and further

 sudo apt update

 sudo apt install gcc

 sudo apt install make

 sudo apt install perl

 sudo apt install curl

 sudo apt install openjdk-8-jdk

 sudo apt install python

 sudo apt install -y software-properties-common

 sudo add-apt-repository ppa:deadsnakes/ppa

 sudo apt update

 sudo apt install python3.5

 sudo apt-get install apt-transport-https ca-certificates curl
 gnupg-agent software-properties-common

 curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key
 add -

 sudo apt-key fingerprint 0EBFCD88

 sudo add-apt-repository "deb [arch=amd64]
 https://download.docker.com/linux/ubuntu \

 $(lsb_release -cs) \

 stable"

 sudo apt-get update

 sudo apt-get install docker-ce docker-ce-cli containerd.io

 sudo groupadd docker

 sudo usermod -aG docker $USER

 git config --global user.email "d...@spam.me"

 git config --global user.name "Some Guy"

 curl https://bootstrap.pypa.io/get-pip.py -o get-pip.py

 sudo python get-pip.py

 rm get-pip.py

 sudo pip install --upgrade virtualenv

 sudo pip install cython

 sudo apt-get install python-dev

 sudo apt-get install python3-distutils

 sudo apt-get install python3-dev # for python3.x installs


 git clone https://github.com/apache/beam.git cd beam/ ./gradlew build

 Nothing else changed/added. (hopefully, need to reassure myself here)

 Unfortunately, this is failing. Need to exclude those python tests (and
 of course website, which usually fails on lira links)

 So I might be missing some env settings for gap, dunno. Probably missed
 some docs.



 On Tue, Mar 26, 2019 at 1:46 AM Michael Luckey 
 wrote:

> Thanks Udi for trying that!
>
> In fact, the go dependency resolution is flaky. Did not look into
> that, but just rerunning usually works. Of course, less than optimal, but,
> well...
>
> Running build target is of course just an aggregation of task to run.
> And unfortunately just running that
>
> ./gradlew  :beam-sdks-python:testPy2Gcp
>
> stalls on my (virtual) machine.
>
> On Tue, Mar 26, 2019 at 1:35 AM Udi Meiri  wrote:
>
>> Okay, `./gradlew build` failed pretty quickly for me:
>>
>> > Task :beam-sdks-go:resolveBuildDependencies FAILED
>> cloud.google.com/go:
>> commit='4f6c921ec566a33844f4e7879b31cd8575a6982d', urls=[
>> https://code.googlesource.com/gocloud] does not exist in
>> /usr/local/google/home/ehudm/.gradle/go/repo/
>> cloud.google.com/go/625660c387d9403fde4d73cacaf2d2ac, updating will
>> be performed.
>>
>> https://gradle.com/s/x5zqbc5zwd3bg
>>
>> (Now I remember why I stopped using `build` :/)

Re: New contributor

2019-03-26 Thread Melissa Pashniak
Welcome!


On Tue, Mar 26, 2019 at 10:17 AM Kenneth Knowles  wrote:

> Welcome! Cool project. A lot of code, and thorough experiments.
>
> Kenn
>
> On Tue, Mar 26, 2019 at 9:15 AM Chamikara Jayalath 
> wrote:
>
>> Welcome!
>>
>> On Tue, Mar 26, 2019 at 8:56 AM Ahmet Altay  wrote:
>>
>>> Welcome Guobao!
>>>
>>> On Tue, Mar 26, 2019 at 7:13 AM Ismaël Mejía  wrote:
>>>
 Welcome Guobao!

 Nice that you are joining us. Looking forward for your contributions !
 Take the time to read the contribution guide
 https://beam.apache.org/contribute/ and don't hesitate to ask any
 question you may have.

 On Tue, Mar 26, 2019 at 2:14 PM Alexey Romanenko
  wrote:
 >
 > Welcome, Guobao! Great to have you on board!
 >
 > Alexey
 >
 > On 26 Mar 2019, at 11:49, Guobao Li  wrote:
 >
 > Hi all,
 >
 > I am Guobao Li from Talend. I am new to Apache Beam. Currently I am
 working on the implementation of CouchbaseIO and hope to contribute in more
 areas in the future.
 >
 > I have some Open Source experience. I contributed on extension of a
 DSL to introduce the architecture of Parameter Server to Apache SystemML
 [1] as part of GSoC 2018. I earned comittership due to my work. So I’m very
 glad to be here and continue contributing to Open Source now on Apache 
 Beam.
 >
 > Regards,
 > Guobao Li
 >
 > [1]
 https://summerofcode.withgoogle.com/archive/2018/projects/5148916517437440/
 >
 >

>>>


Re: Build blocking on

2019-03-26 Thread Robert Burke
Reading the error from the gradle scan, it largely looks like some part of
the GCP dependencies for the build depends on a package, where the commit
version is no longer around. The main issue with gogradle is that it's
entirely distinct from the usual Go workflow, which means deps users use
are likely to be different to what's in the lock file.

This work will be tracked in https://issues.apache.org/jira/browse/BEAM-5379
GoGradle hasn't moved to support the new-go way of handling deps, so my
inclination is to simplify to simple scripts for Gradle that shell out the
to Go tool for handling Go dep management, over trying to fix GoGradle.

On Tue, 26 Mar 2019 at 09:43, Udi Meiri  wrote:

> Robert, from what I recall it's not flaky for me - it consistently fails.
> Let me know if there's a way to get more logging about this error.
>
> On Mon, Mar 25, 2019, 19:50 Robert Burke  wrote:
>
>> It's concerning to me that 1) the Go dependency resolution via gogradle
>> is flaky, and 2) that it can block other languages.
>>
>> I suppose 2) makes sense since it's part of the container bootstrapping
>> code, but that makes 1) a serious problem, of which I wasn't aware.
>> I should have time to investigate this in the next two weeks.
>>
>> On Mon, 25 Mar 2019 at 18:08, Michael Luckey  wrote:
>>
>>> Just for the record,
>>>
>>> using a vm here, because did not yet get all task running on my mac, and
>>> did not want to mess with my setup.
>>>
>>> So installed vanilla ubuntu-18.04 LTS on virtual box, 26GB ram, 6 cores
>>> and further
>>>
>>> sudo apt update
>>>
>>> sudo apt install gcc
>>>
>>> sudo apt install make
>>>
>>> sudo apt install perl
>>>
>>> sudo apt install curl
>>>
>>> sudo apt install openjdk-8-jdk
>>>
>>> sudo apt install python
>>>
>>> sudo apt install -y software-properties-common
>>>
>>> sudo add-apt-repository ppa:deadsnakes/ppa
>>>
>>> sudo apt update
>>>
>>> sudo apt install python3.5
>>>
>>> sudo apt-get install apt-transport-https ca-certificates curl
>>> gnupg-agent software-properties-common
>>>
>>> curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key
>>> add -
>>>
>>> sudo apt-key fingerprint 0EBFCD88
>>>
>>> sudo add-apt-repository "deb [arch=amd64]
>>> https://download.docker.com/linux/ubuntu \
>>>
>>> $(lsb_release -cs) \
>>>
>>> stable"
>>>
>>> sudo apt-get update
>>>
>>> sudo apt-get install docker-ce docker-ce-cli containerd.io
>>>
>>> sudo groupadd docker
>>>
>>> sudo usermod -aG docker $USER
>>>
>>> git config --global user.email "d...@spam.me"
>>>
>>> git config --global user.name "Some Guy"
>>>
>>> curl https://bootstrap.pypa.io/get-pip.py -o get-pip.py
>>>
>>> sudo python get-pip.py
>>>
>>> rm get-pip.py
>>>
>>> sudo pip install --upgrade virtualenv
>>>
>>> sudo pip install cython
>>>
>>> sudo apt-get install python-dev
>>>
>>> sudo apt-get install python3-distutils
>>>
>>> sudo apt-get install python3-dev # for python3.x installs
>>>
>>>
>>> git clone https://github.com/apache/beam.git cd beam/ ./gradlew build
>>>
>>> Nothing else changed/added. (hopefully, need to reassure myself here)
>>>
>>> Unfortunately, this is failing. Need to exclude those python tests (and
>>> of course website, which usually fails on lira links)
>>>
>>> So I might be missing some env settings for gap, dunno. Probably missed
>>> some docs.
>>>
>>>
>>>
>>> On Tue, Mar 26, 2019 at 1:46 AM Michael Luckey 
>>> wrote:
>>>
 Thanks Udi for trying that!

 In fact, the go dependency resolution is flaky. Did not look into that,
 but just rerunning usually works. Of course, less than optimal, but,
 well...

 Running build target is of course just an aggregation of task to run.
 And unfortunately just running that

 ./gradlew  :beam-sdks-python:testPy2Gcp

 stalls on my (virtual) machine.

 On Tue, Mar 26, 2019 at 1:35 AM Udi Meiri  wrote:

> Okay, `./gradlew build` failed pretty quickly for me:
>
> > Task :beam-sdks-go:resolveBuildDependencies FAILED
> cloud.google.com/go:
> commit='4f6c921ec566a33844f4e7879b31cd8575a6982d', urls=[
> https://code.googlesource.com/gocloud] does not exist in
> /usr/local/google/home/ehudm/.gradle/go/repo/
> cloud.google.com/go/625660c387d9403fde4d73cacaf2d2ac, updating will
> be performed.
>
> https://gradle.com/s/x5zqbc5zwd3bg
>
> (Now I remember why I stopped using `build` :/)
>
> On Mon, Mar 25, 2019 at 5:30 PM Udi Meiri  wrote:
>
>> It shouldn't stall. That's a bug.
>> OTOH, I never use the `build` target.
>> I'll try running that myself.
>>
>> On Mon, Mar 25, 2019, 07:24 Michael Luckey 
>> wrote:
>>
>>> Hi,
>>>
>>> trying to run './gradlew build' on vanilla setup, my build
>>> consistently stalls during execution of python gcp tests, e.g. on both 
>>> of
>>> - > :beam-sdks-python:testPy2Gcp
>>> - > :beam-sdks-python-test-suites-tox-py35:testPy35Gcp
>>>
>>> Console output:
>>> 

Re: New contributor

2019-03-26 Thread Kenneth Knowles
Welcome! Cool project. A lot of code, and thorough experiments.

Kenn

On Tue, Mar 26, 2019 at 9:15 AM Chamikara Jayalath 
wrote:

> Welcome!
>
> On Tue, Mar 26, 2019 at 8:56 AM Ahmet Altay  wrote:
>
>> Welcome Guobao!
>>
>> On Tue, Mar 26, 2019 at 7:13 AM Ismaël Mejía  wrote:
>>
>>> Welcome Guobao!
>>>
>>> Nice that you are joining us. Looking forward for your contributions !
>>> Take the time to read the contribution guide
>>> https://beam.apache.org/contribute/ and don't hesitate to ask any
>>> question you may have.
>>>
>>> On Tue, Mar 26, 2019 at 2:14 PM Alexey Romanenko
>>>  wrote:
>>> >
>>> > Welcome, Guobao! Great to have you on board!
>>> >
>>> > Alexey
>>> >
>>> > On 26 Mar 2019, at 11:49, Guobao Li  wrote:
>>> >
>>> > Hi all,
>>> >
>>> > I am Guobao Li from Talend. I am new to Apache Beam. Currently I am
>>> working on the implementation of CouchbaseIO and hope to contribute in more
>>> areas in the future.
>>> >
>>> > I have some Open Source experience. I contributed on extension of a
>>> DSL to introduce the architecture of Parameter Server to Apache SystemML
>>> [1] as part of GSoC 2018. I earned comittership due to my work. So I’m very
>>> glad to be here and continue contributing to Open Source now on Apache Beam.
>>> >
>>> > Regards,
>>> > Guobao Li
>>> >
>>> > [1]
>>> https://summerofcode.withgoogle.com/archive/2018/projects/5148916517437440/
>>> >
>>> >
>>>
>>


Re: SNAPSHOTS have not been updated since february

2019-03-26 Thread Boyuan Zhang
Sorry for the typo. Ideally, the snapshot publish is *independent* from
postrelease_snapshot.

On Tue, Mar 26, 2019 at 9:55 AM Boyuan Zhang  wrote:

> Hey,
>
> I'm trying to publish the artifacts by commenting "Run Gradle Publish" in
> my PR, but there are several errors saying "cannot write artifacts into
> dir"
> ,
> anyone has idea on it? Ideally, the snapshot publish is dependent from
> postrelease_snapshot. The publish task is to build and publish artifacts
> and the postrelease_snapshot is to verify whether the snapshot works.
>
> On Tue, Mar 26, 2019 at 8:45 AM Ahmet Altay  wrote:
>
>> I believe this is related to
>> https://issues.apache.org/jira/browse/BEAM-6840 and +Boyuan Zhang
>>  has a fix in progress
>> https://github.com/apache/beam/pull/8132
>>
>> On Tue, Mar 26, 2019 at 7:09 AM Ismaël Mejía  wrote:
>>
>>> I was trying to validate a fix on the Spark runner and realized that
>>> Beam SNAPSHOTS have not been updated since February 24 !
>>>
>>>
>>> https://repository.apache.org/content/repositories/snapshots/org/apache/beam/beam-sdks-java-core/2.12.0-SNAPSHOT/
>>>
>>> Can somebody please take a look at why this is not been updated?
>>>
>>> Thanks,
>>> Ismaël
>>>
>>


Re: SNAPSHOTS have not been updated since february

2019-03-26 Thread Boyuan Zhang
Hey,

I'm trying to publish the artifacts by commenting "Run Gradle Publish" in
my PR, but there are several errors saying "cannot write artifacts into dir"
,
anyone has idea on it? Ideally, the snapshot publish is dependent from
postrelease_snapshot. The publish task is to build and publish artifacts
and the postrelease_snapshot is to verify whether the snapshot works.

On Tue, Mar 26, 2019 at 8:45 AM Ahmet Altay  wrote:

> I believe this is related to
> https://issues.apache.org/jira/browse/BEAM-6840 and +Boyuan Zhang
>  has a fix in progress
> https://github.com/apache/beam/pull/8132
>
> On Tue, Mar 26, 2019 at 7:09 AM Ismaël Mejía  wrote:
>
>> I was trying to validate a fix on the Spark runner and realized that
>> Beam SNAPSHOTS have not been updated since February 24 !
>>
>>
>> https://repository.apache.org/content/repositories/snapshots/org/apache/beam/beam-sdks-java-core/2.12.0-SNAPSHOT/
>>
>> Can somebody please take a look at why this is not been updated?
>>
>> Thanks,
>> Ismaël
>>
>


Re: SNAPSHOTS have not been updated since february

2019-03-26 Thread Michael Luckey
This was already mentioned a few times. Nightly snapshot rely on 'gradlew
build'. This was failing for different reasons since a few weeks,

Apart from that, we also encounter problems on 'gradlew publish'. As far as
I know artefact uploading (occasionally?) fails. See
https://scans.gradle.com/s/2nds5odjedkky



On Tue, Mar 26, 2019 at 4:45 PM Ahmet Altay  wrote:

> I believe this is related to
> https://issues.apache.org/jira/browse/BEAM-6840 and +Boyuan Zhang
>  has a fix in progress
> https://github.com/apache/beam/pull/8132
>
> On Tue, Mar 26, 2019 at 7:09 AM Ismaël Mejía  wrote:
>
>> I was trying to validate a fix on the Spark runner and realized that
>> Beam SNAPSHOTS have not been updated since February 24 !
>>
>>
>> https://repository.apache.org/content/repositories/snapshots/org/apache/beam/beam-sdks-java-core/2.12.0-SNAPSHOT/
>>
>> Can somebody please take a look at why this is not been updated?
>>
>> Thanks,
>> Ismaël
>>
>


Re: Build blocking on

2019-03-26 Thread Udi Meiri
Robert, from what I recall it's not flaky for me - it consistently fails.
Let me know if there's a way to get more logging about this error.

On Mon, Mar 25, 2019, 19:50 Robert Burke  wrote:

> It's concerning to me that 1) the Go dependency resolution via gogradle is
> flaky, and 2) that it can block other languages.
>
> I suppose 2) makes sense since it's part of the container bootstrapping
> code, but that makes 1) a serious problem, of which I wasn't aware.
> I should have time to investigate this in the next two weeks.
>
> On Mon, 25 Mar 2019 at 18:08, Michael Luckey  wrote:
>
>> Just for the record,
>>
>> using a vm here, because did not yet get all task running on my mac, and
>> did not want to mess with my setup.
>>
>> So installed vanilla ubuntu-18.04 LTS on virtual box, 26GB ram, 6 cores
>> and further
>>
>> sudo apt update
>>
>> sudo apt install gcc
>>
>> sudo apt install make
>>
>> sudo apt install perl
>>
>> sudo apt install curl
>>
>> sudo apt install openjdk-8-jdk
>>
>> sudo apt install python
>>
>> sudo apt install -y software-properties-common
>>
>> sudo add-apt-repository ppa:deadsnakes/ppa
>>
>> sudo apt update
>>
>> sudo apt install python3.5
>>
>> sudo apt-get install apt-transport-https ca-certificates curl gnupg-agent
>> software-properties-common
>>
>> curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key
>> add -
>>
>> sudo apt-key fingerprint 0EBFCD88
>>
>> sudo add-apt-repository "deb [arch=amd64]
>> https://download.docker.com/linux/ubuntu \
>>
>> $(lsb_release -cs) \
>>
>> stable"
>>
>> sudo apt-get update
>>
>> sudo apt-get install docker-ce docker-ce-cli containerd.io
>>
>> sudo groupadd docker
>>
>> sudo usermod -aG docker $USER
>>
>> git config --global user.email "d...@spam.me"
>>
>> git config --global user.name "Some Guy"
>>
>> curl https://bootstrap.pypa.io/get-pip.py -o get-pip.py
>>
>> sudo python get-pip.py
>>
>> rm get-pip.py
>>
>> sudo pip install --upgrade virtualenv
>>
>> sudo pip install cython
>>
>> sudo apt-get install python-dev
>>
>> sudo apt-get install python3-distutils
>>
>> sudo apt-get install python3-dev # for python3.x installs
>>
>>
>> git clone https://github.com/apache/beam.git cd beam/ ./gradlew build
>>
>> Nothing else changed/added. (hopefully, need to reassure myself here)
>>
>> Unfortunately, this is failing. Need to exclude those python tests (and
>> of course website, which usually fails on lira links)
>>
>> So I might be missing some env settings for gap, dunno. Probably missed
>> some docs.
>>
>>
>>
>> On Tue, Mar 26, 2019 at 1:46 AM Michael Luckey 
>> wrote:
>>
>>> Thanks Udi for trying that!
>>>
>>> In fact, the go dependency resolution is flaky. Did not look into that,
>>> but just rerunning usually works. Of course, less than optimal, but,
>>> well...
>>>
>>> Running build target is of course just an aggregation of task to run.
>>> And unfortunately just running that
>>>
>>> ./gradlew  :beam-sdks-python:testPy2Gcp
>>>
>>> stalls on my (virtual) machine.
>>>
>>> On Tue, Mar 26, 2019 at 1:35 AM Udi Meiri  wrote:
>>>
 Okay, `./gradlew build` failed pretty quickly for me:

 > Task :beam-sdks-go:resolveBuildDependencies FAILED
 cloud.google.com/go:
 commit='4f6c921ec566a33844f4e7879b31cd8575a6982d', urls=[
 https://code.googlesource.com/gocloud] does not exist in
 /usr/local/google/home/ehudm/.gradle/go/repo/
 cloud.google.com/go/625660c387d9403fde4d73cacaf2d2ac, updating will be
 performed.

 https://gradle.com/s/x5zqbc5zwd3bg

 (Now I remember why I stopped using `build` :/)

 On Mon, Mar 25, 2019 at 5:30 PM Udi Meiri  wrote:

> It shouldn't stall. That's a bug.
> OTOH, I never use the `build` target.
> I'll try running that myself.
>
> On Mon, Mar 25, 2019, 07:24 Michael Luckey 
> wrote:
>
>> Hi,
>>
>> trying to run './gradlew build' on vanilla setup, my build
>> consistently stalls during execution of python gcp tests, e.g. on both of
>> - > :beam-sdks-python:testPy2Gcp
>> - > :beam-sdks-python-test-suites-tox-py35:testPy35Gcp
>>
>> Console output:
>>  snip 
>> test_big_query_standard_sql
>> (apache_beam.io.gcp.big_query_query_to_table_it_test.BigQueryQueryToTableIT)
>> ... SKIP: IT is skipped because --test-pipeline-options is not specified
>> test_big_query_standard_sql_kms_key
>> (apache_beam.io.gcp.big_query_query_to_table_it_test.BigQueryQueryToTableIT)
>> ... SKIP: This test requires BQ Dataflow native source support for KMS,
>> which is not available yet.
>> test_multiple_destinations_transform
>> (apache_beam.io.gcp.bigquery_file_loads_test.BigQueryFileLoadsIT) ... 
>> SKIP:
>> IT is skipped because --test-pipeline-options is not specified
>> test_one_job_fails_all_jobs_fail
>> (apache_beam.io.gcp.bigquery_file_loads_test.BigQueryFileLoadsIT) ... 
>> SKIP:
>> IT is skipped because --test-pipeline-options is not 

Re: New contributor

2019-03-26 Thread Chamikara Jayalath
Welcome!

On Tue, Mar 26, 2019 at 8:56 AM Ahmet Altay  wrote:

> Welcome Guobao!
>
> On Tue, Mar 26, 2019 at 7:13 AM Ismaël Mejía  wrote:
>
>> Welcome Guobao!
>>
>> Nice that you are joining us. Looking forward for your contributions !
>> Take the time to read the contribution guide
>> https://beam.apache.org/contribute/ and don't hesitate to ask any
>> question you may have.
>>
>> On Tue, Mar 26, 2019 at 2:14 PM Alexey Romanenko
>>  wrote:
>> >
>> > Welcome, Guobao! Great to have you on board!
>> >
>> > Alexey
>> >
>> > On 26 Mar 2019, at 11:49, Guobao Li  wrote:
>> >
>> > Hi all,
>> >
>> > I am Guobao Li from Talend. I am new to Apache Beam. Currently I am
>> working on the implementation of CouchbaseIO and hope to contribute in more
>> areas in the future.
>> >
>> > I have some Open Source experience. I contributed on extension of a DSL
>> to introduce the architecture of Parameter Server to Apache SystemML [1] as
>> part of GSoC 2018. I earned comittership due to my work. So I’m very glad
>> to be here and continue contributing to Open Source now on Apache Beam.
>> >
>> > Regards,
>> > Guobao Li
>> >
>> > [1]
>> https://summerofcode.withgoogle.com/archive/2018/projects/5148916517437440/
>> >
>> >
>>
>


Re: New contributor

2019-03-26 Thread Ahmet Altay
Welcome Guobao!

On Tue, Mar 26, 2019 at 7:13 AM Ismaël Mejía  wrote:

> Welcome Guobao!
>
> Nice that you are joining us. Looking forward for your contributions !
> Take the time to read the contribution guide
> https://beam.apache.org/contribute/ and don't hesitate to ask any
> question you may have.
>
> On Tue, Mar 26, 2019 at 2:14 PM Alexey Romanenko
>  wrote:
> >
> > Welcome, Guobao! Great to have you on board!
> >
> > Alexey
> >
> > On 26 Mar 2019, at 11:49, Guobao Li  wrote:
> >
> > Hi all,
> >
> > I am Guobao Li from Talend. I am new to Apache Beam. Currently I am
> working on the implementation of CouchbaseIO and hope to contribute in more
> areas in the future.
> >
> > I have some Open Source experience. I contributed on extension of a DSL
> to introduce the architecture of Parameter Server to Apache SystemML [1] as
> part of GSoC 2018. I earned comittership due to my work. So I’m very glad
> to be here and continue contributing to Open Source now on Apache Beam.
> >
> > Regards,
> > Guobao Li
> >
> > [1]
> https://summerofcode.withgoogle.com/archive/2018/projects/5148916517437440/
> >
> >
>


Re: SNAPSHOTS have not been updated since february

2019-03-26 Thread Ahmet Altay
I believe this is related to https://issues.apache.org/jira/browse/BEAM-6840
and +Boyuan Zhang  has a fix in progress
https://github.com/apache/beam/pull/8132

On Tue, Mar 26, 2019 at 7:09 AM Ismaël Mejía  wrote:

> I was trying to validate a fix on the Spark runner and realized that
> Beam SNAPSHOTS have not been updated since February 24 !
>
>
> https://repository.apache.org/content/repositories/snapshots/org/apache/beam/beam-sdks-java-core/2.12.0-SNAPSHOT/
>
> Can somebody please take a look at why this is not been updated?
>
> Thanks,
> Ismaël
>


SNAPSHOTS have not been updated since february

2019-03-26 Thread Ismaël Mejía
I was trying to validate a fix on the Spark runner and realized that
Beam SNAPSHOTS have not been updated since February 24 !

https://repository.apache.org/content/repositories/snapshots/org/apache/beam/beam-sdks-java-core/2.12.0-SNAPSHOT/

Can somebody please take a look at why this is not been updated?

Thanks,
Ismaël


Re: New contributor

2019-03-26 Thread Ismaël Mejía
Welcome Guobao!

Nice that you are joining us. Looking forward for your contributions !
Take the time to read the contribution guide
https://beam.apache.org/contribute/ and don't hesitate to ask any
question you may have.

On Tue, Mar 26, 2019 at 2:14 PM Alexey Romanenko
 wrote:
>
> Welcome, Guobao! Great to have you on board!
>
> Alexey
>
> On 26 Mar 2019, at 11:49, Guobao Li  wrote:
>
> Hi all,
>
> I am Guobao Li from Talend. I am new to Apache Beam. Currently I am working 
> on the implementation of CouchbaseIO and hope to contribute in more areas in 
> the future.
>
> I have some Open Source experience. I contributed on extension of a DSL to 
> introduce the architecture of Parameter Server to Apache SystemML [1] as part 
> of GSoC 2018. I earned comittership due to my work. So I’m very glad to be 
> here and continue contributing to Open Source now on Apache Beam.
>
> Regards,
> Guobao Li
>
> [1] 
> https://summerofcode.withgoogle.com/archive/2018/projects/5148916517437440/
>
>


Re: New contributor

2019-03-26 Thread Alexey Romanenko
Welcome, Guobao! Great to have you on board!

Alexey

> On 26 Mar 2019, at 11:49, Guobao Li  wrote:
> 
> Hi all,
> 
> I am Guobao Li from Talend. I am new to Apache Beam. Currently I am working 
> on the implementation of CouchbaseIO and hope to contribute in more areas in 
> the future.
> 
> I have some Open Source experience. I contributed on extension of a DSL to 
> introduce the architecture of Parameter Server to Apache SystemML [1] as part 
> of GSoC 2018. I earned comittership due to my work. So I’m very glad to be 
> here and continue contributing to Open Source now on Apache Beam.
> 
> Regards,
> Guobao Li
> 
> [1] 
> https://summerofcode.withgoogle.com/archive/2018/projects/5148916517437440/ 
> 
> 



New contributor

2019-03-26 Thread Guobao Li
Hi all,

I am Guobao Li from Talend. I am new to Apache Beam. Currently I am working
on the implementation of CouchbaseIO and hope to contribute in more areas
in the future.

I have some Open Source experience. I contributed on extension of a DSL to
introduce the architecture of Parameter Server to Apache SystemML [1] as
part of GSoC 2018. I earned comittership due to my work. So I’m very glad
to be here and continue contributing to Open Source now on Apache Beam.

Regards,
Guobao Li

[1]
https://summerofcode.withgoogle.com/archive/2018/projects/5148916517437440/