After digging a bit deeper, I was able to verify, that those tests block on authorization to GCP.
Seems that, as I do not have any credentials set, and underlying oauth2 falls back to some local mode. This seems to start a webserver on port 8080 and waiting there forever. Accessing that port forwards to some google, but fails also miserably. Running python setup.py nosetests --tests > > apache_beam.io.gcp.bigquery_file_loads_test:TestBigQueryFileLoads.test_records_traverse_transform_with_mocks and hitting 'Ctrl-C' after it got stuck, results in following output: 'KeyboardInterrupt [while running > \'WriteToBigQuery/BigQueryBatchFileLoads/RemoveTempTables/Delete\']\n------------ > Your browser has been opened to visit: > > https://accounts.google.com/o/oauth2/v2/auth?scope=https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fbigquery+https%3A% > If your browser is on a different machine then exit and re-run this > application with the command-line parameter > --noauth_local_webserver > Failed to find "code" in the query parameters of the redirect. > Invalid authorization: Try running with --noauth_local_webserver. I am a bit lost here on how to proceed. On Tue, Mar 26, 2019 at 11:48 PM Michael Luckey <adude3...@gmail.com> wrote: > > > On Tue, Mar 26, 2019 at 11:18 PM Mikhail Gryzykhin <mig...@google.com> > wrote: > >> I believe what happens is that testPy2Gcp actually runs integration tests >> that try to connect to GCP. >> > > Actually I was hoping for an explanation like this. Any suggestion how I > could confirm that on my behalf? > > >> Without having GCP cluster and configuration on your machine I'd expect >> these tests to fail. >> > > Hmm... here I am actually unsure, what would be the best to handle such > cases. > > If I understand correctly, we currently skip some tests which do not meet > expectations, kind of 'can not run on your arch' thingies... So I am > undecided, whether I d prefer those tests to be skipped if gcp > configuration is missing > > pro > * dev is still able to run the tests (whichever task they are associated > with) without having to separate the failures out. For instance, these > 'testPy2Gcp' does actually execute 'some tests' - which might be already > covered by some other calls... But I definitely do not like the idea, to > put the burden on the developer to track which tasks/tests might be > executed on local machine. Unless this distinction is really coarse - and > pre/postcommit is something I really would like to be able to run locally... > > > con > * we definitely need to make sure, those tests are not accidentally > skipped on CI servers. > > >> >> I'd say we should remove testPy2Gcp task from "build" task and explicitly >> keep it as integration test. >> >> --Mikhail >> >> >> On Tue, Mar 26, 2019 at 3:12 PM Michael Luckey <adude3...@gmail.com> >> wrote: >> >>> >>> >>> On Tue, Mar 26, 2019 at 10:29 PM Udi Meiri <eh...@google.com> wrote: >>> >>>> Luckey, I couldn't recreate your issue, but I still haven't done a full >>>> build. >>>> I created a new GCE VM with using the ubuntu-1804-bionic-v20190212a >>>> image (n1-standard-4 machine type). >>>> >>>> Ran the following: >>>> sudo apt-get update >>>> sudo apt-get install python-pip >>>> sudo apt-get install python-virtualenv >>>> git clone https://github.com/apache/beam.git >>>> cd beam >>>> ./gradlew :beam-sdks-python:testPy2Gcp >>>> [failed: no JAVA_HOME] >>>> sudo apt-get install openjdk-8-jdk >>>> ./gradlew :beam-sdks-python:testPy2Gcp >>>> >>>> Got: BUILD SUCCESSFUL in 7m 52s >>>> >>> >>> Nice. Thanks a lot for your help here. >>> >>> If I understand correctly, this VM is already located within gcp. Could >>> it already have some setup, which needs to be done on 'my' VM? For instance >>> I was contemplating about that test trying 'to call home', but as I am >>> (unfortunately ;) no googler and do not have any gcp specific setup, fails >>> here but misses to timeout? This is just some weird assumption, did not yet >>> look into the actual implementation. >>> >>> Which I seemingly need to do here :( >>> >>> >>>> Then I tried: >>>> ./gradlew build >>>> >>>> And ran out of disk space. :) (beam/ is taking 4.5G and the VM boot >>>> disk is 10G total) >>>> >>> >>> Ouch :D >>> >>> >>>> >>>> On Tue, Mar 26, 2019 at 1:35 PM Robert Burke <rob...@frantil.com> >>>> wrote: >>>> >>>>> Michael, your concern is reasonable, especially with the experience >>>>> with python, though that does help me bootstrap this work. :) >>>>> >>>>> The go tools provide caching and avoid redoing work if the source >>>>> files haven't changed. This applies most particularly for `go build` and >>>>> `go test`. As long as the go code isn't changing at every invocation, this >>>>> should be fine. I'm not aware of the same being the case for the usual >>>>> python tools. >>>>> >>>>> The real trick is ensuring a valid and consistent environment for the >>>>> go code. >>>>> >>>>> The environment question becomes easier for everyone by moving to go >>>>> modules, which were designed to provide these kinds of consistent builds. >>>>> It also avoids needing a GOPATH set. Any directory is permitted, as long >>>>> as >>>>> the go.mod is present. >>>>> >>>>> (The Go SDK doesn't yet us go modules, so go.mod and go.sum aren't yet >>>>> in the repo.) >>>>> >>>>> The main blocker is see is updating the Jenkins machines to have the >>>>> latest version of Go (1.12) instead of 1.10, which doesn't support >>>>> modules. >>>>> This only blocks a final submission, rather than the work fortunately. >>>>> >>>>> On Tue, Mar 26, 2019, 1:08 PM Udi Meiri <eh...@google.com> wrote: >>>>> >>>>>> "rm -r ~/.gradle/go/repo/" worked for me (there was more than one >>>>>> package with issues). >>>>>> My ~/.bashrc has >>>>>> export GOPATH=$HOME/go >>>>>> so maybe that's making the difference in my setup. >>>>>> >>>>>> On Tue, Mar 26, 2019 at 11:28 AM Thomas Weise <t...@apache.org> wrote: >>>>>> >>>>>>> Can this be addressed by having "clean" remove all state that >>>>>>> gogradle leaves behind? This staleness issue has bitten me a few times >>>>>>> also >>>>>>> and it would be good to have a reliable way to deal with it, even if it >>>>>>> involves an extra clean. >>>>>>> >>>>>>> >>>>>>> On Tue, Mar 26, 2019 at 11:14 AM Michael Luckey <adude3...@gmail.com> >>>>>>> wrote: >>>>>>> >>>>>>>> @Udi >>>>>>>> Did you try to just delete the >>>>>>>> '/usr/local/google/home/ehudm/.gradle/go/repo/cloud.google.com' >>>>>>>> folder? >>>>>>>> >>>>>>>> @Robert >>>>>>>> As said before, I am a bit scared about the implications. Shelling >>>>>>>> out is done by python, and from build perspective, this does not work >>>>>>>> very >>>>>>>> well, unfortunately. I.e. no caching, up-to-date checks etc... >>>>>>>> >>>>>>>> But of course, we need to play with this a bit more. >>>>>>>> >>>>>>>> On Tue, Mar 26, 2019 at 6:24 PM Robert Burke <rob...@frantil.com> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> Reading the error from the gradle scan, it largely looks like some >>>>>>>>> part of the GCP dependencies for the build depends on a package, >>>>>>>>> where the >>>>>>>>> commit version is no longer around. The main issue with gogradle is >>>>>>>>> that >>>>>>>>> it's entirely distinct from the usual Go workflow, which means deps >>>>>>>>> users >>>>>>>>> use are likely to be different to what's in the lock file. >>>>>>>>> >>>>>>>>> This work will be tracked in >>>>>>>>> https://issues.apache.org/jira/browse/BEAM-5379 >>>>>>>>> GoGradle hasn't moved to support the new-go way of handling deps, >>>>>>>>> so my inclination is to simplify to simple scripts for Gradle that >>>>>>>>> shell >>>>>>>>> out the to Go tool for handling Go dep management, over trying to fix >>>>>>>>> GoGradle. >>>>>>>>> >>>>>>>>> On Tue, 26 Mar 2019 at 09:43, Udi Meiri <eh...@google.com> wrote: >>>>>>>>> >>>>>>>>>> Robert, from what I recall it's not flaky for me - it >>>>>>>>>> consistently fails. Let me know if there's a way to get more logging >>>>>>>>>> about >>>>>>>>>> this error. >>>>>>>>>> >>>>>>>>>> On Mon, Mar 25, 2019, 19:50 Robert Burke <rob...@frantil.com> >>>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>>> It's concerning to me that 1) the Go dependency resolution via >>>>>>>>>>> gogradle is flaky, and 2) that it can block other languages. >>>>>>>>>>> >>>>>>>>>>> I suppose 2) makes sense since it's part of the container >>>>>>>>>>> bootstrapping code, but that makes 1) a serious problem, of which I >>>>>>>>>>> wasn't >>>>>>>>>>> aware. >>>>>>>>>>> I should have time to investigate this in the next two weeks. >>>>>>>>>>> >>>>>>>>>>> On Mon, 25 Mar 2019 at 18:08, Michael Luckey < >>>>>>>>>>> adude3...@gmail.com> wrote: >>>>>>>>>>> >>>>>>>>>>>> Just for the record, >>>>>>>>>>>> >>>>>>>>>>>> using a vm here, because did not yet get all task running on my >>>>>>>>>>>> mac, and did not want to mess with my setup. >>>>>>>>>>>> >>>>>>>>>>>> So installed vanilla ubuntu-18.04 LTS on virtual box, 26GB ram, >>>>>>>>>>>> 6 cores and further >>>>>>>>>>>> >>>>>>>>>>>> sudo apt update >>>>>>>>>>>> >>>>>>>>>>>> sudo apt install gcc >>>>>>>>>>>> >>>>>>>>>>>> sudo apt install make >>>>>>>>>>>> >>>>>>>>>>>> sudo apt install perl >>>>>>>>>>>> >>>>>>>>>>>> sudo apt install curl >>>>>>>>>>>> >>>>>>>>>>>> sudo apt install openjdk-8-jdk >>>>>>>>>>>> >>>>>>>>>>>> sudo apt install python >>>>>>>>>>>> >>>>>>>>>>>> sudo apt install -y software-properties-common >>>>>>>>>>>> >>>>>>>>>>>> sudo add-apt-repository ppa:deadsnakes/ppa >>>>>>>>>>>> >>>>>>>>>>>> sudo apt update >>>>>>>>>>>> >>>>>>>>>>>> sudo apt install python3.5 >>>>>>>>>>>> >>>>>>>>>>>> sudo apt-get install apt-transport-https ca-certificates curl >>>>>>>>>>>> gnupg-agent software-properties-common >>>>>>>>>>>> >>>>>>>>>>>> curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo >>>>>>>>>>>> apt-key add - >>>>>>>>>>>> >>>>>>>>>>>> sudo apt-key fingerprint 0EBFCD88 >>>>>>>>>>>> >>>>>>>>>>>> sudo add-apt-repository "deb [arch=amd64] >>>>>>>>>>>> https://download.docker.com/linux/ubuntu \ >>>>>>>>>>>> >>>>>>>>>>>> $(lsb_release -cs) \ >>>>>>>>>>>> >>>>>>>>>>>> stable" >>>>>>>>>>>> >>>>>>>>>>>> sudo apt-get update >>>>>>>>>>>> >>>>>>>>>>>> sudo apt-get install docker-ce docker-ce-cli containerd.io >>>>>>>>>>>> >>>>>>>>>>>> sudo groupadd docker >>>>>>>>>>>> >>>>>>>>>>>> sudo usermod -aG docker $USER >>>>>>>>>>>> >>>>>>>>>>>> git config --global user.email "d...@spam.me" >>>>>>>>>>>> >>>>>>>>>>>> git config --global user.name "Some Guy" >>>>>>>>>>>> >>>>>>>>>>>> curl https://bootstrap.pypa.io/get-pip.py -o get-pip.py >>>>>>>>>>>> >>>>>>>>>>>> sudo python get-pip.py >>>>>>>>>>>> >>>>>>>>>>>> rm get-pip.py >>>>>>>>>>>> >>>>>>>>>>>> sudo pip install --upgrade virtualenv >>>>>>>>>>>> >>>>>>>>>>>> sudo pip install cython >>>>>>>>>>>> >>>>>>>>>>>> sudo apt-get install python-dev >>>>>>>>>>>> >>>>>>>>>>>> sudo apt-get install python3-distutils >>>>>>>>>>>> >>>>>>>>>>>> sudo apt-get install python3-dev # for python3.x installs >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> git clone https://github.com/apache/beam.git cd beam/ >>>>>>>>>>>> ./gradlew build >>>>>>>>>>>> >>>>>>>>>>>> Nothing else changed/added. (hopefully, need to reassure myself >>>>>>>>>>>> here) >>>>>>>>>>>> >>>>>>>>>>>> Unfortunately, this is failing. Need to exclude those python >>>>>>>>>>>> tests (and of course website, which usually fails on lira links) >>>>>>>>>>>> >>>>>>>>>>>> So I might be missing some env settings for gap, dunno. >>>>>>>>>>>> Probably missed some docs. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On Tue, Mar 26, 2019 at 1:46 AM Michael Luckey < >>>>>>>>>>>> adude3...@gmail.com> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> Thanks Udi for trying that! >>>>>>>>>>>>> >>>>>>>>>>>>> In fact, the go dependency resolution is flaky. Did not look >>>>>>>>>>>>> into that, but just rerunning usually works. Of course, less than >>>>>>>>>>>>> optimal, >>>>>>>>>>>>> but, well... >>>>>>>>>>>>> >>>>>>>>>>>>> Running build target is of course just an aggregation of task >>>>>>>>>>>>> to run. And unfortunately just running that >>>>>>>>>>>>> >>>>>>>>>>>>> ./gradlew :beam-sdks-python:testPy2Gcp >>>>>>>>>>>>> >>>>>>>>>>>>> stalls on my (virtual) machine. >>>>>>>>>>>>> >>>>>>>>>>>>> On Tue, Mar 26, 2019 at 1:35 AM Udi Meiri <eh...@google.com> >>>>>>>>>>>>> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> Okay, `./gradlew build` failed pretty quickly for me: >>>>>>>>>>>>>> >>>>>>>>>>>>>> > Task :beam-sdks-go:resolveBuildDependencies FAILED >>>>>>>>>>>>>> cloud.google.com/go: >>>>>>>>>>>>>> commit='4f6c921ec566a33844f4e7879b31cd8575a6982d', urls=[ >>>>>>>>>>>>>> https://code.googlesource.com/gocloud] does not exist in >>>>>>>>>>>>>> /usr/local/google/home/ehudm/.gradle/go/repo/ >>>>>>>>>>>>>> cloud.google.com/go/625660c387d9403fde4d73cacaf2d2ac, >>>>>>>>>>>>>> updating will be performed. >>>>>>>>>>>>>> >>>>>>>>>>>>>> https://gradle.com/s/x5zqbc5zwd3bg >>>>>>>>>>>>>> >>>>>>>>>>>>>> (Now I remember why I stopped using `build` :/) >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Mon, Mar 25, 2019 at 5:30 PM Udi Meiri <eh...@google.com> >>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>>> It shouldn't stall. That's a bug. >>>>>>>>>>>>>>> OTOH, I never use the `build` target. >>>>>>>>>>>>>>> I'll try running that myself. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On Mon, Mar 25, 2019, 07:24 Michael Luckey < >>>>>>>>>>>>>>> adude3...@gmail.com> wrote: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Hi, >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> trying to run './gradlew build' on vanilla setup, my build >>>>>>>>>>>>>>>> consistently stalls during execution of python gcp tests, e.g. >>>>>>>>>>>>>>>> on both of >>>>>>>>>>>>>>>> - > :beam-sdks-python:testPy2Gcp >>>>>>>>>>>>>>>> - > :beam-sdks-python-test-suites-tox-py35:testPy35Gcp >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Console output: >>>>>>>>>>>>>>>> #### snip #### >>>>>>>>>>>>>>>> test_big_query_standard_sql >>>>>>>>>>>>>>>> (apache_beam.io.gcp.big_query_query_to_table_it_test.BigQueryQueryToTableIT) >>>>>>>>>>>>>>>> ... SKIP: IT is skipped because --test-pipeline-options is not >>>>>>>>>>>>>>>> specified >>>>>>>>>>>>>>>> test_big_query_standard_sql_kms_key >>>>>>>>>>>>>>>> (apache_beam.io.gcp.big_query_query_to_table_it_test.BigQueryQueryToTableIT) >>>>>>>>>>>>>>>> ... SKIP: This test requires BQ Dataflow native source support >>>>>>>>>>>>>>>> for KMS, >>>>>>>>>>>>>>>> which is not available yet. >>>>>>>>>>>>>>>> test_multiple_destinations_transform >>>>>>>>>>>>>>>> (apache_beam.io.gcp.bigquery_file_loads_test.BigQueryFileLoadsIT) >>>>>>>>>>>>>>>> ... SKIP: >>>>>>>>>>>>>>>> IT is skipped because --test-pipeline-options is not specified >>>>>>>>>>>>>>>> test_one_job_fails_all_jobs_fail >>>>>>>>>>>>>>>> (apache_beam.io.gcp.bigquery_file_loads_test.BigQueryFileLoadsIT) >>>>>>>>>>>>>>>> ... SKIP: >>>>>>>>>>>>>>>> IT is skipped because --test-pipeline-options is not specified >>>>>>>>>>>>>>>> test_records_traverse_transform_with_mocks >>>>>>>>>>>>>>>> (apache_beam.io.gcp.bigquery_file_loads_test.TestBigQueryFileLoads) >>>>>>>>>>>>>>>> ... >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> output ends here, would expect a failed or ok here. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Afterwards no progress - even waiting for hours. Any idea, >>>>>>>>>>>>>>>> what might be causing this? Do I need to add some GCP >>>>>>>>>>>>>>>> properties for this >>>>>>>>>>>>>>>> task ? >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Any ideas, what I am doing wrong? >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> best, >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> michel >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>