Hi Michael, I wrote that test and much of that code. I'm quite sorry about the trouble. The test should use mocks and not hang when it's missing GCP dependencies. That sounds like a bug in the test. We can deactivate it while I figure out what's going wrong.. Best -P.
On Sat, Mar 30, 2019, 2:55 PM Michael Luckey <[email protected]> wrote: > After digging a bit deeper, I was able to verify, that those tests block > on authorization to GCP. > > Seems that, as I do not have any credentials set, and underlying oauth2 > falls back to some local mode. This seems to start a webserver on port 8080 > and waiting there forever. Accessing that port forwards to some google, but > fails also miserably. > > Running > > python setup.py nosetests --tests >> >> apache_beam.io.gcp.bigquery_file_loads_test:TestBigQueryFileLoads.test_records_traverse_transform_with_mocks > > > and hitting 'Ctrl-C' after it got stuck, results in following output: > > 'KeyboardInterrupt [while running >> \'WriteToBigQuery/BigQueryBatchFileLoads/RemoveTempTables/Delete\']\n------------ >> Your browser has been opened to visit: >> >> https://accounts.google.com/o/oauth2/v2/auth?scope=https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fbigquery+https%3A% >> If your browser is on a different machine then exit and re-run this >> application with the command-line parameter >> --noauth_local_webserver >> Failed to find "code" in the query parameters of the redirect. >> Invalid authorization: Try running with --noauth_local_webserver. > > > I am a bit lost here on how to proceed. > > > On Tue, Mar 26, 2019 at 11:48 PM Michael Luckey <[email protected]> > wrote: > >> >> >> On Tue, Mar 26, 2019 at 11:18 PM Mikhail Gryzykhin <[email protected]> >> wrote: >> >>> I believe what happens is that testPy2Gcp actually runs integration >>> tests that try to connect to GCP. >>> >> >> Actually I was hoping for an explanation like this. Any suggestion how I >> could confirm that on my behalf? >> >> >>> Without having GCP cluster and configuration on your machine I'd expect >>> these tests to fail. >>> >> >> Hmm... here I am actually unsure, what would be the best to handle such >> cases. >> >> If I understand correctly, we currently skip some tests which do not meet >> expectations, kind of 'can not run on your arch' thingies... So I am >> undecided, whether I d prefer those tests to be skipped if gcp >> configuration is missing >> >> pro >> * dev is still able to run the tests (whichever task they are associated >> with) without having to separate the failures out. For instance, these >> 'testPy2Gcp' does actually execute 'some tests' - which might be already >> covered by some other calls... But I definitely do not like the idea, to >> put the burden on the developer to track which tasks/tests might be >> executed on local machine. Unless this distinction is really coarse - and >> pre/postcommit is something I really would like to be able to run locally... >> >> >> con >> * we definitely need to make sure, those tests are not accidentally >> skipped on CI servers. >> >> >>> >>> I'd say we should remove testPy2Gcp task from "build" task and >>> explicitly keep it as integration test. >>> >>> --Mikhail >>> >>> >>> On Tue, Mar 26, 2019 at 3:12 PM Michael Luckey <[email protected]> >>> wrote: >>> >>>> >>>> >>>> On Tue, Mar 26, 2019 at 10:29 PM Udi Meiri <[email protected]> wrote: >>>> >>>>> Luckey, I couldn't recreate your issue, but I still haven't done a >>>>> full build. >>>>> I created a new GCE VM with using the ubuntu-1804-bionic-v20190212a >>>>> image (n1-standard-4 machine type). >>>>> >>>>> Ran the following: >>>>> sudo apt-get update >>>>> sudo apt-get install python-pip >>>>> sudo apt-get install python-virtualenv >>>>> git clone https://github.com/apache/beam.git >>>>> cd beam >>>>> ./gradlew :beam-sdks-python:testPy2Gcp >>>>> [failed: no JAVA_HOME] >>>>> sudo apt-get install openjdk-8-jdk >>>>> ./gradlew :beam-sdks-python:testPy2Gcp >>>>> >>>>> Got: BUILD SUCCESSFUL in 7m 52s >>>>> >>>> >>>> Nice. Thanks a lot for your help here. >>>> >>>> If I understand correctly, this VM is already located within gcp. Could >>>> it already have some setup, which needs to be done on 'my' VM? For instance >>>> I was contemplating about that test trying 'to call home', but as I am >>>> (unfortunately ;) no googler and do not have any gcp specific setup, fails >>>> here but misses to timeout? This is just some weird assumption, did not yet >>>> look into the actual implementation. >>>> >>>> Which I seemingly need to do here :( >>>> >>>> >>>>> Then I tried: >>>>> ./gradlew build >>>>> >>>>> And ran out of disk space. :) (beam/ is taking 4.5G and the VM boot >>>>> disk is 10G total) >>>>> >>>> >>>> Ouch :D >>>> >>>> >>>>> >>>>> On Tue, Mar 26, 2019 at 1:35 PM Robert Burke <[email protected]> >>>>> wrote: >>>>> >>>>>> Michael, your concern is reasonable, especially with the experience >>>>>> with python, though that does help me bootstrap this work. :) >>>>>> >>>>>> The go tools provide caching and avoid redoing work if the source >>>>>> files haven't changed. This applies most particularly for `go build` and >>>>>> `go test`. As long as the go code isn't changing at every invocation, >>>>>> this >>>>>> should be fine. I'm not aware of the same being the case for the usual >>>>>> python tools. >>>>>> >>>>>> The real trick is ensuring a valid and consistent environment for >>>>>> the go code. >>>>>> >>>>>> The environment question becomes easier for everyone by moving to go >>>>>> modules, which were designed to provide these kinds of consistent builds. >>>>>> It also avoids needing a GOPATH set. Any directory is permitted, as long >>>>>> as >>>>>> the go.mod is present. >>>>>> >>>>>> (The Go SDK doesn't yet us go modules, so go.mod and go.sum aren't >>>>>> yet in the repo.) >>>>>> >>>>>> The main blocker is see is updating the Jenkins machines to have the >>>>>> latest version of Go (1.12) instead of 1.10, which doesn't support >>>>>> modules. >>>>>> This only blocks a final submission, rather than the work fortunately. >>>>>> >>>>>> On Tue, Mar 26, 2019, 1:08 PM Udi Meiri <[email protected]> wrote: >>>>>> >>>>>>> "rm -r ~/.gradle/go/repo/" worked for me (there was more than one >>>>>>> package with issues). >>>>>>> My ~/.bashrc has >>>>>>> export GOPATH=$HOME/go >>>>>>> so maybe that's making the difference in my setup. >>>>>>> >>>>>>> On Tue, Mar 26, 2019 at 11:28 AM Thomas Weise <[email protected]> >>>>>>> wrote: >>>>>>> >>>>>>>> Can this be addressed by having "clean" remove all state that >>>>>>>> gogradle leaves behind? This staleness issue has bitten me a few times >>>>>>>> also >>>>>>>> and it would be good to have a reliable way to deal with it, even if it >>>>>>>> involves an extra clean. >>>>>>>> >>>>>>>> >>>>>>>> On Tue, Mar 26, 2019 at 11:14 AM Michael Luckey < >>>>>>>> [email protected]> wrote: >>>>>>>> >>>>>>>>> @Udi >>>>>>>>> Did you try to just delete the >>>>>>>>> '/usr/local/google/home/ehudm/.gradle/go/repo/cloud.google.com' >>>>>>>>> folder? >>>>>>>>> >>>>>>>>> @Robert >>>>>>>>> As said before, I am a bit scared about the implications. Shelling >>>>>>>>> out is done by python, and from build perspective, this does not work >>>>>>>>> very >>>>>>>>> well, unfortunately. I.e. no caching, up-to-date checks etc... >>>>>>>>> >>>>>>>>> But of course, we need to play with this a bit more. >>>>>>>>> >>>>>>>>> On Tue, Mar 26, 2019 at 6:24 PM Robert Burke <[email protected]> >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>>> Reading the error from the gradle scan, it largely looks like >>>>>>>>>> some part of the GCP dependencies for the build depends on a >>>>>>>>>> package, where >>>>>>>>>> the commit version is no longer around. The main issue with gogradle >>>>>>>>>> is >>>>>>>>>> that it's entirely distinct from the usual Go workflow, which means >>>>>>>>>> deps >>>>>>>>>> users use are likely to be different to what's in the lock file. >>>>>>>>>> >>>>>>>>>> This work will be tracked in >>>>>>>>>> https://issues.apache.org/jira/browse/BEAM-5379 >>>>>>>>>> GoGradle hasn't moved to support the new-go way of handling deps, >>>>>>>>>> so my inclination is to simplify to simple scripts for Gradle that >>>>>>>>>> shell >>>>>>>>>> out the to Go tool for handling Go dep management, over trying to fix >>>>>>>>>> GoGradle. >>>>>>>>>> >>>>>>>>>> On Tue, 26 Mar 2019 at 09:43, Udi Meiri <[email protected]> wrote: >>>>>>>>>> >>>>>>>>>>> Robert, from what I recall it's not flaky for me - it >>>>>>>>>>> consistently fails. Let me know if there's a way to get more >>>>>>>>>>> logging about >>>>>>>>>>> this error. >>>>>>>>>>> >>>>>>>>>>> On Mon, Mar 25, 2019, 19:50 Robert Burke <[email protected]> >>>>>>>>>>> wrote: >>>>>>>>>>> >>>>>>>>>>>> It's concerning to me that 1) the Go dependency resolution via >>>>>>>>>>>> gogradle is flaky, and 2) that it can block other languages. >>>>>>>>>>>> >>>>>>>>>>>> I suppose 2) makes sense since it's part of the container >>>>>>>>>>>> bootstrapping code, but that makes 1) a serious problem, of which >>>>>>>>>>>> I wasn't >>>>>>>>>>>> aware. >>>>>>>>>>>> I should have time to investigate this in the next two weeks. >>>>>>>>>>>> >>>>>>>>>>>> On Mon, 25 Mar 2019 at 18:08, Michael Luckey < >>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> Just for the record, >>>>>>>>>>>>> >>>>>>>>>>>>> using a vm here, because did not yet get all task running on >>>>>>>>>>>>> my mac, and did not want to mess with my setup. >>>>>>>>>>>>> >>>>>>>>>>>>> So installed vanilla ubuntu-18.04 LTS on virtual box, 26GB >>>>>>>>>>>>> ram, 6 cores and further >>>>>>>>>>>>> >>>>>>>>>>>>> sudo apt update >>>>>>>>>>>>> >>>>>>>>>>>>> sudo apt install gcc >>>>>>>>>>>>> >>>>>>>>>>>>> sudo apt install make >>>>>>>>>>>>> >>>>>>>>>>>>> sudo apt install perl >>>>>>>>>>>>> >>>>>>>>>>>>> sudo apt install curl >>>>>>>>>>>>> >>>>>>>>>>>>> sudo apt install openjdk-8-jdk >>>>>>>>>>>>> >>>>>>>>>>>>> sudo apt install python >>>>>>>>>>>>> >>>>>>>>>>>>> sudo apt install -y software-properties-common >>>>>>>>>>>>> >>>>>>>>>>>>> sudo add-apt-repository ppa:deadsnakes/ppa >>>>>>>>>>>>> >>>>>>>>>>>>> sudo apt update >>>>>>>>>>>>> >>>>>>>>>>>>> sudo apt install python3.5 >>>>>>>>>>>>> >>>>>>>>>>>>> sudo apt-get install apt-transport-https ca-certificates curl >>>>>>>>>>>>> gnupg-agent software-properties-common >>>>>>>>>>>>> >>>>>>>>>>>>> curl -fsSL https://download.docker.com/linux/ubuntu/gpg | >>>>>>>>>>>>> sudo apt-key add - >>>>>>>>>>>>> >>>>>>>>>>>>> sudo apt-key fingerprint 0EBFCD88 >>>>>>>>>>>>> >>>>>>>>>>>>> sudo add-apt-repository "deb [arch=amd64] >>>>>>>>>>>>> https://download.docker.com/linux/ubuntu \ >>>>>>>>>>>>> >>>>>>>>>>>>> $(lsb_release -cs) \ >>>>>>>>>>>>> >>>>>>>>>>>>> stable" >>>>>>>>>>>>> >>>>>>>>>>>>> sudo apt-get update >>>>>>>>>>>>> >>>>>>>>>>>>> sudo apt-get install docker-ce docker-ce-cli containerd.io >>>>>>>>>>>>> >>>>>>>>>>>>> sudo groupadd docker >>>>>>>>>>>>> >>>>>>>>>>>>> sudo usermod -aG docker $USER >>>>>>>>>>>>> >>>>>>>>>>>>> git config --global user.email "[email protected]" >>>>>>>>>>>>> >>>>>>>>>>>>> git config --global user.name "Some Guy" >>>>>>>>>>>>> >>>>>>>>>>>>> curl https://bootstrap.pypa.io/get-pip.py -o get-pip.py >>>>>>>>>>>>> >>>>>>>>>>>>> sudo python get-pip.py >>>>>>>>>>>>> >>>>>>>>>>>>> rm get-pip.py >>>>>>>>>>>>> >>>>>>>>>>>>> sudo pip install --upgrade virtualenv >>>>>>>>>>>>> >>>>>>>>>>>>> sudo pip install cython >>>>>>>>>>>>> >>>>>>>>>>>>> sudo apt-get install python-dev >>>>>>>>>>>>> >>>>>>>>>>>>> sudo apt-get install python3-distutils >>>>>>>>>>>>> >>>>>>>>>>>>> sudo apt-get install python3-dev # for python3.x installs >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> git clone https://github.com/apache/beam.git cd beam/ >>>>>>>>>>>>> ./gradlew build >>>>>>>>>>>>> >>>>>>>>>>>>> Nothing else changed/added. (hopefully, need to reassure >>>>>>>>>>>>> myself here) >>>>>>>>>>>>> >>>>>>>>>>>>> Unfortunately, this is failing. Need to exclude those python >>>>>>>>>>>>> tests (and of course website, which usually fails on lira links) >>>>>>>>>>>>> >>>>>>>>>>>>> So I might be missing some env settings for gap, dunno. >>>>>>>>>>>>> Probably missed some docs. >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> On Tue, Mar 26, 2019 at 1:46 AM Michael Luckey < >>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> Thanks Udi for trying that! >>>>>>>>>>>>>> >>>>>>>>>>>>>> In fact, the go dependency resolution is flaky. Did not look >>>>>>>>>>>>>> into that, but just rerunning usually works. Of course, less >>>>>>>>>>>>>> than optimal, >>>>>>>>>>>>>> but, well... >>>>>>>>>>>>>> >>>>>>>>>>>>>> Running build target is of course just an aggregation of task >>>>>>>>>>>>>> to run. And unfortunately just running that >>>>>>>>>>>>>> >>>>>>>>>>>>>> ./gradlew :beam-sdks-python:testPy2Gcp >>>>>>>>>>>>>> >>>>>>>>>>>>>> stalls on my (virtual) machine. >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Tue, Mar 26, 2019 at 1:35 AM Udi Meiri <[email protected]> >>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>>> Okay, `./gradlew build` failed pretty quickly for me: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> > Task :beam-sdks-go:resolveBuildDependencies FAILED >>>>>>>>>>>>>>> cloud.google.com/go: >>>>>>>>>>>>>>> commit='4f6c921ec566a33844f4e7879b31cd8575a6982d', urls=[ >>>>>>>>>>>>>>> https://code.googlesource.com/gocloud] does not exist in >>>>>>>>>>>>>>> /usr/local/google/home/ehudm/.gradle/go/repo/ >>>>>>>>>>>>>>> cloud.google.com/go/625660c387d9403fde4d73cacaf2d2ac, >>>>>>>>>>>>>>> updating will be performed. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> https://gradle.com/s/x5zqbc5zwd3bg >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> (Now I remember why I stopped using `build` :/) >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On Mon, Mar 25, 2019 at 5:30 PM Udi Meiri <[email protected]> >>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> It shouldn't stall. That's a bug. >>>>>>>>>>>>>>>> OTOH, I never use the `build` target. >>>>>>>>>>>>>>>> I'll try running that myself. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On Mon, Mar 25, 2019, 07:24 Michael Luckey < >>>>>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Hi, >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> trying to run './gradlew build' on vanilla setup, my build >>>>>>>>>>>>>>>>> consistently stalls during execution of python gcp tests, >>>>>>>>>>>>>>>>> e.g. on both of >>>>>>>>>>>>>>>>> - > :beam-sdks-python:testPy2Gcp >>>>>>>>>>>>>>>>> - > :beam-sdks-python-test-suites-tox-py35:testPy35Gcp >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Console output: >>>>>>>>>>>>>>>>> #### snip #### >>>>>>>>>>>>>>>>> test_big_query_standard_sql >>>>>>>>>>>>>>>>> (apache_beam.io.gcp.big_query_query_to_table_it_test.BigQueryQueryToTableIT) >>>>>>>>>>>>>>>>> ... SKIP: IT is skipped because --test-pipeline-options is >>>>>>>>>>>>>>>>> not specified >>>>>>>>>>>>>>>>> test_big_query_standard_sql_kms_key >>>>>>>>>>>>>>>>> (apache_beam.io.gcp.big_query_query_to_table_it_test.BigQueryQueryToTableIT) >>>>>>>>>>>>>>>>> ... SKIP: This test requires BQ Dataflow native source >>>>>>>>>>>>>>>>> support for KMS, >>>>>>>>>>>>>>>>> which is not available yet. >>>>>>>>>>>>>>>>> test_multiple_destinations_transform >>>>>>>>>>>>>>>>> (apache_beam.io.gcp.bigquery_file_loads_test.BigQueryFileLoadsIT) >>>>>>>>>>>>>>>>> ... SKIP: >>>>>>>>>>>>>>>>> IT is skipped because --test-pipeline-options is not specified >>>>>>>>>>>>>>>>> test_one_job_fails_all_jobs_fail >>>>>>>>>>>>>>>>> (apache_beam.io.gcp.bigquery_file_loads_test.BigQueryFileLoadsIT) >>>>>>>>>>>>>>>>> ... SKIP: >>>>>>>>>>>>>>>>> IT is skipped because --test-pipeline-options is not specified >>>>>>>>>>>>>>>>> test_records_traverse_transform_with_mocks >>>>>>>>>>>>>>>>> (apache_beam.io.gcp.bigquery_file_loads_test.TestBigQueryFileLoads) >>>>>>>>>>>>>>>>> ... >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> output ends here, would expect a failed or ok here. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Afterwards no progress - even waiting for hours. Any idea, >>>>>>>>>>>>>>>>> what might be causing this? Do I need to add some GCP >>>>>>>>>>>>>>>>> properties for this >>>>>>>>>>>>>>>>> task ? >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Any ideas, what I am doing wrong? >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> best, >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> michel >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>
