After digging a bit deeper, I was able to verify, that those tests block on
authorization to GCP.

Seems that, as I do not have any credentials set, and underlying oauth2
falls back to some local mode. This seems to start a webserver on port 8080
and waiting there forever. Accessing that port forwards to some google, but
fails also miserably.

Running

python setup.py nosetests --tests
>  
> apache_beam.io.gcp.bigquery_file_loads_test:TestBigQueryFileLoads.test_records_traverse_transform_with_mocks


and hitting 'Ctrl-C' after it got stuck, results in following output:

'KeyboardInterrupt [while running
> \'WriteToBigQuery/BigQueryBatchFileLoads/RemoveTempTables/Delete\']\n------------
> Your browser has been opened to visit:
>
> https://accounts.google.com/o/oauth2/v2/auth?scope=https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fbigquery+https%3A%
> If your browser is on a different machine then exit and re-run this
> application with the command-line parameter
>   --noauth_local_webserver
> Failed to find "code" in the query parameters of the redirect.
> Invalid authorization: Try running with --noauth_local_webserver.


I am a bit lost here on how to proceed.


On Tue, Mar 26, 2019 at 11:48 PM Michael Luckey <adude3...@gmail.com> wrote:

>
>
> On Tue, Mar 26, 2019 at 11:18 PM Mikhail Gryzykhin <mig...@google.com>
> wrote:
>
>> I believe what happens is that testPy2Gcp actually runs integration tests
>> that try to connect to GCP.
>>
>
> Actually I was hoping for an explanation like this. Any suggestion how I
> could confirm that on my behalf?
>
>
>> Without having GCP cluster and configuration on your machine I'd expect
>> these tests to fail.
>>
>
> Hmm... here I am actually unsure, what would be the best to handle such
> cases.
>
> If I understand correctly, we currently skip some tests which do not meet
> expectations, kind of 'can not run on your arch' thingies... So I am
> undecided, whether I d prefer those tests to be skipped if gcp
> configuration is missing
>
> pro
> * dev is still able to run the tests (whichever task they are associated
> with) without having to separate the failures out. For instance, these
> 'testPy2Gcp' does actually execute 'some tests' - which might be already
> covered by some other calls... But I definitely do not like the idea, to
> put the burden on the developer to track which tasks/tests might be
> executed on local machine. Unless this distinction is really coarse - and
> pre/postcommit is something I really would like to be able to run locally...
>
>
> con
> * we definitely need to make sure, those tests are not accidentally
> skipped on CI servers.
>
>
>>
>> I'd say we should remove testPy2Gcp task from "build" task and explicitly
>> keep it as integration test.
>>
>> --Mikhail
>>
>>
>> On Tue, Mar 26, 2019 at 3:12 PM Michael Luckey <adude3...@gmail.com>
>> wrote:
>>
>>>
>>>
>>> On Tue, Mar 26, 2019 at 10:29 PM Udi Meiri <eh...@google.com> wrote:
>>>
>>>> Luckey, I couldn't recreate your issue, but I still haven't done a full
>>>> build.
>>>> I created a new GCE VM with using the ubuntu-1804-bionic-v20190212a
>>>> image (n1-standard-4 machine type).
>>>>
>>>> Ran the following:
>>>> sudo apt-get update
>>>> sudo apt-get install python-pip
>>>> sudo apt-get install python-virtualenv
>>>> git clone https://github.com/apache/beam.git
>>>> cd beam
>>>> ./gradlew :beam-sdks-python:testPy2Gcp
>>>> [failed: no JAVA_HOME]
>>>> sudo apt-get install openjdk-8-jdk
>>>> ./gradlew :beam-sdks-python:testPy2Gcp
>>>>
>>>> Got: BUILD SUCCESSFUL in 7m 52s
>>>>
>>>
>>> Nice. Thanks a lot for your help here.
>>>
>>> If I understand correctly, this VM is already located within gcp. Could
>>> it already have some setup, which needs to be done on 'my' VM? For instance
>>> I was contemplating about that test trying 'to call home', but as I am
>>> (unfortunately ;) no googler and do not have any gcp specific setup, fails
>>> here but misses to timeout? This is just some weird assumption, did not yet
>>> look into the actual implementation.
>>>
>>> Which I seemingly need to do here :(
>>>
>>>
>>>> Then I tried:
>>>> ./gradlew build
>>>>
>>>> And ran out of disk space. :) (beam/ is taking 4.5G and the VM boot
>>>> disk is 10G total)
>>>>
>>>
>>> Ouch :D
>>>
>>>
>>>>
>>>> On Tue, Mar 26, 2019 at 1:35 PM Robert Burke <rob...@frantil.com>
>>>> wrote:
>>>>
>>>>> Michael, your concern is reasonable, especially with the experience
>>>>> with python, though that does help me bootstrap this work. :)
>>>>>
>>>>> The go tools provide caching and avoid redoing work if the source
>>>>> files haven't changed. This applies most particularly for `go build` and
>>>>> `go test`. As long as the go code isn't changing at every invocation, this
>>>>> should be fine. I'm not aware of the same being the case for the usual
>>>>> python tools.
>>>>>
>>>>>  The real trick is ensuring a valid and consistent environment for the
>>>>> go code.
>>>>>
>>>>> The environment question becomes easier for everyone by moving to go
>>>>> modules, which were designed to provide these kinds of consistent builds.
>>>>> It also avoids needing a GOPATH set. Any directory is permitted, as long 
>>>>> as
>>>>> the go.mod is present.
>>>>>
>>>>> (The Go SDK doesn't yet us go modules, so go.mod and go.sum aren't yet
>>>>> in the repo.)
>>>>>
>>>>> The main blocker is see is updating the Jenkins machines to have the
>>>>> latest version of Go (1.12) instead of 1.10, which doesn't support 
>>>>> modules.
>>>>> This only blocks a final submission, rather than the work fortunately.
>>>>>
>>>>> On Tue, Mar 26, 2019, 1:08 PM Udi Meiri <eh...@google.com> wrote:
>>>>>
>>>>>> "rm -r ~/.gradle/go/repo/" worked for me (there was more than one
>>>>>> package with issues).
>>>>>> My ~/.bashrc has
>>>>>>   export GOPATH=$HOME/go
>>>>>> so maybe that's making the difference in my setup.
>>>>>>
>>>>>> On Tue, Mar 26, 2019 at 11:28 AM Thomas Weise <t...@apache.org> wrote:
>>>>>>
>>>>>>> Can this be addressed by having "clean" remove all state that
>>>>>>> gogradle leaves behind? This staleness issue has bitten me a few times 
>>>>>>> also
>>>>>>> and it would be good to have a reliable way to deal with it, even if it
>>>>>>> involves an extra clean.
>>>>>>>
>>>>>>>
>>>>>>> On Tue, Mar 26, 2019 at 11:14 AM Michael Luckey <adude3...@gmail.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> @Udi
>>>>>>>> Did you try to just delete the
>>>>>>>> '/usr/local/google/home/ehudm/.gradle/go/repo/cloud.google.com'
>>>>>>>> folder?
>>>>>>>>
>>>>>>>> @Robert
>>>>>>>> As said before, I am a bit scared about the implications. Shelling
>>>>>>>> out is done by python, and from build perspective, this does not work 
>>>>>>>> very
>>>>>>>> well, unfortunately. I.e. no caching, up-to-date checks etc...
>>>>>>>>
>>>>>>>> But of course, we need to play with this a bit more.
>>>>>>>>
>>>>>>>> On Tue, Mar 26, 2019 at 6:24 PM Robert Burke <rob...@frantil.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Reading the error from the gradle scan, it largely looks like some
>>>>>>>>> part of the GCP dependencies for the build depends on a package, 
>>>>>>>>> where the
>>>>>>>>> commit version is no longer around. The main issue with gogradle is 
>>>>>>>>> that
>>>>>>>>> it's entirely distinct from the usual Go workflow, which means deps 
>>>>>>>>> users
>>>>>>>>> use are likely to be different to what's in the lock file.
>>>>>>>>>
>>>>>>>>> This work will be tracked in
>>>>>>>>> https://issues.apache.org/jira/browse/BEAM-5379
>>>>>>>>> GoGradle hasn't moved to support the new-go way of handling deps,
>>>>>>>>> so my inclination is to simplify to simple scripts for Gradle that 
>>>>>>>>> shell
>>>>>>>>> out the to Go tool for handling Go dep management, over trying to fix
>>>>>>>>> GoGradle.
>>>>>>>>>
>>>>>>>>> On Tue, 26 Mar 2019 at 09:43, Udi Meiri <eh...@google.com> wrote:
>>>>>>>>>
>>>>>>>>>> Robert, from what I recall it's not flaky for me - it
>>>>>>>>>> consistently fails. Let me know if there's a way to get more logging 
>>>>>>>>>> about
>>>>>>>>>> this error.
>>>>>>>>>>
>>>>>>>>>> On Mon, Mar 25, 2019, 19:50 Robert Burke <rob...@frantil.com>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> It's concerning to me that 1) the Go dependency resolution via
>>>>>>>>>>> gogradle is flaky, and 2) that it can block other languages.
>>>>>>>>>>>
>>>>>>>>>>> I suppose 2) makes sense since it's part of the container
>>>>>>>>>>> bootstrapping code, but that makes 1) a serious problem, of which I 
>>>>>>>>>>> wasn't
>>>>>>>>>>> aware.
>>>>>>>>>>> I should have time to investigate this in the next two weeks.
>>>>>>>>>>>
>>>>>>>>>>> On Mon, 25 Mar 2019 at 18:08, Michael Luckey <
>>>>>>>>>>> adude3...@gmail.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Just for the record,
>>>>>>>>>>>>
>>>>>>>>>>>> using a vm here, because did not yet get all task running on my
>>>>>>>>>>>> mac, and did not want to mess with my setup.
>>>>>>>>>>>>
>>>>>>>>>>>> So installed vanilla ubuntu-18.04 LTS on virtual box, 26GB ram,
>>>>>>>>>>>> 6 cores and further
>>>>>>>>>>>>
>>>>>>>>>>>> sudo apt update
>>>>>>>>>>>>
>>>>>>>>>>>> sudo apt install gcc
>>>>>>>>>>>>
>>>>>>>>>>>> sudo apt install make
>>>>>>>>>>>>
>>>>>>>>>>>> sudo apt install perl
>>>>>>>>>>>>
>>>>>>>>>>>> sudo apt install curl
>>>>>>>>>>>>
>>>>>>>>>>>> sudo apt install openjdk-8-jdk
>>>>>>>>>>>>
>>>>>>>>>>>> sudo apt install python
>>>>>>>>>>>>
>>>>>>>>>>>> sudo apt install -y software-properties-common
>>>>>>>>>>>>
>>>>>>>>>>>> sudo add-apt-repository ppa:deadsnakes/ppa
>>>>>>>>>>>>
>>>>>>>>>>>> sudo apt update
>>>>>>>>>>>>
>>>>>>>>>>>> sudo apt install python3.5
>>>>>>>>>>>>
>>>>>>>>>>>> sudo apt-get install apt-transport-https ca-certificates curl
>>>>>>>>>>>> gnupg-agent software-properties-common
>>>>>>>>>>>>
>>>>>>>>>>>> curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo
>>>>>>>>>>>> apt-key add -
>>>>>>>>>>>>
>>>>>>>>>>>> sudo apt-key fingerprint 0EBFCD88
>>>>>>>>>>>>
>>>>>>>>>>>> sudo add-apt-repository "deb [arch=amd64]
>>>>>>>>>>>> https://download.docker.com/linux/ubuntu \
>>>>>>>>>>>>
>>>>>>>>>>>> $(lsb_release -cs) \
>>>>>>>>>>>>
>>>>>>>>>>>> stable"
>>>>>>>>>>>>
>>>>>>>>>>>> sudo apt-get update
>>>>>>>>>>>>
>>>>>>>>>>>> sudo apt-get install docker-ce docker-ce-cli containerd.io
>>>>>>>>>>>>
>>>>>>>>>>>> sudo groupadd docker
>>>>>>>>>>>>
>>>>>>>>>>>> sudo usermod -aG docker $USER
>>>>>>>>>>>>
>>>>>>>>>>>> git config --global user.email "d...@spam.me"
>>>>>>>>>>>>
>>>>>>>>>>>> git config --global user.name "Some Guy"
>>>>>>>>>>>>
>>>>>>>>>>>> curl https://bootstrap.pypa.io/get-pip.py -o get-pip.py
>>>>>>>>>>>>
>>>>>>>>>>>> sudo python get-pip.py
>>>>>>>>>>>>
>>>>>>>>>>>> rm get-pip.py
>>>>>>>>>>>>
>>>>>>>>>>>> sudo pip install --upgrade virtualenv
>>>>>>>>>>>>
>>>>>>>>>>>> sudo pip install cython
>>>>>>>>>>>>
>>>>>>>>>>>> sudo apt-get install python-dev
>>>>>>>>>>>>
>>>>>>>>>>>> sudo apt-get install python3-distutils
>>>>>>>>>>>>
>>>>>>>>>>>> sudo apt-get install python3-dev # for python3.x installs
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> git clone https://github.com/apache/beam.git cd beam/
>>>>>>>>>>>> ./gradlew build
>>>>>>>>>>>>
>>>>>>>>>>>> Nothing else changed/added. (hopefully, need to reassure myself
>>>>>>>>>>>> here)
>>>>>>>>>>>>
>>>>>>>>>>>> Unfortunately, this is failing. Need to exclude those python
>>>>>>>>>>>> tests (and of course website, which usually fails on lira links)
>>>>>>>>>>>>
>>>>>>>>>>>> So I might be missing some env settings for gap, dunno.
>>>>>>>>>>>> Probably missed some docs.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Tue, Mar 26, 2019 at 1:46 AM Michael Luckey <
>>>>>>>>>>>> adude3...@gmail.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks Udi for trying that!
>>>>>>>>>>>>>
>>>>>>>>>>>>> In fact, the go dependency resolution is flaky. Did not look
>>>>>>>>>>>>> into that, but just rerunning usually works. Of course, less than 
>>>>>>>>>>>>> optimal,
>>>>>>>>>>>>> but, well...
>>>>>>>>>>>>>
>>>>>>>>>>>>> Running build target is of course just an aggregation of task
>>>>>>>>>>>>> to run. And unfortunately just running that
>>>>>>>>>>>>>
>>>>>>>>>>>>> ./gradlew  :beam-sdks-python:testPy2Gcp
>>>>>>>>>>>>>
>>>>>>>>>>>>> stalls on my (virtual) machine.
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Tue, Mar 26, 2019 at 1:35 AM Udi Meiri <eh...@google.com>
>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Okay, `./gradlew build` failed pretty quickly for me:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> > Task :beam-sdks-go:resolveBuildDependencies FAILED
>>>>>>>>>>>>>> cloud.google.com/go:
>>>>>>>>>>>>>> commit='4f6c921ec566a33844f4e7879b31cd8575a6982d', urls=[
>>>>>>>>>>>>>> https://code.googlesource.com/gocloud] does not exist in
>>>>>>>>>>>>>> /usr/local/google/home/ehudm/.gradle/go/repo/
>>>>>>>>>>>>>> cloud.google.com/go/625660c387d9403fde4d73cacaf2d2ac,
>>>>>>>>>>>>>> updating will be performed.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> https://gradle.com/s/x5zqbc5zwd3bg
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> (Now I remember why I stopped using `build` :/)
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Mon, Mar 25, 2019 at 5:30 PM Udi Meiri <eh...@google.com>
>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> It shouldn't stall. That's a bug.
>>>>>>>>>>>>>>> OTOH, I never use the `build` target.
>>>>>>>>>>>>>>> I'll try running that myself.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Mon, Mar 25, 2019, 07:24 Michael Luckey <
>>>>>>>>>>>>>>> adude3...@gmail.com> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> trying to run './gradlew build' on vanilla setup, my build
>>>>>>>>>>>>>>>> consistently stalls during execution of python gcp tests, e.g. 
>>>>>>>>>>>>>>>> on both of
>>>>>>>>>>>>>>>> - > :beam-sdks-python:testPy2Gcp
>>>>>>>>>>>>>>>> - > :beam-sdks-python-test-suites-tox-py35:testPy35Gcp
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Console output:
>>>>>>>>>>>>>>>> #### snip ####
>>>>>>>>>>>>>>>> test_big_query_standard_sql
>>>>>>>>>>>>>>>> (apache_beam.io.gcp.big_query_query_to_table_it_test.BigQueryQueryToTableIT)
>>>>>>>>>>>>>>>> ... SKIP: IT is skipped because --test-pipeline-options is not 
>>>>>>>>>>>>>>>> specified
>>>>>>>>>>>>>>>> test_big_query_standard_sql_kms_key
>>>>>>>>>>>>>>>> (apache_beam.io.gcp.big_query_query_to_table_it_test.BigQueryQueryToTableIT)
>>>>>>>>>>>>>>>> ... SKIP: This test requires BQ Dataflow native source support 
>>>>>>>>>>>>>>>> for KMS,
>>>>>>>>>>>>>>>> which is not available yet.
>>>>>>>>>>>>>>>> test_multiple_destinations_transform
>>>>>>>>>>>>>>>> (apache_beam.io.gcp.bigquery_file_loads_test.BigQueryFileLoadsIT)
>>>>>>>>>>>>>>>>  ... SKIP:
>>>>>>>>>>>>>>>> IT is skipped because --test-pipeline-options is not specified
>>>>>>>>>>>>>>>> test_one_job_fails_all_jobs_fail
>>>>>>>>>>>>>>>> (apache_beam.io.gcp.bigquery_file_loads_test.BigQueryFileLoadsIT)
>>>>>>>>>>>>>>>>  ... SKIP:
>>>>>>>>>>>>>>>> IT is skipped because --test-pipeline-options is not specified
>>>>>>>>>>>>>>>> test_records_traverse_transform_with_mocks
>>>>>>>>>>>>>>>> (apache_beam.io.gcp.bigquery_file_loads_test.TestBigQueryFileLoads)
>>>>>>>>>>>>>>>>  ...
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> output ends here, would expect a failed or ok here.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Afterwards no progress - even waiting for hours. Any idea,
>>>>>>>>>>>>>>>> what might be causing this? Do I need to add some GCP 
>>>>>>>>>>>>>>>> properties for this
>>>>>>>>>>>>>>>> task ?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Any ideas, what I am doing wrong?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> best,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> michel
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>

Reply via email to