[ https://issues.apache.org/jira/browse/BEAM-9130?focusedWorklogId=373722&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-373722 ]
ASF GitHub Bot logged work on BEAM-9130: ---------------------------------------- Author: ASF GitHub Bot Created on: 17/Jan/20 17:30 Start Date: 17/Jan/20 17:30 Worklog Time Spent: 10m Work Description: chadrik commented on pull request #10629: [BEAM-9130] Migrate HDFS IT to use tox env. URL: https://github.com/apache/beam/pull/10629#discussion_r368051711 ########## File path: sdks/python/apache_beam/io/hdfs_integration_test/Dockerfile ########## @@ -24,22 +24,13 @@ FROM $BASE_IMAGE WORKDIR /app ENV HDFSCLI_CONFIG /app/sdks/python/apache_beam/io/hdfs_integration_test/hdfscli.cfg -RUN pip install --no-cache-dir holdup gsutil -RUN gsutil cp gs://dataflow-samples/shakespeare/kinglear.txt . -# Install Beam and dependencies. -ADD sdks/python /app/sdks/python -ADD model /app/model -RUN cd sdks/python && \ - python setup.py sdist && \ - pip install --no-cache-dir $(ls dist/apache-beam-*.tar.gz | tail -n1)[gcp] +# Add Beam SDK sources. +COPY sdks/python /app/sdks/python +COPY model /app/model + +# This step should look like setupVirtualenv minus virtualenv creation. +RUN pip install --no-cache-dir tox==3.11.1 -r sdks/python/build-requirements.txt Review comment: so our goal is to install `grpcio-tools` (and therefore protobuf, one of its deps) into the same python environment where sdist is run, _before the tests run_. This is easier said than done. Installing build-requirements.txt here in the dockerfile affects the environment where tox is installed and runs, but i'm not sure if it affects the environments where tox runs sdist, especially if tox is trying to create a temp env to run sdist (which I _know_ it does when it's operating in pep517 mode). Adding build-requirements.txt to the deps in tox.ini I think only affects the environment where the tests run, but not where tox runs sdist. gen_protos.py also does its own attempt to install build-requirements.txt if it detects `grpcio-tools` is not installed, but it does it in a weird way with a target install directory, which I think might not be reliable for the google .pth file that makes protobufs importable in python2 (see earlier discussion about site dirs). So, I've got 2 ideas left for how to hack this together without going full pep517: 1) run `python setup.py sdist` here in dockerland, after installing build-requirements, and pass the sdist to tox. That should guarantee that sdist is run in an env where protobuf is properly installed where the google .pth file will work, since we're installing it into the site-packages of the python interpreter running here, and not some temp dir. 2) change the logic in gen_protos where it installs build-requirements to also ensure that the directory it installs into is added as a site dir, by calling `site.addsitedir()` on that directory from the environment where we want to be able to import `google.protobuf`. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking ------------------- Worklog Id: (was: 373722) Time Spent: 40m (was: 0.5h) > sdks:python:test-suites:direct:py2:hdfsIntegrationTest is failing with > ImportError: No module named google.protobuf.message > --------------------------------------------------------------------------------------------------------------------------- > > Key: BEAM-9130 > URL: https://issues.apache.org/jira/browse/BEAM-9130 > Project: Beam > Issue Type: Improvement > Components: test-failures > Reporter: Valentyn Tymofieiev > Priority: Major > Labels: currently-failing > Time Spent: 40m > Remaining Estimate: 0h > > From logs: > {noformat} > 16:33:50 File "/usr/local/lib/python2.7/multiprocessing/process.py", line > 267, in _bootstrap > 16:33:50 [0m[91m self.run() > 16:33:50 File "/usr/local/lib/python2.7/multiprocessing/process.py", line > 114, in run > 16:33:50 self._target(*self._args, **self._kwargs) > 16:33:50 File "/app/sdks/python/gen_protos.py", line 357, in > _install_grpcio_tools_and_generate_proto_files > 16:33:50 generate_proto_files(force=force) > 16:33:50 File "/app/sdks/python/gen_protos.py", line 324, in > generate_proto_files > 16:33:50 generate_urn_files(log, out_dir) > 16:33:50 File "/app/sdks/python/gen_protos.py", line 65, in > generate_urn_files > 16:33:50 import google.protobuf.message as message > 16:33:50 [0m[91mImportError: No module named google.protobuf.message > 16:33:50 [0m[91mTraceback (most recent call last): > 16:33:50 File "setup.py", line 305, in <module> > 16:33:50 'mypy': generate_protos_first(mypy), > 16:33:50 File > "/usr/local/lib/python2.7/site-packages/setuptools/__init__.py", line 145, in > setup > 16:33:50 [0m[91m return distutils.core.setup(**attrs) > 16:33:50 File "/usr/local/lib/python2.7/distutils/core.py", line 151, in > setup > 16:33:50 [0m[91m dist.run_commands() > 16:33:50 File "/usr/local/lib/python2.7/distutils/dist.py", line 953, in > run_commands > 16:33:50 [0m[91m self.run_command(cmd) > 16:33:50 File "/usr/local/lib/python2.7/distutils/dist.py", line 972, in > run_command > 16:33:50 cmd_obj.run() > 16:33:50 File > "/usr/local/lib/python2.7/site-packages/setuptools/command/sdist.py", line > 44, in run > 16:33:50 [0m[91m self.run_command('egg_info') > 16:33:50 File "/usr/local/lib/python2.7/distutils/cmd.py", line 326, in > run_command > 16:33:50 [0m[91m self.distribution.run_command(command) > 16:33:50 File "/usr/local/lib/python2.7/distutils/dist.py", line 972, in > run_command > 16:33:50 [0m[91m cmd_obj.run() > 16:33:50 File "setup.py", line 229, in run > 16:33:50 [0m[91m gen_protos.generate_proto_files(log=log) > 16:33:50 File "/app/sdks/python/gen_protos.py", line 291, in > generate_proto_files > 16:33:50 raise ValueError("Proto generation failed (see log for > details).") > 16:33:50 [0m[91mValueError: [0m[91mProto generation failed (see log for > details > {noformat} > {noformat} > import google.protobuf.message as message > ImportError: No module named google.protobuf.message > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)