[ https://issues.apache.org/jira/browse/IMPALA-10848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17711626#comment-17711626 ]
Quanlong Huang commented on IMPALA-10848: ----------------------------------------- Just realized we have bin/bootstrap_build.sh which is for build-only env setup. It currently only supports Ubuntu16. We just need to extend it to support other OS, e.g. CentOS. > Provide compile-only option to skip downloading test dependencies > ----------------------------------------------------------------- > > Key: IMPALA-10848 > URL: https://issues.apache.org/jira/browse/IMPALA-10848 > Project: IMPALA > Issue Type: Improvement > Components: Infrastructure > Reporter: Quanlong Huang > Assignee: yx91490 > Priority: Major > Attachments: pywebhdfs_failure.png > > > Compiling Impala is not easy for a beginner. A portion of failures are in > downloading/installing dependencies. > For instance, old versions of Impala may fail to compile since cdh components > of old GBNs on S3 are removed. However, the artifacts of cdh component are > only used in testing (minicluster & holding testdata). We can still compile > without them. > Take pip dependencies as another example, here is a failure I got from a > community user. It failed by installing pywebhdfs: > !pywebhdfs_failure.png! > However, simple git-grep shows that pywebhdfs is only used in tests: > {code:bash} > $ git grep pywebhdfs > bin/bootstrap_system.sh:# >>> from pywebhdfs.webhdfs import PyWebHdfsClient > infra/python/deps/requirements.txt:pywebhdfs == 0.3.2 > tests/common/impala_test_suite.py: # HDFS: uses a mixture of pywebhdfs > (which is faster than the HDFS CLI) and the > tests/util/hdfs_util.py:from pywebhdfs.webhdfs import PyWebHdfsClient, > errors, _raise_pywebhdfs_exception > tests/util/hdfs_util.py: > _raise_pywebhdfs_exception(response.status_code, response.text) > tests/util/hdfs_util.py: > _raise_pywebhdfs_exception(response.status_code, response.text) > tests/util/hdfs_util.py: > _raise_pywebhdfs_exception(response.status_code, response.text) > tests/util/hdfs_util.py: > _raise_pywebhdfs_exception(response.status_code, response.text) {code} > If the user just wants to compile Impala and deploys it in their existing > Hadoop cluster, dealing with these failures is a waste of their time. > *Target for this JIRA* > * Provide compile-only option to bin/bootstrap_system.sh. It should skip > downloading/installing unused dependencies like postgresql. > * Provide compile-only option to buildall.sh. It should skip downloading > unused cdh/cdp components in compilation. > * Update our > [wiki|https://cwiki.apache.org/confluence/display/IMPALA/Building+Impala] > about this. > Note that we already have some env vars to control the download behaviors, > e.g. SKIP_PYTHON_DOWNLOAD, SKIP_TOOLCHAIN_BOOTSTRAP. We just need to make the > compile-only scenario works with minimal requirements and document it. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org