[ 
https://issues.apache.org/jira/browse/IMPALA-10848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17711626#comment-17711626
 ] 

Quanlong Huang commented on IMPALA-10848:
-----------------------------------------

Just realized we have bin/bootstrap_build.sh which is for build-only env setup. 
It currently only supports Ubuntu16. We just need to extend it to support other 
OS, e.g. CentOS.

> Provide compile-only option to skip downloading test dependencies
> -----------------------------------------------------------------
>
>                 Key: IMPALA-10848
>                 URL: https://issues.apache.org/jira/browse/IMPALA-10848
>             Project: IMPALA
>          Issue Type: Improvement
>          Components: Infrastructure
>            Reporter: Quanlong Huang
>            Assignee: yx91490
>            Priority: Major
>         Attachments: pywebhdfs_failure.png
>
>
> Compiling Impala is not easy for a beginner. A portion of failures are in 
> downloading/installing dependencies.
> For instance, old versions of Impala may fail to compile since cdh components 
> of old GBNs on S3 are removed. However, the artifacts of cdh component are 
> only used in testing (minicluster & holding testdata). We can still compile 
> without them.
> Take pip dependencies as another example, here is a failure I got from a 
> community user. It failed by installing pywebhdfs:
> !pywebhdfs_failure.png!
> However, simple git-grep shows that pywebhdfs is only used in tests:
> {code:bash}
> $ git grep pywebhdfs
> bin/bootstrap_system.sh:#  >>> from pywebhdfs.webhdfs import PyWebHdfsClient
> infra/python/deps/requirements.txt:pywebhdfs == 0.3.2
> tests/common/impala_test_suite.py:    #     HDFS: uses a mixture of pywebhdfs 
> (which is faster than the HDFS CLI) and the
> tests/util/hdfs_util.py:from pywebhdfs.webhdfs import PyWebHdfsClient, 
> errors, _raise_pywebhdfs_exception
> tests/util/hdfs_util.py:      
> _raise_pywebhdfs_exception(response.status_code, response.text)
> tests/util/hdfs_util.py:      
> _raise_pywebhdfs_exception(response.status_code, response.text)
> tests/util/hdfs_util.py:      
> _raise_pywebhdfs_exception(response.status_code, response.text)
> tests/util/hdfs_util.py:      
> _raise_pywebhdfs_exception(response.status_code, response.text) {code}
> If the user just wants to compile Impala and deploys it in their existing 
> Hadoop cluster, dealing with these failures is a waste of their time.
> *Target for this JIRA*
>  * Provide compile-only option to bin/bootstrap_system.sh. It should skip 
> downloading/installing unused dependencies like postgresql.
>  * Provide compile-only option to buildall.sh. It should skip downloading 
> unused cdh/cdp components in compilation.
>  * Update our 
> [wiki|https://cwiki.apache.org/confluence/display/IMPALA/Building+Impala] 
> about this.
> Note that we already have some env vars to control the download behaviors, 
> e.g. SKIP_PYTHON_DOWNLOAD, SKIP_TOOLCHAIN_BOOTSTRAP. We just need to make the 
> compile-only scenario works with minimal requirements and document it.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

Reply via email to