Kengo Seki created AIRFLOW-2515:
-----------------------------------

             Summary: Add dependency on thrift_sasl so that HiveServer2Hook 
works
                 Key: AIRFLOW-2515
                 URL: https://issues.apache.org/jira/browse/AIRFLOW-2515
             Project: Apache Airflow
          Issue Type: Bug
          Components: dependencies, hive_hooks, hooks
            Reporter: Kengo Seki
            Assignee: Kengo Seki


Installing "hive" extra does not require thrift_sasl module for now:

{code}
$ pip install --upgrade -e ".[hive]"

(snip)

Successfully installed apache-airflow
$ pip show thrift_sasl
$ 
{code}

But in fact, HiveServer2Hook (more precisely, impyla on which HiveServer2Hook 
depends) requires that module, even if kerberos is disabled.

{code}
$ ipython
Python 3.5.2 (default, Nov 23 2017, 16:37:01) 
Type 'copyright', 'credits' or 'license' for more information
IPython 6.3.1 -- An enhanced Interactive Python. Type '?' for help.

In [1]: from airflow.hooks.hive_hooks import HiveServer2Hook

In [2]: h = HiveServer2Hook()

In [3]: conn = h.get_conn()
[2018-05-23 11:42:30,452] {base_hook.py:83} INFO - Using connection to: 
localhost
---------------------------------------------------------------------------
ImportError                               Traceback (most recent call last)

(snip)

    147 
    148     # Initializes a sasl client
--> 149     from thrift_sasl import TSaslClientTransport
    150     try:
    151         import sasl  # pylint: disable=import-error

ImportError: No module named 'thrift_sasl'
{code}

This is also [documented|https://github.com/cloudera/impyla#dependencies] in 
impyla's README:

{quote}
For Hive and/or Kerberos support:

```
pip install thrift_sasl==0.2.1
pip install sasl
```
{quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to