Kengo Seki created AIRFLOW-2515: ----------------------------------- Summary: Add dependency on thrift_sasl so that HiveServer2Hook works Key: AIRFLOW-2515 URL: https://issues.apache.org/jira/browse/AIRFLOW-2515 Project: Apache Airflow Issue Type: Bug Components: dependencies, hive_hooks, hooks Reporter: Kengo Seki Assignee: Kengo Seki
Installing "hive" extra does not require thrift_sasl module for now: {code} $ pip install --upgrade -e ".[hive]" (snip) Successfully installed apache-airflow $ pip show thrift_sasl $ {code} But in fact, HiveServer2Hook (more precisely, impyla on which HiveServer2Hook depends) requires that module, even if kerberos is disabled. {code} $ ipython Python 3.5.2 (default, Nov 23 2017, 16:37:01) Type 'copyright', 'credits' or 'license' for more information IPython 6.3.1 -- An enhanced Interactive Python. Type '?' for help. In [1]: from airflow.hooks.hive_hooks import HiveServer2Hook In [2]: h = HiveServer2Hook() In [3]: conn = h.get_conn() [2018-05-23 11:42:30,452] {base_hook.py:83} INFO - Using connection to: localhost --------------------------------------------------------------------------- ImportError Traceback (most recent call last) (snip) 147 148 # Initializes a sasl client --> 149 from thrift_sasl import TSaslClientTransport 150 try: 151 import sasl # pylint: disable=import-error ImportError: No module named 'thrift_sasl' {code} This is also [documented|https://github.com/cloudera/impyla#dependencies] in impyla's README: {quote} For Hive and/or Kerberos support: ``` pip install thrift_sasl==0.2.1 pip install sasl ``` {quote} -- This message was sent by Atlassian JIRA (v7.6.3#76005)