[ https://issues.apache.org/jira/browse/AIRFLOW-2514?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Kengo Seki closed AIRFLOW-2514. ------------------------------- Resolution: Fixed > HiveServer2Hook doesn't work on Python2 due to thrift version conflict > ---------------------------------------------------------------------- > > Key: AIRFLOW-2514 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2514 > Project: Apache Airflow > Issue Type: Bug > Components: hive_hooks, hooks > Reporter: Kengo Seki > Priority: Major > > impyla on which HiveServer2Hook depends doesn't work with Thrift 0.10.0+ on > Python2. Example: > {code} > $ pip show thrift > Name: thrift > Version: 0.11.0 > (snip) > $ ipython > (snip) > In [1]: from airflow.hooks.hive_hooks import HiveServer2Hook > In [2]: HiveServer2Hook().get_conn().cursor() > [2018-05-23 10:21:02,117] {base_hook.py:83} INFO - Using connection to: > localhost > --------------------------------------------------------------------------- > TypeError Traceback (most recent call last) > <ipython-input-2-f76a25f124cf> in <module>() > ----> 1 HiveServer2Hook().get_conn().cursor() > (snip) > /home/sekikn/.virtualenvs/a/local/lib/python2.7/site-packages/impala/_thrift_gen/TCLIService/TCLIService.pyc > in write(self, oprot) > 1067 def write(self, oprot): > 1068 if oprot.__class__ == TBinaryProtocol.TBinaryProtocolAccelerated > and self.thrift_spec is not None and fastbinary is not None: > -> 1069 oprot.trans.write(fastbinary.encode_binary(self, > (self.__class__, self.thrift_spec))) > 1070 return > 1071 oprot.writeStructBegin('OpenSession_args') > TypeError: expecting list of size 2 for struct args > {code} > [This problem is already > reported|https://github.com/cloudera/impyla/issues/286] and therefore [impyla > pins Thrift version to > 0.9.3|https://github.com/cloudera/impyla/commit/94a8eff9cda0cdb16b180c7079961449c8385997]. > On the other hand, hmsclient (introduced by AIRFLOW-2336) needs Thrift > 0.11.0+. > With the lower version, importing hmsclient fails as follows: > {code} > $ pip show thrift > Name: thrift > Version: 0.10.0 > (snip) > $ python -m airflow.hooks.hive_hooks > Traceback (most recent call last): > File "/usr/lib/python2.7/runpy.py", line 174, in _run_module_as_main > "__main__", fname, loader, pkg_name) > File "/usr/lib/python2.7/runpy.py", line 72, in _run_code > exec code in run_globals > File "/home/sekikn/dev/incubator-airflow/airflow/hooks/hive_hooks.py", line > 33, in <module> > import hmsclient > File > "/home/sekikn/.virtualenvs/a/local/lib/python2.7/site-packages/hmsclient/__init__.py", > line 2, in <module> > from .hmsclient import HMSClient > File > "/home/sekikn/.virtualenvs/a/local/lib/python2.7/site-packages/hmsclient/hmsclient.py", > line 23, in <module> > from .genthrift.hive_metastore import ThriftHiveMetastore > File > "/home/sekikn/.virtualenvs/a/local/lib/python2.7/site-packages/hmsclient/genthrift/hive_metastore/ThriftHiveMetastore.py", > line 11, in <module> > from thrift.TRecursive import fix_spec > ImportError: No module named TRecursive > {code} > As a result, HiveServer2Hook is not available on Python2 now. -- This message was sent by Atlassian JIRA (v7.6.3#76005)