This is an automated email from the ASF dual-hosted git repository.
chengpan pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/kyuubi.git
The following commit(s) were added to refs/heads/master by this push:
new 9daf74d9c3 [KYUUBI #6908] Connection class ssl context object paramater
9daf74d9c3 is described below
commit 9daf74d9c38f83bc4873052cbe5935187a5b75a9
Author: Alex Wojtowicz <[email protected]>
AuthorDate: Tue Feb 25 22:22:14 2025 +0800
[KYUUBI #6908] Connection class ssl context object paramater
**Why are the changes needed:**
Currently looking to connect to a HiveServer2 behind an NGINX proxy that is
requiring mTLS communication. pyHive seems to lack the capability to establish
an mTLS connection in applications such as Airflow directly communicating to
the HiveServer2 instance.
The change needed is to be able to pass in the parameters for a proper mTLS
ssl context to be established. I believe that creating your own ssl_context
object is the quickest and cleanest way to do so, leaving the responsibility of
configuring it to further implementations and users. Also cuts down on code
length.
**How was this patch tested:**
Corresponding pytest fixtures have been added, using the mock module to see
if ssl_context object was properly accessed, or if the default one created in
the Connection initialization was properly configured.
Was not able to run pytest fixtures specifically, was lacking JDBC driver,
first time contributing to open source, happy to run tests if provided
guidance. Passed a clean build and test of the entire kyuubi project in local
dev environment.
**Was this patch authored or co-authored using generative AI tooling**
Yes, Generated-by Cursor-AI with Claude Sonnet 3.5 agent
Closes #6935 from alexio215/connection-class-ssl-context-param.
Closes #6908
539b29962 [Cheng Pan] Update python/pyhive/tests/test_hive.py
14c607489 [Alex Wojtowicz] Simplified testing, following pattern of other
tests, need proper SSL setup with nginx to test ssl_context fully
b947f2454 [Alex Wojtowicz] Added exception handling since JDBC driver will
not run in python tests
11f9002bf [Alex Wojtowicz] Passing in fully configured mock object before
creating connection
009c5cf24 [Alex Wojtowicz] Added back doc string documentation
e3280bcd8 [Alex Wojtowicz] Python testing
529de8a12 [Alex Wojtowicz] Added ssl_context object. If no obj is provided,
then it continues to use default provided parameters
Lead-authored-by: Alex Wojtowicz <[email protected]>
Co-authored-by: Cheng Pan <[email protected]>
Signed-off-by: Cheng Pan <[email protected]>
---
python/pyhive/hive.py | 16 +++++++++-------
python/pyhive/tests/test_hive.py | 21 +++++++++++++++++++++
2 files changed, 30 insertions(+), 7 deletions(-)
diff --git a/python/pyhive/hive.py b/python/pyhive/hive.py
index c1287488e9..e199dc25da 100644
--- a/python/pyhive/hive.py
+++ b/python/pyhive/hive.py
@@ -159,7 +159,8 @@ class Connection(object):
password=None,
check_hostname=None,
ssl_cert=None,
- thrift_transport=None
+ thrift_transport=None,
+ ssl_context=None
):
"""Connect to HiveServer2
@@ -172,19 +173,20 @@ class Connection(object):
:param password: Use with auth='LDAP' or auth='CUSTOM' only
:param thrift_transport: A ``TTransportBase`` for custom advanced
usage.
Incompatible with host, port, auth, kerberos_service_name, and
password.
-
+ :param ssl_context: A custom SSL context to use for HTTPS connections.
If provided,
+ this overrides check_hostname and ssl_cert parameters.
The way to support LDAP and GSSAPI is originated from cloudera/Impyla:
https://github.com/cloudera/impyla/blob/255b07ed973d47a3395214ed92d35ec0615ebf62
/impala/_thrift_api.py#L152-L160
"""
if scheme in ("https", "http") and thrift_transport is None:
port = port or 1000
- ssl_context = None
if scheme == "https":
- ssl_context = create_default_context()
- ssl_context.check_hostname = check_hostname == "true"
- ssl_cert = ssl_cert or "none"
- ssl_context.verify_mode = ssl_cert_parameter_map.get(ssl_cert,
CERT_NONE)
+ if ssl_context is None:
+ ssl_context = create_default_context()
+ ssl_context.check_hostname = check_hostname == "true"
+ ssl_cert = ssl_cert or "none"
+ ssl_context.verify_mode =
ssl_cert_parameter_map.get(ssl_cert, CERT_NONE)
thrift_transport = thrift.transport.THttpClient.THttpClient(
uri_or_host="{scheme}://{host}:{port}/cliservice/".format(
scheme=scheme, host=host, port=port
diff --git a/python/pyhive/tests/test_hive.py b/python/pyhive/tests/test_hive.py
index 17206430aa..c2a020e17f 100644
--- a/python/pyhive/tests/test_hive.py
+++ b/python/pyhive/tests/test_hive.py
@@ -15,6 +15,7 @@ import subprocess
import time
import unittest
from decimal import Decimal
+import ssl
import mock
import pytest
@@ -238,6 +239,26 @@ class TestHive(unittest.TestCase, DBAPITestCase):
subprocess.check_call(['sudo', 'cp', orig_none, des])
_restart_hs2()
+ @pytest.mark.skip(reason="Need a proper setup for SSL context testing")
+ def test_basic_ssl_context(self):
+ """Test that connection works with a custom SSL context that mimics
the default behavior."""
+ # Create an SSL context similar to what Connection creates by default
+ ssl_context = ssl.create_default_context()
+ ssl_context.check_hostname = False
+ ssl_context.verify_mode = ssl.CERT_NONE
+
+ # Connect using the same parameters as self.connect() but with our
custom context
+ with contextlib.closing(hive.connect(
+ host=_HOST,
+ port=10000,
+ configuration={'mapred.job.tracker': 'local'},
+ ssl_context=ssl_context
+ )) as connection:
+ with contextlib.closing(connection.cursor()) as cursor:
+ # Use the same query pattern as other tests
+ cursor.execute('SELECT 1 FROM one_row')
+ self.assertEqual(cursor.fetchall(), [(1,)])
+
def _restart_hs2():
subprocess.check_call(['sudo', 'service', 'hive-server2', 'restart'])