This is an automated email from the ASF dual-hosted git repository.

chengpan pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/kyuubi.git


The following commit(s) were added to refs/heads/master by this push:
     new 9daf74d9c3 [KYUUBI #6908] Connection class ssl context object paramater
9daf74d9c3 is described below

commit 9daf74d9c38f83bc4873052cbe5935187a5b75a9
Author: Alex Wojtowicz <[email protected]>
AuthorDate: Tue Feb 25 22:22:14 2025 +0800

    [KYUUBI #6908] Connection class ssl context object paramater
    
    **Why are the changes needed:**
    Currently looking to connect to a HiveServer2 behind an NGINX proxy that is 
requiring mTLS communication. pyHive seems to lack the capability to establish 
an mTLS connection in applications such as Airflow directly communicating to 
the HiveServer2 instance.
    
    The change needed is to be able to pass in the parameters for a proper mTLS 
ssl context to be established. I believe that creating your own ssl_context 
object is the quickest and cleanest way to do so, leaving the responsibility of 
configuring it to further implementations and users. Also cuts down on code 
length.
    
    **How was this patch tested:**
    Corresponding pytest fixtures have been added, using the mock module to see 
if ssl_context object was properly accessed, or if the default one created in 
the Connection initialization was properly configured.
    
    Was not able to run pytest fixtures specifically, was lacking JDBC driver, 
first time contributing to open source, happy to run tests if provided 
guidance. Passed a clean build and test of the entire kyuubi project in local 
dev environment.
    
    **Was this patch authored or co-authored using generative AI tooling**
    Yes, Generated-by Cursor-AI with Claude Sonnet 3.5 agent
    
    Closes #6935 from alexio215/connection-class-ssl-context-param.
    
    Closes #6908
    
    539b29962 [Cheng Pan] Update python/pyhive/tests/test_hive.py
    14c607489 [Alex Wojtowicz] Simplified testing, following pattern of other 
tests, need proper SSL setup with nginx to test ssl_context fully
    b947f2454 [Alex Wojtowicz] Added exception handling since JDBC driver will 
not run in python tests
    11f9002bf [Alex Wojtowicz] Passing in fully configured mock object before 
creating connection
    009c5cf24 [Alex Wojtowicz] Added back doc string documentation
    e3280bcd8 [Alex Wojtowicz] Python testing
    529de8a12 [Alex Wojtowicz] Added ssl_context object. If no obj is provided, 
then it continues to use default provided parameters
    
    Lead-authored-by: Alex Wojtowicz <[email protected]>
    Co-authored-by: Cheng Pan <[email protected]>
    Signed-off-by: Cheng Pan <[email protected]>
---
 python/pyhive/hive.py            | 16 +++++++++-------
 python/pyhive/tests/test_hive.py | 21 +++++++++++++++++++++
 2 files changed, 30 insertions(+), 7 deletions(-)

diff --git a/python/pyhive/hive.py b/python/pyhive/hive.py
index c1287488e9..e199dc25da 100644
--- a/python/pyhive/hive.py
+++ b/python/pyhive/hive.py
@@ -159,7 +159,8 @@ class Connection(object):
         password=None,
         check_hostname=None,
         ssl_cert=None,
-        thrift_transport=None
+        thrift_transport=None,
+        ssl_context=None
     ):
         """Connect to HiveServer2
 
@@ -172,19 +173,20 @@ class Connection(object):
         :param password: Use with auth='LDAP' or auth='CUSTOM' only
         :param thrift_transport: A ``TTransportBase`` for custom advanced 
usage.
             Incompatible with host, port, auth, kerberos_service_name, and 
password.
-
+        :param ssl_context: A custom SSL context to use for HTTPS connections. 
If provided,
+            this overrides check_hostname and ssl_cert parameters.
         The way to support LDAP and GSSAPI is originated from cloudera/Impyla:
         
https://github.com/cloudera/impyla/blob/255b07ed973d47a3395214ed92d35ec0615ebf62
         /impala/_thrift_api.py#L152-L160
         """
         if scheme in ("https", "http") and thrift_transport is None:
             port = port or 1000
-            ssl_context = None
             if scheme == "https":
-                ssl_context = create_default_context()
-                ssl_context.check_hostname = check_hostname == "true"
-                ssl_cert = ssl_cert or "none"
-                ssl_context.verify_mode = ssl_cert_parameter_map.get(ssl_cert, 
CERT_NONE)
+                if ssl_context is None:
+                    ssl_context = create_default_context()
+                    ssl_context.check_hostname = check_hostname == "true"
+                    ssl_cert = ssl_cert or "none"
+                    ssl_context.verify_mode = 
ssl_cert_parameter_map.get(ssl_cert, CERT_NONE)
             thrift_transport = thrift.transport.THttpClient.THttpClient(
                 uri_or_host="{scheme}://{host}:{port}/cliservice/".format(
                     scheme=scheme, host=host, port=port
diff --git a/python/pyhive/tests/test_hive.py b/python/pyhive/tests/test_hive.py
index 17206430aa..c2a020e17f 100644
--- a/python/pyhive/tests/test_hive.py
+++ b/python/pyhive/tests/test_hive.py
@@ -15,6 +15,7 @@ import subprocess
 import time
 import unittest
 from decimal import Decimal
+import ssl
 
 import mock
 import pytest
@@ -238,6 +239,26 @@ class TestHive(unittest.TestCase, DBAPITestCase):
             subprocess.check_call(['sudo', 'cp', orig_none, des])
             _restart_hs2()
 
+    @pytest.mark.skip(reason="Need a proper setup for SSL context testing")
+    def test_basic_ssl_context(self):
+        """Test that connection works with a custom SSL context that mimics 
the default behavior."""
+        # Create an SSL context similar to what Connection creates by default
+        ssl_context = ssl.create_default_context()
+        ssl_context.check_hostname = False
+        ssl_context.verify_mode = ssl.CERT_NONE
+        
+        # Connect using the same parameters as self.connect() but with our 
custom context
+        with contextlib.closing(hive.connect(
+            host=_HOST,
+            port=10000,
+            configuration={'mapred.job.tracker': 'local'},
+            ssl_context=ssl_context
+        )) as connection:
+            with contextlib.closing(connection.cursor()) as cursor:
+                # Use the same query pattern as other tests
+                cursor.execute('SELECT 1 FROM one_row')
+                self.assertEqual(cursor.fetchall(), [(1,)])
+
 
 def _restart_hs2():
     subprocess.check_call(['sudo', 'service', 'hive-server2', 'restart'])

Reply via email to