Re: [I] HIVE SSL mTLS capability [airflow]
alexio215 commented on issue #46023: URL: https://github.com/apache/airflow/issues/46023#issuecomment-2695338668 Just wanted to add the comment, that the new pyHive has been adopted by the apache/kyuubi project. The PR for this support has been made upstream, and is awaiting release. Issue can remain closed. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [I] HIVE SSL mTLS capability [airflow]
eladkal commented on issue #46023: URL: https://github.com/apache/airflow/issues/46023#issuecomment-2692638260 I'm closing this issue as it's missing feature in upstream library https://github.com/dropbox/PyHive/issues/480 Should upstream add support for it feel free to open PR directly (no need for issue in Airflow) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [I] HIVE SSL mTLS capability [airflow]
eladkal closed issue #46023: HIVE SSL mTLS capability URL: https://github.com/apache/airflow/issues/46023 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [I] HIVE SSL mTLS capability [airflow]
alexio215 commented on issue #46023: URL: https://github.com/apache/airflow/issues/46023#issuecomment-2646644945 > I think I understand what you want to do, the way I see it there are two options. > > 1. Open a PR and if it's something simple and relevant it will also be promoted in the open source (I would love to do a CR for you). > 2. Implement your own wrapper for the operator in your organization. Hello, thank you for the help. I have opened an issue to make a PR for puHive first since it is lacking the capability fundamentally. Once I get that merged, I will come back here to make a PR for the airflow provider. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [I] HIVE SSL mTLS capability [airflow]
nevcohen commented on issue #46023: URL: https://github.com/apache/airflow/issues/46023#issuecomment-2637612311 I think I understand what you want to do, the way I see it there are two options. 1. Open a PR and if it's something simple and relevant it will also be promoted in the open source (I would love to do a CR for you). 2. Implement your own wrapper for the operator in your organization. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [I] HIVE SSL mTLS capability [airflow]
alexio215 commented on issue #46023: URL: https://github.com/apache/airflow/issues/46023#issuecomment-2632478556 > > > So I'm currently looking at using two capabilities. The first, is to connect to an NGINX proxy that requires SSL certs and expects mTLS to serve HIVE commands locally through our cluster, into the HIVE2SERVER running right behind it. The second, down the line that I am hoping for, is to find or create support for direct connection with pyHIVE to HIVE2SERVER running with SSL, and to perform mTLS. The problem with this however is that I notice that python does not natively support the .jks format that HIVE2SERVER expects, hence the use of an NGINX proxy. However, looking at pyHIVE, and its most recent issues, to me it seems that pyHIVE as well does not support SSL connection: > > > [dropbox/PyHive#257](https://github.com/dropbox/PyHive/issues/257) > > > Forgive me for any misunderstanding as well, this is all a learning process to me at the same time. Thank you for the patience and help [@nevcohen](https://github.com/nevcohen) > > > > > > So today how do you connect to hive using a code? > > Thank you for the patience, this has taken some digging on my end, getting accustomed to what is currently practiced in my org. Currently our pyHive queries are written a more manual script and sent to a NGINX server that redirects appropriate traffic to a Hive2Server proxy. The Thrift communication is wrapped in HTTPS using the THTTPClient module from the Thrift library. I have found this to exist within pyHive as well. > > This lives and is made accessible within the Connection method of pyHive `if scheme in ("https", "http") and thrift_transport is None: port = port or 1000 ssl_context = None if scheme == "https": ssl_context = create_default_context() ssl_context.check_hostname = check_hostname == "true" ssl_cert = ssl_cert or "none" ssl_context.verify_mode = ssl_cert_parameter_map.get(ssl_cert, CERT_NONE) thrift_transport = thrift.transport.THttpClient.THttpClient( uri_or_host="{scheme}://{host}:{port}/cliservice/".format( scheme=scheme, host=host, port=port ), ssl_context=ssl_context, )` > > My goal is to add a method using the ssl library that creates ssl context using the extras provided and appends them to the connection being created if a "use_https_proxy" boolean is specified within the proxy. Further, a "enable_mtls" boolean option will be included to allow for cases where someone needs to use mTLS. Finding a way to do this through the pyHive scheme and default constructor parameters -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [I] HIVE SSL mTLS capability [airflow]
alexio215 commented on issue #46023: URL: https://github.com/apache/airflow/issues/46023#issuecomment-2632079316 > > So I'm currently looking at using two capabilities. The first, is to connect to an NGINX proxy that requires SSL certs and expects mTLS to serve HIVE commands locally through our cluster, into the HIVE2SERVER running right behind it. The second, down the line that I am hoping for, is to find or create support for direct connection with pyHIVE to HIVE2SERVER running with SSL, and to perform mTLS. The problem with this however is that I notice that python does not natively support the .jks format that HIVE2SERVER expects, hence the use of an NGINX proxy. However, looking at pyHIVE, and its most recent issues, to me it seems that pyHIVE as well does not support SSL connection: > > [dropbox/PyHive#257](https://github.com/dropbox/PyHive/issues/257) > > Forgive me for any misunderstanding as well, this is all a learning process to me at the same time. Thank you for the patience and help [@nevcohen](https://github.com/nevcohen) > > So today how do you connect to hive using a code? Thank you for the patience, this has taken some digging on my end, getting accustomed to what is currently practiced in my org. Currently our pyHive queries are written a more manual script and sent to a NGINX server that redirects appropriate traffic to a Hive2Server proxy. The Thrift communication is wrapped in HTTPS using the THTTPClient module from the Thrift library. I have found this to exist within pyHive as well. This lives and is made accessible within the Connection method of pyHive `if scheme in ("https", "http") and thrift_transport is None: port = port or 1000 ssl_context = None if scheme == "https": ssl_context = create_default_context() ssl_context.check_hostname = check_hostname == "true" ssl_cert = ssl_cert or "none" ssl_context.verify_mode = ssl_cert_parameter_map.get(ssl_cert, CERT_NONE) thrift_transport = thrift.transport.THttpClient.THttpClient( uri_or_host="{scheme}://{host}:{port}/cliservice/".format( scheme=scheme, host=host, port=port ), ssl_context=ssl_context, )` My goal is to add a method using the ssl library that creates ssl context using the extras provided and appends them to the connection being created if a "use_https_proxy" boolean is specified within the proxy. Further, a "enable_mtls" boolean option will be included to allow for cases where someone needs to use mTLS. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [I] HIVE SSL mTLS capability [airflow]
nevcohen commented on issue #46023: URL: https://github.com/apache/airflow/issues/46023#issuecomment-2629484716 > So I'm currently looking at using two capabilities. The first, is to connect to an NGINX proxy that requires SSL certs and expects mTLS to serve HIVE commands locally through our cluster, into the HIVE2SERVER running right behind it. The second, down the line that I am hoping for, is to find or create support for direct connection with pyHIVE to HIVE2SERVER running with SSL, and to perform mTLS. The problem with this however is that I notice that python does not natively support the .jks format that HIVE2SERVER expects, hence the use of an NGINX proxy. However, looking at pyHIVE, and its most recent issues, to me it seems that pyHIVE as well does not support SSL connection: > https://github.com/dropbox/PyHive/issues/257 > > Forgive me for any misunderstanding as well, this is all a learning process to me at the same time. Thank you for the patience and help @nevcohen So today how do you connect to hive using a code? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [I] HIVE SSL mTLS capability [airflow]
alexio215 commented on issue #46023: URL: https://github.com/apache/airflow/issues/46023#issuecomment-2626422301 So I'm currently looking at using two capabilities. The first, is to connect to an NGINX proxy that requires SSL certs and expects mTLS to serve HIVE commands locally through our cluster, into the HIVE2SERVER running right behind it. The second, down the line that I am hoping for, is to find or create support for direct connection with pyHIVE to HIVE2SERVER running with SSL, and to perform mTLS. The problem with this however is that I notice that python does not natively support the .jks format that HIVE2SERVER expects, hence the use of an NGINX proxy. However, looking at pyHIVE, and its most recent issues, to me it seems that pyHIVE as well does not support SSL connection: https://github.com/dropbox/PyHive/issues/257 Forgive me for any misunderstanding as well, this is all a learning process to me at the same time. Thank you for the patience and help @nevcohen -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [I] HIVE SSL mTLS capability [airflow]
nevcohen commented on issue #46023: URL: https://github.com/apache/airflow/issues/46023#issuecomment-2622767231 The `hive cli` or `hive server 2` [connections](https://airflow.apache.org/docs/apache-airflow-providers-apache-hive/stable/connections/index.html) don't work for you? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [I] HIVE SSL mTLS capability [airflow]
alexio215 commented on issue #46023: URL: https://github.com/apache/airflow/issues/46023#issuecomment-2613250678 On pause for now and considering ramifications of this in my environment -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [I] HIVE SSL mTLS capability [airflow]
alexio215 commented on issue #46023: URL: https://github.com/apache/airflow/issues/46023#issuecomment-2613128436 My current inclination in solving this problem is to use [jpype](https://pypi.org/project/jpype1/) to funnel python requests through a JVM running the JDBC Driver adjacent to Airflow. The goal with this is to use the Python natively to write HIVE DAGs but communicate in Java which is more native to HIVE and supports the JKS key format, which is the default to HIVE -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [I] HIVE SSL mTLS capability [airflow]
boring-cyborg[bot] commented on issue #46023: URL: https://github.com/apache/airflow/issues/46023#issuecomment-2613115204 Thanks for opening your first issue here! Be sure to follow the issue template! If you are willing to raise PR to address this issue please do so, no need to wait for approval. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [I] HIVE SSL mTLS capability [airflow]
alexio215 commented on issue #46023: URL: https://github.com/apache/airflow/issues/46023#issuecomment-2613116453 Hello, thank you for having me. It is my first time contributing to any open source project, so please bear with me. Happy to learn from any wisdom shared -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org