Re: [I] HIVE SSL mTLS capability [airflow]

2025-03-03 Thread via GitHub


alexio215 commented on issue #46023:
URL: https://github.com/apache/airflow/issues/46023#issuecomment-2695338668

   Just wanted to add the comment, that the new pyHive has been adopted by the 
apache/kyuubi project. The PR for this support has been made upstream, and is 
awaiting release.
   
   Issue can remain closed.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] HIVE SSL mTLS capability [airflow]

2025-03-02 Thread via GitHub


eladkal commented on issue #46023:
URL: https://github.com/apache/airflow/issues/46023#issuecomment-2692638260

   I'm closing this issue as it's missing feature in upstream library 
https://github.com/dropbox/PyHive/issues/480
   Should upstream add support for it feel free to open PR directly (no need 
for issue in Airflow)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] HIVE SSL mTLS capability [airflow]

2025-03-02 Thread via GitHub


eladkal closed issue #46023: HIVE SSL mTLS capability
URL: https://github.com/apache/airflow/issues/46023


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] HIVE SSL mTLS capability [airflow]

2025-02-09 Thread via GitHub


alexio215 commented on issue #46023:
URL: https://github.com/apache/airflow/issues/46023#issuecomment-2646644945

   > I think I understand what you want to do, the way I see it there are two 
options. 
   > 
   > 1. Open a PR and if it's something simple and relevant it will also be 
promoted in the open source (I would love to do a CR for you).
   > 2. Implement your own wrapper for the operator in your organization.
   
   Hello, thank you for the help. I have opened an issue to make a PR for 
puHive first since it is lacking the capability fundamentally. Once I get that 
merged, I will come back here to make a PR for the airflow provider.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] HIVE SSL mTLS capability [airflow]

2025-02-05 Thread via GitHub


nevcohen commented on issue #46023:
URL: https://github.com/apache/airflow/issues/46023#issuecomment-2637612311

   I think I understand what you want to do, the way I see it there are two 
options. 
   
   1. Open a PR and if it's something simple and relevant it will also be 
promoted in the open source (I would love to do a CR for you).
   2. Implement your own wrapper for the operator in your organization.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] HIVE SSL mTLS capability [airflow]

2025-02-03 Thread via GitHub


alexio215 commented on issue #46023:
URL: https://github.com/apache/airflow/issues/46023#issuecomment-2632478556

   > > > So I'm currently looking at using two capabilities. The first, is to 
connect to an NGINX proxy that requires SSL certs and expects mTLS to serve 
HIVE commands locally through our cluster, into the HIVE2SERVER running right 
behind it. The second, down the line that I am hoping for, is to find or create 
support for direct connection with pyHIVE to HIVE2SERVER running with SSL, and 
to perform mTLS. The problem with this however is that I notice that python 
does not natively support the .jks format that HIVE2SERVER expects, hence the 
use of an NGINX proxy. However, looking at pyHIVE, and its most recent issues, 
to me it seems that pyHIVE as well does not support SSL connection:
   > > > [dropbox/PyHive#257](https://github.com/dropbox/PyHive/issues/257)
   > > > Forgive me for any misunderstanding as well, this is all a learning 
process to me at the same time. Thank you for the patience and help 
[@nevcohen](https://github.com/nevcohen)
   > > 
   > > 
   > > So today how do you connect to hive using a code?
   > 
   > Thank you for the patience, this has taken some digging on my end, getting 
accustomed to what is currently practiced in my org. Currently our pyHive 
queries are written a more manual script and sent to a NGINX server that 
redirects appropriate traffic to a Hive2Server proxy. The Thrift communication 
is wrapped in HTTPS using the THTTPClient module from the Thrift library. I 
have found this to exist within pyHive as well.
   > 
   > This lives and is made accessible within the Connection method of pyHive 
`if scheme in ("https", "http") and thrift_transport is None: port = port or 
1000 ssl_context = None if scheme == "https": ssl_context = 
create_default_context() ssl_context.check_hostname = check_hostname == "true" 
ssl_cert = ssl_cert or "none" ssl_context.verify_mode = 
ssl_cert_parameter_map.get(ssl_cert, CERT_NONE) thrift_transport = 
thrift.transport.THttpClient.THttpClient( 
uri_or_host="{scheme}://{host}:{port}/cliservice/".format( scheme=scheme, 
host=host, port=port ), ssl_context=ssl_context, )`
   > 
   > My goal is to add a method using the ssl library that creates ssl context 
using the extras provided and appends them to the connection being created if a 
"use_https_proxy" boolean is specified within the proxy. Further, a 
"enable_mtls" boolean option will be included to allow for cases where someone 
needs to use mTLS.
   
   Finding a way to do this through the pyHive scheme and default constructor 
parameters


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] HIVE SSL mTLS capability [airflow]

2025-02-03 Thread via GitHub


alexio215 commented on issue #46023:
URL: https://github.com/apache/airflow/issues/46023#issuecomment-2632079316

   > > So I'm currently looking at using two capabilities. The first, is to 
connect to an NGINX proxy that requires SSL certs and expects mTLS to serve 
HIVE commands locally through our cluster, into the HIVE2SERVER running right 
behind it. The second, down the line that I am hoping for, is to find or create 
support for direct connection with pyHIVE to HIVE2SERVER running with SSL, and 
to perform mTLS. The problem with this however is that I notice that python 
does not natively support the .jks format that HIVE2SERVER expects, hence the 
use of an NGINX proxy. However, looking at pyHIVE, and its most recent issues, 
to me it seems that pyHIVE as well does not support SSL connection:
   > > [dropbox/PyHive#257](https://github.com/dropbox/PyHive/issues/257)
   > > Forgive me for any misunderstanding as well, this is all a learning 
process to me at the same time. Thank you for the patience and help 
[@nevcohen](https://github.com/nevcohen)
   > 
   > So today how do you connect to hive using a code?
   
   Thank you for the patience, this has taken some digging on my end, getting 
accustomed to what is currently practiced in my org. Currently our pyHive 
queries are written a more manual script and sent to a NGINX server that 
redirects appropriate traffic to a Hive2Server proxy. The Thrift communication 
is wrapped in HTTPS using the THTTPClient module from the Thrift library. I 
have found this to exist within pyHive as well.
   
   This lives and is made accessible within the Connection method of pyHive
   `if scheme in ("https", "http") and thrift_transport is None:
   port = port or 1000
   ssl_context = None
   if scheme == "https":
   ssl_context = create_default_context()
   ssl_context.check_hostname = check_hostname == "true"
   ssl_cert = ssl_cert or "none"
   ssl_context.verify_mode = 
ssl_cert_parameter_map.get(ssl_cert, CERT_NONE)
   thrift_transport = thrift.transport.THttpClient.THttpClient(
   uri_or_host="{scheme}://{host}:{port}/cliservice/".format(
   scheme=scheme, host=host, port=port
   ),
   ssl_context=ssl_context,
   )`
   
   My goal is to add a method using the ssl library that creates ssl context 
using the extras provided and appends them to the connection being created if a 
"use_https_proxy" boolean is specified within the proxy. Further, a 
"enable_mtls" boolean option will be included to allow for cases where someone 
needs to use mTLS.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] HIVE SSL mTLS capability [airflow]

2025-02-02 Thread via GitHub


nevcohen commented on issue #46023:
URL: https://github.com/apache/airflow/issues/46023#issuecomment-2629484716

   > So I'm currently looking at using two capabilities. The first, is to 
connect to an NGINX proxy that requires SSL certs and expects mTLS to serve 
HIVE commands locally through our cluster, into the HIVE2SERVER running right 
behind it. The second, down the line that I am hoping for, is to find or create 
support for direct connection with pyHIVE to HIVE2SERVER running with SSL, and 
to perform mTLS. The problem with this however is that I notice that python 
does not natively support the .jks format that HIVE2SERVER expects, hence the 
use of an NGINX proxy. However, looking at pyHIVE, and its most recent issues, 
to me it seems that pyHIVE as well does not support SSL connection:
   > https://github.com/dropbox/PyHive/issues/257
   > 
   > Forgive me for any misunderstanding as well, this is all a learning 
process to me at the same time. Thank you for the patience and help @nevcohen 
   
   So today how do you connect to hive using a code?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] HIVE SSL mTLS capability [airflow]

2025-01-30 Thread via GitHub


alexio215 commented on issue #46023:
URL: https://github.com/apache/airflow/issues/46023#issuecomment-2626422301

   So I'm currently looking at using two capabilities. The first, is to connect 
to an NGINX proxy that requires SSL certs and expects mTLS to serve HIVE 
commands locally through our cluster, into the HIVE2SERVER running right behind 
it. The second, down the line that I am hoping for, is to find or create 
support for direct connection with pyHIVE to HIVE2SERVER running with SSL, and 
to perform mTLS. The problem with this however is that I notice that python 
does not natively support the .jks format that HIVE2SERVER expects, hence the 
use of an NGINX proxy. However, looking at pyHIVE, and its most recent issues, 
to me it seems that pyHIVE as well does not support SSL connection:
   https://github.com/dropbox/PyHive/issues/257
   
   Forgive me for any misunderstanding as well, this is all a learning process 
to me at the same time. Thank you for the patience and help @nevcohen 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] HIVE SSL mTLS capability [airflow]

2025-01-29 Thread via GitHub


nevcohen commented on issue #46023:
URL: https://github.com/apache/airflow/issues/46023#issuecomment-2622767231

   The `hive cli` or `hive server 2` 
[connections](https://airflow.apache.org/docs/apache-airflow-providers-apache-hive/stable/connections/index.html)
 don't work for you? 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] HIVE SSL mTLS capability [airflow]

2025-01-24 Thread via GitHub


alexio215 commented on issue #46023:
URL: https://github.com/apache/airflow/issues/46023#issuecomment-2613250678

   On pause for now and considering ramifications of this in my environment


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] HIVE SSL mTLS capability [airflow]

2025-01-24 Thread via GitHub


alexio215 commented on issue #46023:
URL: https://github.com/apache/airflow/issues/46023#issuecomment-2613128436

   My current inclination in solving this problem is to use 
[jpype](https://pypi.org/project/jpype1/) to funnel python requests through a 
JVM running the JDBC Driver adjacent to Airflow.
   
   The goal with this is to use the Python natively to write HIVE DAGs but 
communicate in Java which is more native to HIVE and supports the JKS key 
format, which is the default to HIVE


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] HIVE SSL mTLS capability [airflow]

2025-01-24 Thread via GitHub


boring-cyborg[bot] commented on issue #46023:
URL: https://github.com/apache/airflow/issues/46023#issuecomment-2613115204

   Thanks for opening your first issue here! Be sure to follow the issue 
template! If you are willing to raise PR to address this issue please do so, no 
need to wait for approval.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] HIVE SSL mTLS capability [airflow]

2025-01-24 Thread via GitHub


alexio215 commented on issue #46023:
URL: https://github.com/apache/airflow/issues/46023#issuecomment-2613116453

   Hello, thank you for having me. It is my first time contributing to any open 
source project, so please bear with me.
   
   Happy to learn from any wisdom shared


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org