[Impala-ASF-CR] IMPALA-10496: SAML implementation in Impala

2021-02-17 Thread Thomas Tauber-Marshall (Code Review)
Thomas Tauber-Marshall has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16833 )

Change subject: IMPALA-10496: SAML implementation in Impala
..


Patch Set 27:

Oh, looks like you ran the verify job without 'dry run' set, so when I +2ed it 
right before the job completed it ended up submitted it even though I had one 
more outstanding comment. That's my bad.

I think that's fine, though, and of course no need to revert it just to fix 
that one minor thing.


--
To view, visit http://gerrit.cloudera.org:8080/16833
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia0c026cba1b90e7ff6ec5ae49be78b0d1edd8dfa
Gerrit-Change-Number: 16833
Gerrit-PatchSet: 27
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Wed, 17 Feb 2021 22:59:35 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10496: SAML implementation in Impala

2021-02-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16833 )

Change subject: IMPALA-10496: SAML implementation in Impala
..


Patch Set 26: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/16833
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia0c026cba1b90e7ff6ec5ae49be78b0d1edd8dfa
Gerrit-Change-Number: 16833
Gerrit-PatchSet: 26
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Wed, 17 Feb 2021 22:52:04 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10496: SAML implementation in Impala

2021-02-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/16833 )

Change subject: IMPALA-10496: SAML implementation in Impala
..

IMPALA-10496: SAML implementation in Impala

The bulk of the SAML2 related code is done on Java side because:
- There is already an implementation for Hive on review (HIVE-24543).
- The only SAML lib for c++ seems to be OpenSaml, which is seemed
  quite hard to use and a heavy dependency.

Doing authentication in Java needed some plumbing, as the hs2-http
port is listened to in c++ and http related processing happens in
THttpServer/THttpTransport, which is not a "real" web server, just
a simple http implementation that processes the headers and passes
content to the thrift service.
- Http headers (and in one case body) are inspected and if it is
  SAML related, the http request is wrapped in TWrappedHttpRequest
  and sent to the Frontend. The Frontend processes it and returns
  a TWrappedHttpResponse with the info to return to the client.
- After the last SAML message (with the bearer token) we generate
  an auth cookie in c++ (which can be validated in c++),  so later
  requests in the session don't need to call to Java.

SAML auth can work alongside LDAP and Kerberos - for each hs2-http
request the path and the http headers are inspected to decide
whether it is SAML related, and if not, then we fallback to other
auth mechanisms. This "mixed mode" has no tests yet, so I consider it
experimental.

Planned followup work:
- It would be great to import the logic implemented in Hive instead
  of copy-pasting most of it. I plan to do this in a followup commit,
  as this needs changes on the Hive side too.
- Adding more tests will be much easier once we will have a hs2-http
  client that supports SAML. See IMPALA-10496 for Impyla support.
- Currently the debug webserver does not support SAML auth.
  Implementing SAML for the webserver is problematic on the statestore
  which doesn't have a Frontend.

Testing:
- Added EE tests that use Python's urllib2 to sent SAML
  requests to Impala. Impala works slightly differently
  during tests (saml2_ee_test_mode=true).

Change-Id: Ia0c026cba1b90e7ff6ec5ae49be78b0d1edd8dfa
Reviewed-on: http://gerrit.cloudera.org:8080/16833
Reviewed-by: Thomas Tauber-Marshall 
Tested-by: Impala Public Jenkins 
---
M be/src/rpc/auth-provider.h
M be/src/rpc/authentication-test.cc
M be/src/rpc/authentication.cc
M be/src/rpc/authentication.h
M be/src/rpc/hs2-http-test.cc
M be/src/rpc/thrift-server.h
M be/src/service/frontend.cc
M be/src/service/frontend.h
M be/src/service/impala-server.cc
M be/src/transport/THttpServer.cpp
M be/src/transport/THttpServer.h
M be/src/transport/THttpTransport.cpp
M be/src/transport/THttpTransport.h
M be/src/util/backend-gflag-util.cc
M bin/rat_exclude_files.txt
M common/thrift/BackendGflags.thrift
M common/thrift/Frontend.thrift
M common/thrift/metrics.json
M fe/pom.xml
A fe/src/main/java/org/apache/impala/authentication/saml/AuthTokenGenerator.java
A 
fe/src/main/java/org/apache/impala/authentication/saml/HiveSamlAuthTokenGenerator.java
A 
fe/src/main/java/org/apache/impala/authentication/saml/HiveSamlGroupNameFilter.java
A 
fe/src/main/java/org/apache/impala/authentication/saml/HiveSamlHttpServlet.java
A 
fe/src/main/java/org/apache/impala/authentication/saml/HiveSamlRelayStateInfo.java
A 
fe/src/main/java/org/apache/impala/authentication/saml/HiveSamlRelayStateStore.java
A fe/src/main/java/org/apache/impala/authentication/saml/HiveSamlUtils.java
A 
fe/src/main/java/org/apache/impala/authentication/saml/HttpSamlAuthenticationException.java
A 
fe/src/main/java/org/apache/impala/authentication/saml/HttpSamlNoGroupsMatchedException.java
A fe/src/main/java/org/apache/impala/authentication/saml/ImpalaSamlClient.java
A fe/src/main/java/org/apache/impala/authentication/saml/NullSessionStore.java
A fe/src/main/java/org/apache/impala/authentication/saml/WrappedWebContext.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M fe/src/main/java/org/apache/impala/service/JniFrontend.java
M java/pom.xml
A testdata/authentication/saml2_sso.jks
A testdata/authentication/saml2_sso_metadata.xml
A tests/custom_cluster/test_saml2_sso.py
38 files changed, 2,202 insertions(+), 53 deletions(-)

Approvals:
  Thomas Tauber-Marshall: Looks good to me, approved
  Impala Public Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/16833
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Ia0c026cba1b90e7ff6ec5ae49be78b0d1edd8dfa
Gerrit-Change-Number: 16833
Gerrit-PatchSet: 27
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Thomas Tauber-Marshall 

[Impala-ASF-CR] IMPALA-10496: SAML implementation in Impala

2021-02-17 Thread Thomas Tauber-Marshall (Code Review)
Thomas Tauber-Marshall has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16833 )

Change subject: IMPALA-10496: SAML implementation in Impala
..


Patch Set 26: Code-Review+2

(1 comment)

Looks good, just one final thing I noticed

http://gerrit.cloudera.org:8080/#/c/16833/26/be/src/rpc/authentication.cc
File be/src/rpc/authentication.cc:

http://gerrit.cloudera.org:8080/#/c/16833/26/be/src/rpc/authentication.cc@1368
PS26, Line 1368: if (use_saml) {
This case makes me kind of nervous - easy to imagine a user setting up SAML and 
not realizing that they're leaving other endpoints entirely unsecured. Maybe we 
should log an error in that case, eg. if either FLAGS_beeswax_port or 
FLAGS_hs2_port are non-zero so we're actually using external_auth_provider_, or 
even disallow it by returning an error.



--
To view, visit http://gerrit.cloudera.org:8080/16833
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia0c026cba1b90e7ff6ec5ae49be78b0d1edd8dfa
Gerrit-Change-Number: 16833
Gerrit-PatchSet: 26
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Wed, 17 Feb 2021 22:45:58 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10496: SAML implementation in Impala

2021-02-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16833 )

Change subject: IMPALA-10496: SAML implementation in Impala
..


Patch Set 25:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/8146/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/16833
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia0c026cba1b90e7ff6ec5ae49be78b0d1edd8dfa
Gerrit-Change-Number: 16833
Gerrit-PatchSet: 25
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Wed, 17 Feb 2021 17:18:35 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10496: SAML implementation in Impala

2021-02-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16833 )

Change subject: IMPALA-10496: SAML implementation in Impala
..


Patch Set 26:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6894/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/16833
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia0c026cba1b90e7ff6ec5ae49be78b0d1edd8dfa
Gerrit-Change-Number: 16833
Gerrit-PatchSet: 26
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Wed, 17 Feb 2021 17:03:55 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10496: SAML implementation in Impala

2021-02-17 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16833 )

Change subject: IMPALA-10496: SAML implementation in Impala
..


Patch Set 25:

(11 comments)

http://gerrit.cloudera.org:8080/#/c/16833/24/be/src/rpc/authentication.h
File be/src/rpc/authentication.h:

http://gerrit.cloudera.org:8080/#/c/16833/24/be/src/rpc/authentication.h@58
PS24, Line 58:   /// Returns the authentication provider to use for "external" 
communication with
> brief comment
Done


http://gerrit.cloudera.org:8080/#/c/16833/24/be/src/rpc/authentication.cc
File be/src/rpc/authentication.cc:

http://gerrit.cloudera.org:8080/#/c/16833/24/be/src/rpc/authentication.cc@125
PS24, Line 125:
> nit: allows
Done


http://gerrit.cloudera.org:8080/#/c/16833/24/be/src/rpc/authentication.cc@573
PS24, Line 573: }
> Brief comment about the parameters here.
Done


http://gerrit.cloudera.org:8080/#/c/16833/24/be/src/rpc/authentication.cc@629
PS24, Line 629: est
> nit: we prefer nullptr, here and elsewhere
Done


http://gerrit.cloudera.org:8080/#/c/16833/24/be/src/rpc/authentication.cc@683
PS24, Line 683:
> probably worth explicitly saying what still needs to be done here
I added some comments that explain why did I do this.


http://gerrit.cloudera.org:8080/#/c/16833/24/be/src/transport/THttpServer.h
File be/src/transport/THttpServer.h:

http://gerrit.cloudera.org:8080/#/c/16833/24/be/src/transport/THttpServer.h@85
PS24, Line 85: purposes
> typo: purposes
Done


http://gerrit.cloudera.org:8080/#/c/16833/24/be/src/transport/THttpServer.cpp
File be/src/transport/THttpServer.cpp:

http://gerrit.cloudera.org:8080/#/c/16833/24/be/src/transport/THttpServer.cpp@228
PS24, Line 228:   // Try authenticating with cookies first.
> nit: unnecessary whitespace
Done


http://gerrit.cloudera.org:8080/#/c/16833/24/be/src/transport/THttpServer.cpp@252
PS24, Line 252: throw TTransportException("HTTP auth - SAML redirection.");
  :   }
  : } else if (!auth_value_
> might be nice to put this logic in some sort of resetConnection() helper fu
Done


http://gerrit.cloudera.org:8080/#/c/16833/24/common/thrift/metrics.json
File common/thrift/metrics.json:

http://gerrit.cloudera.org:8080/#/c/16833/24/common/thrift/metrics.json@1315
PS24, Line 1315: emon that
> This maybe gets a little confusing, since usually when we talk about the "w
Done


http://gerrit.cloudera.org:8080/#/c/16833/24/common/thrift/metrics.json@1329
PS24, Line 1329: I
> typo: space
Done


http://gerrit.cloudera.org:8080/#/c/16833/23/fe/src/main/java/org/apache/impala/authentication/saml/ImpalaSamlClient.java
File 
fe/src/main/java/org/apache/impala/authentication/saml/ImpalaSamlClient.java:

http://gerrit.cloudera.org:8080/#/c/16833/23/fe/src/main/java/org/apache/impala/authentication/saml/ImpalaSamlClient.java@76
PS23, Line 76:
> This path is already parsed in the backend, so I think that it is easier to
Added validation to authentication.cc . This also revealed a test issue that 
the port is only correct on the first Impalad. For now the fix was to change 
the tests to use a 1 sized cluster.



--
To view, visit http://gerrit.cloudera.org:8080/16833
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia0c026cba1b90e7ff6ec5ae49be78b0d1edd8dfa
Gerrit-Change-Number: 16833
Gerrit-PatchSet: 25
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Wed, 17 Feb 2021 17:03:12 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10496: SAML implementation in Impala

2021-02-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16833 )

Change subject: IMPALA-10496: SAML implementation in Impala
..


Patch Set 25:

(16 comments)

http://gerrit.cloudera.org:8080/#/c/16833/25/fe/src/main/java/org/apache/impala/authentication/saml/AuthTokenGenerator.java
File 
fe/src/main/java/org/apache/impala/authentication/saml/AuthTokenGenerator.java:

http://gerrit.cloudera.org:8080/#/c/16833/25/fe/src/main/java/org/apache/impala/authentication/saml/AuthTokenGenerator.java@20
PS25, Line 20: // copy of 
https://github.com/vihangk1/hive/blob/master_saml/service/src/java/org/apache/hive/service/auth/saml/AuthTokenGenerator.java
line too long (135 > 90)


http://gerrit.cloudera.org:8080/#/c/16833/25/fe/src/main/java/org/apache/impala/authentication/saml/HiveSamlAuthTokenGenerator.java
File 
fe/src/main/java/org/apache/impala/authentication/saml/HiveSamlAuthTokenGenerator.java:

http://gerrit.cloudera.org:8080/#/c/16833/25/fe/src/main/java/org/apache/impala/authentication/saml/HiveSamlAuthTokenGenerator.java@32
PS25, Line 32: // 
https://github.com/vihangk1/hive/blob/master_saml/service/src/java/org/apache/hive/service/auth/saml/HiveSamlAuthTokenGenerator.java
line too long (135 > 90)


http://gerrit.cloudera.org:8080/#/c/16833/25/fe/src/main/java/org/apache/impala/authentication/saml/HiveSamlAuthTokenGenerator.java@52
PS25, Line 52:   private static final Logger LOG = 
LoggerFactory.getLogger(HiveSamlAuthTokenGenerator.class);
line too long (94 > 90)


http://gerrit.cloudera.org:8080/#/c/16833/25/fe/src/main/java/org/apache/impala/authentication/saml/HiveSamlGroupNameFilter.java
File 
fe/src/main/java/org/apache/impala/authentication/saml/HiveSamlGroupNameFilter.java:

http://gerrit.cloudera.org:8080/#/c/16833/25/fe/src/main/java/org/apache/impala/authentication/saml/HiveSamlGroupNameFilter.java@31
PS25, Line 31: // 
https://github.com/vihangk1/hive/blob/master_saml/service/src/java/org/apache/hive/service/auth/saml/HiveSamlGroupNameFilter.java
line too long (132 > 90)


http://gerrit.cloudera.org:8080/#/c/16833/25/fe/src/main/java/org/apache/impala/authentication/saml/HiveSamlRelayStateInfo.java
File 
fe/src/main/java/org/apache/impala/authentication/saml/HiveSamlRelayStateInfo.java:

http://gerrit.cloudera.org:8080/#/c/16833/25/fe/src/main/java/org/apache/impala/authentication/saml/HiveSamlRelayStateInfo.java@20
PS25, Line 20: // copy of 
https://github.com/vihangk1/hive/blob/master_saml/service/src/java/org/apache/hive/service/auth/saml/HiveSamlRelayStateInfo.java
line too long (139 > 90)


http://gerrit.cloudera.org:8080/#/c/16833/25/fe/src/main/java/org/apache/impala/authentication/saml/HiveSamlRelayStateStore.java
File 
fe/src/main/java/org/apache/impala/authentication/saml/HiveSamlRelayStateStore.java:

http://gerrit.cloudera.org:8080/#/c/16833/25/fe/src/main/java/org/apache/impala/authentication/saml/HiveSamlRelayStateStore.java@32
PS25, Line 32: // slightly modified copy of 
https://github.com/vihangk1/hive/blob/master_saml/service/src/java/org/apache/hive/service/auth/saml/HiveSamlRelayStateInfo.java
line too long (157 > 90)


http://gerrit.cloudera.org:8080/#/c/16833/25/fe/src/main/java/org/apache/impala/authentication/saml/ImpalaSamlClient.java
File 
fe/src/main/java/org/apache/impala/authentication/saml/ImpalaSamlClient.java:

http://gerrit.cloudera.org:8080/#/c/16833/25/fe/src/main/java/org/apache/impala/authentication/saml/ImpalaSamlClient.java@43
PS25, Line 43: // modified version of 
https://github.com/vihangk1/hive/blob/master_saml/service/src/java/org/apache/hive/service/auth/saml/HiveSaml2Client.java
line too long (144 > 90)


http://gerrit.cloudera.org:8080/#/c/16833/25/fe/src/main/java/org/apache/impala/authentication/saml/ImpalaSamlClient.java@69
PS25, Line 69: //TODO handle the replayCache as described in 
http://www.pac4j.org/docs/clients/saml.html
line too long (93 > 90)


http://gerrit.cloudera.org:8080/#/c/16833/25/fe/src/main/java/org/apache/impala/authentication/saml/ImpalaSamlClient.java@147
PS25, Line 147: // This is done to keep original structure by Vihang + keep 
ImpalaSamlClient as the only
line too long (92 > 90)


http://gerrit.cloudera.org:8080/#/c/16833/25/fe/src/main/java/org/apache/impala/authentication/saml/ImpalaSamlClient.java@187
PS25, Line 187:   // 
https://github.com/vihangk1/hive/blob/master_saml/service/src/java/org/apache/hive/service/cli/thrift/ThriftHttpServlet.java
line too long (129 > 90)


http://gerrit.cloudera.org:8080/#/c/16833/25/fe/src/main/java/org/apache/impala/authentication/saml/ImpalaSamlClient.java@188
PS25, Line 188:   private String doSamlAuth(WrappedWebContext webContext) 
throws HttpSamlAuthenticationException {
line too long (98 > 90)


http://gerrit.cloudera.org:8080/#/c/16833/25/fe/src/main/java/org/apache/impala/service/BackendConfig.java
File fe/src/main/java/org/apache/impala/service/BackendConfig.java:


[Impala-ASF-CR] IMPALA-10496: SAML implementation in Impala

2021-02-17 Thread Csaba Ringhofer (Code Review)
Hello Thomas Tauber-Marshall, Vihang Karajgaonkar, Joe McDonnell, Impala Public 
Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/16833

to look at the new patch set (#25).

Change subject: IMPALA-10496: SAML implementation in Impala
..

IMPALA-10496: SAML implementation in Impala

The bulk of the SAML2 related code is done on Java side because:
- There is already an implementation for Hive on review (HIVE-24543).
- The only SAML lib for c++ seems to be OpenSaml, which is seemed
  quite hard to use and a heavy dependency.

Doing authentication in Java needed some plumbing, as the hs2-http
port is listened to in c++ and http related processing happens in
THttpServer/THttpTransport, which is not a "real" web server, just
a simple http implementation that processes the headers and passes
content to the thrift service.
- Http headers (and in one case body) are inspected and if it is
  SAML related, the http request is wrapped in TWrappedHttpRequest
  and sent to the Frontend. The Frontend processes it and returns
  a TWrappedHttpResponse with the info to return to the client.
- After the last SAML message (with the bearer token) we generate
  an auth cookie in c++ (which can be validated in c++),  so later
  requests in the session don't need to call to Java.

SAML auth can work alongside LDAP and Kerberos - for each hs2-http
request the path and the http headers are inspected to decide
whether it is SAML related, and if not, then we fallback to other
auth mechanisms. This "mixed mode" has no tests yet, so I consider it
experimental.

Planned followup work:
- It would be great to import the logic implemented in Hive instead
  of copy-pasting most of it. I plan to do this in a followup commit,
  as this needs changes on the Hive side too.
- Adding more tests will be much easier once we will have a hs2-http
  client that supports SAML. See IMPALA-10496 for Impyla support.
- Currently the debug webserver does not support SAML auth.
  Implementing SAML for the webserver is problematic on the statestore
  which doesn't have a Frontend.

Testing:
- Added EE tests that use Python's urllib2 to sent SAML
  requests to Impala. Impala works slightly differently
  during tests (saml2_ee_test_mode=true).

Change-Id: Ia0c026cba1b90e7ff6ec5ae49be78b0d1edd8dfa
---
M be/src/rpc/auth-provider.h
M be/src/rpc/authentication-test.cc
M be/src/rpc/authentication.cc
M be/src/rpc/authentication.h
M be/src/rpc/hs2-http-test.cc
M be/src/rpc/thrift-server.h
M be/src/service/frontend.cc
M be/src/service/frontend.h
M be/src/service/impala-server.cc
M be/src/transport/THttpServer.cpp
M be/src/transport/THttpServer.h
M be/src/transport/THttpTransport.cpp
M be/src/transport/THttpTransport.h
M be/src/util/backend-gflag-util.cc
M bin/rat_exclude_files.txt
M common/thrift/BackendGflags.thrift
M common/thrift/Frontend.thrift
M common/thrift/metrics.json
M fe/pom.xml
A fe/src/main/java/org/apache/impala/authentication/saml/AuthTokenGenerator.java
A 
fe/src/main/java/org/apache/impala/authentication/saml/HiveSamlAuthTokenGenerator.java
A 
fe/src/main/java/org/apache/impala/authentication/saml/HiveSamlGroupNameFilter.java
A 
fe/src/main/java/org/apache/impala/authentication/saml/HiveSamlHttpServlet.java
A 
fe/src/main/java/org/apache/impala/authentication/saml/HiveSamlRelayStateInfo.java
A 
fe/src/main/java/org/apache/impala/authentication/saml/HiveSamlRelayStateStore.java
A fe/src/main/java/org/apache/impala/authentication/saml/HiveSamlUtils.java
A 
fe/src/main/java/org/apache/impala/authentication/saml/HttpSamlAuthenticationException.java
A 
fe/src/main/java/org/apache/impala/authentication/saml/HttpSamlNoGroupsMatchedException.java
A fe/src/main/java/org/apache/impala/authentication/saml/ImpalaSamlClient.java
A fe/src/main/java/org/apache/impala/authentication/saml/NullSessionStore.java
A fe/src/main/java/org/apache/impala/authentication/saml/WrappedWebContext.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M fe/src/main/java/org/apache/impala/service/JniFrontend.java
M java/pom.xml
A testdata/authentication/saml2_sso.jks
A testdata/authentication/saml2_sso_metadata.xml
A tests/custom_cluster/test_saml2_sso.py
38 files changed, 2,202 insertions(+), 53 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/33/16833/25
--
To view, visit http://gerrit.cloudera.org:8080/16833
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia0c026cba1b90e7ff6ec5ae49be78b0d1edd8dfa
Gerrit-Change-Number: 16833
Gerrit-PatchSet: 25
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Vihang 

[Impala-ASF-CR] IMPALA-10496: SAML implementation in Impala

2021-02-16 Thread Vihang Karajgaonkar (Code Review)
Vihang Karajgaonkar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16833 )

Change subject: IMPALA-10496: SAML implementation in Impala
..


Patch Set 24: Code-Review+1

(1 comment)

Patch looks good to me. Thanks for making the suggested changes.

http://gerrit.cloudera.org:8080/#/c/16833/24/tests/custom_cluster/test_saml2_sso.py
File tests/custom_cluster/test_saml2_sso.py:

http://gerrit.cloudera.org:8080/#/c/16833/24/tests/custom_cluster/test_saml2_sso.py@48
PS24, Line 48: Most
Thanks for adding a detailed doc. Looks great!



--
To view, visit http://gerrit.cloudera.org:8080/16833
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia0c026cba1b90e7ff6ec5ae49be78b0d1edd8dfa
Gerrit-Change-Number: 16833
Gerrit-PatchSet: 24
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Wed, 17 Feb 2021 01:15:44 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10496: SAML implementation in Impala

2021-02-16 Thread Thomas Tauber-Marshall (Code Review)
Thomas Tauber-Marshall has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16833 )

Change subject: IMPALA-10496: SAML implementation in Impala
..


Patch Set 24:

(10 comments)

http://gerrit.cloudera.org:8080/#/c/16833/24/be/src/rpc/authentication.h
File be/src/rpc/authentication.h:

http://gerrit.cloudera.org:8080/#/c/16833/24/be/src/rpc/authentication.h@58
PS24, Line 58:   AuthProvider* GetExternalHttpAuthProvider();
brief comment


http://gerrit.cloudera.org:8080/#/c/16833/24/be/src/rpc/authentication.cc
File be/src/rpc/authentication.cc:

http://gerrit.cloudera.org:8080/#/c/16833/24/be/src/rpc/authentication.cc@125
PS24, Line 125: allow
nit: allows


http://gerrit.cloudera.org:8080/#/c/16833/24/be/src/rpc/authentication.cc@573
PS24, Line 573: bool ParseParams(std::map& params_to_check,
Brief comment about the parameters here.


http://gerrit.cloudera.org:8080/#/c/16833/24/be/src/rpc/authentication.cc@629
PS24, Line 629: NULL
nit: we prefer nullptr, here and elsewhere


http://gerrit.cloudera.org:8080/#/c/16833/24/be/src/rpc/authentication.cc@683
PS24, Line 683: TODO
probably worth explicitly saying what still needs to be done here


http://gerrit.cloudera.org:8080/#/c/16833/24/be/src/transport/THttpServer.h
File be/src/transport/THttpServer.h:

http://gerrit.cloudera.org:8080/#/c/16833/24/be/src/transport/THttpServer.h@85
PS24, Line 85: porpuses
typo: purposes


http://gerrit.cloudera.org:8080/#/c/16833/24/be/src/transport/THttpServer.cpp
File be/src/transport/THttpServer.cpp:

http://gerrit.cloudera.org:8080/#/c/16833/24/be/src/transport/THttpServer.cpp@228
PS24, Line 228:
nit: unnecessary whitespace


http://gerrit.cloudera.org:8080/#/c/16833/24/be/src/transport/THttpServer.cpp@252
PS24, Line 252: saml_port_ = -1;
  : auth_value_ = "";
  : cookie_value_ = "";
might be nice to put this logic in some sort of resetConnection() helper 
function that could be called here, below in the 'if (!authorized && 
!fallback_to_other_auths)' block and at the very end of headersDone() when 
we've succeeded.

Could help avoid bugs, eg. if I follow correctly we're currently not resetting 
saml_port to -1 when SAML succeeds. Possibly that doesn't matter, but better to 
be safe.


http://gerrit.cloudera.org:8080/#/c/16833/24/common/thrift/metrics.json
File common/thrift/metrics.json:

http://gerrit.cloudera.org:8080/#/c/16833/24/common/thrift/metrics.json@1315
PS24, Line 1315: webserver
This maybe gets a little confusing, since usually when we talk about the 
"webserver" we're referring to the webui. I think you can just leave this word 
off and it'll be clear since it says the "HiveServer2 HTTP API", here and below


http://gerrit.cloudera.org:8080/#/c/16833/24/common/thrift/metrics.json@1329
PS24, Line 1329: IC
typo: space



--
To view, visit http://gerrit.cloudera.org:8080/16833
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia0c026cba1b90e7ff6ec5ae49be78b0d1edd8dfa
Gerrit-Change-Number: 16833
Gerrit-PatchSet: 24
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Wed, 17 Feb 2021 01:07:36 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10496: SAML implementation in Impala

2021-02-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16833 )

Change subject: IMPALA-10496: SAML implementation in Impala
..


Patch Set 24:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/8143/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/16833
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia0c026cba1b90e7ff6ec5ae49be78b0d1edd8dfa
Gerrit-Change-Number: 16833
Gerrit-PatchSet: 24
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Wed, 17 Feb 2021 00:40:21 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10496: SAML implementation in Impala

2021-02-16 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16833 )

Change subject: IMPALA-10496: SAML implementation in Impala
..


Patch Set 24:

(13 comments)

http://gerrit.cloudera.org:8080/#/c/16833/23//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/16833/23//COMMIT_MSG@10
PS23, Line 10:  implementa
> I think mentioning the upstream HIVE jira (HIVE-24543) here would be a usef
Done


http://gerrit.cloudera.org:8080/#/c/16833/23/be/src/common/global-flags.cc
File be/src/common/global-flags.cc:

http://gerrit.cloudera.org:8080/#/c/16833/23/be/src/common/global-flags.cc@352
PS23, Line 352: // ++++
> nit, blank new line.
Done


http://gerrit.cloudera.org:8080/#/c/16833/23/be/src/rpc/authentication.cc
File be/src/rpc/authentication.cc:

http://gerrit.cloudera.org:8080/#/c/16833/23/be/src/rpc/authentication.cc@1338
PS23, Line 1338:   LOG(INFO) << "External communication can be also 
authenticated with SAML2 SSO";
> do we need to add a similar msg as above for SAML?
Done


http://gerrit.cloudera.org:8080/#/c/16833/23/be/src/service/frontend.h
File be/src/service/frontend.h:

http://gerrit.cloudera.org:8080/#/c/16833/23/be/src/service/frontend.h@201
PS23, Line 201:   // Call FE to create a http response that redirects to the 
SSO service.
  :   Status GetSaml2Redirect(const TWrappedHttpRequest& request,
  :   TWrappedHttpResponse* response);
  :
  :   // Call FE to validate the SAML2 AuthNResponse.
> Can we add some comments here?
Done


http://gerrit.cloudera.org:8080/#/c/16833/23/common/thrift/metrics.json
File common/thrift/metrics.json:

http://gerrit.cloudera.org:8080/#/c/16833/23/common/thrift/metrics.json@1318
PS23, Line 1318: ],
   : "label": "Hive
> Do we need this for catalogd and statestored?
Thanks for spotting - actually the description and the label were also wrong.


http://gerrit.cloudera.org:8080/#/c/16833/23/common/thrift/metrics.json@1330
PS23, Line 1330: "units": "NONE",
   : "kind": "COUNT
> Do we need this for catalogd and statestored?
Done


http://gerrit.cloudera.org:8080/#/c/16833/23/fe/src/main/java/org/apache/impala/authentication/saml/AuthTokenGenerator.java
File 
fe/src/main/java/org/apache/impala/authentication/saml/AuthTokenGenerator.java:

http://gerrit.cloudera.org:8080/#/c/16833/23/fe/src/main/java/org/apache/impala/authentication/saml/AuthTokenGenerator.java@20
PS23, Line 20: https://github.com/vihangk1/hive/blob/master_saml
> This points to a private branch. Unfortunately, we have not merged this yet
Yes, I plan to update these once the Hive patch is merged, but this may happen 
in a follow up patch if this change is merged before Hive.


http://gerrit.cloudera.org:8080/#/c/16833/23/fe/src/main/java/org/apache/impala/authentication/saml/HiveSamlHttpServlet.java
File 
fe/src/main/java/org/apache/impala/authentication/saml/HiveSamlHttpServlet.java:

http://gerrit.cloudera.org:8080/#/c/16833/23/fe/src/main/java/org/apache/impala/authentication/saml/HiveSamlHttpServlet.java@58
PS23, Line 58: generateFormData(webContext, "http://127.0.0.1:; +
> This can be removed. Also, the latest hive patch uses 127.0.0.1 instead of
Done


http://gerrit.cloudera.org:8080/#/c/16833/23/fe/src/main/java/org/apache/impala/authentication/saml/HiveSamlHttpServlet.java@66
PS23, Line 66: tokenGenerator.get(nameId, relayState), true,
> this can be removed. Also, please replace localhost with 127.0.0.1
Done


http://gerrit.cloudera.org:8080/#/c/16833/23/fe/src/main/java/org/apache/impala/authentication/saml/HiveSamlRelayStateInfo.java
File 
fe/src/main/java/org/apache/impala/authentication/saml/HiveSamlRelayStateInfo.java:

http://gerrit.cloudera.org:8080/#/c/16833/23/fe/src/main/java/org/apache/impala/authentication/saml/HiveSamlRelayStateInfo.java@23
PS23, Line 23: g clientIden
> the latest hive patch renames this field to clientIdentifier which is more
Done


http://gerrit.cloudera.org:8080/#/c/16833/23/fe/src/main/java/org/apache/impala/authentication/saml/ImpalaSamlClient.java
File 
fe/src/main/java/org/apache/impala/authentication/saml/ImpalaSamlClient.java:

http://gerrit.cloudera.org:8080/#/c/16833/23/fe/src/main/java/org/apache/impala/authentication/saml/ImpalaSamlClient.java@76
PS23, Line 76:
> Do we need to validate this call back URL? e.g. the port number is same as
This path is already parsed in the backend, so I think that it is easier to 
check this there. I will add validation in a later patch.


http://gerrit.cloudera.org:8080/#/c/16833/23/tests/custom_cluster/test_saml2_sso.py
File tests/custom_cluster/test_saml2_sso.py:

http://gerrit.cloudera.org:8080/#/c/16833/23/tests/custom_cluster/test_saml2_sso.py@37
PS23, Line 37: return response
> I think it would be generally more readable if we have a class level commen
Done



[Impala-ASF-CR] IMPALA-10496: SAML implementation in Impala

2021-02-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16833 )

Change subject: IMPALA-10496: SAML implementation in Impala
..


Patch Set 24:

(16 comments)

http://gerrit.cloudera.org:8080/#/c/16833/24/fe/src/main/java/org/apache/impala/authentication/saml/AuthTokenGenerator.java
File 
fe/src/main/java/org/apache/impala/authentication/saml/AuthTokenGenerator.java:

http://gerrit.cloudera.org:8080/#/c/16833/24/fe/src/main/java/org/apache/impala/authentication/saml/AuthTokenGenerator.java@20
PS24, Line 20: // copy of 
https://github.com/vihangk1/hive/blob/master_saml/service/src/java/org/apache/hive/service/auth/saml/AuthTokenGenerator.java
line too long (135 > 90)


http://gerrit.cloudera.org:8080/#/c/16833/24/fe/src/main/java/org/apache/impala/authentication/saml/HiveSamlAuthTokenGenerator.java
File 
fe/src/main/java/org/apache/impala/authentication/saml/HiveSamlAuthTokenGenerator.java:

http://gerrit.cloudera.org:8080/#/c/16833/24/fe/src/main/java/org/apache/impala/authentication/saml/HiveSamlAuthTokenGenerator.java@32
PS24, Line 32: // 
https://github.com/vihangk1/hive/blob/master_saml/service/src/java/org/apache/hive/service/auth/saml/HiveSamlAuthTokenGenerator.java
line too long (135 > 90)


http://gerrit.cloudera.org:8080/#/c/16833/24/fe/src/main/java/org/apache/impala/authentication/saml/HiveSamlAuthTokenGenerator.java@52
PS24, Line 52:   private static final Logger LOG = 
LoggerFactory.getLogger(HiveSamlAuthTokenGenerator.class);
line too long (94 > 90)


http://gerrit.cloudera.org:8080/#/c/16833/24/fe/src/main/java/org/apache/impala/authentication/saml/HiveSamlGroupNameFilter.java
File 
fe/src/main/java/org/apache/impala/authentication/saml/HiveSamlGroupNameFilter.java:

http://gerrit.cloudera.org:8080/#/c/16833/24/fe/src/main/java/org/apache/impala/authentication/saml/HiveSamlGroupNameFilter.java@31
PS24, Line 31: // 
https://github.com/vihangk1/hive/blob/master_saml/service/src/java/org/apache/hive/service/auth/saml/HiveSamlGroupNameFilter.java
line too long (132 > 90)


http://gerrit.cloudera.org:8080/#/c/16833/24/fe/src/main/java/org/apache/impala/authentication/saml/HiveSamlRelayStateInfo.java
File 
fe/src/main/java/org/apache/impala/authentication/saml/HiveSamlRelayStateInfo.java:

http://gerrit.cloudera.org:8080/#/c/16833/24/fe/src/main/java/org/apache/impala/authentication/saml/HiveSamlRelayStateInfo.java@20
PS24, Line 20: // copy of 
https://github.com/vihangk1/hive/blob/master_saml/service/src/java/org/apache/hive/service/auth/saml/HiveSamlRelayStateInfo.java
line too long (139 > 90)


http://gerrit.cloudera.org:8080/#/c/16833/24/fe/src/main/java/org/apache/impala/authentication/saml/HiveSamlRelayStateStore.java
File 
fe/src/main/java/org/apache/impala/authentication/saml/HiveSamlRelayStateStore.java:

http://gerrit.cloudera.org:8080/#/c/16833/24/fe/src/main/java/org/apache/impala/authentication/saml/HiveSamlRelayStateStore.java@32
PS24, Line 32: // slightly modified copy of 
https://github.com/vihangk1/hive/blob/master_saml/service/src/java/org/apache/hive/service/auth/saml/HiveSamlRelayStateInfo.java
line too long (157 > 90)


http://gerrit.cloudera.org:8080/#/c/16833/24/fe/src/main/java/org/apache/impala/authentication/saml/ImpalaSamlClient.java
File 
fe/src/main/java/org/apache/impala/authentication/saml/ImpalaSamlClient.java:

http://gerrit.cloudera.org:8080/#/c/16833/24/fe/src/main/java/org/apache/impala/authentication/saml/ImpalaSamlClient.java@43
PS24, Line 43: // modified version of 
https://github.com/vihangk1/hive/blob/master_saml/service/src/java/org/apache/hive/service/auth/saml/HiveSaml2Client.java
line too long (144 > 90)


http://gerrit.cloudera.org:8080/#/c/16833/24/fe/src/main/java/org/apache/impala/authentication/saml/ImpalaSamlClient.java@69
PS24, Line 69: //TODO handle the replayCache as described in 
http://www.pac4j.org/docs/clients/saml.html
line too long (93 > 90)


http://gerrit.cloudera.org:8080/#/c/16833/24/fe/src/main/java/org/apache/impala/authentication/saml/ImpalaSamlClient.java@147
PS24, Line 147: // This is done to keep original structure by Vihang + keep 
ImpalaSamlClient as the only
line too long (92 > 90)


http://gerrit.cloudera.org:8080/#/c/16833/24/fe/src/main/java/org/apache/impala/authentication/saml/ImpalaSamlClient.java@187
PS24, Line 187:   // 
https://github.com/vihangk1/hive/blob/master_saml/service/src/java/org/apache/hive/service/cli/thrift/ThriftHttpServlet.java
line too long (129 > 90)


http://gerrit.cloudera.org:8080/#/c/16833/24/fe/src/main/java/org/apache/impala/authentication/saml/ImpalaSamlClient.java@188
PS24, Line 188:   private String doSamlAuth(WrappedWebContext webContext) 
throws HttpSamlAuthenticationException {
line too long (98 > 90)


http://gerrit.cloudera.org:8080/#/c/16833/24/fe/src/main/java/org/apache/impala/service/BackendConfig.java
File fe/src/main/java/org/apache/impala/service/BackendConfig.java:


[Impala-ASF-CR] IMPALA-10496: SAML implementation in Impala

2021-02-16 Thread Csaba Ringhofer (Code Review)
Hello Thomas Tauber-Marshall, Vihang Karajgaonkar, Joe McDonnell, Impala Public 
Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/16833

to look at the new patch set (#24).

Change subject: IMPALA-10496: SAML implementation in Impala
..

IMPALA-10496: SAML implementation in Impala

The bulk of the SAML2 related code is done on Java side because:
- There is already an implementation for Hive on review (HIVE-24543).
- The only SAML lib for c++ seems to be OpenSaml, which is seemed
  quite hard to use and a heavy dependency.

Doing authentication in Java needed some plumbing, as the hs2-http
port is listened to in c++ and http related processing happens in
THttpServer/THttpTransport, which is not a "real" web server, just
a simple http implementation that processes the headers and passes
content to the thrift service.
- Http headers (and in one case body) are inspected and if it is
  SAML related, the http request is wrapped in TWrappedHttpRequest
  and sent to the Frontend. The Frontend processes it and returns
  a TWrappedHttpResponse with the info to return to the client.
- After the last SAML message (with the bearer token) we generate
  an auth cookie in c++ (which can be validated in c++),  so later
  requests in the session don't need to call to Java.

SAML auth can work alongside LDAP and Kerberos - for each hs2-http
request the path and the http headers are inspected to decide
whether it is SAML related, and if not, then we fallback to other
auth mechanisms. This "mixed mode" has no tests yet, so I consider it
experimental.

Planned followup work:
- It would be great to import the logic implemented in Hive instead
  of copy-pasting most of it. I plan to do this in a followup commit,
  as this needs changes on the Hive side too.
- Adding more tests will be much easier once we will have a hs2-http
  client that supports SAML. See IMPALA-10496 for Impyla support.

Testing:
- Added EE tests that use Python's urllib2 to sent SAML
  requests to Impala. Impala works slightly differently
  during tests (saml2_ee_test_mode=true).

Change-Id: Ia0c026cba1b90e7ff6ec5ae49be78b0d1edd8dfa
---
M be/src/rpc/auth-provider.h
M be/src/rpc/authentication-test.cc
M be/src/rpc/authentication.cc
M be/src/rpc/authentication.h
M be/src/rpc/hs2-http-test.cc
M be/src/rpc/thrift-server.h
M be/src/service/frontend.cc
M be/src/service/frontend.h
M be/src/service/impala-server.cc
M be/src/transport/THttpServer.cpp
M be/src/transport/THttpServer.h
M be/src/transport/THttpTransport.cpp
M be/src/transport/THttpTransport.h
M be/src/util/backend-gflag-util.cc
M bin/rat_exclude_files.txt
M common/thrift/BackendGflags.thrift
M common/thrift/Frontend.thrift
M common/thrift/metrics.json
M fe/pom.xml
A fe/src/main/java/org/apache/impala/authentication/saml/AuthTokenGenerator.java
A 
fe/src/main/java/org/apache/impala/authentication/saml/HiveSamlAuthTokenGenerator.java
A 
fe/src/main/java/org/apache/impala/authentication/saml/HiveSamlGroupNameFilter.java
A 
fe/src/main/java/org/apache/impala/authentication/saml/HiveSamlHttpServlet.java
A 
fe/src/main/java/org/apache/impala/authentication/saml/HiveSamlRelayStateInfo.java
A 
fe/src/main/java/org/apache/impala/authentication/saml/HiveSamlRelayStateStore.java
A fe/src/main/java/org/apache/impala/authentication/saml/HiveSamlUtils.java
A 
fe/src/main/java/org/apache/impala/authentication/saml/HttpSamlAuthenticationException.java
A 
fe/src/main/java/org/apache/impala/authentication/saml/HttpSamlNoGroupsMatchedException.java
A fe/src/main/java/org/apache/impala/authentication/saml/ImpalaSamlClient.java
A fe/src/main/java/org/apache/impala/authentication/saml/NullSessionStore.java
A fe/src/main/java/org/apache/impala/authentication/saml/WrappedWebContext.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M fe/src/main/java/org/apache/impala/service/JniFrontend.java
M java/pom.xml
A testdata/authentication/saml2_sso.jks
A testdata/authentication/saml2_sso_metadata.xml
A tests/custom_cluster/test_saml2_sso.py
38 files changed, 2,166 insertions(+), 49 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/33/16833/24
--
To view, visit http://gerrit.cloudera.org:8080/16833
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia0c026cba1b90e7ff6ec5ae49be78b0d1edd8dfa
Gerrit-Change-Number: 16833
Gerrit-PatchSet: 24
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Vihang Karajgaonkar 


[Impala-ASF-CR] IMPALA-10496: SAML implementation in Impala

2021-02-16 Thread Vihang Karajgaonkar (Code Review)
Vihang Karajgaonkar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16833 )

Change subject: IMPALA-10496: SAML implementation in Impala
..


Patch Set 23:

(14 comments)

I mostly have some minor suggestions, but overall the patch looks good to me 
and I can +1 once the comments are resolved. Will wait for Thomas to look at 
this patch as well.

http://gerrit.cloudera.org:8080/#/c/16833/23//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/16833/23//COMMIT_MSG@10
PS23, Line 10: POC in Hive
I think mentioning the upstream HIVE jira (HIVE-24543) here would be a useful 
reference since it links to the design doc.


http://gerrit.cloudera.org:8080/#/c/16833/23/be/src/common/global-flags.cc
File be/src/common/global-flags.cc:

http://gerrit.cloudera.org:8080/#/c/16833/23/be/src/common/global-flags.cc@352
PS23, Line 352:
nit, blank new line.


http://gerrit.cloudera.org:8080/#/c/16833/23/be/src/rpc/authentication.cc
File be/src/rpc/authentication.cc:

http://gerrit.cloudera.org:8080/#/c/16833/23/be/src/rpc/authentication.cc@1338
PS23, Line 1338:   sap->InitSaml();
do we need to add a similar msg as above for SAML?


http://gerrit.cloudera.org:8080/#/c/16833/23/be/src/service/frontend.h
File be/src/service/frontend.h:

http://gerrit.cloudera.org:8080/#/c/16833/23/be/src/service/frontend.h@201
PS23, Line 201:   Status GetSaml2Redirect(const TWrappedHttpRequest& request,
  :   TWrappedHttpResponse* response);
  :
  :   Status ValidateSaml2Response(
  :   const TWrappedHttpRequest& request, TWrappedHttpResponse* 
response);
Can we add some comments here?


http://gerrit.cloudera.org:8080/#/c/16833/23/common/thrift/metrics.json
File common/thrift/metrics.json:

http://gerrit.cloudera.org:8080/#/c/16833/23/common/thrift/metrics.json@1318
PS23, Line 1318:   "CATALOGSERVER",
   :   "STATESTORE"
Do we need this for catalogd and statestored?


http://gerrit.cloudera.org:8080/#/c/16833/23/common/thrift/metrics.json@1330
PS23, Line 1330:   "CATALOGSERVER",
   :   "STATESTORE"
Do we need this for catalogd and statestored?


http://gerrit.cloudera.org:8080/#/c/16833/23/fe/src/main/java/org/apache/impala/authentication/saml/AuthTokenGenerator.java
File 
fe/src/main/java/org/apache/impala/authentication/saml/AuthTokenGenerator.java:

http://gerrit.cloudera.org:8080/#/c/16833/23/fe/src/main/java/org/apache/impala/authentication/saml/AuthTokenGenerator.java@20
PS23, Line 20: https://github.com/vihangk1/hive/blob/master_saml
This points to a private branch. Unfortunately, we have not merged this yet in 
the master branch. This comment might need an update once the Hive patch is 
merged.


http://gerrit.cloudera.org:8080/#/c/16833/23/fe/src/main/java/org/apache/impala/authentication/saml/HiveSamlHttpServlet.java
File 
fe/src/main/java/org/apache/impala/authentication/saml/HiveSamlHttpServlet.java:

http://gerrit.cloudera.org:8080/#/c/16833/23/fe/src/main/java/org/apache/impala/authentication/saml/HiveSamlHttpServlet.java@58
PS23, Line 58: //TODO(Vihang) do we need a https://localhost here?
This can be removed. Also, the latest hive patch uses 127.0.0.1 instead of 
localhost as per the specification RFC 8252 section 7.3 
(https://tools.ietf.org/html/rfc8252#section-7.3).


http://gerrit.cloudera.org:8080/#/c/16833/23/fe/src/main/java/org/apache/impala/authentication/saml/HiveSamlHttpServlet.java@66
PS23, Line 66: //TODO(Vihang) do we need a https://localhost her
this can be removed. Also, please replace localhost with 127.0.0.1


http://gerrit.cloudera.org:8080/#/c/16833/23/fe/src/main/java/org/apache/impala/authentication/saml/HiveSamlRelayStateInfo.java
File 
fe/src/main/java/org/apache/impala/authentication/saml/HiveSamlRelayStateInfo.java:

http://gerrit.cloudera.org:8080/#/c/16833/23/fe/src/main/java/org/apache/impala/authentication/saml/HiveSamlRelayStateInfo.java@23
PS23, Line 23: codeVerifier
the latest hive patch renames this field to clientIdentifier which is more 
readable in my opinion.


http://gerrit.cloudera.org:8080/#/c/16833/23/fe/src/main/java/org/apache/impala/authentication/saml/ImpalaSamlClient.java
File 
fe/src/main/java/org/apache/impala/authentication/saml/ImpalaSamlClient.java:

http://gerrit.cloudera.org:8080/#/c/16833/23/fe/src/main/java/org/apache/impala/authentication/saml/ImpalaSamlClient.java@76
PS23, Line 76: conf.getSaml2SpCallbackUrl()
Do we need to validate this call back URL? e.g. the port number is same as http 
port? and it is using https?


http://gerrit.cloudera.org:8080/#/c/16833/23/fe/src/main/java/org/apache/impala/authentication/saml/ImpalaSamlClient.java@196
PS23, Line 196: codeVerifier
may be rename this to clientIdentifier.


http://gerrit.cloudera.org:8080/#/c/16833/23/tests/custom_cluster/test_saml2_sso.py
File tests/custom_cluster/test_saml2_sso.py:


[Impala-ASF-CR] IMPALA-10496: SAML implementation in Impala

2021-02-10 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16833 )

Change subject: IMPALA-10496: SAML implementation in Impala
..


Patch Set 23: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/16833
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia0c026cba1b90e7ff6ec5ae49be78b0d1edd8dfa
Gerrit-Change-Number: 16833
Gerrit-PatchSet: 23
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Wed, 10 Feb 2021 21:06:35 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10496: SAML implementation in Impala

2021-02-10 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16833 )

Change subject: IMPALA-10496: SAML implementation in Impala
..


Patch Set 22:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/8117/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/16833
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia0c026cba1b90e7ff6ec5ae49be78b0d1edd8dfa
Gerrit-Change-Number: 16833
Gerrit-PatchSet: 22
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Wed, 10 Feb 2021 15:36:04 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10496: SAML implementation in Impala

2021-02-10 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16833 )

Change subject: IMPALA-10496: SAML implementation in Impala
..


Patch Set 23:

PS 22 is a one line fix for the ldap test failures introduced + unintended 
removal of the rebase that was done by jenkins.


--
To view, visit http://gerrit.cloudera.org:8080/16833
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia0c026cba1b90e7ff6ec5ae49be78b0d1edd8dfa
Gerrit-Change-Number: 16833
Gerrit-PatchSet: 23
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Wed, 10 Feb 2021 15:23:50 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10496: SAML implementation in Impala

2021-02-10 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16833 )

Change subject: IMPALA-10496: SAML implementation in Impala
..


Patch Set 23:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6881/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/16833
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia0c026cba1b90e7ff6ec5ae49be78b0d1edd8dfa
Gerrit-Change-Number: 16833
Gerrit-PatchSet: 23
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Wed, 10 Feb 2021 15:21:56 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10496: SAML implementation in Impala

2021-02-10 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16833 )

Change subject: IMPALA-10496: SAML implementation in Impala
..


Patch Set 22:

(37 comments)

http://gerrit.cloudera.org:8080/#/c/16833/22/fe/src/main/java/org/apache/impala/authentication/saml/AuthTokenGenerator.java
File 
fe/src/main/java/org/apache/impala/authentication/saml/AuthTokenGenerator.java:

http://gerrit.cloudera.org:8080/#/c/16833/22/fe/src/main/java/org/apache/impala/authentication/saml/AuthTokenGenerator.java@20
PS22, Line 20: // copy of 
https://github.com/vihangk1/hive/blob/master_saml/service/src/java/org/apache/hive/service/auth/saml/AuthTokenGenerator.java
line too long (135 > 90)


http://gerrit.cloudera.org:8080/#/c/16833/22/fe/src/main/java/org/apache/impala/authentication/saml/HiveSamlAuthTokenGenerator.java
File 
fe/src/main/java/org/apache/impala/authentication/saml/HiveSamlAuthTokenGenerator.java:

http://gerrit.cloudera.org:8080/#/c/16833/22/fe/src/main/java/org/apache/impala/authentication/saml/HiveSamlAuthTokenGenerator.java@32
PS22, Line 32: // 
https://github.com/vihangk1/hive/blob/master_saml/service/src/java/org/apache/hive/service/auth/saml/HiveSamlAuthTokenGenerator.java
line too long (135 > 90)


http://gerrit.cloudera.org:8080/#/c/16833/22/fe/src/main/java/org/apache/impala/authentication/saml/HiveSamlAuthTokenGenerator.java@52
PS22, Line 52:   private static final Logger LOG = 
LoggerFactory.getLogger(HiveSamlAuthTokenGenerator.class);
line too long (94 > 90)


http://gerrit.cloudera.org:8080/#/c/16833/22/fe/src/main/java/org/apache/impala/authentication/saml/HiveSamlGroupNameFilter.java
File 
fe/src/main/java/org/apache/impala/authentication/saml/HiveSamlGroupNameFilter.java:

http://gerrit.cloudera.org:8080/#/c/16833/22/fe/src/main/java/org/apache/impala/authentication/saml/HiveSamlGroupNameFilter.java@31
PS22, Line 31: // 
https://github.com/vihangk1/hive/blob/master_saml/service/src/java/org/apache/hive/service/auth/saml/HiveSamlGroupNameFilter.java
line too long (132 > 90)


http://gerrit.cloudera.org:8080/#/c/16833/22/fe/src/main/java/org/apache/impala/authentication/saml/HiveSamlRelayStateInfo.java
File 
fe/src/main/java/org/apache/impala/authentication/saml/HiveSamlRelayStateInfo.java:

http://gerrit.cloudera.org:8080/#/c/16833/22/fe/src/main/java/org/apache/impala/authentication/saml/HiveSamlRelayStateInfo.java@20
PS22, Line 20: // copy of 
https://github.com/vihangk1/hive/blob/master_saml/service/src/java/org/apache/hive/service/auth/saml/HiveSamlRelayStateInfo.java
line too long (139 > 90)


http://gerrit.cloudera.org:8080/#/c/16833/22/fe/src/main/java/org/apache/impala/authentication/saml/HiveSamlRelayStateStore.java
File 
fe/src/main/java/org/apache/impala/authentication/saml/HiveSamlRelayStateStore.java:

http://gerrit.cloudera.org:8080/#/c/16833/22/fe/src/main/java/org/apache/impala/authentication/saml/HiveSamlRelayStateStore.java@32
PS22, Line 32: // slightly modified copy of 
https://github.com/vihangk1/hive/blob/master_saml/service/src/java/org/apache/hive/service/auth/saml/HiveSamlRelayStateInfo.java
line too long (157 > 90)


http://gerrit.cloudera.org:8080/#/c/16833/22/fe/src/main/java/org/apache/impala/authentication/saml/ImpalaSamlClient.java
File 
fe/src/main/java/org/apache/impala/authentication/saml/ImpalaSamlClient.java:

http://gerrit.cloudera.org:8080/#/c/16833/22/fe/src/main/java/org/apache/impala/authentication/saml/ImpalaSamlClient.java@43
PS22, Line 43: // modified version of 
https://github.com/vihangk1/hive/blob/master_saml/service/src/java/org/apache/hive/service/auth/saml/HiveSaml2Client.java
line too long (144 > 90)


http://gerrit.cloudera.org:8080/#/c/16833/22/fe/src/main/java/org/apache/impala/authentication/saml/ImpalaSamlClient.java@69
PS22, Line 69: //TODO handle the replayCache as described in 
http://www.pac4j.org/docs/clients/saml.html
line too long (93 > 90)


http://gerrit.cloudera.org:8080/#/c/16833/22/fe/src/main/java/org/apache/impala/authentication/saml/ImpalaSamlClient.java@148
PS22, Line 148: // This is done to keep original structure by Vihang + keep 
ImpalaSamlClient as the only
line too long (92 > 90)


http://gerrit.cloudera.org:8080/#/c/16833/22/fe/src/main/java/org/apache/impala/authentication/saml/ImpalaSamlClient.java@188
PS22, Line 188:   // 
https://github.com/vihangk1/hive/blob/master_saml/service/src/java/org/apache/hive/service/cli/thrift/ThriftHttpServlet.java
line too long (129 > 90)


http://gerrit.cloudera.org:8080/#/c/16833/22/fe/src/main/java/org/apache/impala/authentication/saml/ImpalaSamlClient.java@189
PS22, Line 189:   private String doSamlAuth(WrappedWebContext webContext) 
throws HttpSamlAuthenticationException {
line too long (98 > 90)


http://gerrit.cloudera.org:8080/#/c/16833/22/fe/src/main/java/org/apache/impala/service/BackendConfig.java
File fe/src/main/java/org/apache/impala/service/BackendConfig.java:


[Impala-ASF-CR] IMPALA-10496: SAML implementation in Impala

2021-02-10 Thread Csaba Ringhofer (Code Review)
Hello Thomas Tauber-Marshall, Vihang Karajgaonkar, Joe McDonnell, Impala Public 
Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/16833

to look at the new patch set (#22).

Change subject: IMPALA-10496: SAML implementation in Impala
..

IMPALA-10496: SAML implementation in Impala

The bulk of the SAML2 related code is done on Java side because:
- There is already a POC in Hive that could be reused.
- The only SAML lib for c++ seems to be OpenSaml, which is seemed
  quite hard to use and a heavy dependency.

Doing authentication in Java needed some plumbing, as the hs2-http
port is listened to in c++ and http related processing happens in
THttpServer/THttpTransport, which is not a "real" web server, just
a simple http implementation that processes the headers and passes
content to the thrift service.
- Http headers (and in one case body) are inspected and if it is
  SAML related, the http request is wrapped in TWrappedHttpRequest
  and sent to the Frontend. The Frontend processes it and returns
  a TWrappedHttpResponse with the info to return to the client.
- After the last SAML message (with the bearer token) we generate
  an auth cookie in c++ (which can be validated in c++),  so later
  requests in the session don't need to call to Java.

State of implementation:
- The java side is more or less ok, will be updated when the Hive
  implementation changes. I would do a proper cleanup / documentation
  once the Hive code is more final.
- Compatibility with other auth mechanisms should be decided:
  - Whether other clients should be able to auth with ldap/kerberos
is not clear yet.

Testing:
- Added EE tests that use Python's urllib2 to sent SAML
  requests to Impala. Impala works slightly differently
  during tests (saml2_ee_test_mode=true).

Change-Id: Ia0c026cba1b90e7ff6ec5ae49be78b0d1edd8dfa
---
M be/src/common/global-flags.cc
M be/src/rpc/auth-provider.h
M be/src/rpc/authentication-test.cc
M be/src/rpc/authentication.cc
M be/src/rpc/authentication.h
M be/src/rpc/hs2-http-test.cc
M be/src/rpc/thrift-server.h
M be/src/service/frontend.cc
M be/src/service/frontend.h
M be/src/service/impala-server.cc
M be/src/transport/THttpServer.cpp
M be/src/transport/THttpServer.h
M be/src/transport/THttpTransport.cpp
M be/src/transport/THttpTransport.h
M be/src/util/backend-gflag-util.cc
M bin/rat_exclude_files.txt
M common/thrift/BackendGflags.thrift
M common/thrift/Frontend.thrift
M common/thrift/metrics.json
M fe/pom.xml
A fe/src/main/java/org/apache/impala/authentication/saml/AuthTokenGenerator.java
A 
fe/src/main/java/org/apache/impala/authentication/saml/HiveSamlAuthTokenGenerator.java
A 
fe/src/main/java/org/apache/impala/authentication/saml/HiveSamlGroupNameFilter.java
A 
fe/src/main/java/org/apache/impala/authentication/saml/HiveSamlHttpServlet.java
A 
fe/src/main/java/org/apache/impala/authentication/saml/HiveSamlRelayStateInfo.java
A 
fe/src/main/java/org/apache/impala/authentication/saml/HiveSamlRelayStateStore.java
A fe/src/main/java/org/apache/impala/authentication/saml/HiveSamlUtils.java
A 
fe/src/main/java/org/apache/impala/authentication/saml/HttpSamlAuthenticationException.java
A 
fe/src/main/java/org/apache/impala/authentication/saml/HttpSamlNoGroupsMatchedException.java
A fe/src/main/java/org/apache/impala/authentication/saml/ImpalaSamlClient.java
A fe/src/main/java/org/apache/impala/authentication/saml/NullSessionStore.java
A fe/src/main/java/org/apache/impala/authentication/saml/WrappedWebContext.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M fe/src/main/java/org/apache/impala/service/JniFrontend.java
M java/pom.xml
A testdata/authentication/saml2_sso.jks
A testdata/authentication/saml2_sso_metadata.xml
A tests/custom_cluster/test_saml2_sso.py
39 files changed, 2,142 insertions(+), 49 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/33/16833/22
--
To view, visit http://gerrit.cloudera.org:8080/16833
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia0c026cba1b90e7ff6ec5ae49be78b0d1edd8dfa
Gerrit-Change-Number: 16833
Gerrit-PatchSet: 22
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Vihang Karajgaonkar 


[Impala-ASF-CR] IMPALA-10496: SAML implementation in Impala

2021-02-09 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16833 )

Change subject: IMPALA-10496: SAML implementation in Impala
..


Patch Set 21: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/6878/


--
To view, visit http://gerrit.cloudera.org:8080/16833
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia0c026cba1b90e7ff6ec5ae49be78b0d1edd8dfa
Gerrit-Change-Number: 16833
Gerrit-PatchSet: 21
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Wed, 10 Feb 2021 03:53:16 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10496: SAML implementation in Impala

2021-02-09 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16833 )

Change subject: IMPALA-10496: SAML implementation in Impala
..


Patch Set 20:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/8108/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/16833
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia0c026cba1b90e7ff6ec5ae49be78b0d1edd8dfa
Gerrit-Change-Number: 16833
Gerrit-PatchSet: 20
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Tue, 09 Feb 2021 22:31:38 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10496: SAML implementation in Impala

2021-02-09 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16833 )

Change subject: IMPALA-10496: SAML implementation in Impala
..


Patch Set 21:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6878/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/16833
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia0c026cba1b90e7ff6ec5ae49be78b0d1edd8dfa
Gerrit-Change-Number: 16833
Gerrit-PatchSet: 21
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Tue, 09 Feb 2021 22:16:12 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10496: SAML implementation in Impala

2021-02-09 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16833 )

Change subject: IMPALA-10496: SAML implementation in Impala
..


Patch Set 20:

(37 comments)

http://gerrit.cloudera.org:8080/#/c/16833/20/fe/src/main/java/org/apache/impala/authentication/saml/AuthTokenGenerator.java
File 
fe/src/main/java/org/apache/impala/authentication/saml/AuthTokenGenerator.java:

http://gerrit.cloudera.org:8080/#/c/16833/20/fe/src/main/java/org/apache/impala/authentication/saml/AuthTokenGenerator.java@20
PS20, Line 20: // copy of 
https://github.com/vihangk1/hive/blob/master_saml/service/src/java/org/apache/hive/service/auth/saml/AuthTokenGenerator.java
line too long (135 > 90)


http://gerrit.cloudera.org:8080/#/c/16833/20/fe/src/main/java/org/apache/impala/authentication/saml/HiveSamlAuthTokenGenerator.java
File 
fe/src/main/java/org/apache/impala/authentication/saml/HiveSamlAuthTokenGenerator.java:

http://gerrit.cloudera.org:8080/#/c/16833/20/fe/src/main/java/org/apache/impala/authentication/saml/HiveSamlAuthTokenGenerator.java@32
PS20, Line 32: // 
https://github.com/vihangk1/hive/blob/master_saml/service/src/java/org/apache/hive/service/auth/saml/HiveSamlAuthTokenGenerator.java
line too long (135 > 90)


http://gerrit.cloudera.org:8080/#/c/16833/20/fe/src/main/java/org/apache/impala/authentication/saml/HiveSamlAuthTokenGenerator.java@52
PS20, Line 52:   private static final Logger LOG = 
LoggerFactory.getLogger(HiveSamlAuthTokenGenerator.class);
line too long (94 > 90)


http://gerrit.cloudera.org:8080/#/c/16833/20/fe/src/main/java/org/apache/impala/authentication/saml/HiveSamlGroupNameFilter.java
File 
fe/src/main/java/org/apache/impala/authentication/saml/HiveSamlGroupNameFilter.java:

http://gerrit.cloudera.org:8080/#/c/16833/20/fe/src/main/java/org/apache/impala/authentication/saml/HiveSamlGroupNameFilter.java@31
PS20, Line 31: // 
https://github.com/vihangk1/hive/blob/master_saml/service/src/java/org/apache/hive/service/auth/saml/HiveSamlGroupNameFilter.java
line too long (132 > 90)


http://gerrit.cloudera.org:8080/#/c/16833/20/fe/src/main/java/org/apache/impala/authentication/saml/HiveSamlRelayStateInfo.java
File 
fe/src/main/java/org/apache/impala/authentication/saml/HiveSamlRelayStateInfo.java:

http://gerrit.cloudera.org:8080/#/c/16833/20/fe/src/main/java/org/apache/impala/authentication/saml/HiveSamlRelayStateInfo.java@20
PS20, Line 20: // copy of 
https://github.com/vihangk1/hive/blob/master_saml/service/src/java/org/apache/hive/service/auth/saml/HiveSamlRelayStateInfo.java
line too long (139 > 90)


http://gerrit.cloudera.org:8080/#/c/16833/20/fe/src/main/java/org/apache/impala/authentication/saml/HiveSamlRelayStateStore.java
File 
fe/src/main/java/org/apache/impala/authentication/saml/HiveSamlRelayStateStore.java:

http://gerrit.cloudera.org:8080/#/c/16833/20/fe/src/main/java/org/apache/impala/authentication/saml/HiveSamlRelayStateStore.java@32
PS20, Line 32: // slightly modified copy of 
https://github.com/vihangk1/hive/blob/master_saml/service/src/java/org/apache/hive/service/auth/saml/HiveSamlRelayStateInfo.java
line too long (157 > 90)


http://gerrit.cloudera.org:8080/#/c/16833/20/fe/src/main/java/org/apache/impala/authentication/saml/ImpalaSamlClient.java
File 
fe/src/main/java/org/apache/impala/authentication/saml/ImpalaSamlClient.java:

http://gerrit.cloudera.org:8080/#/c/16833/20/fe/src/main/java/org/apache/impala/authentication/saml/ImpalaSamlClient.java@43
PS20, Line 43: // modified version of 
https://github.com/vihangk1/hive/blob/master_saml/service/src/java/org/apache/hive/service/auth/saml/HiveSaml2Client.java
line too long (144 > 90)


http://gerrit.cloudera.org:8080/#/c/16833/20/fe/src/main/java/org/apache/impala/authentication/saml/ImpalaSamlClient.java@69
PS20, Line 69: //TODO handle the replayCache as described in 
http://www.pac4j.org/docs/clients/saml.html
line too long (93 > 90)


http://gerrit.cloudera.org:8080/#/c/16833/20/fe/src/main/java/org/apache/impala/authentication/saml/ImpalaSamlClient.java@148
PS20, Line 148: // This is done to keep original structure by Vihang + keep 
ImpalaSamlClient as the only
line too long (92 > 90)


http://gerrit.cloudera.org:8080/#/c/16833/20/fe/src/main/java/org/apache/impala/authentication/saml/ImpalaSamlClient.java@188
PS20, Line 188:   // 
https://github.com/vihangk1/hive/blob/master_saml/service/src/java/org/apache/hive/service/cli/thrift/ThriftHttpServlet.java
line too long (129 > 90)


http://gerrit.cloudera.org:8080/#/c/16833/20/fe/src/main/java/org/apache/impala/authentication/saml/ImpalaSamlClient.java@189
PS20, Line 189:   private String doSamlAuth(WrappedWebContext webContext) 
throws HttpSamlAuthenticationException {
line too long (98 > 90)


http://gerrit.cloudera.org:8080/#/c/16833/20/fe/src/main/java/org/apache/impala/service/BackendConfig.java
File fe/src/main/java/org/apache/impala/service/BackendConfig.java:


[Impala-ASF-CR] IMPALA-10496: SAML implementation in Impala

2021-02-09 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has uploaded a new patch set (#20). ( 
http://gerrit.cloudera.org:8080/16833 )

Change subject: IMPALA-10496: SAML implementation in Impala
..

IMPALA-10496: SAML implementation in Impala

The bulk of the SAML2 related code is done on Java side because:
- There is already a POC in Hive that could be reused.
- The only SAML lib for c++ seems to be OpenSaml, which is seemed
  quite hard to use and a heavy dependency.

Doing authentication in Java needed some plumbing, as the hs2-http
port is listened to in c++ and http related processing happens in
THttpServer/THttpTransport, which is not a "real" web server, just
a simple http implementation that processes the headers and passes
content to the thrift service.
- Http headers (and in one case body) are inspected and if it is
  SAML related, the http request is wrapped in TWrappedHttpRequest
  and sent to the Frontend. The Frontend processes it and returns
  a TWrappedHttpResponse with the info to return to the client.
- After the last SAML message (with the bearer token) we generate
  an auth cookie in c++ (which can be validated in c++),  so later
  requests in the session don't need to call to Java.

State of implementation:
- The java side is more or less ok, will be updated when the Hive
  implementation changes. I would do a proper cleanup / documentation
  once the Hive code is more final.
- Compatibility with other auth mechanisms should be decided:
  - Whether other clients should be able to auth with ldap/kerberos
is not clear yet.

Testing:
- Added EE tests that use Python's urllib2 to sent SAML
  requests to Impala. Impala works slightly differently
  during tests (saml2_ee_test_mode=true).

Change-Id: Ia0c026cba1b90e7ff6ec5ae49be78b0d1edd8dfa
---
M be/src/common/global-flags.cc
M be/src/rpc/auth-provider.h
M be/src/rpc/authentication-test.cc
M be/src/rpc/authentication.cc
M be/src/rpc/authentication.h
M be/src/rpc/hs2-http-test.cc
M be/src/rpc/thrift-server.h
M be/src/service/frontend.cc
M be/src/service/frontend.h
M be/src/service/impala-server.cc
M be/src/transport/THttpServer.cpp
M be/src/transport/THttpServer.h
M be/src/transport/THttpTransport.cpp
M be/src/transport/THttpTransport.h
M be/src/util/backend-gflag-util.cc
M bin/rat_exclude_files.txt
M common/thrift/BackendGflags.thrift
M common/thrift/Frontend.thrift
M common/thrift/metrics.json
M fe/pom.xml
A fe/src/main/java/org/apache/impala/authentication/saml/AuthTokenGenerator.java
A 
fe/src/main/java/org/apache/impala/authentication/saml/HiveSamlAuthTokenGenerator.java
A 
fe/src/main/java/org/apache/impala/authentication/saml/HiveSamlGroupNameFilter.java
A 
fe/src/main/java/org/apache/impala/authentication/saml/HiveSamlHttpServlet.java
A 
fe/src/main/java/org/apache/impala/authentication/saml/HiveSamlRelayStateInfo.java
A 
fe/src/main/java/org/apache/impala/authentication/saml/HiveSamlRelayStateStore.java
A fe/src/main/java/org/apache/impala/authentication/saml/HiveSamlUtils.java
A 
fe/src/main/java/org/apache/impala/authentication/saml/HttpSamlAuthenticationException.java
A 
fe/src/main/java/org/apache/impala/authentication/saml/HttpSamlNoGroupsMatchedException.java
A fe/src/main/java/org/apache/impala/authentication/saml/ImpalaSamlClient.java
A fe/src/main/java/org/apache/impala/authentication/saml/NullSessionStore.java
A fe/src/main/java/org/apache/impala/authentication/saml/WrappedWebContext.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M fe/src/main/java/org/apache/impala/service/JniFrontend.java
M java/pom.xml
A testdata/authentication/saml2_sso.jks
A testdata/authentication/saml2_sso_metadata.xml
A tests/custom_cluster/test_saml2_sso.py
39 files changed, 2,142 insertions(+), 50 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/33/16833/20
--
To view, visit http://gerrit.cloudera.org:8080/16833
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia0c026cba1b90e7ff6ec5ae49be78b0d1edd8dfa
Gerrit-Change-Number: 16833
Gerrit-PatchSet: 20
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Vihang Karajgaonkar