[Impala-ASF-CR] HS2 + HTTP(S) + BASIC/LDAP based thrift server endpoint
Todd Lipcon has posted comments on this change. ( http://gerrit.cloudera.org:8080/13299 ) Change subject: HS2 + HTTP(S) + BASIC/LDAP based thrift server endpoint .. Patch Set 2: (17 comments) http://gerrit.cloudera.org:8080/#/c/13299/2//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/13299/2//COMMIT_MSG@21 PS2, Line 21: Before this patch is committed, the intention is : to submit the changes to those files that are shown in this review as : a patch on Impala's native-toolchain Thrift. It seems to me like it would be easier to just copy-paste this, since the Thrift Protocol/Transport interfaces are typically pretty stable across Thrift versions. Or, contribute the changes upstream to Thrift itself. Carrying a largeish patch in the context of native-toolchain seems kinda like worst of both worlds to me. http://gerrit.cloudera.org:8080/#/c/13299/2//COMMIT_MSG@28 PS2, Line 28: sends a huge payload that can potentially crash the server. yea, assuming this is meant to be exposed to the internet, we might want to consider fuzz testing the pre-authenticated endpoint in an ASAN build. http://gerrit.cloudera.org:8080/#/c/13299/2/be/src/rpc/authentication.cc File be/src/rpc/authentication.cc: http://gerrit.cloudera.org:8080/#/c/13299/2/be/src/rpc/authentication.cc@502 PS2, Line 502: LOG(ERROR) << "Failed to determine buffer length to decode base64 auth string."; for all of the log messages in this function, can we include the remote socket address? http://gerrit.cloudera.org:8080/#/c/13299/2/be/src/rpc/authentication.cc@505 PS2, Line 505: char out[out_max]; stack allocating here seems quite dangerous without constraining the length (eg to 1kb or something) http://gerrit.cloudera.org:8080/#/c/13299/2/be/src/rpc/authentication.cc@507 PS2, Line 507: if (!Base64Decode(base64, in_len, out_max, out, &out_len)) { Looking at Base64Decode, it doesn't seem to null-terminate the output, but down a few lines you're using 'out' as if it's a null-terminated C string. I'd feel safer about all this code if we used C++ strings. gutil has this function: bool Base64Unescape(const char* src, int slen, string* dest); which will take care of all the buffer sizing stuff for you http://gerrit.cloudera.org:8080/#/c/13299/2/be/src/rpc/authentication.cc@521 PS2, Line 521: size_t pass_len = out_len - user_len - 1; again I'd feel safer about the above code if we used C++ strings, like: pair u_p = strings::Split(up_string, strings::delimiter::Limit(":", 1)); (I can't necessarily spot a bug in your C string manipulation, but I also know that the above won't have a bug) http://gerrit.cloudera.org:8080/#/c/13299/2/be/src/rpc/authentication.cc@957 PS2, Line 957: SetConnectionUsername this function is a bit weirdly named, since in the HTTP case, it isn't setting the connection username a tall, but rather setting up a hook that will later set a connection username. Maybe a moot point based on my comment elsewhere that this needs to be per-request state rather than per-connection state for HTTP http://gerrit.cloudera.org:8080/#/c/13299/2/be/src/rpc/authentication.cc@977 PS2, Line 977: const string& err = Substitute("Bad transport type: $0", underlying_transport_type); can we just LOG(FATAL) here since it would be a coding bug? http://gerrit.cloudera.org:8080/#/c/13299/2/be/src/rpc/authentication.cc@998 PS2, Line 998: return Status::Expected(err); same http://gerrit.cloudera.org:8080/#/c/13299/2/be/src/service/impala-server.cc File be/src/service/impala-server.cc: http://gerrit.cloudera.org:8080/#/c/13299/2/be/src/service/impala-server.cc@2424 PS2, Line 2424: if (hs2_http_port > 0) { Maybe we should use '-1' to mean disable? port 0 usually means "use an ephemeral port' http://gerrit.cloudera.org:8080/#/c/13299/2/be/src/transport/THttpServer.h File be/src/transport/THttpServer.h: http://gerrit.cloudera.org:8080/#/c/13299/2/be/src/transport/THttpServer.h@71 PS2, Line 71: std::vector that's a bit of an odd choice of type instead of std::string http://gerrit.cloudera.org:8080/#/c/13299/2/be/src/transport/THttpServer.h@80 PS2, Line 80: THttpServerTransportFactory(bool requireBasicAuth = false) explicit http://gerrit.cloudera.org:8080/#/c/13299/2/be/src/transport/THttpServer.cpp File be/src/transport/THttpServer.cpp: http://gerrit.cloudera.org:8080/#/c/13299/2/be/src/transport/THttpServer.cpp@68 PS2, Line 68: strncmp probably needs to be made case-insensitive (odd that x-forwarded-for is not using THRIFT_strncasecmp). http://gerrit.cloudera.org:8080/#/c/13299/2/be/src/transport/THttpServer.cpp@69 PS2, Line 69: 7 probably need to also check that sz >= 7, otherwise we might read past the end of value (one of the advantages of pulling THttpServer into Impala code itself instead of trying to patch Thrift woudl be we could use nice convenience functions like HasPrefi
[Impala-ASF-CR] HS2 + HTTP(S) + BASIC/LDAP based thrift server endpoint
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13299 ) Change subject: HS2 + HTTP(S) + BASIC/LDAP based thrift server endpoint .. Patch Set 2: Build Failed https://jenkins.impala.io/job/gerrit-code-review-checks/3185/ : Initial code review checks failed. See linked job for details on the failure. -- To view, visit http://gerrit.cloudera.org:8080/13299 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic5569ac62ef3af2868b5d0581f5029dac736b2ff Gerrit-Change-Number: 13299 Gerrit-PatchSet: 2 Gerrit-Owner: Thomas Marshall Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sudhanshu Arora Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Sat, 11 May 2019 00:10:31 + Gerrit-HasComments: No
[Impala-ASF-CR] HS2 + HTTP(S) + BASIC/LDAP based thrift server endpoint
Hello Bharath Vissapragada, Michael Ho, Sudhanshu Arora, Todd Lipcon, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/13299 to look at the new patch set (#2). Change subject: HS2 + HTTP(S) + BASIC/LDAP based thrift server endpoint .. HS2 + HTTP(S) + BASIC/LDAP based thrift server endpoint This patch provides an option to use HTTP based transport for HiveServer2 endpoint on coordinators that the clients can connect to query. HTTP(S) also works when external TLS is enabled using --ssl_server_certificate. Implemented only for HS2 compatible thrift server since, unlike beeswax, its session management does not need to be tied to the underlying TCP conneciton. Thirft's http transport is modified to support BASIC authentication via ldap. For convenience of developing and reviewing, this patch is based on another that copied THttpServer and THttpTransport into Impala's codebase. Before this patch is committed, the intention is to submit the changes to those files that are shown in this review as a patch on Impala's native-toolchain Thrift. TODO = - Audit code to see if it can handle DOS kinda cases where the client sends a huge payload that can potentially crash the server. - Figure out how to do automated testing with LDAP. Testing === - Parameterized JdbcTest to work for HS2 + HTTP mode (no TLS). Manual testing with Beeline client (from Apache Hive), which has builtin support to connect to HTTP(S) based HS2 compatible endpoints. Example -- HTTP mode: > start-impala-cluster.py > JDBC_URL="jdbc:hive2://localhost:/default;transportMode=http" > beeline -u "$JDBC_URL" -- HTTPS mode: > cd $IMPALA_HOME > SSL_ARGS="--ssl_client_ca_certificate=./be/src/testutil/server-cert.pem \ --ssl_server_certificate=./be/src/testutil/server-cert.pem \ --ssl_private_key=./be/src/testutil/server-key.pem --hostname=localhost" > start-impala-cluster.py --impalad_args="$SSL_ARGS" \ --catalogd_args="$SSL_ARGS" --state_store_args="$SSL_ARGS" - Create a local trust store using 'keytool' and import the certificate from server-cert.pem (./clientkeystore in the example). > JDBC_URL="jdbc:hive2://localhost:/default;ssl=true;sslTrustStore= \ ./clientkeystore;trustStorePassword=password;transportMode=http" > beeline -u "$JDBC_URL" -- BASIC Auth with LDAP: > LDAP_ARGS="--enable_ldap_auth --ldap_uri='ldap://...' \ --ldap_bind_pattern='...' --ldap_passwords_in_clear_ok" > start-impala-cluster.py --impalad_args="$LDAP_ARGS" > JDBC_URL="jdbc:hive2://localhost:28000/default;user=...;password=\ ...;transportMode=http" > beeline -u "$JDBC_URL" -- HTTPS mode with LDAP: > start-impala-cluster.py --impalad_args="$LDAP_ARGS $SSL_ARGS" \ --catalogd_args="$SSL_ARGS" --state_store_args="$SSL_ARGS" > JDBC_URL="jdbc:hive2://localhost:28000/default;user=...;password=\ ...;ssl=true;sslTrustStore=./clientkeystore;trustStorePassword=\ password;transportMode=http" > beeline -u "$JDBC_URL" Change-Id: Ic5569ac62ef3af2868b5d0581f5029dac736b2ff --- M be/src/rpc/auth-provider.h M be/src/rpc/authentication-test.cc M be/src/rpc/authentication.cc M be/src/rpc/thrift-server.cc M be/src/rpc/thrift-server.h M be/src/service/impala-server.cc M be/src/service/impala-server.h M be/src/service/impalad-main.cc M be/src/testutil/in-process-servers.cc M be/src/testutil/in-process-servers.h M be/src/transport/THttpServer.cpp M be/src/transport/THttpServer.h M be/src/transport/THttpTransport.cpp M be/src/transport/THttpTransport.h M bin/start-impala-cluster.py M common/thrift/metrics.json M fe/src/test/java/org/apache/impala/service/JdbcTest.java M fe/src/test/java/org/apache/impala/testutil/ImpalaJdbcClient.java M tests/common/impala_cluster.py 19 files changed, 480 insertions(+), 116 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/99/13299/2 -- To view, visit http://gerrit.cloudera.org:8080/13299 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ic5569ac62ef3af2868b5d0581f5029dac736b2ff Gerrit-Change-Number: 13299 Gerrit-PatchSet: 2 Gerrit-Owner: Thomas Marshall Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sudhanshu Arora Gerrit-Reviewer: Todd Lipcon
[Impala-ASF-CR] HS2 + HTTP(S) + BASIC/LDAP based thrift server endpoint
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13299 ) Change subject: HS2 + HTTP(S) + BASIC/LDAP based thrift server endpoint .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/13299/2/bin/start-impala-cluster.py File bin/start-impala-cluster.py: http://gerrit.cloudera.org:8080/#/c/13299/2/bin/start-impala-cluster.py@204 PS2, Line 204: flake8: E203 whitespace before ':' -- To view, visit http://gerrit.cloudera.org:8080/13299 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic5569ac62ef3af2868b5d0581f5029dac736b2ff Gerrit-Change-Number: 13299 Gerrit-PatchSet: 2 Gerrit-Owner: Thomas Marshall Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sudhanshu Arora Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Fri, 10 May 2019 23:14:33 + Gerrit-HasComments: Yes