[Impala-ASF-CR] IMPALA-5164: Fix flaky benchmarks
Impala Public Jenkins has posted comments on this change. Change subject: IMPALA-5164: Fix flaky benchmarks .. Patch Set 8: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/6935 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I7e0ddfaa9bd1340990ae2b6ff946d72bbd26c274 Gerrit-PatchSet: 8 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Zach Amsden Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zach Amsden Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4965: Authorize access to runtime profile and exec summary
Dimitris Tsirogiannis has uploaded a new patch set (#6). Change subject: IMPALA-4965: Authorize access to runtime profile and exec summary .. IMPALA-4965: Authorize access to runtime profile and exec summary Bug: When Sentry-based authorization is enabled, a user that isn't authorized to EXPLAIN a statement that uses a view can still access unauthorized information, such as view's definition, by running the statement and asking for the query profile or the execution summary. Fix: During query compilation, determine if the user can access the the runtime profile or the execution summary. Upon request for a runtime profile or execution summary from a user, determine based on that information and the user that is asking for the profile if the runtime profile (or execution summary) will be returned or an authorization error. The authorization rule enforced is the following: - User A runs statement S, A asks for profile, A has profile access: Runtime profile is returned - User A runs statement S, A asks for profile, A doesn't have profile access: Authorization error - User A runs statement S, user B asks for profile: Authorization error. This patch doesn't enforce access to the runtime profile or execution summary through the Web UI. Change-Id: I2255d587367c2d328590ae8534a5406c4b0c9b15 --- M be/src/service/client-request-state.cc M be/src/service/client-request-state.h M be/src/service/impala-beeswax-server.cc M be/src/service/impala-hs2-server.cc M be/src/service/impala-http-handler.cc M be/src/service/impala-server.cc M be/src/service/impala-server.h M be/src/util/auth-util.cc M be/src/util/auth-util.h M common/thrift/ImpalaInternalService.thrift M fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java M fe/src/main/java/org/apache/impala/analysis/Analyzer.java M fe/src/main/java/org/apache/impala/analysis/InlineViewRef.java M fe/src/main/java/org/apache/impala/analysis/ShowCreateTableStmt.java M shell/impala_shell.py M tests/authorization/test_authorization.py 16 files changed, 272 insertions(+), 50 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/64/7064/6 -- To view, visit http://gerrit.cloudera.org:8080/7064 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I2255d587367c2d328590ae8534a5406c4b0c9b15 Gerrit-PatchSet: 6 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Dimitris Tsirogiannis Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Marcel Kornacker
[Impala-ASF-CR] IMPALA-4965: Authorize access to runtime profile and exec summary
Dimitris Tsirogiannis has posted comments on this change. Change subject: IMPALA-4965: Authorize access to runtime profile and exec summary .. Patch Set 4: (6 comments) http://gerrit.cloudera.org:8080/#/c/7064/4/be/src/service/client-request-state.cc File be/src/service/client-request-state.cc: Line 1094: << "or execution summary."; > indentation Done http://gerrit.cloudera.org:8080/#/c/7064/4/be/src/service/client-request-state.h File be/src/service/client-request-state.h: Line 220: Status CheckProfileAccess(const std::string& user); > const function? removed http://gerrit.cloudera.org:8080/#/c/7064/4/be/src/service/impala-server.cc File be/src/service/impala-server.cc: Line 636: Status ImpalaServer::QueryStateRecord::CheckProfileAccess(const string& user) { > What do you think of moving this into auth-util.h like the other function s Done http://gerrit.cloudera.org:8080/#/c/7064/4/be/src/service/impala-server.h File be/src/service/impala-server.h: Line 639: Status CheckProfileAccess(const std::string& user); > const fn removed http://gerrit.cloudera.org:8080/#/c/7064/4/fe/src/main/java/org/apache/impala/analysis/Analyzer.java File fe/src/main/java/org/apache/impala/analysis/Analyzer.java: Line 162: private boolean enablePrivChecks_ = true; > I was hoping we could remove this now because it should have the same effec That was my initial intention but the way it is used in PartitionSet.java made me change my mind. There, we explicitly disable priv req registration, so it felt weird to make a call to setMaskPrivChecks. http://gerrit.cloudera.org:8080/#/c/7064/4/shell/impala_shell.py File shell/impala_shell.py: Line 515: print_to_stderr(e) > are these changes for testing? Without this change, we would always get a "Could not retrieve summary" error (L518) instead of an auth error. -- To view, visit http://gerrit.cloudera.org:8080/7064 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I2255d587367c2d328590ae8534a5406c4b0c9b15 Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Dimitris Tsirogiannis Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Marcel Kornacker Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4192: Disentangle Expr and ExprContext
Hello Tim Armstrong, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/5483 to look at the new patch set (#19). Change subject: IMPALA-4192: Disentangle Expr and ExprContext .. IMPALA-4192: Disentangle Expr and ExprContext This change separates Expr and ExprContext. This is a preparatory step for factoring out static data (e.g. Exprs) of plan fragments to be shared by multiple plan fragment instances. This change includes the followings: 1. Include aggregate functions (AggFn) as Expr. This separates AggFn from its evaluator. AggFn is similar to existing Expr as both are represented as a tree of Expr nodes but it doesn't really make sense to call Get*Val() on AggFn. This change restructures the class hierarchy: much of the existing Expr class is now renamed to ScalarExpr. Expr is the parent class of both AggFn and ScalarExpr. Expr is defined to be a tree with root of either AggFn or ScalarExpr and all descendants being ScalarExpr. 2. ExprContext is renamed to ScalarExprEvaluator which is the interface for evaluating ScalarExpr; AggFnEvaluator is the interface for evaluating AggFn. Multiple evaluators can be instantiated per Expr. Expr contains static states of an expression while evaluator contains runtime states needed for execution (i.e. evaluating the expression). 3. Update all exec nodes to instantiate Expr and their evaluators separately. ExecNode::Init() will be responsible for creating all the Exprs in an ExecNode while their evaluators are created in ExecNode::Prepare(). Certain evaluators are also moved into the data structures which actually utilize them. For instance, HashTableCtx now owns the build and probe expression evaluators. Similarly, TupleRowComparator and Sorter also own the evaluators. ExecNode which utilizes these data structures are only responsible for creating the expressions used by these data structures. 4. All codegen functions take Exprs instead of evaluators. 5. The assignment of index into the FunctionContext vector is now done during the construction of ScalarExpr. Evaluators are only responsible for allocating and initializing the FunctionContexts. 6. Open(), Prepare() are now removed from Expr classes. The interface for creating any Expr is via either ScalarExpr::Create() or AggFn::Create() which will convert a thrift Expr into an initialized Expr object. Similarly, Create() interface is used for creating evaluators from an intialized Expr object. This separation allows the future change to introduce PlanNode data structures. The plan is to move all ExecNode::Init() logic to PlanNode and call them once per plan fragment. Change-Id: Iefdc9aeeba033355cb9497e3a5d2363627dcf2f3 --- M be/src/benchmarks/expr-benchmark.cc M be/src/codegen/gen_ir_descriptions.py M be/src/codegen/impala-ir.cc M be/src/codegen/llvm-codegen.cc M be/src/codegen/llvm-codegen.h M be/src/common/init.cc M be/src/exec/CMakeLists.txt M be/src/exec/aggregation-node-ir.cc M be/src/exec/aggregation-node.cc M be/src/exec/aggregation-node.h M be/src/exec/analytic-eval-node.cc M be/src/exec/analytic-eval-node.h M be/src/exec/blocking-join-node.cc M be/src/exec/blocking-join-node.h M be/src/exec/data-sink.cc M be/src/exec/data-sink.h M be/src/exec/data-source-scan-node.cc M be/src/exec/exchange-node.cc M be/src/exec/exchange-node.h M be/src/exec/exec-node.cc M be/src/exec/exec-node.h M be/src/exec/filter-context.cc M be/src/exec/filter-context.h M be/src/exec/hash-join-node-ir.cc M be/src/exec/hash-join-node.cc M be/src/exec/hash-join-node.h M be/src/exec/hash-table-ir.cc M be/src/exec/hash-table-test.cc M be/src/exec/hash-table.cc M be/src/exec/hash-table.h M be/src/exec/hash-table.inline.h M be/src/exec/hbase-scan-node.cc M be/src/exec/hbase-table-sink.cc M be/src/exec/hbase-table-sink.h M be/src/exec/hbase-table-writer.cc M be/src/exec/hbase-table-writer.h M be/src/exec/hdfs-avro-scanner.cc M be/src/exec/hdfs-avro-scanner.h M be/src/exec/hdfs-avro-table-writer.cc M be/src/exec/hdfs-avro-table-writer.h M be/src/exec/hdfs-parquet-scanner-ir.cc M be/src/exec/hdfs-parquet-scanner.cc M be/src/exec/hdfs-parquet-scanner.h M be/src/exec/hdfs-parquet-table-writer.cc M be/src/exec/hdfs-parquet-table-writer.h M be/src/exec/hdfs-rcfile-scanner.cc M be/src/exec/hdfs-scan-node-base.cc M be/src/exec/hdfs-scan-node-base.h M be/src/exec/hdfs-scan-node.cc M be/src/exec/hdfs-scanner-ir.cc M be/src/exec/hdfs-scanner.cc M be/src/exec/hdfs-scanner.h M be/src/exec/hdfs-sequence-scanner.cc M be/src/exec/hdfs-sequence-scanner.h M be/src/exec/hdfs-sequence-table-writer.cc M be/src/exec/hdfs-sequence-table-writer.h M be/src/exec/hdfs-table-sink.cc M be/src/exec/hdfs-table-sink.h M be/src/exec/hdfs-table-writer.cc M be/src/exec/hdfs-table-writer.h M be/src/exec/hdfs-text-scanner.cc M be/src/exec/hdfs-text-scanner.h M be/src/exec/hdfs-text-tabl
[Impala-ASF-CR] IMPALA-5315 Cast to timestamp fails for YYYY-M-D format
Lars Volker has posted comments on this change. Change subject: IMPALA-5315 Cast to timestamp fails for -M-D format .. Patch Set 3: Code-Review+1 Thank you for fixing this! -- To view, visit http://gerrit.cloudera.org:8080/7009 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ib9a184a09d7e7783f04d47588537612c2ecec28f Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Vincent Tran Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Vincent Tran Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4965: Authorize access to runtime profile and exec summary
Alex Behm has posted comments on this change. Change subject: IMPALA-4965: Authorize access to runtime profile and exec summary .. Patch Set 4: (6 comments) Thanks, looks much nicer! http://gerrit.cloudera.org:8080/#/c/7064/4/be/src/service/client-request-state.cc File be/src/service/client-request-state.cc: Line 1094: << "or execution summary."; indentation http://gerrit.cloudera.org:8080/#/c/7064/4/be/src/service/client-request-state.h File be/src/service/client-request-state.h: Line 220: Status CheckProfileAccess(const std::string& user); const function? http://gerrit.cloudera.org:8080/#/c/7064/4/be/src/service/impala-server.cc File be/src/service/impala-server.cc: Line 636: Status ImpalaServer::QueryStateRecord::CheckProfileAccess(const string& user) { What do you think of moving this into auth-util.h like the other function so it's obvious that both implementations do the same thing? We could even consolidate the code into a single function CheckProfileAccess() that takes all 3 params, i.e. we have CheckProfileAccess(user, QueryStateRecord); CheckProfileAccess(user, ClientRequestState); CheckProfileAccess(user, effective_user, has_access); I don't fee too strongly, but seems nice to have the auth logic and error message only once. http://gerrit.cloudera.org:8080/#/c/7064/4/be/src/service/impala-server.h File be/src/service/impala-server.h: Line 639: Status CheckProfileAccess(const std::string& user); const fn http://gerrit.cloudera.org:8080/#/c/7064/4/fe/src/main/java/org/apache/impala/analysis/Analyzer.java File fe/src/main/java/org/apache/impala/analysis/Analyzer.java: Line 162: private boolean enablePrivChecks_ = true; I was hoping we could remove this now because it should have the same effect as setMaskPrivChecks(null). Any reason we need to keep it? http://gerrit.cloudera.org:8080/#/c/7064/4/shell/impala_shell.py File shell/impala_shell.py: Line 515: print_to_stderr(e) are these changes for testing? -- To view, visit http://gerrit.cloudera.org:8080/7064 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I2255d587367c2d328590ae8534a5406c4b0c9b15 Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Dimitris Tsirogiannis Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Marcel Kornacker Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4965: Authorize access to runtime profile and exec summary
Dimitris Tsirogiannis has uploaded a new patch set (#5). Change subject: IMPALA-4965: Authorize access to runtime profile and exec summary .. IMPALA-4965: Authorize access to runtime profile and exec summary Bug: When Sentry-based authorization is enabled, a user that isn't authorized to EXPLAIN a statement that uses a view can still access unauthorized information, such as view's definition, by running the statement and asking for the query profile or the execution summary. Fix: During query compilation, determine if the user can access the the runtime profile or the execution summary. Upon request for a runtime profile or execution summary from a user, determine based on that information and the user that is asking for the profile if the runtime profile (or execution summary) will be returned or an authorization error. The authorization rule enforced is the following: - User A runs statement S, A asks for profile, A has profile access: Runtime profile is returned - User A runs statement S, A asks for profile, A doesn't have profile access: Authorization error - User A runs statement S, user B asks for profile: Authorization error. This patch doesn't enforce access to the runtime profile or execution summary through the Web UI. Change-Id: I2255d587367c2d328590ae8534a5406c4b0c9b15 --- M be/src/service/client-request-state.cc M be/src/service/client-request-state.h M be/src/service/impala-beeswax-server.cc M be/src/service/impala-hs2-server.cc M be/src/service/impala-http-handler.cc M be/src/service/impala-server.cc M be/src/service/impala-server.h M be/src/util/auth-util.cc M be/src/util/auth-util.h M common/thrift/ImpalaInternalService.thrift M fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java M fe/src/main/java/org/apache/impala/analysis/Analyzer.java M fe/src/main/java/org/apache/impala/analysis/InlineViewRef.java M fe/src/main/java/org/apache/impala/analysis/ShowCreateTableStmt.java M shell/impala_shell.py M tests/authorization/test_authorization.py 16 files changed, 265 insertions(+), 49 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/64/7064/5 -- To view, visit http://gerrit.cloudera.org:8080/7064 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I2255d587367c2d328590ae8534a5406c4b0c9b15 Gerrit-PatchSet: 5 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Dimitris Tsirogiannis Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Marcel Kornacker
[Impala-ASF-CR] IMPALA-4965: Authorize access to runtime profile and exec summary
Dimitris Tsirogiannis has posted comments on this change. Change subject: IMPALA-4965: Authorize access to runtime profile and exec summary .. Patch Set 1: (5 comments) http://gerrit.cloudera.org:8080/#/c/7064/1/tests/authorization/test_authorization.py File tests/authorization/test_authorization.py: Line 92: def test_access_runtime_profile(self): > Nice test! Done Line 135: execute_statement_req.statement = """create view if not exists tpch.customer_view as > Can we put this into a different db? Might be good to make sure this view i We need a db in which we have access to the tables for the positive test. Open to recommendations if you can think of a better option. Line 137: execute_statement_resp = self.hs2_client.ExecuteStatement(execute_statement_req) > also test that EXPLAIN works Done Line 145: # Repeat as a deleted user > delegated Done Line 281: """Runs statement 'stmt' and retrieves the runtime profile and exec summary. If > remove 'statement' Done -- To view, visit http://gerrit.cloudera.org:8080/7064 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I2255d587367c2d328590ae8534a5406c4b0c9b15 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Dimitris Tsirogiannis Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Marcel Kornacker Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4965: Authorize access to runtime profile and exec summary
Dimitris Tsirogiannis has posted comments on this change. Change subject: IMPALA-4965: Authorize access to runtime profile and exec summary .. Patch Set 4: Missed the comments in the test file. Sending a new patch in a while. -- To view, visit http://gerrit.cloudera.org:8080/7064 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I2255d587367c2d328590ae8534a5406c4b0c9b15 Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Dimitris Tsirogiannis Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Marcel Kornacker Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4965: Authorize access to runtime profile and exec summary
Dimitris Tsirogiannis has posted comments on this change. Change subject: IMPALA-4965: Authorize access to runtime profile and exec summary .. Patch Set 3: (5 comments) http://gerrit.cloudera.org:8080/#/c/7064/3/be/src/service/client-request-state.h File be/src/service/client-request-state.h: Line 151: const bool user_has_profile_access() const { > single line if it fits Done Line 222: /// error. If base64_encoded is true, outputs the base64-encoded profile output, > This API is confusing to me. Why does base64_encoded influence the auth beh Removed. http://gerrit.cloudera.org:8080/#/c/7064/3/be/src/service/impala-server.cc File be/src/service/impala-server.cc: Line 606: return request_state->GetRuntimeProfile(user, base64_encoded, output); > We not have many different GetRuntimeProfile*() and GetExecSummary() functi Added the CheckProfileAccess calls and removed the Get* functions. http://gerrit.cloudera.org:8080/#/c/7064/3/be/src/service/impala-server.h File be/src/service/impala-server.h: Line 560: std::string limited_profile_str; > Remove? Done http://gerrit.cloudera.org:8080/#/c/7064/3/fe/src/main/java/org/apache/impala/analysis/Analyzer.java File fe/src/main/java/org/apache/impala/analysis/Analyzer.java: Line 2535: public void registerPrivReq(PrivilegeRequest privReq) { > As discussed, I think this logic can be further simplified and made more ex Done -- To view, visit http://gerrit.cloudera.org:8080/7064 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I2255d587367c2d328590ae8534a5406c4b0c9b15 Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Dimitris Tsirogiannis Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Marcel Kornacker Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4965: Authorize access to runtime profile and exec summary
Dimitris Tsirogiannis has uploaded a new patch set (#4). Change subject: IMPALA-4965: Authorize access to runtime profile and exec summary .. IMPALA-4965: Authorize access to runtime profile and exec summary Bug: When Sentry-based authorization is enabled, a user that isn't authorized to EXPLAIN a statement that uses a view can still access unauthorized information, such as view's definition, by running the statement and asking for the query profile or the execution summary. Fix: During query compilation, determine if the user can access the the runtime profile or the execution summary. Upon request for a runtime profile or execution summary from a user, determine based on that information and the user that is asking for the profile if the runtime profile (or execution summary) will be returned or an authorization error. The authorization rule enforced is the following: - User A runs statement S, A asks for profile, A has profile access: Runtime profile is returned - User A runs statement S, A asks for profile, A doesn't have profile access: Authorization error - User A runs statement S, user B asks for profile: Authorization error. This patch doesn't enforce access to the runtime profile or execution summary through the Web UI. Change-Id: I2255d587367c2d328590ae8534a5406c4b0c9b15 --- M be/src/service/client-request-state.cc M be/src/service/client-request-state.h M be/src/service/impala-beeswax-server.cc M be/src/service/impala-hs2-server.cc M be/src/service/impala-http-handler.cc M be/src/service/impala-server.cc M be/src/service/impala-server.h M be/src/util/auth-util.cc M be/src/util/auth-util.h M common/thrift/ImpalaInternalService.thrift M fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java M fe/src/main/java/org/apache/impala/analysis/Analyzer.java M fe/src/main/java/org/apache/impala/analysis/InlineViewRef.java M fe/src/main/java/org/apache/impala/analysis/ShowCreateTableStmt.java M shell/impala_shell.py M tests/authorization/test_authorization.py 16 files changed, 257 insertions(+), 49 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/64/7064/4 -- To view, visit http://gerrit.cloudera.org:8080/7064 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I2255d587367c2d328590ae8534a5406c4b0c9b15 Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Dimitris Tsirogiannis Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Marcel Kornacker
[Impala-ASF-CR] IMPALA-5400: Execute tests in subplans.test
Impala Public Jenkins has posted comments on this change. Change subject: IMPALA-5400: Execute tests in subplans.test .. Patch Set 4: Build started: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/696/ -- To view, visit http://gerrit.cloudera.org:8080/7038 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I02b4f47553fb8f5fe3425cde2e0bcb3245c39b91 Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: anujphadke Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: anujphadke Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5400: Execute tests in subplans.test
Alex Behm has posted comments on this change. Change subject: IMPALA-5400: Execute tests in subplans.test .. Patch Set 4: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/7038 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I02b4f47553fb8f5fe3425cde2e0bcb3245c39b91 Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: anujphadke Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: anujphadke Gerrit-HasComments: No
Re: [Impala-ASF-CR] IMPALA-2525: Treat parquet ENUMs as STRINGs when creating impala tables.
Jakub, thanks for your contribution and congratulations on your first commit! Nice work! On Tue, Jun 6, 2017 at 7:51 PM, Impala Public Jenkins (Code Review) < ger...@cloudera.org> wrote: > Impala Public Jenkins has submitted this change and it was merged. > > Change subject: IMPALA-2525: Treat parquet ENUMs as STRINGs when creating > impala tables. > .. > > > IMPALA-2525: Treat parquet ENUMs as STRINGs when creating impala tables. > > Change-Id: Ia7a2e20c3ab83eb3fac422c3b33c117856fec475 > Reviewed-on: http://gerrit.cloudera.org:8080/6550 > Reviewed-by: Alex Behm > Tested-by: Impala Public Jenkins > --- > M docs/topics/impala_parquet.xml > M fe/src/main/java/org/apache/impala/analysis/CreateTableLikeFileStmt.java > M fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java > M testdata/bin/create-load-data.sh > A testdata/data/schemas/enum/enum.parquet > M testdata/workloads/functional-query/queries/QueryTest/ > create-table-like-file.test > 6 files changed, 33 insertions(+), 1 deletion(-) > > Approvals: > Impala Public Jenkins: Verified > Alex Behm: Looks good to me, approved > > > > -- > To view, visit http://gerrit.cloudera.org:8080/6550 > To unsubscribe, visit http://gerrit.cloudera.org:8080/settings > > Gerrit-MessageType: merged > Gerrit-Change-Id: Ia7a2e20c3ab83eb3fac422c3b33c117856fec475 > Gerrit-PatchSet: 9 > Gerrit-Project: Impala-ASF > Gerrit-Branch: master > Gerrit-Owner: Jakub Kukul > Gerrit-Reviewer: Alex Behm > Gerrit-Reviewer: Attila Jeges > Gerrit-Reviewer: Impala Public Jenkins > Gerrit-Reviewer: Jakub Kukul > Gerrit-Reviewer: Jim Apple > Gerrit-Reviewer: Lars Volker > Gerrit-Reviewer: Taras Bobrovytsky > > -- > You received this message because you are subscribed to the Google Groups > "impala-cr" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to impala-cr+unsubscr...@cloudera.com. > For more options, visit https://groups.google.com/a/cloudera.com/d/optout. >
[Impala-ASF-CR] IMPALA-2525: Treat parquet ENUMs as STRINGs when creating impala tables.
Impala Public Jenkins has submitted this change and it was merged. Change subject: IMPALA-2525: Treat parquet ENUMs as STRINGs when creating impala tables. .. IMPALA-2525: Treat parquet ENUMs as STRINGs when creating impala tables. Change-Id: Ia7a2e20c3ab83eb3fac422c3b33c117856fec475 Reviewed-on: http://gerrit.cloudera.org:8080/6550 Reviewed-by: Alex Behm Tested-by: Impala Public Jenkins --- M docs/topics/impala_parquet.xml M fe/src/main/java/org/apache/impala/analysis/CreateTableLikeFileStmt.java M fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java M testdata/bin/create-load-data.sh A testdata/data/schemas/enum/enum.parquet M testdata/workloads/functional-query/queries/QueryTest/create-table-like-file.test 6 files changed, 33 insertions(+), 1 deletion(-) Approvals: Impala Public Jenkins: Verified Alex Behm: Looks good to me, approved -- To view, visit http://gerrit.cloudera.org:8080/6550 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: merged Gerrit-Change-Id: Ia7a2e20c3ab83eb3fac422c3b33c117856fec475 Gerrit-PatchSet: 9 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Jakub Kukul Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jakub Kukul Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Taras Bobrovytsky
[Impala-ASF-CR] IMPALA-2525: Treat parquet ENUMs as STRINGs when creating impala tables.
Impala Public Jenkins has posted comments on this change. Change subject: IMPALA-2525: Treat parquet ENUMs as STRINGs when creating impala tables. .. Patch Set 8: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/6550 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ia7a2e20c3ab83eb3fac422c3b33c117856fec475 Gerrit-PatchSet: 8 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Jakub Kukul Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jakub Kukul Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Taras Bobrovytsky Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5016: Try to rewrite coalesce() function in SimplifyConditionalsRule.
yu feng has uploaded a new patch set (#2). Change subject: IMPALA-5016: Try to rewrite coalesce() function in SimplifyConditionalsRule. .. IMPALA-5016: Try to rewrite coalesce() function in SimplifyConditionalsRule. try to rewrite coalesce() function with head-most nulls, constant values and HDFS table partition metadatas. which will make sence in partiton purning Testing: add some unit tests for SimplifyConditionalsRule with coalesce() function. run all test cases in SimplifyConditionalsRule and all passed. Change-Id: I0325a9e437261b15313bbdf2f22d83376a84b575 --- M fe/src/main/java/org/apache/impala/rewrite/SimplifyConditionalsRule.java M fe/src/test/java/org/apache/impala/analysis/ExprRewriteRulesTest.java 2 files changed, 100 insertions(+), 14 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/15/7015/2 -- To view, visit http://gerrit.cloudera.org:8080/7015 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I0325a9e437261b15313bbdf2f22d83376a84b575 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: yu feng Gerrit-Reviewer: Alex Behm
[Impala-ASF-CR] IMPALA-4164: Avoid overly aggressive inlining in LLVM IR
Michael Ho has posted comments on this change. Change subject: IMPALA-4164: Avoid overly aggressive inlining in LLVM IR .. Patch Set 8: (1 comment) http://gerrit.cloudera.org:8080/#/c/6941/8/be/src/codegen/llvm-codegen.cc File be/src/codegen/llvm-codegen.cc: PS8, Line 1091: () Missing pass_builder.OptLevel, and pass_builder.SizeLevel. The default inlining threshold for Inliner is 225 which matches -O2 so we luck out. -- To view, visit http://gerrit.cloudera.org:8080/6941 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I2d87ae8d222b415587e7320cb9072e4a8d6615ce Gerrit-PatchSet: 8 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Michael Ho Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5164: Fix flaky benchmarks
Impala Public Jenkins has posted comments on this change. Change subject: IMPALA-5164: Fix flaky benchmarks .. Patch Set 8: Build started: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/695/ -- To view, visit http://gerrit.cloudera.org:8080/6935 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I7e0ddfaa9bd1340990ae2b6ff946d72bbd26c274 Gerrit-PatchSet: 8 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Zach Amsden Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zach Amsden Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5036: Parquet count star optimization
Taras Bobrovytsky has uploaded a new patch set (#4). Change subject: IMPALA-5036: Parquet count star optimization .. IMPALA-5036: Parquet count star optimization Instead of materializing empty rows when computing count star, we use the data stored in the Parquet RowGroup.num_rows field. The Parquet scanner tuple is modified to have one slot into which we will write the num rows statistic. The aggregate function is changed from count to a special sum function that gets initialized to 0. We also add a rewrite rule so that count() is rewritten to count(*) in order to make sure that this optimization is applied in all cases. Testing: - Added functional and planner tests Change-Id: I536b85c014821296aed68a0c68faadae96005e62 --- M be/src/exec/hdfs-parquet-scanner.cc M be/src/exec/hdfs-scan-node-base.cc M be/src/exec/hdfs-scan-node-base.h M common/thrift/PlanNodes.thrift M fe/src/main/java/org/apache/impala/analysis/AggregateInfo.java M fe/src/main/java/org/apache/impala/analysis/Analyzer.java M fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java M fe/src/main/java/org/apache/impala/analysis/FunctionParams.java M fe/src/main/java/org/apache/impala/analysis/TupleDescriptor.java M fe/src/main/java/org/apache/impala/catalog/BuiltinsDb.java M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java M fe/src/main/java/org/apache/impala/planner/PlanNode.java M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java A fe/src/main/java/org/apache/impala/rewrite/NormalizeCountStarRule.java M fe/src/test/java/org/apache/impala/analysis/ExprRewriteRulesTest.java M fe/src/test/java/org/apache/impala/planner/PlannerTest.java M testdata/workloads/functional-planner/queries/PlannerTest/distinct.test M testdata/workloads/functional-planner/queries/PlannerTest/join-order.test A testdata/workloads/functional-planner/queries/PlannerTest/parquet-stats-agg.test M testdata/workloads/functional-planner/queries/PlannerTest/resource-requirements.test M testdata/workloads/functional-query/queries/QueryTest/aggregation.test R testdata/workloads/functional-query/queries/QueryTest/parquet-stats.test M tests/query_test/test_aggregation.py M tests/query_test/test_parquet_stats.py 24 files changed, 865 insertions(+), 79 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/12/6812/4 -- To view, visit http://gerrit.cloudera.org:8080/6812 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I536b85c014821296aed68a0c68faadae96005e62 Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Taras Bobrovytsky Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Mostafa Mokhtar
[Impala-ASF-CR] IMPALA-5164: Fix flaky benchmarks
Hello Impala Public Jenkins, Tim Armstrong, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/6935 to look at the new patch set (#8). Change subject: IMPALA-5164: Fix flaky benchmarks .. IMPALA-5164: Fix flaky benchmarks Improve benchmarks by detecting involuntary context switches. If a server is heavily loaded due to some other jobs running, benchmarking will not be reliable. We can detect this by using getrusage() to measure context switches. If 90% or more of benchmark time is lost due to switching, we abort the benchmark, but catch the failure and exit silently. We subtract out the time during which we might have been switched out from the total result time, and only count runs which were uninterrupted. If 10% of the result is valid, that seems barely legitimate, but we will warn when anything less than 50% of the benchmark ran without switching. Testing: ran the free-lists-benchmark until the first test passed and then started "make -j 60" on an Impala directory. Got this result: E0519 23:35:41.482010 8766 benchmark.cc:161] Benchmark failed to complete due to context switching. W0519 23:35:41.548840 8766 benchmark.cc:162] Exiting silently from FreeLists 0 iters on 128 kb to avoid spurious test failure. Also, fixed two benchmarks that had crashes on start due to static initialization order problems and missing pieces of initialization. Some benchmarks are still crashing (expr-benchmark and network-perf-be), but those will require a lot more work to fix. Change-Id: I7e0ddfaa9bd1340990ae2b6ff946d72bbd26c274 --- M be/src/benchmarks/free-lists-benchmark.cc M be/src/benchmarks/hash-benchmark.cc M be/src/benchmarks/lock-benchmark.cc M be/src/util/benchmark-test.cc M be/src/util/benchmark.cc M be/src/util/benchmark.h 6 files changed, 108 insertions(+), 38 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/35/6935/8 -- To view, visit http://gerrit.cloudera.org:8080/6935 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I7e0ddfaa9bd1340990ae2b6ff946d72bbd26c274 Gerrit-PatchSet: 8 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Zach Amsden Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zach Amsden
[Impala-ASF-CR] IMPALA-5061: Populate null count in parquet::statistics
Pooja Nilangekar has posted comments on this change. Change subject: IMPALA-5061: Populate null_count in parquet::statistics .. Patch Set 5: (11 comments) http://gerrit.cloudera.org:8080/#/c/7058/4//COMMIT_MSG Commit Message: Line 10: value is encountered by parquet table writer. The value is written > nit typo Done http://gerrit.cloudera.org:8080/#/c/7058/4/be/src/exec/hdfs-parquet-table-writer.cc File be/src/exec/hdfs-parquet-table-writer.cc: Line 197: if (page_stats_base_ != nullptr) page_stats_base_->UpdateNullCount(); > nit: single line Done http://gerrit.cloudera.org:8080/#/c/7058/2/be/src/exec/parquet-column-stats.h File be/src/exec/parquet-column-stats.h: Line 79: > For frequently called functions it's often better to return a bool instead Done http://gerrit.cloudera.org:8080/#/c/7058/4/be/src/exec/parquet-column-stats.h File be/src/exec/parquet-column-stats.h: PS4, Line 63: ; > remove Done PS4, Line 105: ing > column? Done PS4, Line 116: r'. > nit: lowercase Done http://gerrit.cloudera.org:8080/#/c/7058/4/be/src/exec/parquet-column-stats.inline.h File be/src/exec/parquet-column-stats.inline.h: Line 32: template > It looks like this will fit into one line in the header. If so, please move Done Line 186: has_values_ = true; > You could also change the interface of Update() to take a min and max param Done Line 205: template <> > If you add a "int64_t count" parameter to the UpdateNullCount method, you c Done http://gerrit.cloudera.org:8080/#/c/7058/4/tests/query_test/test_insert_parquet.py File tests/query_test/test_insert_parquet.py: Line 324: > How about moving this to L330, initialize it from stats.null_count and then Done Line 480: is set for columns with null values.""" > Please remove the trailing whitespace. You can also set-up git in a way tha Done -- To view, visit http://gerrit.cloudera.org:8080/7058 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I4c49a63af84c2234f0633be63206cb52eb7e8ebb Gerrit-PatchSet: 5 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Pooja Nilangekar Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Pooja Nilangekar Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5061: Populate null count in parquet::statistics
Pooja Nilangekar has uploaded a new patch set (#5). Change subject: IMPALA-5061: Populate null_count in parquet::statistics .. IMPALA-5061: Populate null_count in parquet::statistics The null_count in the statistics field is updated each time a null value is encountered by parquet table writer. The value is written to the parquet header if it has one or more null values in the row_group. Testing: Modified the existing end-to-end test in the test_insert_parquet.py file to make sure each parquet header has the appropriate null_count. Verified the correctness of the nulltable test and added an additional test which populates a parquet file with the functional_parquet.zipcode_incomes table and ensures that the expected null_count is populated. Change-Id: I4c49a63af84c2234f0633be63206cb52eb7e8ebb --- M be/src/exec/hdfs-parquet-scanner.cc M be/src/exec/hdfs-parquet-table-writer.cc M be/src/exec/parquet-column-stats.cc M be/src/exec/parquet-column-stats.h M be/src/exec/parquet-column-stats.inline.h M tests/query_test/test_insert_parquet.py 6 files changed, 144 insertions(+), 98 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/58/7058/5 -- To view, visit http://gerrit.cloudera.org:8080/7058 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I4c49a63af84c2234f0633be63206cb52eb7e8ebb Gerrit-PatchSet: 5 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Pooja Nilangekar Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Pooja Nilangekar
[Impala-ASF-CR] IMPALA-5036: Parquet count star optimization
Taras Bobrovytsky has uploaded a new patch set (#3). Change subject: IMPALA-5036: Parquet count star optimization .. IMPALA-5036: Parquet count star optimization Instead of materializing empty rows when computing count star, we use the data stored in the Parquet RowGroup.num_rows field. The Parquet scanner tuple is modified to have one slot into which we will write the num rows statistic. The aggregate function is changed from count to a special sum function that gets initialized to 0. We also add a rewrite rule so that count() is rewritten to count(*) in order to make sure that this optimization is applied in all cases. Testing: - Added functional and planner tests Change-Id: I536b85c014821296aed68a0c68faadae96005e62 --- M be/src/exec/hdfs-parquet-scanner.cc M be/src/exec/hdfs-scan-node-base.cc M be/src/exec/hdfs-scan-node-base.h M common/thrift/PlanNodes.thrift M fe/src/main/java/org/apache/impala/analysis/AggregateInfo.java M fe/src/main/java/org/apache/impala/analysis/Analyzer.java M fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java M fe/src/main/java/org/apache/impala/analysis/FunctionParams.java M fe/src/main/java/org/apache/impala/analysis/TupleDescriptor.java M fe/src/main/java/org/apache/impala/catalog/BuiltinsDb.java M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java M fe/src/main/java/org/apache/impala/planner/PlanNode.java M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java A fe/src/main/java/org/apache/impala/rewrite/NormalizeCountStarRule.java M fe/src/test/java/org/apache/impala/analysis/ExprRewriteRulesTest.java M fe/src/test/java/org/apache/impala/planner/PlannerTest.java M testdata/workloads/functional-planner/queries/PlannerTest/distinct.test M testdata/workloads/functional-planner/queries/PlannerTest/join-order.test A testdata/workloads/functional-planner/queries/PlannerTest/parquet-stats-agg.test M testdata/workloads/functional-planner/queries/PlannerTest/resource-requirements.test M testdata/workloads/functional-query/queries/QueryTest/aggregation.test R testdata/workloads/functional-query/queries/QueryTest/parquet-stats.test M tests/query_test/test_aggregation.py M tests/query_test/test_parquet_stats.py 24 files changed, 914 insertions(+), 71 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/12/6812/3 -- To view, visit http://gerrit.cloudera.org:8080/6812 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I536b85c014821296aed68a0c68faadae96005e62 Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Taras Bobrovytsky Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Mostafa Mokhtar
[Impala-ASF-CR] IMPALA-5036: Parquet count star optimization
Taras Bobrovytsky has uploaded a new patch set (#2). Change subject: IMPALA-5036: Parquet count star optimization .. IMPALA-5036: Parquet count star optimization Instead of materializing empty rows when computing count star, we use the data stored in the Parquet num rows statistic. The Parquet scanner tuple is modified to have one slot into which we will write the num rows statistic. The aggregate function is changed from count to sum. Testing: - Ran tests locally Change-Id: I536b85c014821296aed68a0c68faadae96005e62 --- M be/src/exec/hdfs-parquet-scanner.cc M be/src/exec/hdfs-scan-node-base.cc M be/src/exec/hdfs-scan-node-base.h M common/thrift/PlanNodes.thrift M fe/src/main/java/org/apache/impala/analysis/AggregateInfo.java M fe/src/main/java/org/apache/impala/analysis/Analyzer.java M fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java M fe/src/main/java/org/apache/impala/analysis/FunctionParams.java M fe/src/main/java/org/apache/impala/analysis/TupleDescriptor.java M fe/src/main/java/org/apache/impala/catalog/BuiltinsDb.java M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java M fe/src/main/java/org/apache/impala/planner/PlanNode.java M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java A fe/src/main/java/org/apache/impala/rewrite/NormalizeCountStarRule.java M fe/src/test/java/org/apache/impala/analysis/ExprRewriteRulesTest.java M fe/src/test/java/org/apache/impala/planner/PlannerTest.java M testdata/workloads/functional-planner/queries/PlannerTest/ddl.test M testdata/workloads/functional-planner/queries/PlannerTest/distinct.test M testdata/workloads/functional-planner/queries/PlannerTest/join-order.test A testdata/workloads/functional-planner/queries/PlannerTest/parquet-stats-agg.test M testdata/workloads/functional-planner/queries/PlannerTest/resource-requirements.test M testdata/workloads/functional-query/queries/QueryTest/aggregation.test R testdata/workloads/functional-query/queries/QueryTest/parquet-stats.test M tests/query_test/test_aggregation.py M tests/query_test/test_parquet_stats.py 25 files changed, 915 insertions(+), 325 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/12/6812/2 -- To view, visit http://gerrit.cloudera.org:8080/6812 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I536b85c014821296aed68a0c68faadae96005e62 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Taras Bobrovytsky Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Mostafa Mokhtar
[Impala-ASF-CR] IMPALA-4850 [DOCS] Create table "comment comes after "partioned by"
Laurel Hale has posted comments on this change. Change subject: IMPALA-4850 [DOCS] Create table "comment comes after "partioned by" .. Patch Set 1: (1 comment) > (1 comment) http://gerrit.cloudera.org:8080/#/c/7080/1/docs/topics/impala_create_table.xml File docs/topics/impala_create_table.xml: Line 172: [COMMENT 'table_comment'] > There seem to be a few things amiss. Thanks for your feedback. I don't see the reference to the PARTITIONED BY clause in the section of sql-parser.cup you pasted above. I looked at sql-parser.cup and don't see anything about the PARTITIONED BY clause other than the following section which follows the "tbl_options" section you pasted above: -- partitioned_data_layout ::= partition_param_list:partition_params {: RESULT = TableDataLayout.createKuduPartitionedLayout(partition_params); :} | /* empty */ {: RESULT = TableDataLayout.createEmptyLayout(); :} ; partition_column_defs ::= KW_PARTITIONED KW_BY LPAREN column_def_list:col_defs RPAREN {: RESULT = col_defs; :} ; // The PARTITION BY clause contains any number of HASH() clauses followed by exactly zero // or one RANGE clauses partition_param_list ::= KW_PARTITION KW_BY hash_partition_param_list:list {: RESULT = list; :} | KW_PARTITION KW_BY range_partition_param:rng {: RESULT = Lists.newArrayList(rng); :} | KW_PARTITION KW_BY hash_partition_param_list:list COMMA range_partition_param:rng {: list.add(rng); RESULT = list; :} ; - Also, I've been testing and am getting many errors on many of the code examples in this file. For example, SORT BY throws an error in a simple CREATE TABLE statement (not a CTAS): -- [vc0136.halxg.cloudera.com:21000] > create table laurel.komono (name string, number int, store_location string) > partitioned by (room string) > sort by store_location > comment 'impala-4850 test'; Query: create table laurel.komono (name string, number int, store_location string) partitioned by (room string) sort by store_location comment 'impala-4850 test' ERROR: AnalysisException: Syntax error in line 3: sort by store_location ^ Encountered: IDENTIFIER Expected: CACHED, COMMENT, LOCATION, ROW, STORED, TBLPROPERTIES, UNCACHED, WITH CAUSED BY: Exception: Syntax error -- This indicates that all of the code examples should be tested in this file, which exceeds the scope of this Jira. My testing did verify, however, that the table comment should be placed immediately after the PARTITIONED BY clause: -- [vc0136.halxg.cloudera.com:21000] > create table laurel.favorite_places (rating int, name string, country string, why string) > partitioned by (color string) > comment 'test of impala-4850'; Query: create table laurel.favorite_places (rating int, name string, country string, why string) partitioned by (color string) comment 'test of impala-4850' Fetched 0 row(s) in 0.13s [vc0136.halxg.cloudera.com:21000] > show tables; Query: show tables ++ | name | ++ | favorite_places| | favorite_words | | french_departments | ++ I will make that fix and discuss the other code examples with John Russell to decide how/when that work can be done. -- To view, visit http://gerrit.cloudera.org:8080/7080 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I543ff1dbfe1ab8a7e0a0a668130ab060e3af0a5f Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Laurel Hale Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: John Russell Gerrit-Reviewer: Laurel Hale Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4107 [DOCS] APPX MEDIAN cuts string to 10 chars
Laurel Hale has uploaded a new patch set (#3). Change subject: IMPALA-4107 [DOCS] APPX_MEDIAN cuts string to 10 chars .. IMPALA-4107 [DOCS] APPX_MEDIAN cuts string to 10 chars In the "Restrictions" section of the "APPX_MEDIAN Function" topic, added information about how the function truncates string values: "The APPX_MEDIAN function returns only the first 10 characters for string values (varchar, char). Additional characters are truncated." In this patch, I added 'string' to the list of string values that are truncated by the APPX_MEDIAN function and checked the source XML for the white space called out in the review comment. It was not present in the source. I can only speculate that it was a distortion caused by the Gerrit UI. Change-Id: I141787452e74ecdb491765cd7fd4c9a771c5bbc2 --- M docs/topics/impala_appx_median.xml 1 file changed, 7 insertions(+), 2 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/94/7094/3 -- To view, visit http://gerrit.cloudera.org:8080/7094 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I141787452e74ecdb491765cd7fd4c9a771c5bbc2 Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Laurel Hale Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: John Russell Gerrit-Reviewer: Laurel Hale
[Impala-ASF-CR] IMPALA-5355: Fix the order of Sentry roles and privileges
Impala Public Jenkins has submitted this change and it was merged. Change subject: IMPALA-5355: Fix the order of Sentry roles and privileges .. IMPALA-5355: Fix the order of Sentry roles and privileges After a single Impalad is restarted, it is possible that order in which it receives roles and privileges from the statestore is incorrect. The correct order is for the role to appear first in the update, before the privilege that references it. If a user updates a role, its catalog version number can become larger than the catalog numbers of the privileges that reference it. This causes the role to come after the privilege in the initial metastore update. The issue is fixed by doing two passes over the catalog objects in the Impalad. The first pass updates the top level objects. The second pass updates the dependent objects Testing: - Added a test that reproduced the problem. Change-Id: I7072e95b74952ce5a51ea1b6e2ae3e80fb0940e0 Reviewed-on: http://gerrit.cloudera.org:8080/7004 Reviewed-by: Taras Bobrovytsky Tested-by: Impala Public Jenkins --- M fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java M tests/authorization/test_grant_revoke.py M tests/common/impala_test_suite.py 3 files changed, 86 insertions(+), 17 deletions(-) Approvals: Impala Public Jenkins: Verified Taras Bobrovytsky: Looks good to me, approved -- To view, visit http://gerrit.cloudera.org:8080/7004 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: merged Gerrit-Change-Id: I7072e95b74952ce5a51ea1b6e2ae3e80fb0940e0 Gerrit-PatchSet: 6 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Taras Bobrovytsky Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Taras Bobrovytsky
[Impala-ASF-CR] IMPALA-5355: Fix the order of Sentry roles and privileges
Impala Public Jenkins has posted comments on this change. Change subject: IMPALA-5355: Fix the order of Sentry roles and privileges .. Patch Set 5: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/7004 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I7072e95b74952ce5a51ea1b6e2ae3e80fb0940e0 Gerrit-PatchSet: 5 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Taras Bobrovytsky Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Taras Bobrovytsky Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4848: Add WIDTH BUCKET() function
anujphadke has posted comments on this change. Change subject: IMPALA-4848: Add WIDTH_BUCKET() function .. Patch Set 5: (13 comments) http://gerrit.cloudera.org:8080/#/c/6023/4/be/src/exprs/math-functions-ir.cc File be/src/exprs/math-functions-ir.cc: Line 31: #include "exprs/decimal-operators.h" > order alphabetically Done Line 430: const T1& max_range, const IntVal& num_buckets) { > move declarations closer to where they are used Done Line 432: bool overflow = false; > // FE casts all the arguments to the same type. Done Line 433: // FE casts all the arguments to the same type. > formatting, space after ',' Done Line 434: int input_scale = ctx->impl()->GetConstFnAttr(FunctionContextImpl::ARG_TYPE_SCALE, 1); > formatting, space after ',' Done Line 442: > single line and formatting (please fix formatting everywhere) Done Line 447: result.val = num_buckets.val; > How about we make the return value a BIGINT and simplify this code (no need Done Line 457: input_scale, input_precision, input_scale, false, &overflow); > use int128_t consistently Done Line 460: error_msg << "Overflow while evaluating the difference between min_range: " << > Please make the error more specific, so we can see where the overflow happe Done Line 472: ctx->SetError(error_msg.str().c_str()); > Don't you need to convert the arguments before the multiplication? Done Line 475: > typo: resulting Done Line 476: // resulting scale should be 2 * input_scale as per multiplication rules > I think we'll need to be more stingy with the types. For example, the input Discussed this with Alex. For this particular scenario where we might need to store results with decimal8Values inputs into int256_t. In this particular case- where expr and min_range are decimal8Values. This subtraction can overflow into a decimal16Values ex: expr = max(decimal8Value) min_range = -1 (or any negative value) width_size = expr.template Subtract<__int128_t>(input_scale, min_range, input_scale, input_precision, input_scale, false, &overflow); width_size has to be decimal16values (to store this overflowed value) even for decimal8values. Should we templetize this function? Since we convert intermediate results to decimal16value . http://gerrit.cloudera.org:8080/#/c/6023/4/be/src/exprs/math-functions.h File be/src/exprs/math-functions.h: Line 111: const IntVal& num_buckets); > weird indentation Done -- To view, visit http://gerrit.cloudera.org:8080/6023 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I081bc916b1bef7b929ca161a9aade3b54c6b858f Gerrit-PatchSet: 5 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: anujphadke Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: anujphadke Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-3603 [DOCS] Document handling of NaN values
Laurel Hale has uploaded a new change for review. http://gerrit.cloudera.org:8080/7098 Change subject: IMPALA-3603 [DOCS] Document handling of NaN values .. IMPALA-3603 [DOCS] Document handling of NaN values Added information in the "DOUBLE Data Type" (impala_double.html) and the "FLOAT Data Type" (impala_float.html) topics about how Impala handles NaN values. This information has been added to the subsection "Usage Notes:" in each topic: "Impala does not evaluate NaN (not a number) values as equal to any other numeric values, including NaN. For example, the following statement, which evaluates equality between two NaN values returns 'false': SELECT CAST('nan' AS DOUBLE|FLOAT)=CAST('nan' AS DOUBLE|FLOAT); Change-Id: Id9485b6790d58fafdae32332d2634cbe893d7fb0 --- M docs/topics/impala_double.xml M docs/topics/impala_float.xml 2 files changed, 20 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/98/7098/1 -- To view, visit http://gerrit.cloudera.org:8080/7098 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: Id9485b6790d58fafdae32332d2634cbe893d7fb0 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Laurel Hale
[Impala-ASF-CR] IMPALA-5400: Execute tests in subplans.test
anujphadke has posted comments on this change. Change subject: IMPALA-5400: Execute tests in subplans.test .. Patch Set 4: (1 comment) http://gerrit.cloudera.org:8080/#/c/7038/3/testdata/workloads/functional-query/queries/QueryTest/nested-types-subplan.test File testdata/workloads/functional-query/queries/QueryTest/nested-types-subplan.test: Line 264: bigint, string > nit: lower case types as well Done -- To view, visit http://gerrit.cloudera.org:8080/7038 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I02b4f47553fb8f5fe3425cde2e0bcb3245c39b91 Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: anujphadke Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: anujphadke Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5400: Execute tests in subplans.test
Hello Alex Behm, Tim Armstrong, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/7038 to look at the new patch set (#4). Change subject: IMPALA-5400: Execute tests in subplans.test .. IMPALA-5400: Execute tests in subplans.test This change executes the tests added to subplans.test and removes a test which incorrectly references subplannull_data.test (a file which does not exist) Change-Id: I02b4f47553fb8f5fe3425cde2e0bcb3245c39b91 --- M testdata/workloads/functional-query/queries/QueryTest/nested-types-subplan.test D testdata/workloads/functional-query/queries/QueryTest/subplans.test M tests/query_test/test_queries.py 3 files changed, 47 insertions(+), 239 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/38/7038/4 -- To view, visit http://gerrit.cloudera.org:8080/7038 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I02b4f47553fb8f5fe3425cde2e0bcb3245c39b91 Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: anujphadke Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: anujphadke
[Impala-ASF-CR] IMPALA-5164: Fix flaky benchmarks
Impala Public Jenkins has posted comments on this change. Change subject: IMPALA-5164: Fix flaky benchmarks .. Patch Set 7: Verified-1 Build failed: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/693/ -- To view, visit http://gerrit.cloudera.org:8080/6935 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I7e0ddfaa9bd1340990ae2b6ff946d72bbd26c274 Gerrit-PatchSet: 7 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Zach Amsden Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zach Amsden Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4848: Add WIDTH BUCKET() function
anujphadke has uploaded a new patch set (#5). Change subject: IMPALA-4848: Add WIDTH_BUCKET() function .. IMPALA-4848: Add WIDTH_BUCKET() function Change-Id: I081bc916b1bef7b929ca161a9aade3b54c6b858f --- M be/src/exprs/expr-test.cc M be/src/exprs/math-functions-ir.cc M be/src/exprs/math-functions.h M common/function-registry/impala_functions.py M fe/src/main/java/org/apache/impala/analysis/Expr.java 5 files changed, 130 insertions(+), 6 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/23/6023/5 -- To view, visit http://gerrit.cloudera.org:8080/6023 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I081bc916b1bef7b929ca161a9aade3b54c6b858f Gerrit-PatchSet: 5 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: anujphadke Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: anujphadke
[Impala-ASF-CR] IMPALA-5394: Set socket timeouts while opening TSaslTransport
John Sherman has posted comments on this change. Change subject: IMPALA-5394: Set socket timeouts while opening TSaslTransport .. Patch Set 2: > Thanks for the write-up, very helpful! I'll give that a try. I think the hard part would be in selecting the thread pool size in TAcceptQueue::serve. Because lets say I set it to 5 and the flaky client tries to connect 5 times within the tcp keep alive period, then impalad is back in the situation of not being able to accept connections until the sockets start timing out. I'm tempted to keep the timeouts combined with switching to the TAcceptQueue implementation. I'll read the code a bit more to get a better understanding. -- To view, visit http://gerrit.cloudera.org:8080/7061 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I56a5f3d9cf931cff14eae7f236fea018236a6255 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: John Sherman Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: John Sherman Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Sailesh Mukil Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5315 Cast to timestamp fails for YYYY-M-D format
Vincent Tran has uploaded a new patch set (#3). Change subject: IMPALA-5315 Cast to timestamp fails for -M-D format .. IMPALA-5315 Cast to timestamp fails for -M-D format This change allows casting of a string in 'lazy' date format to timestamp. The supported lazy date formats are: -M-D -MM-D -M-DD Change-Id: Ib9a184a09d7e7783f04d47588537612c2ecec28f --- M be/src/exprs/expr-test.cc M be/src/runtime/timestamp-parse-util.cc M be/src/runtime/timestamp-parse-util.h 3 files changed, 43 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/09/7009/3 -- To view, visit http://gerrit.cloudera.org:8080/7009 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ib9a184a09d7e7783f04d47588537612c2ecec28f Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Vincent Tran Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Vincent Tran
[Impala-ASF-CR] IMPALA-5315 Cast to timestamp fails for YYYY-M-D format
Vincent Tran has posted comments on this change. Change subject: IMPALA-5315 Cast to timestamp fails for -M-D format .. Patch Set 3: (2 comments) http://gerrit.cloudera.org:8080/#/c/7009/2/be/src/exprs/expr-test.cc File be/src/exprs/expr-test.cc: Line 2210: // Test casting of lazy day and/or month format string to timestamp > nit: this should probably now read day, not date. Done http://gerrit.cloudera.org:8080/#/c/7009/2/be/src/runtime/timestamp-parse-util.cc File be/src/runtime/timestamp-parse-util.cc: PS2, Line 260: DEFAULT_DATE_FMT_LEN - 2: > Maybe it would be clearer to replace these with DEFAULT_DATE_FMT_LEN-1 and That makes sense. Done. -- To view, visit http://gerrit.cloudera.org:8080/7009 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ib9a184a09d7e7783f04d47588537612c2ecec28f Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Vincent Tran Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Vincent Tran Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5160: adjust spill buffer size based on planner estimates
Tim Armstrong has posted comments on this change. Change subject: IMPALA-5160: adjust spill buffer size based on planner estimates .. Patch Set 8: Cleaned up this patch, removing the increased explain level of the existing planner tests. -- To view, visit http://gerrit.cloudera.org:8080/6963 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I57b5b4c528325d478c8a9b834a6bc5dedab54b5b Gerrit-PatchSet: 8 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5160: adjust spill buffer size based on planner estimates
Tim Armstrong has uploaded a new patch set (#8). Change subject: IMPALA-5160: adjust spill buffer size based on planner estimates .. IMPALA-5160: adjust spill buffer size based on planner estimates Scale down the buffer size in hash joins and hash aggregations if estimates indicate that the build side of the join is small. This greatly reduces minimum memory requirements for joins in some common cases, e.g. small dimension tables. Currently this is not plumbed through to the backend and only takes effect in planner tests. Testing: Added targeted planner tests for small/mid/large/unknown memory requirements for aggregations and joins. Increase explain level for TPC-H and TPC-DS to provide visibility into the memory estimates on those data sets. Change-Id: I57b5b4c528325d478c8a9b834a6bc5dedab54b5b --- M fe/src/main/java/org/apache/impala/common/RuntimeEnv.java M fe/src/main/java/org/apache/impala/planner/AggregationNode.java M fe/src/main/java/org/apache/impala/planner/AnalyticEvalNode.java M fe/src/main/java/org/apache/impala/planner/HashJoinNode.java M fe/src/main/java/org/apache/impala/planner/PlanNode.java M fe/src/main/java/org/apache/impala/planner/SortNode.java A fe/src/main/java/org/apache/impala/util/BitUtil.java M fe/src/test/java/org/apache/impala/planner/PlannerTest.java M fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java A testdata/workloads/functional-planner/queries/PlannerTest/spillable-buffer-sizing.test 10 files changed, 983 insertions(+), 39 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/63/6963/8 -- To view, visit http://gerrit.cloudera.org:8080/6963 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I57b5b4c528325d478c8a9b834a6bc5dedab54b5b Gerrit-PatchSet: 8 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-4864 Speed up single slot predicates with dictionaries
Tim Armstrong has posted comments on this change. Change subject: IMPALA-4864 Speed up single slot predicates with dictionaries .. Patch Set 16: (1 comment) http://gerrit.cloudera.org:8080/#/c/6726/16/be/src/exec/parquet-column-readers.cc File be/src/exec/parquet-column-readers.cc: Line 430: if (!dictionary_results_.Get(dict_index)) { I'm thinking that this branch lengthened the critical path through this function for the non-selective case, since the two loads can't necessarily happen in parallel. Hard to know if that's the source of the regression without profiling. We could try to do something like this so that the two loads can happen in parallel: void* slot = tuple->GetSlot(tuple_offset_); T* val_ptr = reinterpret_cast(slot); dict_decoder_.GetValue(dict_index, val_ptr); if (!dictionary_results_.Get(dict_index)) { filtered_rows_->Set(*num_values + val_count, true); } Or maybe even change Set() so that it uses branch-free code and do something like: filtered_rows_->Set(!dictionary_results_.Get(dict_index)); -- To view, visit http://gerrit.cloudera.org:8080/6726 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I65981c89e5292086809ec1268f5a273f4c1fe054 Gerrit-PatchSet: 16 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Zach Amsden Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zach Amsden Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4864 Speed up single slot predicates with dictionaries
Tim Armstrong has posted comments on this change. Change subject: IMPALA-4864 Speed up single slot predicates with dictionaries .. Patch Set 16: (1 comment) http://gerrit.cloudera.org:8080/#/c/6726/16/be/src/exec/parquet-column-readers.cc File be/src/exec/parquet-column-readers.cc: Line 420: LIKELY(dictionary_results_.num_bits() > 0)) { > Not liking this branch, but it is unavoidable without pre-computation overh I think the predicate evaluation on 40,000 values is probably cheap enough to just do it. In most cases we would have to evaluate the predicates anyway when processing the dictionary-encoded pages. I don't think I fully understand the inexpensive predicate check that you're envisioning. Does it really need to be done per-row instead of per-batch (like the IS_DICT_ENCODED branch). -- To view, visit http://gerrit.cloudera.org:8080/6726 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I65981c89e5292086809ec1268f5a273f4c1fe054 Gerrit-PatchSet: 16 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Zach Amsden Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zach Amsden Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5259: [DOCS] Doc REFRESH FUNCTIONS
Taras Bobrovytsky has posted comments on this change. Change subject: IMPALA-5259: [DOCS] Doc REFRESH FUNCTIONS .. Patch Set 4: Code-Review+2 (1 comment) Looks good to me. http://gerrit.cloudera.org:8080/#/c/7044/3/docs/shared/impala_common.xml File docs/shared/impala_common.xml: PS3, Line 1213: s, > OK, I added the wording and some code snippets in the example to clarify th Ok, I think it's clear now. -- To view, visit http://gerrit.cloudera.org:8080/7044 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ic41fec781396b69e6df06b8de0b29c42ad51ce8f Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: John Russell Gerrit-Reviewer: John Russell Gerrit-Reviewer: Taras Bobrovytsky Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-2525: Treat parquet ENUMs as STRINGs when creating impala tables.
Alex Behm has posted comments on this change. Change subject: IMPALA-2525: Treat parquet ENUMs as STRINGs when creating impala tables. .. Patch Set 8: Code-Review+2 (1 comment) http://gerrit.cloudera.org:8080/#/c/6550/8/testdata/bin/create-load-data.sh File testdata/bin/create-load-data.sh: Line 144: hadoop fs -put $SCHEMA_SRC_DIR/enum ${SCHEMA_DEST_DIR}/ > I want to put the whole directory actually, because it's not possible to ha Ahh, got it, thanks! -- To view, visit http://gerrit.cloudera.org:8080/6550 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ia7a2e20c3ab83eb3fac422c3b33c117856fec475 Gerrit-PatchSet: 8 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Jakub Kukul Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jakub Kukul Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Taras Bobrovytsky Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-2525: Treat parquet ENUMs as STRINGs when creating impala tables.
Impala Public Jenkins has posted comments on this change. Change subject: IMPALA-2525: Treat parquet ENUMs as STRINGs when creating impala tables. .. Patch Set 8: Build started: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/694/ -- To view, visit http://gerrit.cloudera.org:8080/6550 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ia7a2e20c3ab83eb3fac422c3b33c117856fec475 Gerrit-PatchSet: 8 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Jakub Kukul Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jakub Kukul Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Taras Bobrovytsky Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5259: [DOCS] Doc REFRESH FUNCTIONS
John Russell has posted comments on this change. Change subject: IMPALA-5259: [DOCS] Doc REFRESH FUNCTIONS .. Patch Set 3: (1 comment) http://gerrit.cloudera.org:8080/#/c/7044/3/docs/shared/impala_common.xml File docs/shared/impala_common.xml: PS3, Line 1213: s. > You should add that these functions will be visible in Impala after running OK, I added the wording and some code snippets in the example to clarify the sequence of operations. -- To view, visit http://gerrit.cloudera.org:8080/7044 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ic41fec781396b69e6df06b8de0b29c42ad51ce8f Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: John Russell Gerrit-Reviewer: John Russell Gerrit-Reviewer: Taras Bobrovytsky Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5388: Only retry RPC on lost connection in send call
Juan Yu has posted comments on this change. Change subject: IMPALA-5388: Only retry RPC on lost connection in send call .. Patch Set 8: Code-Review+1 -- To view, visit http://gerrit.cloudera.org:8080/7063 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I176975f2aa521d5be8a40de51067b1497923d09b Gerrit-PatchSet: 8 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Michael Ho Gerrit-Reviewer: Alan Choi Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Juan Yu Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5259: [DOCS] Doc REFRESH FUNCTIONS
John Russell has uploaded a new patch set (#4). Change subject: IMPALA-5259: [DOCS] Doc REFRESH FUNCTIONS .. IMPALA-5259: [DOCS] Doc REFRESH FUNCTIONS The text from impala_common.xml is reused verbatim under the REFRESH page and in the UDFs page by a #include-like mechanism. Change-Id: Ic41fec781396b69e6df06b8de0b29c42ad51ce8f --- M docs/shared/impala_common.xml M docs/topics/impala_refresh.xml M docs/topics/impala_udf.xml 3 files changed, 26 insertions(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/44/7044/4 -- To view, visit http://gerrit.cloudera.org:8080/7044 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ic41fec781396b69e6df06b8de0b29c42ad51ce8f Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: John Russell Gerrit-Reviewer: John Russell Gerrit-Reviewer: Taras Bobrovytsky
[Impala-ASF-CR] IMPALA-5388: Only retry RPC on lost connection in send call
Michael Ho has posted comments on this change. Change subject: IMPALA-5388: Only retry RPC on lost connection in send call .. Patch Set 4: (1 comment) http://gerrit.cloudera.org:8080/#/c/7063/4/be/src/rpc/thrift-util.cc File be/src/rpc/thrift-util.cc: PS4, Line 183: IsSendCnxLostTException > See comments below. Thanks for the comment. PS8 will now match against "send time out" too -- To view, visit http://gerrit.cloudera.org:8080/7063 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I176975f2aa521d5be8a40de51067b1497923d09b Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Michael Ho Gerrit-Reviewer: Alan Choi Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Juan Yu Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5435: Increase runtime filter test timeouts (again)
Henry Robinson has posted comments on this change. Change subject: IMPALA-5435: Increase runtime filter test timeouts (again) .. Patch Set 1: No, I haven't tried (the issue is flakey in ASAN builds anyhow). If I'm able I'll do so later today. -- To view, visit http://gerrit.cloudera.org:8080/7097 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I2280e08910430e271da2173e465731bba5aef6cf Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Henry Robinson Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Michael Ho Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5435: Increase runtime filter test timeouts (again)
Michael Ho has posted comments on this change. Change subject: IMPALA-5435: Increase runtime filter test timeouts (again) .. Patch Set 1: Were you able to verify the fix with a private ASAN build ? -- To view, visit http://gerrit.cloudera.org:8080/7097 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I2280e08910430e271da2173e465731bba5aef6cf Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Henry Robinson Gerrit-Reviewer: Michael Ho Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5435: Increase runtime filter test timeouts (again)
Michael Ho has posted comments on this change. Change subject: IMPALA-5435: Increase runtime filter test timeouts (again) .. Patch Set 1: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/7097 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I2280e08910430e271da2173e465731bba5aef6cf Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Henry Robinson Gerrit-Reviewer: Michael Ho Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5056: Ensure analysis uses 'fresh' catalog after metadata loading
Alex Behm has posted comments on this change. Change subject: IMPALA-5056: Ensure analysis uses 'fresh' catalog after metadata loading .. Patch Set 4: (1 comment) http://gerrit.cloudera.org:8080/#/c/7045/4/fe/src/main/java/org/apache/impala/service/Frontend.java File fe/src/main/java/org/apache/impala/service/Frontend.java: Line 610: List tblNames = impaladCatalog_.get().getTableNames(dbName, matcher); > Can you clarify - is this an issue that you see introduced by this patch? getDbsMetadata() calls fe.getTableNames() which gets the catalog again I see your point though that your patch does not introduce this issue, so nvm my comment. -- To view, visit http://gerrit.cloudera.org:8080/7045 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I574d69fa75198499523dc291fbbd0d7e3d8d968f Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Henry Robinson Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Henry Robinson Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-2525: Treat parquet ENUMs as STRINGs when creating impala tables.
Jakub Kukul has posted comments on this change. Change subject: IMPALA-2525: Treat parquet ENUMs as STRINGs when creating impala tables. .. Patch Set 8: (1 comment) http://gerrit.cloudera.org:8080/#/c/6550/8/testdata/bin/create-load-data.sh File testdata/bin/create-load-data.sh: Line 144: hadoop fs -put $SCHEMA_SRC_DIR/enum ${SCHEMA_DEST_DIR}/ > enum.parquet I want to put the whole directory actually, because it's not possible to have a `CREATE TABLE` with `LOCATION` pointing to a file. -- To view, visit http://gerrit.cloudera.org:8080/6550 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ia7a2e20c3ab83eb3fac422c3b33c117856fec475 Gerrit-PatchSet: 8 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Jakub Kukul Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jakub Kukul Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Taras Bobrovytsky Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5388: Only retry RPC on lost connection in send call
Sailesh Mukil has posted comments on this change. Change subject: IMPALA-5388: Only retry RPC on lost connection in send call .. Patch Set 8: (1 comment) I looked at the TSaslTransport layer since it seemed like we didn't vet that along with TSocket and TSSLSocket, there are a couple of places where exceptions can be thrown (during reads and writes, not during SASL negotiation), but if we're going to treat them the same as TSSLExceptions, then we can leave the code as it is, i.e. just return a RPC_GENERAL_ERROR. Just wanted to add this as a note here. http://gerrit.cloudera.org:8080/#/c/7063/8/be/src/runtime/client-cache.h File be/src/runtime/client-cache.h: Line 316: try { > We already have a log statement in Reopen(). If Reopen() fails, we should l Good point. I think that should be fine since we only call Reopen() here and nowhere else. -- To view, visit http://gerrit.cloudera.org:8080/7063 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I176975f2aa521d5be8a40de51067b1497923d09b Gerrit-PatchSet: 8 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Michael Ho Gerrit-Reviewer: Alan Choi Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Juan Yu Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-2525: Treat parquet ENUMs as STRINGs when creating impala tables.
Alex Behm has posted comments on this change. Change subject: IMPALA-2525: Treat parquet ENUMs as STRINGs when creating impala tables. .. Patch Set 8: (1 comment) http://gerrit.cloudera.org:8080/#/c/6550/8/testdata/bin/create-load-data.sh File testdata/bin/create-load-data.sh: Line 144: hadoop fs -put $SCHEMA_SRC_DIR/enum ${SCHEMA_DEST_DIR}/ enum.parquet -- To view, visit http://gerrit.cloudera.org:8080/6550 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ia7a2e20c3ab83eb3fac422c3b33c117856fec475 Gerrit-PatchSet: 8 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Jakub Kukul Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jakub Kukul Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Taras Bobrovytsky Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-2525: Treat parquet ENUMs as STRINGs when creating impala tables.
Jakub Kukul has posted comments on this change. Change subject: IMPALA-2525: Treat parquet ENUMs as STRINGs when creating impala tables. .. Patch Set 8: (2 comments) http://gerrit.cloudera.org:8080/#/c/6550/7/testdata/workloads/functional-query/queries/QueryTest/create-table-like-file.test File testdata/workloads/functional-query/queries/QueryTest/create-table-like-file.test: Line 58: '$FILESYSTEM_PREFIX/test-warehouse/schemas/enum/enum.parquet' > In order to make the end-to-end test work, you need to add STORED AS PARQUE Done Line 70: > Jakub, this test failed in pre-commit tests. It returned 0 rows. Any idea? Thanks for your help. -- To view, visit http://gerrit.cloudera.org:8080/6550 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ia7a2e20c3ab83eb3fac422c3b33c117856fec475 Gerrit-PatchSet: 8 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Jakub Kukul Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jakub Kukul Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Taras Bobrovytsky Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-2525: Treat parquet ENUMs as STRINGs when creating impala tables.
Hello Impala Public Jenkins, Jim Apple, Alex Behm, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/6550 to look at the new patch set (#8). Change subject: IMPALA-2525: Treat parquet ENUMs as STRINGs when creating impala tables. .. IMPALA-2525: Treat parquet ENUMs as STRINGs when creating impala tables. Change-Id: Ia7a2e20c3ab83eb3fac422c3b33c117856fec475 --- M docs/topics/impala_parquet.xml M fe/src/main/java/org/apache/impala/analysis/CreateTableLikeFileStmt.java M fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java M testdata/bin/create-load-data.sh A testdata/data/schemas/enum/enum.parquet M testdata/workloads/functional-query/queries/QueryTest/create-table-like-file.test 6 files changed, 33 insertions(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/50/6550/8 -- To view, visit http://gerrit.cloudera.org:8080/6550 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ia7a2e20c3ab83eb3fac422c3b33c117856fec475 Gerrit-PatchSet: 8 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Jakub Kukul Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jakub Kukul Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Taras Bobrovytsky
[Impala-ASF-CR] IMPALA-5056: Ensure analysis uses 'fresh' catalog after metadata loading
Henry Robinson has posted comments on this change. Change subject: IMPALA-5056: Ensure analysis uses 'fresh' catalog after metadata loading .. Patch Set 4: (2 comments) Will post a new patch when I'm sure I'm addressing Alex's concerns. http://gerrit.cloudera.org:8080/#/c/7045/4/fe/src/main/java/org/apache/impala/service/Frontend.java File fe/src/main/java/org/apache/impala/service/Frontend.java: Line 610: List tblNames = impaladCatalog_.get().getTableNames(dbName, matcher); > I'm a little worried about the repercussions of operations seeing different Can you clarify - is this an issue that you see introduced by this patch? getDbsMetadata() retrieves the catalog reference once. So it should see a consistent snapshot throughout its execution (or at least as consistent as it did previously). Previously, the naked catalog reference would be updated asynchronously. The only change here is an atomic ref that makes sure the update is published cross-thread. I agree that this is all a bit confusing. We should be able to snapshot a version of the catalog and keep it fixed during analysis. Line 890: AnalysisContext analysisCtx = new AnalysisContext(impaladCatalog_.get(), queryCtx, > Are we guaranteed that this new catalog returns true for isReady()? I believe so (since the new catalog is created from the statestore update). I'll look further and make the code clearer. -- To view, visit http://gerrit.cloudera.org:8080/7045 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I574d69fa75198499523dc291fbbd0d7e3d8d968f Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Henry Robinson Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Henry Robinson Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5435: Increase runtime filter test timeouts (again)
Henry Robinson has uploaded a new change for review. http://gerrit.cloudera.org:8080/7097 Change subject: IMPALA-5435: Increase runtime filter test timeouts (again) .. IMPALA-5435: Increase runtime filter test timeouts (again) Codegen time under ASAN can take ~10s, making the 15s timeouts for runtime filter tests a bit small. Double those timeouts to 30s. Change-Id: I2280e08910430e271da2173e465731bba5aef6cf --- M testdata/workloads/functional-query/queries/QueryTest/runtime_filters.test M testdata/workloads/functional-query/queries/QueryTest/runtime_row_filters.test 2 files changed, 34 insertions(+), 34 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/97/7097/1 -- To view, visit http://gerrit.cloudera.org:8080/7097 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I2280e08910430e271da2173e465731bba5aef6cf Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Henry Robinson
[Impala-ASF-CR] IMPALA-5388: Only retry RPC on lost connection in send call
Michael Ho has posted comments on this change. Change subject: IMPALA-5388: Only retry RPC on lost connection in send call .. Patch Set 8: (2 comments) http://gerrit.cloudera.org:8080/#/c/7063/8/be/src/runtime/client-cache.h File be/src/runtime/client-cache.h: Line 310: // TODO: ThriftClient should return proper error codes. > DCHECK(client_is_unrecoverable_); Done Line 316: try { > nit: Should we add a higher level LOG here stating the retry? We already have a log statement in Reopen(). If Reopen() fails, we should log in line 314 so I guess it should be okay to skip the log statement here. Are you concerned about certain cases in which the log statement will help us with debugging ? -- To view, visit http://gerrit.cloudera.org:8080/7063 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I176975f2aa521d5be8a40de51067b1497923d09b Gerrit-PatchSet: 8 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Michael Ho Gerrit-Reviewer: Alan Choi Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Juan Yu Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5413: Skip test seq writer hive compatibility on remote clusters.
Alex Behm has posted comments on this change. Change subject: IMPALA-5413: Skip test_seq_writer_hive_compatibility on remote clusters. .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/7046/1/tests/query_test/test_compressed_formats.py File tests/query_test/test_compressed_formats.py: Line 179: output = self.run_stmt_in_hive('select count(*) from %s' % table_name) > Hmm, good point. It appears this is not the case -- we do have some other t So no idea what is happening here? -- To view, visit http://gerrit.cloudera.org:8080/7046 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I1cc8824800e4339874b9c4e3a84969baf848d941 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: David Knupp Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: David Knupp Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-2525: Treat parquet ENUMs as STRINGs when creating impala tables.
Alex Behm has posted comments on this change. Change subject: IMPALA-2525: Treat parquet ENUMs as STRINGs when creating impala tables. .. Patch Set 7: (1 comment) http://gerrit.cloudera.org:8080/#/c/6550/7/testdata/workloads/functional-query/queries/QueryTest/create-table-like-file.test File testdata/workloads/functional-query/queries/QueryTest/create-table-like-file.test: Line 58: '$FILESYSTEM_PREFIX/test-warehouse/schemas/enum.parquet' In order to make the end-to-end test work, you need to add STORED AS PARQUET and LOCATION clauses. -- To view, visit http://gerrit.cloudera.org:8080/6550 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ia7a2e20c3ab83eb3fac422c3b33c117856fec475 Gerrit-PatchSet: 7 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Jakub Kukul Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jakub Kukul Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Taras Bobrovytsky Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5164: Fix flaky benchmarks
Impala Public Jenkins has posted comments on this change. Change subject: IMPALA-5164: Fix flaky benchmarks .. Patch Set 7: Build started: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/693/ -- To view, visit http://gerrit.cloudera.org:8080/6935 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I7e0ddfaa9bd1340990ae2b6ff946d72bbd26c274 Gerrit-PatchSet: 7 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Zach Amsden Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zach Amsden Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5164: Fix flaky benchmarks
Tim Armstrong has posted comments on this change. Change subject: IMPALA-5164: Fix flaky benchmarks .. Patch Set 7: Code-Review+2 Rebase before starting merge -- To view, visit http://gerrit.cloudera.org:8080/6935 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I7e0ddfaa9bd1340990ae2b6ff946d72bbd26c274 Gerrit-PatchSet: 7 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Zach Amsden Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zach Amsden Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5164: Fix flaky benchmarks
Tim Armstrong has posted comments on this change. Change subject: IMPALA-5164: Fix flaky benchmarks .. Patch Set 6: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/6935 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I7e0ddfaa9bd1340990ae2b6ff946d72bbd26c274 Gerrit-PatchSet: 6 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Zach Amsden Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zach Amsden Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5388: Only retry RPC on lost connection in send call
Sailesh Mukil has posted comments on this change. Change subject: IMPALA-5388: Only retry RPC on lost connection in send call .. Patch Set 8: (2 comments) http://gerrit.cloudera.org:8080/#/c/7063/8/be/src/runtime/client-cache.h File be/src/runtime/client-cache.h: Line 310: // TODO: ThriftClient should return proper error codes. DCHECK(client_is_unrecoverable_); Line 316: try { nit: Should we add a higher level LOG here stating the retry? -- To view, visit http://gerrit.cloudera.org:8080/7063 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I176975f2aa521d5be8a40de51067b1497923d09b Gerrit-PatchSet: 8 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Michael Ho Gerrit-Reviewer: Alan Choi Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Juan Yu Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5400: Execute tests in subplans.test
Alex Behm has posted comments on this change. Change subject: IMPALA-5400: Execute tests in subplans.test .. Patch Set 3: Code-Review+2 (1 comment) http://gerrit.cloudera.org:8080/#/c/7038/3/testdata/workloads/functional-query/queries/QueryTest/nested-types-subplan.test File testdata/workloads/functional-query/queries/QueryTest/nested-types-subplan.test: Line 264: BIGINT, STRING nit: lower case types as well -- To view, visit http://gerrit.cloudera.org:8080/7038 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I02b4f47553fb8f5fe3425cde2e0bcb3245c39b91 Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: anujphadke Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: anujphadke Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-2525: Treat parquet ENUMs as STRINGs when creating impala tables.
Alex Behm has posted comments on this change. Change subject: IMPALA-2525: Treat parquet ENUMs as STRINGs when creating impala tables. .. Patch Set 7: (1 comment) http://gerrit.cloudera.org:8080/#/c/6550/7/testdata/workloads/functional-query/queries/QueryTest/create-table-like-file.test File testdata/workloads/functional-query/queries/QueryTest/create-table-like-file.test: Line 70: select * FROM $DATABASE.like_enumtype_file Jakub, this test failed in pre-commit tests. It returned 0 rows. Any idea? 18:17:20 ] MainThread: Comparing QueryTestResults (expected vs actual): 18:17:20 ] 'BEAR','Winnie' != None 18:17:20 ] Number of rows returned (expected vs actual): 1 != 0 -- To view, visit http://gerrit.cloudera.org:8080/6550 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ia7a2e20c3ab83eb3fac422c3b33c117856fec475 Gerrit-PatchSet: 7 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Jakub Kukul Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jakub Kukul Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Taras Bobrovytsky Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5388: Only retry RPC on lost connection in send call
Michael Ho has posted comments on this change. Change subject: IMPALA-5388: Only retry RPC on lost connection in send call .. Patch Set 8: (1 comment) http://gerrit.cloudera.org:8080/#/c/7063/7/be/src/rpc/thrift-util.cc File be/src/rpc/thrift-util.cc: Line 194: DCHECK_EQ("0.9.0", PACKAGE_VERSION) << Substitute("Please update $0() to match " > and we discussed, let's also handle the TTransportException::TIMED_OUT, "se Done -- To view, visit http://gerrit.cloudera.org:8080/7063 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I176975f2aa521d5be8a40de51067b1497923d09b Gerrit-PatchSet: 8 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Michael Ho Gerrit-Reviewer: Alan Choi Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Juan Yu Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5388: Only retry RPC on lost connection in send call
Michael Ho has uploaded a new patch set (#8). Change subject: IMPALA-5388: Only retry RPC on lost connection in send call .. IMPALA-5388: Only retry RPC on lost connection in send call Previously, DoRpc() blacklists only a couple of conditions which shouldn't retry the RPC on exception. This is fragile as the errors could have happened after the payload has been successfully sent to the destination. Such aggressive retry behavior can lead to duplicated row batches being sent, causing wrong results in queries. This change fixes the problem by whitelisting the conditions in which the RPC can be retried. Specifically, it pattern-matches against certain errors in TSocket::write_partial() in the thrift library and only retries the RPC in those cases. With SSL enabled, we will never retry. We should investigate whether there are some cases in which it's safe to retry. This change also adds fault injection in the TransmitData() RPC caller's path to emulate different exception cases. Change-Id: I176975f2aa521d5be8a40de51067b1497923d09b --- M be/src/common/global-flags.cc M be/src/rpc/thrift-util.cc M be/src/rpc/thrift-util.h M be/src/runtime/backend-client.h M be/src/runtime/client-cache.h M be/src/testutil/CMakeLists.txt A be/src/testutil/fault-injection-util.cc M be/src/testutil/fault-injection-util.h A tests/custom_cluster/test_rpc_exception.py 9 files changed, 290 insertions(+), 54 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/63/7063/8 -- To view, visit http://gerrit.cloudera.org:8080/7063 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I176975f2aa521d5be8a40de51067b1497923d09b Gerrit-PatchSet: 8 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Michael Ho Gerrit-Reviewer: Alan Choi Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Juan Yu Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-5394: Set socket timeouts while opening TSaslTransport
Henry Robinson has posted comments on this change. Change subject: IMPALA-5394: Set socket timeouts while opening TSaslTransport .. Patch Set 2: Thanks for the write-up, very helpful! For backend services we have TAcceptQueueServer which moves accept()'ed sockets onto a thread pool for calling open(). See IMPALA-4135, and https://github.com/apache/incubator-impala/commit/a9c40595549bfb74d2b9796a0a7098361793492e. It looks like from your stack trace You might look into that as a drop-in replacement (I think you need to change the HS2 server to Threaded, from ThreadPool) to see if that shows any improvement. You might need to increase the size of the thread pool in TAcceptQueueServer::serve to >1, and implement the locking fix, to see an improvement. If that doesn't seem to work, then setting the timeouts here is ok - just want to make sure we've considered everything first. -- To view, visit http://gerrit.cloudera.org:8080/7061 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I56a5f3d9cf931cff14eae7f236fea018236a6255 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: John Sherman Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: John Sherman Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Sailesh Mukil Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5259: [DOCS] Doc REFRESH FUNCTIONS
Taras Bobrovytsky has posted comments on this change. Change subject: IMPALA-5259: [DOCS] Doc REFRESH FUNCTIONS .. Patch Set 3: (1 comment) http://gerrit.cloudera.org:8080/#/c/7044/3/docs/shared/impala_common.xml File docs/shared/impala_common.xml: PS3, Line 1213: s. You should add that these functions will be visible in Impala after running REFRESH FUNCTIONS -- To view, visit http://gerrit.cloudera.org:8080/7044 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ic41fec781396b69e6df06b8de0b29c42ad51ce8f Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: John Russell Gerrit-Reviewer: John Russell Gerrit-Reviewer: Taras Bobrovytsky Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5400: Execute tests in subplans.test
anujphadke has posted comments on this change. Change subject: IMPALA-5400: Execute tests in subplans.test .. Patch Set 3: (2 comments) http://gerrit.cloudera.org:8080/#/c/7038/2/testdata/workloads/functional-query/queries/QueryTest/nested-types-subplan.test File testdata/workloads/functional-query/queries/QueryTest/nested-types-subplan.test: Line 257: select count(okey), opriority > please make the casing of these new tests consistent with the others in thi Done Line 331: RESULTS: > remove the "VERIFY_IS_EQUAL_SORTED" part since it's not needed here Done -- To view, visit http://gerrit.cloudera.org:8080/7038 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I02b4f47553fb8f5fe3425cde2e0bcb3245c39b91 Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: anujphadke Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: anujphadke Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5400: Execute tests in subplans.test
Hello Tim Armstrong, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/7038 to look at the new patch set (#3). Change subject: IMPALA-5400: Execute tests in subplans.test .. IMPALA-5400: Execute tests in subplans.test This change executes the tests added to subplans.test and removes a test which incorrectly references subplannull_data.test (a file which does not exist) Change-Id: I02b4f47553fb8f5fe3425cde2e0bcb3245c39b91 --- M testdata/workloads/functional-query/queries/QueryTest/nested-types-subplan.test D testdata/workloads/functional-query/queries/QueryTest/subplans.test M tests/query_test/test_queries.py 3 files changed, 44 insertions(+), 236 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/38/7038/3 -- To view, visit http://gerrit.cloudera.org:8080/7038 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I02b4f47553fb8f5fe3425cde2e0bcb3245c39b91 Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: anujphadke Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: anujphadke
[Impala-ASF-CR] IMPALA-3643/IMPALA-5344: Fix FE tests on Java 8
Taras Bobrovytsky has posted comments on this change. Change subject: IMPALA-3643/IMPALA-5344: Fix FE tests on Java 8 .. Patch Set 2: Dimitris, also there are changes the behavior of the union node. Pass through is enabled more often now (which is a good thing). I think it would be hard to make planner tests agnostic to that. -- To view, visit http://gerrit.cloudera.org:8080/7073 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Iad8e1dccec3a51293a109c420bd2b88b9d1e0625 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Lars Volker Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5357: Fix unixtime to UTC TimestampValue perf
Impala Public Jenkins has submitted this change and it was merged. Change subject: IMPALA-5357: Fix unixtime to UTC TimestampValue perf .. IMPALA-5357: Fix unixtime to UTC TimestampValue perf Fixes the severe perf issue when calling gmtime_r to convert a unix time because that libc function takes a global lock. Instead, use boost ptime::from_time_t when possible. Unfortunately only a range of input times are supported without overflowing the underlying time_duration struct (dates between 1677-2262), so those dates outside that range are handled with the slow path calling into gmtime_r. This requires using a patched build of boost which includes an upstream fix for the time_duration class, see: https://github.com/cloudera/native-toolchain/commit/6e726b4b8164d53814f75b78a681fcf6df4a887a Testing: * Ran an exhaustive test run successfully. * Wrote a test program to validate that all time_t values between 1677 and 2262 are converted to the same ptime when using the new code path and the old path. Doing so required running all night, so I'm not going to attempt to add this test. I think a single validation is acceptable. Perf impact: Microbenchmark of the new path (conversion using boost) and the old path (conversion using libc gmtime_r) resulted in the expected speedup from not using a global lock. Machine Info: Intel(R) Core(TM) i7-4770 CPU @ 3.40GHz Function10%ile 50%ile 90%ile 10%ile 50%ile 90%ile iters/ms iters/ms iters/ms (relative) (relative) (relative) - libc-1 0.1920.196 0.2 1X 1X 1X boost-1 0.333 0.34 0.34 1.73X 1.73X 1.7X libc-2 0.1120.1150.116 0.584X 0.586X 0.581X boost-2 0.6270.6540.667 3.26X 3.33X 3.33X libc-4 0.042 0.0478 0.0515 0.218X 0.244X 0.258X boost-4 0.863 1.27 1.3 4.49X 6.5X 6.5X libc-80.0326 0.0328 0.0329 0.169X 0.167X 0.164X boost-8 0.7410.902 1.1 3.85X 4.6X 5.5X Change-Id: I9d611d21310c7b4f93d8e0bc845eb85125abcac9 Reviewed-on: http://gerrit.cloudera.org:8080/7082 Reviewed-by: Matthew Jacobs Tested-by: Impala Public Jenkins --- M be/src/runtime/timestamp-test.cc M be/src/runtime/timestamp-value.cc 2 files changed, 66 insertions(+), 13 deletions(-) Approvals: Impala Public Jenkins: Verified Matthew Jacobs: Looks good to me, approved -- To view, visit http://gerrit.cloudera.org:8080/7082 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: merged Gerrit-Change-Id: I9d611d21310c7b4f93d8e0bc845eb85125abcac9 Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Matthew Jacobs Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Thomas Tauber-Marshall
[Impala-ASF-CR] IMPALA-5357: Fix unixtime to UTC TimestampValue perf
Impala Public Jenkins has posted comments on this change. Change subject: IMPALA-5357: Fix unixtime to UTC TimestampValue perf .. Patch Set 3: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/7082 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I9d611d21310c7b4f93d8e0bc845eb85125abcac9 Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Matthew Jacobs Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5137: [DOCS] Document TIMESTAMP for Kudu tables
Impala Public Jenkins has posted comments on this change. Change subject: IMPALA-5137: [DOCS] Document TIMESTAMP for Kudu tables .. Patch Set 3: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/7035 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ib889198eb2c918c969c7613dd1ddf65a801f7926 Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: John Russell Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: John Russell Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5137: [DOCS] Document TIMESTAMP for Kudu tables
Impala Public Jenkins has submitted this change and it was merged. Change subject: IMPALA-5137: [DOCS] Document TIMESTAMP for Kudu tables .. IMPALA-5137: [DOCS] Document TIMESTAMP for Kudu tables Change-Id: Ib889198eb2c918c969c7613dd1ddf65a801f7926 Reviewed-on: http://gerrit.cloudera.org:8080/7035 Reviewed-by: Matthew Jacobs Tested-by: Impala Public Jenkins --- M docs/shared/impala_common.xml M docs/topics/impala_authorization.xml M docs/topics/impala_datetime_functions.xml M docs/topics/impala_kudu.xml M docs/topics/impala_timestamp.xml 5 files changed, 154 insertions(+), 34 deletions(-) Approvals: Impala Public Jenkins: Verified Matthew Jacobs: Looks good to me, approved -- To view, visit http://gerrit.cloudera.org:8080/7035 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: merged Gerrit-Change-Id: Ib889198eb2c918c969c7613dd1ddf65a801f7926 Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: John Russell Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: John Russell Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Todd Lipcon
[Impala-ASF-CR] IMPALA-5413: Skip test seq writer hive compatibility on remote clusters.
David Knupp has posted comments on this change. Change subject: IMPALA-5413: Skip test_seq_writer_hive_compatibility on remote clusters. .. Patch Set 1: (2 comments) http://gerrit.cloudera.org:8080/#/c/7046/1/tests/query_test/test_compressed_formats.py File tests/query_test/test_compressed_formats.py: Line 179: output = self.run_stmt_in_hive('select count(*) from %s' % table_name) > For clarification: We cannot really run any statements in Hive on a remote Hmm, good point. It appears this is not the case -- we do have some other tests in other modules that use this method. It must be something else. Line 179: output = self.run_stmt_in_hive('select count(*) from %s' % table_name) > Just looked at the JIRA and it's not really clear to me what Hive is doing. Thanks for looking at this Alex. I see what you mean. I'll try to figure out why this is happening. -- To view, visit http://gerrit.cloudera.org:8080/7046 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I1cc8824800e4339874b9c4e3a84969baf848d941 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: David Knupp Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: David Knupp Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5137: [DOCS] Document TIMESTAMP for Kudu tables
Impala Public Jenkins has posted comments on this change. Change subject: IMPALA-5137: [DOCS] Document TIMESTAMP for Kudu tables .. Patch Set 3: Build started: http://jenkins.impala.io:8080/job/gerrit-docs-submit/128/ -- To view, visit http://gerrit.cloudera.org:8080/7035 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ib889198eb2c918c969c7613dd1ddf65a801f7926 Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: John Russell Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: John Russell Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-3643/IMPALA-5344: Fix FE tests on Java 8
Jim Apple has posted comments on this change. Change subject: IMPALA-3643/IMPALA-5344: Fix FE tests on Java 8 .. Patch Set 5: > Many of these changes are essentially needed just for generating > consistent tests results when we switch between different java > versions. I wonder if this the right approach. We're essentially > "forcing" the use of a data structure with different > performance/memory tradeoffs than we had before. It may not have > significant impact on performance/memory usage, but creates > confusion. For example, I believe we may end up overusing linked* > just to ensure results stability. Wouldn't it be better is we made > the test results verification agnostic to order? Just some > thoughts... When the order of some of these things change, the query execution can change which can affect performance (though hopefully not correctness). I'd imagine we would want the order to remain stable so as not to have performance regressions. -- To view, visit http://gerrit.cloudera.org:8080/7073 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Iad8e1dccec3a51293a109c420bd2b88b9d1e0625 Gerrit-PatchSet: 5 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Lars Volker Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5137: [DOCS] Document TIMESTAMP for Kudu tables
Matthew Jacobs has posted comments on this change. Change subject: IMPALA-5137: [DOCS] Document TIMESTAMP for Kudu tables .. Patch Set 3: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/7035 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ib889198eb2c918c969c7613dd1ddf65a801f7926 Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: John Russell Gerrit-Reviewer: John Russell Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5355: Fix the order of Sentry roles and privileges
Impala Public Jenkins has posted comments on this change. Change subject: IMPALA-5355: Fix the order of Sentry roles and privileges .. Patch Set 5: Build started: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/692/ -- To view, visit http://gerrit.cloudera.org:8080/7004 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I7072e95b74952ce5a51ea1b6e2ae3e80fb0940e0 Gerrit-PatchSet: 5 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Taras Bobrovytsky Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Taras Bobrovytsky Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5355: Fix the order of Sentry roles and privileges
Taras Bobrovytsky has uploaded a new patch set (#5). Change subject: IMPALA-5355: Fix the order of Sentry roles and privileges .. IMPALA-5355: Fix the order of Sentry roles and privileges After a single Impalad is restarted, it is possible that order in which it receives roles and privileges from the statestore is incorrect. The correct order is for the role to appear first in the update, before the privilege that references it. If a user updates a role, its catalog version number can become larger than the catalog numbers of the privileges that reference it. This causes the role to come after the privilege in the initial metastore update. The issue is fixed by doing two passes over the catalog objects in the Impalad. The first pass updates the top level objects. The second pass updates the dependent objects Testing: - Added a test that reproduced the problem. Change-Id: I7072e95b74952ce5a51ea1b6e2ae3e80fb0940e0 --- M fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java M tests/authorization/test_grant_revoke.py M tests/common/impala_test_suite.py 3 files changed, 86 insertions(+), 17 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/04/7004/5 -- To view, visit http://gerrit.cloudera.org:8080/7004 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I7072e95b74952ce5a51ea1b6e2ae3e80fb0940e0 Gerrit-PatchSet: 5 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Taras Bobrovytsky Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Taras Bobrovytsky
[Impala-ASF-CR] IMPALA-5355: Fix the order of Sentry roles and privileges
Taras Bobrovytsky has posted comments on this change. Change subject: IMPALA-5355: Fix the order of Sentry roles and privileges .. Patch Set 5: Code-Review+2 Forwarding the +2. Thanks for the review! -- To view, visit http://gerrit.cloudera.org:8080/7004 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I7072e95b74952ce5a51ea1b6e2ae3e80fb0940e0 Gerrit-PatchSet: 5 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Taras Bobrovytsky Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Taras Bobrovytsky Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4965: Authorize access to runtime profile and exec summary
Alex Behm has posted comments on this change. Change subject: IMPALA-4965: Authorize access to runtime profile and exec summary .. Patch Set 3: (10 comments) Some ideas to discuss / think through. http://gerrit.cloudera.org:8080/#/c/7064/3/be/src/service/client-request-state.h File be/src/service/client-request-state.h: Line 151: const bool user_has_profile_access() const { single line if it fits Line 222: /// error. If base64_encoded is true, outputs the base64-encoded profile output, This API is confusing to me. Why does base64_encoded influence the auth behavior? Also an empty user will always be authorized right? http://gerrit.cloudera.org:8080/#/c/7064/3/be/src/service/impala-server.cc File be/src/service/impala-server.cc: Line 606: return request_state->GetRuntimeProfile(user, base64_encoded, output); We not have many different GetRuntimeProfile*() and GetExecSummary() functions. I'm wondering if we can do the auth check more directly here, for example, with a function Status CheckProfileAccess(user); The function could be implemented in both ClientRequestState and the QueryLogRecord. Apart from that, most of the existing code would not have to change. Another alternative along these lines is to have two CheckProfileAccess() functions in auth.h. One version would take a ClientRequestState and a user, the other would take a QueryLogRecord and the user. Happy to discuss the idea before you try it. http://gerrit.cloudera.org:8080/#/c/7064/3/be/src/service/impala-server.h File be/src/service/impala-server.h: Line 560: std::string limited_profile_str; Remove? http://gerrit.cloudera.org:8080/#/c/7064/3/fe/src/main/java/org/apache/impala/analysis/Analyzer.java File fe/src/main/java/org/apache/impala/analysis/Analyzer.java: Line 2535: public void registerPrivReq(PrivilegeRequest privReq) { As discussed, I think this logic can be further simplified and made more explicit. http://gerrit.cloudera.org:8080/#/c/7064/1/tests/authorization/test_authorization.py File tests/authorization/test_authorization.py: Line 92: def test_access_runtime_profile(self): Nice test! Line 135: execute_statement_req.statement = """create view if not exists tpch.customer_view as Can we put this into a different db? Might be good to make sure this view is dropped even if there is a failure. Line 137: execute_statement_resp = self.hs2_client.ExecuteStatement(execute_statement_req) also test that EXPLAIN works Line 145: # Repeat as a deleted user delegated Line 281: """Runs statement 'stmt' and retrieves the runtime profile and exec summary. If remove 'statement' -- To view, visit http://gerrit.cloudera.org:8080/7064 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I2255d587367c2d328590ae8534a5406c4b0c9b15 Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Dimitris Tsirogiannis Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Marcel Kornacker Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5355: Fix the order of Sentry roles and privileges
Hello Dimitris Tsirogiannis, Alex Behm, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/7004 to look at the new patch set (#5). Change subject: IMPALA-5355: Fix the order of Sentry roles and privileges .. IMPALA-5355: Fix the order of Sentry roles and privileges After a single Impalad is restarted, it is possible that order in which it receives roles and privileges from the statestore is incorrect. The correct order is for the role to appear first in the update, before the privilege that references it. If a user updates a role, its catalog version number can become larger than the catalog numbers of the privileges that reference it. This causes the role to come after the privilege in the initial metastore update. The issue is fixed by doing two passes over the catalog objects in the Impalad. The first pass updates the top level objects. The second pass updates the dependent objects Testing: - Added a test that reproduced the problem. Change-Id: I7072e95b74952ce5a51ea1b6e2ae3e80fb0940e0 --- M fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java M tests/authorization/test_grant_revoke.py M tests/common/impala_test_suite.py 3 files changed, 86 insertions(+), 17 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/04/7004/5 -- To view, visit http://gerrit.cloudera.org:8080/7004 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I7072e95b74952ce5a51ea1b6e2ae3e80fb0940e0 Gerrit-PatchSet: 5 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Taras Bobrovytsky Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Taras Bobrovytsky
[Impala-ASF-CR] IMPALA-3643/IMPALA-5344: Fix FE tests on Java 8
Dimitris Tsirogiannis has posted comments on this change. Change subject: IMPALA-3643/IMPALA-5344: Fix FE tests on Java 8 .. Patch Set 5: Many of these changes are essentially needed just for generating consistent tests results when we switch between different java versions. I wonder if this the right approach. We're essentially "forcing" the use of a data structure with different performance/memory tradeoffs than we had before. It may not have significant impact on performance/memory usage, but creates confusion. For example, I believe we may end up overusing linked* just to ensure results stability. Wouldn't it be better is we made the test results verification agnostic to order? Just some thoughts... -- To view, visit http://gerrit.cloudera.org:8080/7073 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Iad8e1dccec3a51293a109c420bd2b88b9d1e0625 Gerrit-PatchSet: 5 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Lars Volker Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5388: Only retry RPC on lost connection in send call
Dan Hecht has posted comments on this change. Change subject: IMPALA-5388: Only retry RPC on lost connection in send call .. Patch Set 7: (1 comment) http://gerrit.cloudera.org:8080/#/c/7063/7/be/src/rpc/thrift-util.cc File be/src/rpc/thrift-util.cc: Line 194:strstr(e.what(), "Socket send returned 0.") != nullptr); and we discussed, let's also handle the TTransportException::TIMED_OUT, "send timeout expired" case, and rename this function to IsSendInCompleteTException() or similar. -- To view, visit http://gerrit.cloudera.org:8080/7063 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I176975f2aa521d5be8a40de51067b1497923d09b Gerrit-PatchSet: 7 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Michael Ho Gerrit-Reviewer: Alan Choi Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Juan Yu Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5394: Set socket timeouts while opening TSaslTransport
John Sherman has posted comments on this change. Change subject: IMPALA-5394: Set socket timeouts while opening TSaslTransport .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/7061/2/be/src/transport/TSaslServerTransport.cpp File be/src/transport/TSaslServerTransport.cpp: PS2, Line 158: socket->setRecvTimeout(0); : socket->setSendTimeout(0); > Ah, that's what I was missing, thanks John. Still trying to make sure I und Sorry, reposting this comment so it's not all on one line: If I'm reading the thrift code correctly, fixing the lock doesn't really help the specific issue I've seen. The C++ thrift server implementation ends up calling getTransport on the same thread that does the accept of the connection. So if the getTransport implementation does reads/writes that ends up blocking, it prevents new connections from being accepted until the read/write completes or times out. We ran into this issue when a client connecting to the hs2_port started the SASL handshake then for some reason quits communicating with the server (the packets from the client to impalad started to not make it to impalad even the RST packet that the client sends after retrying). In this case all other connection requests to the hs2_port end up being queued until I think the tcp keepalive kicks in (I think the default time is 2 hours) or impalad is restarted. (The Java implementation of TThreadPoolServer puts the accept! ed connection on a different thread before calling getTransport https://github.com/apache/thrift/blob/master/lib/java/src/org/apache/thrift/server/TThreadPoolServer.java) If the C++ thrift implementation put the connection on a different thread before calling getTransport, fixing the locking would be helpful. C++ implementation: https://github.com/apache/thrift/blob/0.10.0/lib/cpp/src/thrift/server/TServerFramework.cpp lines 147-157 Backtrace ends up lookings something like this (from IMPALA-5268): #6 0x01b394d2 in apache::thrift::transport::TSSLSocket::read(unsigned char*, unsigned int) () No symbol table info available. #7 0x01b36003 in unsigned int apache::thrift::transport::readAll(apache::thrift::transport::TSocket&, unsigned char*, unsigned int) () No symbol table info available. #8 0x00b52e15 in apache::thrift::transport::TSaslTransport::receiveSaslMessage(apache::thrift::transport::NegotiationStatus*, unsigned int*) () No symbol table info available. #9 0x00b50733 in apache::thrift::transport::TSaslServerTransport::handleSaslStartMessage() () No symbol table info available. #10 0x00b52f9e in apache::thrift::transport::TSaslTransport::open() () No symbol table info available. #11 0x00b51431 in apache::thrift::transport::TSaslServerTransport::Factory::getTransport(boost::shared_ptr) () No symbol table info available. #12 0x01b4066d in apache::thrift::server::TThreadPoolServer::serve() () No symbol table info available. #13 0x009f2e60 in impala::ThriftServer::ThriftServerEventProcessor::Supervise() () No symbol table info available. -- To view, visit http://gerrit.cloudera.org:8080/7061 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I56a5f3d9cf931cff14eae7f236fea018236a6255 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: John Sherman Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: John Sherman Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Sailesh Mukil Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5355: Fix the order of Sentry roles and privileges
Taras Bobrovytsky has posted comments on this change. Change subject: IMPALA-5355: Fix the order of Sentry roles and privileges .. Patch Set 4: (2 comments) http://gerrit.cloudera.org:8080/#/c/7004/4//COMMIT_MSG Commit Message: PS4, Line 14: it's > its Done PS4, Line 17: We fix the issue by incrementing the catalog version of the : privileges when a role is updated > I believe the strategy has changed. Update the commit msg? Done -- To view, visit http://gerrit.cloudera.org:8080/7004 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I7072e95b74952ce5a51ea1b6e2ae3e80fb0940e0 Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Taras Bobrovytsky Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Taras Bobrovytsky Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5394: Set socket timeouts while opening TSaslTransport
John Sherman has posted comments on this change. Change subject: IMPALA-5394: Set socket timeouts while opening TSaslTransport .. Patch Set 2: > (1 comment) https://github.com/apache/thrift/blob/0.10.0/lib/cpp/src/thrift/server/TServerFramework.cpp lines 147-157 If I'm reading the thrift code correctly, fixing the lock doesn't really help the specific issue I've seen. The C++ thrift server implementation ends up calling getTransport on the same thread that does the accept of the connection. So if the getTransport implementation does reads/writes that ends up blocking, it prevents new connections from being accepted until the read/write completes or times out. We ran into this issue when a client connecting to the hs2_port started the SASL handshake then for some reason quits communicating with the server (the packets from the client to impalad started to not make it to impalad even the RST packet that the client sends after retrying). In this case all other connection requests to the hs2_port end up being queued until I think the tcp keepalive kicks in (I think the default time is 2 hours) or impalad is restarted. (The Java implementation of TThreadPoolServer puts the accepted connection on a different thread before calling getTran! sport https://github.com/apache/thrift/blob/master/lib/java/src/org/apache/thrift/server/TThreadPoolServer.java) If the C++ thrift implementation put the connection on a different thread before calling getTransport, fixing the locking would be helpful. Backtrace ends up lookings something like this (from IMPALA-5268): #6 0x01b394d2 in apache::thrift::transport::TSSLSocket::read(unsigned char*, unsigned int) () No symbol table info available. #7 0x01b36003 in unsigned int apache::thrift::transport::readAll(apache::thrift::transport::TSocket&, unsigned char*, unsigned int) () No symbol table info available. #8 0x00b52e15 in apache::thrift::transport::TSaslTransport::receiveSaslMessage(apache::thrift::transport::NegotiationStatus*, unsigned int*) () No symbol table info available. #9 0x00b50733 in apache::thrift::transport::TSaslServerTransport::handleSaslStartMessage() () No symbol table info available. #10 0x00b52f9e in apache::thrift::transport::TSaslTransport::open() () No symbol table info available. #11 0x00b51431 in apache::thrift::transport::TSaslServerTransport::Factory::getTransport(boost::shared_ptr) () No symbol table info available. #12 0x01b4066d in apache::thrift::server::TThreadPoolServer::serve() () No symbol table info available. #13 0x009f2e60 in impala::ThriftServer::ThriftServerEventProcessor::Supervise() () No symbol table info available. -- To view, visit http://gerrit.cloudera.org:8080/7061 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I56a5f3d9cf931cff14eae7f236fea018236a6255 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: John Sherman Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: John Sherman Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Sailesh Mukil Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-3548: Prune runtime filters based on query options in the FE
Attila Jeges has uploaded a new change for review. http://gerrit.cloudera.org:8080/7096 Change subject: IMPALA-3548: Prune runtime filters based on query options in the FE .. IMPALA-3548: Prune runtime filters based on query options in the FE Currently, the FE generates a number of runtime filters and assigns them to plan nodes without taking the value of RUNTIME_FILTER_MODE and DISABLE_ROW_RUNTIME_FILTERING query options into account. The backend then removes filters from exec nodes, based on the following rules: 1. If DISABLE_ROW_RUNTIME_FILTERING is set, filters are removed from the exec nodes that are marked as targets not bound by partition columns. 2. If RUNTIME_FILTER_MODE is set to LOCAL, filters are removed from the exec nodes that are marked as remote targets. This may cause some confusion to users because they may see runtime filters in the output of explain that are not applied when the query is executed. This change moves the logic of runtime filter pruning to the planner in the FE. Pruning based on the value of DISABLE_ROW_RUNTIME_FILTERING query option takes place as early as when runtime filters are assigned to the single node plan. Pruning based on RUNTIME_FILTER_MODE query option on the other hand has to wait until the distributed plan is ready (since it requires information about the locality of runtime filter targets). Change-Id: I422e48847428cab9887aee899fed47a8de95c323 --- M be/src/exec/hash-join-node.cc M be/src/exec/hdfs-scan-node-base.cc M fe/src/main/java/org/apache/impala/planner/PlanNode.java M fe/src/main/java/org/apache/impala/planner/Planner.java M fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java A testdata/workloads/functional-query/queries/QueryTest/runtime_filters_distrib_pruning.test A testdata/workloads/functional-query/queries/QueryTest/runtime_filters_singlenode_pruning.test M tests/query_test/test_runtime_filters.py 8 files changed, 665 insertions(+), 76 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/96/7096/2 -- To view, visit http://gerrit.cloudera.org:8080/7096 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I422e48847428cab9887aee899fed47a8de95c323 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Attila Jeges
[Impala-ASF-CR] IMPALA-5137: [DOCS] Document TIMESTAMP for Kudu tables
John Russell has uploaded a new patch set (#3). Change subject: IMPALA-5137: [DOCS] Document TIMESTAMP for Kudu tables .. IMPALA-5137: [DOCS] Document TIMESTAMP for Kudu tables Change-Id: Ib889198eb2c918c969c7613dd1ddf65a801f7926 --- M docs/shared/impala_common.xml M docs/topics/impala_authorization.xml M docs/topics/impala_datetime_functions.xml M docs/topics/impala_kudu.xml M docs/topics/impala_timestamp.xml 5 files changed, 154 insertions(+), 34 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/35/7035/3 -- To view, visit http://gerrit.cloudera.org:8080/7035 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ib889198eb2c918c969c7613dd1ddf65a801f7926 Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: John Russell Gerrit-Reviewer: John Russell Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Todd Lipcon
[Impala-ASF-CR] IMPALA-5061: Populate null count in parquet::statistics
Lars Volker has posted comments on this change. Change subject: IMPALA-5061: Populate null_count in parquet::statistics .. Patch Set 4: (13 comments) http://gerrit.cloudera.org:8080/#/c/7058/4//COMMIT_MSG Commit Message: Line 10: value is encountered by paquet table writer. The value is written nit typo http://gerrit.cloudera.org:8080/#/c/7058/4/be/src/exec/hdfs-parquet-table-writer.cc File be/src/exec/hdfs-parquet-table-writer.cc: Line 197: if(page_stats_base_ != nullptr) { nit: single line http://gerrit.cloudera.org:8080/#/c/7058/2/be/src/exec/parquet-column-stats.h File be/src/exec/parquet-column-stats.h: Line 79: static bool ReadCountFromThrift( > If the return type is int64, what would the return value be in case the sca For frequently called functions it's often better to return a bool instead of the more heavy-weight Status. In this case you would probably pass an int64_t* to return the value. Let's remove the method here altogether and re-introduce it in a change that handles reading the stats. http://gerrit.cloudera.org:8080/#/c/7058/4/be/src/exec/parquet-column-stats.h File be/src/exec/parquet-column-stats.h: PS4, Line 63: NULL_COUNT remove PS4, Line 105: row column? PS4, Line 105: Initilizes typo PS4, Line 116: NULL nit: lowercase http://gerrit.cloudera.org:8080/#/c/7058/4/be/src/exec/parquet-column-stats.inline.h File be/src/exec/parquet-column-stats.inline.h: Line 32: inline void ColumnStatsBase::UpdateNullCount() { It looks like this will fit into one line in the header. If so, please move it there. Line 186: if (cs->has_values_) { You could also change the interface of Update() to take a min and max parameter and then clean up this method by calling it. You don't have to do it in this change though if you think it's too much. Line 205: null_count_ += cs->null_count_; If you add a "int64_t count" parameter to the UpdateNullCount method, you can call it from here. http://gerrit.cloudera.org:8080/#/c/7058/2/tests/query_test/test_insert_parquet.py File tests/query_test/test_insert_parquet.py: Line 427: ColumnStats('bigint_col', 0, 90, 0), > Done Cool :) http://gerrit.cloudera.org:8080/#/c/7058/4/tests/query_test/test_insert_parquet.py File tests/query_test/test_insert_parquet.py: Line 324: null_count = None How about moving this to L330, initialize it from stats.null_count and then check it's not None? Line 480: """Test that we don't write min/max statistics for null columns. Ensure null_count Please remove the trailing whitespace. You can also set-up git in a way that it warns about trailing whitespace during commit time. Even better, you may be able to configure your editor to highlight it for you. -- To view, visit http://gerrit.cloudera.org:8080/7058 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I4c49a63af84c2234f0633be63206cb52eb7e8ebb Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Pooja Nilangekar Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Pooja Nilangekar Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5137: [DOCS] Document TIMESTAMP for Kudu tables
John Russell has posted comments on this change. Change subject: IMPALA-5137: [DOCS] Document TIMESTAMP for Kudu tables .. Patch Set 2: (2 comments) http://gerrit.cloudera.org:8080/#/c/7035/2/docs/shared/impala_common.xml File docs/shared/impala_common.xml: PS2, Line 3521: values in Kudu tables. > Kudu datatype. Done http://gerrit.cloudera.org:8080/#/c/7035/2/docs/topics/impala_kudu.xml File docs/topics/impala_kudu.xml: Line 975: partition '2015-01-01' <= values < '2016-01-01', > still some tabs here :) Doh. I also found a stray one in an unrelated file by grepping, which I'll also fix. -- To view, visit http://gerrit.cloudera.org:8080/7035 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ib889198eb2c918c969c7613dd1ddf65a801f7926 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: John Russell Gerrit-Reviewer: John Russell Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5259: [DOCS] Doc REFRESH FUNCTIONS
John Russell has posted comments on this change. Change subject: IMPALA-5259: [DOCS] Doc REFRESH FUNCTIONS .. Patch Set 2: (2 comments) http://gerrit.cloudera.org:8080/#/c/7044/2/docs/shared/impala_common.xml File docs/shared/impala_common.xml: PS2, Line 1213: which can be > It's not possible to add C++ udfs through Hive. I would start the sentence Done PS2, Line 1213: to > Remove this word Done -- To view, visit http://gerrit.cloudera.org:8080/7044 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ic41fec781396b69e6df06b8de0b29c42ad51ce8f Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: John Russell Gerrit-Reviewer: John Russell Gerrit-Reviewer: Taras Bobrovytsky Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5259: [DOCS] Doc REFRESH FUNCTIONS
John Russell has uploaded a new patch set (#3). Change subject: IMPALA-5259: [DOCS] Doc REFRESH FUNCTIONS .. IMPALA-5259: [DOCS] Doc REFRESH FUNCTIONS The text from impala_common.xml is reused verbatim under the REFRESH page and in the UDFs page by a #include-like mechanism. Change-Id: Ic41fec781396b69e6df06b8de0b29c42ad51ce8f --- M docs/shared/impala_common.xml M docs/topics/impala_refresh.xml M docs/topics/impala_udf.xml 3 files changed, 23 insertions(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/44/7044/3 -- To view, visit http://gerrit.cloudera.org:8080/7044 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ic41fec781396b69e6df06b8de0b29c42ad51ce8f Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: John Russell Gerrit-Reviewer: Taras Bobrovytsky