[Impala-ASF-CR] IMPALA-2373: Extrapolate row counts for HDFS tables.
Dimitris Tsirogiannis has posted comments on this change. Change subject: IMPALA-2373: Extrapolate row counts for HDFS tables. .. Patch Set 3: (8 comments) http://gerrit.cloudera.org:8080/#/c/6840/2/fe/src/main/java/org/apache/impala/catalog/Table.java File fe/src/main/java/org/apache/impala/catalog/Table.java: PS2, Line 77: > Created in c'tor. There is no constructor that allows setting both values t Better, thanks. http://gerrit.cloudera.org:8080/#/c/6840/3/fe/src/main/java/org/apache/impala/catalog/View.java File fe/src/main/java/org/apache/impala/catalog/View.java: PS3, Line 118: tableStats_ = new TTableStats(-1); : tableStats_.setTotal_file_bytes(-1); I believe you don't need it now that you put the initialization in Table's ctor. http://gerrit.cloudera.org:8080/#/c/6840/3/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java File fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java: PS3, Line 617: computeStats(Analyzer analyzer) { Significantly better. Thanks PS3, Line 655: statsNumRows_, totalBytes_ There are class members, why are you passing them as parameter? Is this function called somewhere else? PS3, Line 673: are ignored "because...". You may want to briefly mention why this is the case. PS3, Line 683: extrapolatedNumRows_ = tbl_.getExtrapolatedNumRows(totalBytes); This doesn't necessarily belong in this function. I'd move it to L654 since at this point totalBytes_ has been computed. http://gerrit.cloudera.org:8080/#/c/6840/3/tests/metadata/test_explain.py File tests/metadata/test_explain.py: PS3, Line 127: 50 Hm, I would expect this to be '100' since the query doesn't filter any partitions and '50' if the query was something like "select .. where p = 1". PS3, Line 131: nit: extra space -- To view, visit http://gerrit.cloudera.org:8080/6840 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I972c8a03ed70211734631a7dc9085cb33622ebc4 Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Alex Behm Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Marcel Kornacker Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5342: Add comments of loaded tables in the response of GetTables
Dimitris Tsirogiannis has posted comments on this change. Change subject: IMPALA-5342: Add comments of loaded tables in the response of GetTables .. Patch Set 1: (3 comments) http://gerrit.cloudera.org:8080/#/c/6933/1/fe/src/main/java/org/apache/impala/service/MetadataOp.java File fe/src/main/java/org/apache/impala/service/MetadataOp.java: Line 218: // comments[i][j] are the comments of tableNames[j] in dbs[i]. > is the comment Done Line 291: comment = table.getMetaStoreTable().getParameters().get("comment"); > Is there an HMS constant for this? Couldn't find any. Let me know if you've seen one. http://gerrit.cloudera.org:8080/#/c/6933/1/fe/src/test/java/org/apache/impala/service/FrontendTest.java File fe/src/test/java/org/apache/impala/service/FrontendTest.java: Line 184: Db testDb = addTestDb(dbName, "Stores tables with comments"); > Should we fix this for DB comments as well? Or does that already work as ex Good question. Actually, the HS2 API doesn't seem to include the comment in the GetSchemas request, so I am not sure we should include it. -- To view, visit http://gerrit.cloudera.org:8080/6933 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I61f327168a93ceb4bd60b47474f39bfa405ae07d Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Dimitris Tsirogiannis Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5342: Add comments of loaded tables in the response of GetTables
Dimitris Tsirogiannis has uploaded a new patch set (#2). Change subject: IMPALA-5342: Add comments of loaded tables in the response of GetTables .. IMPALA-5342: Add comments of loaded tables in the response of GetTables This commit changes the response of HiveServer2 GetTables request to return the comments (if any) of loaded tables. For unloaded tables or for tables with no comments an empty string is returned. Testing: - Added a new Frontend test. Change-Id: I61f327168a93ceb4bd60b47474f39bfa405ae07d --- M fe/src/main/java/org/apache/impala/service/MetadataOp.java M fe/src/test/java/org/apache/impala/service/FrontendTest.java 2 files changed, 48 insertions(+), 6 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/33/6933/2 -- To view, visit http://gerrit.cloudera.org:8080/6933 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I61f327168a93ceb4bd60b47474f39bfa405ae07d Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Dimitris Tsirogiannis Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dimitris Tsirogiannis
[Impala-ASF-CR] Ninja support: use MAKE CMD in run-backend-tests.sh when possible
Jim Apple has uploaded a new change for review. http://gerrit.cloudera.org:8080/6942 Change subject: Ninja support: use MAKE_CMD in run-backend-tests.sh when possible .. Ninja support: use MAKE_CMD in run-backend-tests.sh when possible Without this, buildall.sh -ninja fails to run the backend tests or runs them with the Makefiles that were created when buildall.sh was last run without the -ninja flag. Change-Id: Idb920dd4b08d8ef5fbc0bf1ea1b424a0c544e1db --- M bin/run-backend-tests.sh 1 file changed, 1 insertion(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/42/6942/1 -- To view, visit http://gerrit.cloudera.org:8080/6942 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: Idb920dd4b08d8ef5fbc0bf1ea1b424a0c544e1db Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Jim Apple
[Impala-ASF-CR] IMPALA-4164: Use inline hints instead of always-inline
Michael Ho has uploaded a new patch set (#2). Change subject: IMPALA-4164: Use inline hints instead of always-inline .. IMPALA-4164: Use inline hints instead of always-inline When generating IR functions during codegen, we used to always tag the functions with the "AlwaysInline" attribute. That potentially leads to excessive inlining, leading to very long optimization / compilation time with marginal performance benefit at runtime. One of the reasons for doing it was that the "target-cpu" and "target-features" attributes are missing in the generated IR functions so the LLVM inliner considers them incompatible with the cross-compiled functions. As a result, the inliner will not inline the generated IR functions into cross-compiled functions unless we put the "AlwaysInline" attributes with the generated IR functions. This change fixes the problem above by adding the "target-cpu" and "target-features" attributes to the generated IR functions. In addition, we now switch to using "InlineHint" attribute in place of "AlwaysInline" and let LLVM make the decision of whether it's sensible to inline a function. With this change, the codegen time of a query with a long predicate went from 15s to 4s and the overall runtime went from 19s to 8s. Change-Id: I2d87ae8d222b415587e7320cb9072e4a8d6615ce --- M be/src/codegen/llvm-codegen.cc M be/src/codegen/llvm-codegen.h A testdata/workloads/targeted-perf/queries/primitive_long_predicate.test 3 files changed, 308 insertions(+), 6 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/41/6941/2 -- To view, visit http://gerrit.cloudera.org:8080/6941 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I2d87ae8d222b415587e7320cb9072e4a8d6615ce Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Michael Ho
[Impala-ASF-CR] IMPALA-4164: Use inline hints instead of always-inline
Michael Ho has uploaded a new change for review. http://gerrit.cloudera.org:8080/6941 Change subject: IMPALA-4164: Use inline hints instead of always-inline .. IMPALA-4164: Use inline hints instead of always-inline When generating IR functions during codegen, we used to always tag the functions with the "AlwaysInline" attribute. That potentially leads to excessive inlining, leading to very long optimization / compilation time with marginal performance benefit at runtime. One of the reasons for doing it was that the "target-cpu" and "target-features" attributes are missing in the generated IR functions so the LLVM inliner considers them incompatible with the cross-compiled functions. As a result, the inliner will not inline the generated IR functions into cross-compiled functions unless we put the "AlwaysInline" attributes with the generated IR functions. This change fixes the problem above by adding the "target-cpu" and "target-features" attributes to the generated IR functions. In addition, we now switch to using "InlineHint" attribute in place of "AlwaysInline" and let LLVM make the decision of whether it's sensible to inline a function. With this change, the codegen time of a query with a long predicate went from 15s to 4s and the overall runtime went from 19s to 8s. Change-Id: I2d87ae8d222b415587e7320cb9072e4a8d6615ce --- M be/src/codegen/llvm-codegen.cc M be/src/codegen/llvm-codegen.h A testdata/workloads/targeted-perf/queries/primitive_long_predicate.test 3 files changed, 307 insertions(+), 6 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/41/6941/1 -- To view, visit http://gerrit.cloudera.org:8080/6941 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I2d87ae8d222b415587e7320cb9072e4a8d6615ce Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Michael Ho
[Impala-ASF-CR] IMPALA-5164: Fix flaky benchmarks
Zach Amsden has uploaded a new patch set (#2). Change subject: IMPALA-5164: Fix flaky benchmarks .. IMPALA-5164: Fix flaky benchmarks Improve benchmarks by detecting involuntary context switches. If a server is heavily loaded due to some other jobs running, benchmarking will not be reliable. We can detect this by using getrusage() to measure context switches. If 90% or more of benchmark time is lost due to switching, we abort the benchmark, but catch the failure and exit silently. We subtract out the time during which we might have been switched out from the total result time, and only count runs which were uninterrupted. If 10% of the result is valid, that seems barely legitimate, but we will warn when anything less than 50% of the benchmark ran without switching. Testing: ran the free-lists-benchmark until the first test passed and then started "make -j 60" on an Impala directory. Got this result: E0519 23:35:41.482010 8766 benchmark.cc:161] Benchmark failed to complete due to context switching. W0519 23:35:41.548840 8766 benchmark.cc:162] Exiting silently from FreeLists 0 iters on 128 kb to avoid spurious test failure. Also, fixed two benchmarks that had crashes on start due to static initialization order problems and missing pieces of initialization. Some benchmarks are still crashing (expr-benchmark and network-perf-be), but those will require a lot more work to fix. Change-Id: I7e0ddfaa9bd1340990ae2b6ff946d72bbd26c274 --- M be/src/benchmarks/free-lists-benchmark.cc M be/src/benchmarks/hash-benchmark.cc M be/src/util/benchmark.cc 3 files changed, 82 insertions(+), 32 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/35/6935/2 -- To view, visit http://gerrit.cloudera.org:8080/6935 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I7e0ddfaa9bd1340990ae2b6ff946d72bbd26c274 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Zach Amsden Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zach Amsden