[Impala-ASF-CR] IMPALA-4939, IMPALA-4940: Decimal V2 multiplication
Taras Bobrovytsky has posted comments on this change. ( http://gerrit.cloudera.org:8080/7438 ) Change subject: IMPALA-4939, IMPALA-4940: Decimal V2 multiplication .. Patch Set 8: I made some additional changes in patch 8 as a result of thinking about your comment. Dan, can you take a look one more time? -- To view, visit http://gerrit.cloudera.org:8080/7438 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I37ad6232d7953bd75c18dc86e665b2b501a1ebe1 Gerrit-Change-Number: 7438 Gerrit-PatchSet: 8 Gerrit-Owner: Taras BobrovytskyGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zach Amsden Gerrit-Comment-Date: Tue, 03 Oct 2017 02:55:19 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4939, IMPALA-4940: Decimal V2 multiplication
Taras Bobrovytsky has uploaded a new patch set (#8). ( http://gerrit.cloudera.org:8080/7438 ) Change subject: IMPALA-4939, IMPALA-4940: Decimal V2 multiplication .. IMPALA-4939, IMPALA-4940: Decimal V2 multiplication Implement the new DECIMAL return type rules for multiply expressions, active when query option DECIMAL_V2=1. The algorithm for determining the type of the result of multiplication is described in the JIRA. DECIMAL V1: +---+ | typeof(cast('0.1' as decimal(38,38)) * cast('0.1' as decimal(38,38))) | +---+ | DECIMAL(38,38)| +---+ +---+ | typeof(cast('0.1' as decimal(38,15)) * cast('0.1' as decimal(38,15))) | +---+ | DECIMAL(38,30)| +---+ DECIMAL V2: +---+ | typeof(cast('0.1' as decimal(38,38)) * cast('0.1' as decimal(38,38))) | +---+ | DECIMAL(38,37)| +---+ +---+ | typeof(cast('0.1' as decimal(38,15)) * cast('0.1' as decimal(38,15))) | +---+ | DECIMAL(38,6) | +---+ In this patch, we also fix the early multiplication overflow. We compute a 256 bit integer intermediate value, which we then attempt to scale down and round. Performance: I ran TPCH 300 and TPCDS 1000 workloads and the performance is almost identical. For TPCH Q1, there was an improvement from 21 seconds to 16 seconds. I did not see any regressions. The performance improvement is due to the way we check for overflows after this patch (by counting the leading zeros instead of dividing). It can be clealy seen in this query: select cast(2.2 as decimal(38, 1)) * cast(2.2 as decimal(38, 1)) before: 7.85s after: 2.03s I noticed performance regressions in the following cases: - When we need to convert to a 256 bit integer before multiplying, which was introduced in this patch. Whether this happens depends on the resulting precision and the value of the inputs. In the following extreme case, the intermediate value is converted to a 256 bit integer every time. select cast(1.1 as decimal(38, 37)) * cast(1.1 as decimal(38, 37)) before: 14.56s (returns null) after: 126.17s - When we need to scale down the intermediate value. In the following query the result is decimal(38,6) after the patch, so the intermediate needs to be scaled down. select cast(2.2 as decimal(38,1)) * cast(2.2 as decimal(38,19)) before: 7.25s after: 13.06s These regressions are possible only when the resulting precision is 38 which is not common in typical workloads. Note: The actual queries that I ran for the benchmark are not exactly as above. I constructed tables with millions of rows with those values. I ran the queries with DECIMAL_v2=1 option before and after the patch. Change-Id: I37ad6232d7953bd75c18dc86e665b2b501a1ebe1 --- M be/src/exprs/expr-test.cc M be/src/runtime/decimal-value.inline.h M be/src/util/bit-util.h M fe/src/main/java/org/apache/impala/analysis/TypesUtil.java 4 files changed, 332 insertions(+), 59 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/38/7438/8 -- To view, visit http://gerrit.cloudera.org:8080/7438 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I37ad6232d7953bd75c18dc86e665b2b501a1ebe1 Gerrit-Change-Number: 7438 Gerrit-PatchSet: 8 Gerrit-Owner: Taras BobrovytskyGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zach Amsden
[Impala-ASF-CR] IMPALA-4939, IMPALA-4940: Decimal V2 multiplication
Taras Bobrovytsky has posted comments on this change. ( http://gerrit.cloudera.org:8080/7438 ) Change subject: IMPALA-4939, IMPALA-4940: Decimal V2 multiplication .. Patch Set 7: (1 comment) http://gerrit.cloudera.org:8080/#/c/7438/7/be/src/runtime/decimal-value.inline.h File be/src/runtime/decimal-value.inline.h: http://gerrit.cloudera.org:8080/#/c/7438/7/be/src/runtime/decimal-value.inline.h@249 PS7, Line 249: } else { > maybe a quick comment: Yes, your understanding is correct. Added comment. -- To view, visit http://gerrit.cloudera.org:8080/7438 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I37ad6232d7953bd75c18dc86e665b2b501a1ebe1 Gerrit-Change-Number: 7438 Gerrit-PatchSet: 7 Gerrit-Owner: Taras BobrovytskyGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zach Amsden Gerrit-Comment-Date: Tue, 03 Oct 2017 02:53:19 + Gerrit-HasComments: Yes
[native-toolchain-CR] Bump Kudu version to bec2a24
Alex Behm has posted comments on this change. ( http://gerrit.cloudera.org:8080/8192 ) Change subject: Bump Kudu version to bec2a24 .. Patch Set 1: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/8192 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: native-toolchain Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1d1416e475488db580a4f9cb8ed2882440d9abcd Gerrit-Change-Number: 8192 Gerrit-PatchSet: 1 Gerrit-Owner: Thomas Tauber-MarshallGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Comment-Date: Tue, 03 Oct 2017 02:44:40 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4786: Clean up how ImpalaServers are created
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/8076 ) Change subject: IMPALA-4786: Clean up how ImpalaServers are created .. IMPALA-4786: Clean up how ImpalaServers are created ImpalaServer had to be created via an awkward CreateImpalaServer() method that took a lot of arguments. This patch refactors that code (in anticipation of some KRPC changes) to follow a more normal lifecycle: 1. ImpalaServer* server = new ImpalaServer(ExecEnv*) 2. RETURN_IF_ERROR(server->Init()) // for error-returning init operations 3. RETURN_IF_ERROR(server->Start()) 4. server->Join() Also add ExecEnv::Init(), and move calls to ExecEnv::StartServices() to ImpalaServer::StartServices(). This captures a dependency that KRPC will rely on - where initialization of both ExecEnv and ImpalaServer need to happen before services are started. This sets up a clean-up of InProcessImpalaServer, which is too heavy for the work that it does. That work is deferred to a follow-on patch. This is a slightly cleaned up version of Henry's abandoned patch: https://gerrit.cloudera.org/#/c/7673/ Change-Id: If388c5618258a9c4529cd1d63e956566b92bd0d8 Reviewed-on: http://gerrit.cloudera.org:8080/8076 Reviewed-by: Sailesh MukilTested-by: Impala Public Jenkins --- M be/src/exprs/expr-test.cc M be/src/runtime/exec-env.cc M be/src/runtime/exec-env.h M be/src/scheduling/scheduler.cc M be/src/service/impala-server.cc M be/src/service/impala-server.h M be/src/service/impalad-main.cc M be/src/testutil/in-process-servers.cc M be/src/testutil/in-process-servers.h M be/src/util/hdfs-util-test.cc 10 files changed, 128 insertions(+), 137 deletions(-) Approvals: Sailesh Mukil: Looks good to me, approved Impala Public Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/8076 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: If388c5618258a9c4529cd1d63e956566b92bd0d8 Gerrit-Change-Number: 8076 Gerrit-PatchSet: 7 Gerrit-Owner: Sailesh Mukil Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil
[Impala-ASF-CR] IMPALA-4786: Clean up how ImpalaServers are created
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/8076 ) Change subject: IMPALA-4786: Clean up how ImpalaServers are created .. Patch Set 6: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/8076 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If388c5618258a9c4529cd1d63e956566b92bd0d8 Gerrit-Change-Number: 8076 Gerrit-PatchSet: 6 Gerrit-Owner: Sailesh MukilGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Tue, 03 Oct 2017 02:20:16 + Gerrit-HasComments: No
[native-toolchain-CR] Bump Kudu version to bec2a24
Thomas Tauber-Marshall has posted comments on this change. ( http://gerrit.cloudera.org:8080/8192 ) Change subject: Bump Kudu version to bec2a24 .. Patch Set 1: http://unittest.jenkins.cloudera.com/job/verify-impala-toolchain-package-build/473/ -- To view, visit http://gerrit.cloudera.org:8080/8192 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: native-toolchain Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1d1416e475488db580a4f9cb8ed2882440d9abcd Gerrit-Change-Number: 8192 Gerrit-PatchSet: 1 Gerrit-Owner: Thomas Tauber-MarshallGerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Comment-Date: Tue, 03 Oct 2017 01:38:33 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4736: Add SIGUSR1 behavior to help string for 'minidump path' flag
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/8164 ) Change subject: IMPALA-4736: Add SIGUSR1 behavior to help string for 'minidump_path' flag .. Patch Set 6: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/8164 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I8f4da8ff12c10e4a84a339ddb4036d1682b3a2a0 Gerrit-Change-Number: 8164 Gerrit-PatchSet: 6 Gerrit-Owner: Csaba RinghoferGerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 03 Oct 2017 01:33:33 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4736: Add SIGUSR1 behavior to help string for 'minidump path' flag
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/8164 ) Change subject: IMPALA-4736: Add SIGUSR1 behavior to help string for 'minidump_path' flag .. IMPALA-4736: Add SIGUSR1 behavior to help string for 'minidump_path' flag Change-Id: I8f4da8ff12c10e4a84a339ddb4036d1682b3a2a0 Reviewed-on: http://gerrit.cloudera.org:8080/8164 Reviewed-by: Dan HechtTested-by: Impala Public Jenkins --- M be/src/common/global-flags.cc 1 file changed, 4 insertions(+), 3 deletions(-) Approvals: Dan Hecht: Looks good to me, approved Impala Public Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/8164 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I8f4da8ff12c10e4a84a339ddb4036d1682b3a2a0 Gerrit-Change-Number: 8164 Gerrit-PatchSet: 7 Gerrit-Owner: Csaba Ringhofer Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] Re-apply: IMPALA-5589: change "set" in impala-shell to show empty string for unset query options
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/8096 ) Change subject: Re-apply: IMPALA-5589: change "set" in impala-shell to show empty string for unset query options .. Re-apply: IMPALA-5589: change "set" in impala-shell to show empty string for unset query options (Re-applies reverted commit 387bde0639ffd8ef580ccbf727152954e62bacbe. The commit broke ASAN tests due to a race in how test infrastructure re-uses connections. The fix for that is in an adjacent commit.) When converting TQueryOptions to a map, we now convert unset options to the empty string. Within TQueryOptions we have optional options (like mt_dop or compression_codec) with no default specified. In this case, the user was seeing 0 for numeric types and the first enum option for enumeration types (e.g., "NONE" in the compression case). This was confusing as the implementation handles this "null" case differently (e.g., using SNAPPY as the default codec in the case reported in the JIRA). When running "set" in impala-shell, the difference is as follows: -BUFFER_POOL_LIMIT: [0] +BUFFER_POOL_LIMIT: [] -COMPRESSION_CODEC: [NONE] +COMPRESSION_CODEC: [] -MT_DOP: [0] +MT_DOP: [] -RESERVATION_REQUEST_TIMEOUT: [0] +RESERVATION_REQUEST_TIMEOUT: [] -SEQ_COMPRESSION_MODE: [0] +SEQ_COMPRESSION_MODE: [] -V_CPU_CORES: [0] +V_CPU_CORES: [] Obviously, the empty string is a valid value for a string-typed option, where it will be impossible to tell the difference between "unset" and "set to empty string." Today, there are two string-typed options: debug_string defaults to "" and request_pool has no default. An alternative would have been to use a special token like "_unset" or to introduce a new field in the beeswax Thrift ConfigVariable struct. I think the empty string approach is clearest. The other users of this state, which I believe are HiveServer2's OpenSession() call and HiveServer2's response to a "SET" query are affected. They benefit from the same fix, and a new test has been added to test_hs2.py. I did a mild refactoring in the HS2 tests to write a helper method for the very common pattern of excecuting a query. Testing: * Manual testing with impala-shell * Modified impala-shell tests to check this explicitly for one case. * Modified HS2 test to check this as well as the SET k=v statement, which looked otherwise untested. Change-Id: I29f5d8ab874cb1338077f16019a9537766cac0c4 Reviewed-on: http://gerrit.cloudera.org:8080/8096 Reviewed-by: Dan Hecht Tested-by: Impala Public Jenkins --- M be/src/service/query-options-test.cc M be/src/service/query-options.cc M be/src/service/query-options.h M testdata/workloads/functional-query/queries/QueryTest/set.test M tests/hs2/hs2_test_suite.py M tests/hs2/test_hs2.py M tests/shell/test_shell_commandline.py 7 files changed, 106 insertions(+), 60 deletions(-) Approvals: Dan Hecht: Looks good to me, approved Impala Public Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/8096 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I29f5d8ab874cb1338077f16019a9537766cac0c4 Gerrit-Change-Number: 8096 Gerrit-PatchSet: 5 Gerrit-Owner: Philip Zeyliger Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Philip Zeyliger
[Impala-ASF-CR] Re-apply: IMPALA-5589: change "set" in impala-shell to show empty string for unset query options
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/8096 ) Change subject: Re-apply: IMPALA-5589: change "set" in impala-shell to show empty string for unset query options .. Patch Set 4: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/8096 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I29f5d8ab874cb1338077f16019a9537766cac0c4 Gerrit-Change-Number: 8096 Gerrit-PatchSet: 4 Gerrit-Owner: Philip ZeyligerGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Philip Zeyliger Gerrit-Comment-Date: Tue, 03 Oct 2017 01:11:49 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5525 Extend TestScannersFuzzing to test uncompressed parquet
Hello Joe McDonnell, Tim Armstrong, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8056 to look at the new patch set (#6). Change subject: IMPALA-5525 Extend TestScannersFuzzing to test uncompressed parquet .. IMPALA-5525 Extend TestScannersFuzzing to test uncompressed parquet test_scanners_fuzz.py currently tests compressed parquet but does not test uncompressed parquet. This fix adds a new test case for uncompressed parquet. Change-Id: I760de7203a51cf82b16016fa8043cadc7c8325bc --- M tests/query_test/test_scanners_fuzz.py 1 file changed, 43 insertions(+), 16 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/56/8056/6 -- To view, visit http://gerrit.cloudera.org:8080/8056 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I760de7203a51cf82b16016fa8043cadc7c8325bc Gerrit-Change-Number: 8056 Gerrit-PatchSet: 6 Gerrit-Owner: Pranay Singh Gerrit-Reviewer: Joe McDonnellGerrit-Reviewer: Pranay Singh Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-5525 Extend TestScannersFuzzing to test uncompressed parquet
Pranay Singh has posted comments on this change. ( http://gerrit.cloudera.org:8080/8056 ) Change subject: IMPALA-5525 Extend TestScannersFuzzing to test uncompressed parquet .. Patch Set 5: (2 comments) http://gerrit.cloudera.org:8080/#/c/8056/5/tests/query_test/test_scanners_fuzz.py File tests/query_test/test_scanners_fuzz.py: http://gerrit.cloudera.org:8080/#/c/8056/5/tests/query_test/test_scanners_fuzz.py@103 PS5, Line 103: with no compression. > Nit: I would emphasize that this has a new table. Something like: "into a n Done http://gerrit.cloudera.org:8080/#/c/8056/5/tests/query_test/test_scanners_fuzz.py@110 PS5, Line 110: for orig_tbl_name in tbl_list: : src_table_name = "parquet_uncomp_src_" + orig_tbl_name : dst_table_name = "parquet_uncomp_dst_" + orig_tbl_name > Nit: Done -- To view, visit http://gerrit.cloudera.org:8080/8056 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I760de7203a51cf82b16016fa8043cadc7c8325bc Gerrit-Change-Number: 8056 Gerrit-PatchSet: 5 Gerrit-Owner: Pranay Singh Gerrit-Reviewer: Joe McDonnellGerrit-Reviewer: Pranay Singh Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 03 Oct 2017 00:57:05 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5394: Change ThriftServer() to always use TAcceptQueueServer
John Sherman has posted comments on this change. ( http://gerrit.cloudera.org:8080/7061 ) Change subject: IMPALA-5394: Change ThriftServer() to always use TAcceptQueueServer .. Patch Set 10: Ok kicked off the job and used your comment suggestion. -- To view, visit http://gerrit.cloudera.org:8080/7061 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I56a5f3d9cf931cff14eae7f236fea018236a6255 Gerrit-Change-Number: 7061 Gerrit-PatchSet: 10 Gerrit-Owner: John ShermanGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: John Sherman Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Tue, 03 Oct 2017 00:20:19 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5394: Change ThriftServer() to always use TAcceptQueueServer
Hello Henry Robinson, Matthew Jacobs, Sailesh Mukil, Dan Hecht, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/7061 to look at the new patch set (#10). Change subject: IMPALA-5394: Change ThriftServer() to always use TAcceptQueueServer .. IMPALA-5394: Change ThriftServer() to always use TAcceptQueueServer - Previously TThreadPoolServer called getTransport() on a client from the Server thread (the thread that did the accepts). - TSaslServerTransport->getTransport() called TSaslTransport->open() - TSaslServerTransport->open() tried to negotiate SASL which calls read/write - If read/write blocks indefinitely, the TThreadPoolServer could not accept connections until tcp_keepalive kicked in. - Set the underlying TSocket's recvTimeout and sendTimeout before the TSaslServerTransport->open() and reset them to 0 after open() completes. - Added sasl_connect_tcp_timeout_ms flag that defaults to 30 milliseconds (5 minutes) - Add the ability for TAcceptQueueServer to limit the maximum number of concurrent tasks - Added a test case to thrift-server-test to test max_concurrent_connections enforcement - Changed the remaining Thrift servers to use TAcceptQueueServer. (hs2/beeswax/network-perf-benchmark) - The timeout is still needed in TAcceptQueueServer since SetupConnection follows a similar pattern that TThreadPoolServer does. - Removed support for TThreadPool from ThriftServer() since it is no longer used anywhere. ThriftServer() now always uses TAcceptQueueServer. - Deprecated enable_accept_queue_server flag and removed supporting code. Change-Id: I56a5f3d9cf931cff14eae7f236fea018236a6255 --- M be/src/benchmarks/network-perf-benchmark.cc M be/src/common/global-flags.cc M be/src/rpc/TAcceptQueueServer.cpp M be/src/rpc/TAcceptQueueServer.h M be/src/rpc/thrift-server-test.cc M be/src/rpc/thrift-server.cc M be/src/rpc/thrift-server.h M be/src/service/impala-server.cc M be/src/transport/TSaslServerTransport.cpp M common/thrift/metrics.json 10 files changed, 158 insertions(+), 112 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/61/7061/10 -- To view, visit http://gerrit.cloudera.org:8080/7061 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I56a5f3d9cf931cff14eae7f236fea018236a6255 Gerrit-Change-Number: 7061 Gerrit-PatchSet: 10 Gerrit-Owner: John ShermanGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: John Sherman Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Sailesh Mukil
[Impala-ASF-CR] IMPALA-5394: Change ThriftServer() to always use TAcceptQueueServer
Sailesh Mukil has posted comments on this change. ( http://gerrit.cloudera.org:8080/7061 ) Change subject: IMPALA-5394: Change ThriftServer() to always use TAcceptQueueServer .. Patch Set 9: (1 comment) Yes, a run of the gerrit-verify-dryrun-external is a good idea. If that passes, we can +2 this patch and merge it. I just have one nit review about a code comment otherwise. http://gerrit.cloudera.org:8080/#/c/7061/9/be/src/transport/TSaslServerTransport.cpp File be/src/transport/TSaslServerTransport.cpp: http://gerrit.cloudera.org:8080/#/c/7061/9/be/src/transport/TSaslServerTransport.cpp@162 PS9, Line 162: // It would be possible for the transport to be created multiple times if this method : // was called concurrently on the same 'trans' object. The current usage of this : // method is that it is always called serially. Instead of saying it's always called serially, I think this comment is clearer: "This method should never be called concurrently with the same 'trans' object. Therefore, it is safe to drop the transportMap_mutex_ here." -- To view, visit http://gerrit.cloudera.org:8080/7061 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I56a5f3d9cf931cff14eae7f236fea018236a6255 Gerrit-Change-Number: 7061 Gerrit-PatchSet: 9 Gerrit-Owner: John ShermanGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: John Sherman Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Tue, 03 Oct 2017 00:09:49 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5990: Part 1: JNI-based LZ4 de/compression
Alex Behm has posted comments on this change. ( http://gerrit.cloudera.org:8080/8150 ) Change subject: IMPALA-5990: Part 1: JNI-based LZ4 de/compression .. Patch Set 6: Code-Review+1 Carry +1 -- To view, visit http://gerrit.cloudera.org:8080/8150 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I237802944875b07080db0159ff8ec548150fd95e Gerrit-Change-Number: 8150 Gerrit-PatchSet: 6 Gerrit-Owner: Alex BehmGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Philip Zeyliger Gerrit-Comment-Date: Mon, 02 Oct 2017 23:49:36 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5990: Part 1: JNI-based LZ4 de/compression
Hello Bharath Vissapragada, Philip Zeyliger, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8150 to look at the new patch set (#6). Change subject: IMPALA-5990: Part 1: JNI-based LZ4 de/compression .. IMPALA-5990: Part 1: JNI-based LZ4 de/compression Adds LZ4 de/compression in FeSupport geared towards eventually compressing catalog objects end-to-end in their journey from the catalogd over the statestored to coordinator impalads. A follow-on change will use the new LZ4 functions to do the actual compression. Testing: - Added a new unit test Change-Id: I237802944875b07080db0159ff8ec548150fd95e --- M be/src/service/fe-support.cc M be/src/util/jni-util.cc M be/src/util/jni-util.h M fe/src/main/java/org/apache/impala/service/FeSupport.java A fe/src/test/java/org/apache/impala/service/Lz4CompressionTest.java 5 files changed, 411 insertions(+), 10 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/50/8150/6 -- To view, visit http://gerrit.cloudera.org:8080/8150 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I237802944875b07080db0159ff8ec548150fd95e Gerrit-Change-Number: 8150 Gerrit-PatchSet: 6 Gerrit-Owner: Alex BehmGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Philip Zeyliger
[Impala-ASF-CR] IMPALA-5990: Part 1: JNI-based LZ4 de/compression
Alex Behm has posted comments on this change. ( http://gerrit.cloudera.org:8080/8150 ) Change subject: IMPALA-5990: Part 1: JNI-based LZ4 de/compression .. Patch Set 5: (4 comments) http://gerrit.cloudera.org:8080/#/c/8150/5/be/src/service/fe-support.cc File be/src/service/fe-support.cc: http://gerrit.cloudera.org:8080/#/c/8150/5/be/src/service/fe-support.cc@457 PS5, Line 457: // It does not matter if 'src' is copied because we only read from it. > nit: May be document src_off, dst_off and return value for readability. Documented return values. I think the 'off' and 'len' params are standard enough to not need extra comments. http://gerrit.cloudera.org:8080/#/c/8150/5/be/src/service/fe-support.cc@481 PS5, Line 481: env->SetByteArrayRegion(dst, 0, len, reinterpret_cast(dst_data)); > Check the return value of this? Given it made a copy, its likely that this Function returns void. Added code to handle exceptions. http://gerrit.cloudera.org:8080/#/c/8150/5/be/src/util/jni-util.h File be/src/util/jni-util.h: http://gerrit.cloudera.org:8080/#/c/8150/5/be/src/util/jni-util.h@184 PS5, Line 184: !is_copy_ && > My understanding is that we should call ReleasePrimitiveArrayCritical() irr Good catch, that was a bug. Fixed. http://gerrit.cloudera.org:8080/#/c/8150/5/be/src/util/jni-util.h@192 PS5, Line 192: ; > WARN_UNUSED_RESULT Done -- To view, visit http://gerrit.cloudera.org:8080/8150 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I237802944875b07080db0159ff8ec548150fd95e Gerrit-Change-Number: 8150 Gerrit-PatchSet: 5 Gerrit-Owner: Alex Behm Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Philip Zeyliger Gerrit-Comment-Date: Mon, 02 Oct 2017 23:49:16 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4236: Codegen CopyRows() for select nodes
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/8196 ) Change subject: IMPALA-4236: Codegen CopyRows() for select nodes .. Patch Set 1: (12 comments) http://gerrit.cloudera.org:8080/#/c/8196/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/8196/1//COMMIT_MSG@8 PS1, Line 8: Can you add a testing section? It would be good to mention what test coverage we have for this code path (it's certainly possible that we have some gaps). http://gerrit.cloudera.org:8080/#/c/8196/1//COMMIT_MSG@9 PS1, Line 9: Preformance Performance http://gerrit.cloudera.org:8080/#/c/8196/1//COMMIT_MSG@10 PS1, Line 10: perdicates predicates http://gerrit.cloudera.org:8080/#/c/8196/1//COMMIT_MSG@11 PS1, Line 11: [column_name > value AND] Can you include a concrete example of the query to make it easier for others to reproduce? http://gerrit.cloudera.org:8080/#/c/8196/1//COMMIT_MSG@18 PS1, Line 18: | Select Node | 12s385ms | 1m1s| 234ms | 797ms | Nice! http://gerrit.cloudera.org:8080/#/c/8196/1/be/src/exec/select-node-ir.cc File be/src/exec/select-node-ir.cc: http://gerrit.cloudera.org:8080/#/c/8196/1/be/src/exec/select-node-ir.cc@29 PS1, Line 29: child_row_batch_->num_rows() This probably doesn't matter, but you could save a few instructions by hoisting this out of the loop. E.g. int child_batch_num_rows = ...; Pretty sure the compiler can't infer that num_rows() is constant across loop iterations (since 'this' and 'this->child_row_batch_' could alias memory that's written in the loop so it will end up chasing the pointers on every iteration. http://gerrit.cloudera.org:8080/#/c/8196/1/be/src/exec/select-node-ir.cc@41 PS1, Line 41: COUNTER_SET(rows_returned_counter_, num_rows_returned_); COUNTER_SET can be pretty expensive because of the atomic operation. We can actually move it out of this function entirely into the caller. http://gerrit.cloudera.org:8080/#/c/8196/1/be/src/exec/select-node.cc File be/src/exec/select-node.cc: http://gerrit.cloudera.org:8080/#/c/8196/1/be/src/exec/select-node.cc@55 PS1, Line 55: NULL Use nullptr in new code (here and below). http://gerrit.cloudera.org:8080/#/c/8196/1/be/src/exec/select-node.cc@65 PS1, Line 65: if (codegen_status.ok()) { Instead of this branch the control flow may be easier to follow if you factor out the CopyRows codegen logic into a separate Status-returning function that returns early on an error. This will be more true if you start checking FinalizeFunction() below. http://gerrit.cloudera.org:8080/#/c/8196/1/be/src/exec/select-node.cc@71 PS1, Line 71: DCHECK(copy_rows_fn != NULL); We should probably check if this is NULL and fail codegen if so. I'm not sure if we do this everywhere in the code but we probably should be doing so (although it isn't super-important, since we shouldn't be failing verification in the first place). (The invariants around this function are a bit wonky at the moment since we don't really have good testing around this, but FinalizeFunction() is documented as returning NULL in some cases). http://gerrit.cloudera.org:8080/#/c/8196/1/be/src/exec/select-node.cc@99 PS1, Line 99: bool copy_rows_success = false; Just declare it inside the loop where it's needed? When I see it declared up here I tend to think that the value lives across loop iterations. http://gerrit.cloudera.org:8080/#/c/8196/1/be/src/exec/select-node.cc@114 PS1, Line 114: copy_rows_success = false; It's assigned on both branches so this isn't necessary. I think some people consider this good defensive coding practice but it seems unnecessary to me. -- To view, visit http://gerrit.cloudera.org:8080/8196 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ie0d496d004418468e16b6f564f90f45ebbf87c1e Gerrit-Change-Number: 8196 Gerrit-PatchSet: 1 Gerrit-Owner: Bikramjeet VigGerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Mon, 02 Oct 2017 23:40:05 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5394: Change ThriftServer() to always use TAcceptQueueServer
John Sherman has posted comments on this change. ( http://gerrit.cloudera.org:8080/7061 ) Change subject: IMPALA-5394: Change ThriftServer() to always use TAcceptQueueServer .. Patch Set 9: I've done local testing which caught a regression on my patch due to a change in how thread pools are initialized now. Should I run this through the new gerrit-verify-dryrun-external Jenkins job? -- To view, visit http://gerrit.cloudera.org:8080/7061 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I56a5f3d9cf931cff14eae7f236fea018236a6255 Gerrit-Change-Number: 7061 Gerrit-PatchSet: 9 Gerrit-Owner: John ShermanGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: John Sherman Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Mon, 02 Oct 2017 23:23:33 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5990: Part 1: JNI-based LZ4 de/compression
Bharath Vissapragada has posted comments on this change. ( http://gerrit.cloudera.org:8080/8150 ) Change subject: IMPALA-5990: Part 1: JNI-based LZ4 de/compression .. Patch Set 5: Code-Review+1 (4 comments) The patch looks good to me once the comments are fixed. http://gerrit.cloudera.org:8080/#/c/8150/5/be/src/service/fe-support.cc File be/src/service/fe-support.cc: http://gerrit.cloudera.org:8080/#/c/8150/5/be/src/service/fe-support.cc@457 PS5, Line 457: // It does not matter if 'src' is copied because we only read from it. nit: May be document src_off, dst_off and return value for readability. http://gerrit.cloudera.org:8080/#/c/8150/5/be/src/service/fe-support.cc@481 PS5, Line 481: env->SetByteArrayRegion(dst, 0, len, reinterpret_cast(dst_data)); Check the return value of this? Given it made a copy, its likely that this call can fail too? http://gerrit.cloudera.org:8080/#/c/8150/5/be/src/util/jni-util.h File be/src/util/jni-util.h: http://gerrit.cloudera.org:8080/#/c/8150/5/be/src/util/jni-util.h@184 PS5, Line 184: !is_copy_ && My understanding is that we should call ReleasePrimitiveArrayCritical() irrespective of whether a copy is made. My reasoning is that it seems to be unblocking the GC thread, which I think is required. http://gerrit.cloudera.org:8080/#/c/8150/5/be/src/util/jni-util.h@192 PS5, Line 192: ; WARN_UNUSED_RESULT -- To view, visit http://gerrit.cloudera.org:8080/8150 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I237802944875b07080db0159ff8ec548150fd95e Gerrit-Change-Number: 8150 Gerrit-PatchSet: 5 Gerrit-Owner: Alex Behm Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Philip Zeyliger Gerrit-Comment-Date: Mon, 02 Oct 2017 23:19:59 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4236: Codegen CopyRows() for select nodes
Bikramjeet Vig has uploaded this change for review. ( http://gerrit.cloudera.org:8080/8196 Change subject: IMPALA-4236: Codegen CopyRows() for select nodes .. IMPALA-4236: Codegen CopyRows() for select nodes Preformance: Query run on tpch_parquet.lineitem with perdicates of the form [column_name > value AND] +--+-+ | | 500 Predicates | 1 Predicate| | ++-++-+ | | After| Before| After| Before| +--++-++-+ | Select Node | 12s385ms | 1m1s| 234ms | 797ms | | Codegen time | 2s619ms| 1s962ms | 200ms | 181ms | +--++-++-+ Change-Id: Ie0d496d004418468e16b6f564f90f45ebbf87c1e --- M be/src/codegen/gen_ir_descriptions.py M be/src/codegen/impala-ir.cc M be/src/exec/CMakeLists.txt A be/src/exec/select-node-ir.cc M be/src/exec/select-node.cc M be/src/exec/select-node.h 6 files changed, 96 insertions(+), 27 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/96/8196/1 -- To view, visit http://gerrit.cloudera.org:8080/8196 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: Ie0d496d004418468e16b6f564f90f45ebbf87c1e Gerrit-Change-Number: 8196 Gerrit-PatchSet: 1 Gerrit-Owner: Bikramjeet Vig
[Impala-ASF-CR] IMPALA-5394: Change ThriftServer() to always use TAcceptQueueServer
Hello Henry Robinson, Matthew Jacobs, Sailesh Mukil, Dan Hecht, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/7061 to look at the new patch set (#9). Change subject: IMPALA-5394: Change ThriftServer() to always use TAcceptQueueServer .. IMPALA-5394: Change ThriftServer() to always use TAcceptQueueServer - Previously TThreadPoolServer called getTransport() on a client from the Server thread (the thread that did the accepts). - TSaslServerTransport->getTransport() called TSaslTransport->open() - TSaslServerTransport->open() tried to negotiate SASL which calls read/write - If read/write blocks indefinitely, the TThreadPoolServer could not accept connections until tcp_keepalive kicked in. - Set the underlying TSocket's recvTimeout and sendTimeout before the TSaslServerTransport->open() and reset them to 0 after open() completes. - Added sasl_connect_tcp_timeout_ms flag that defaults to 30 milliseconds (5 minutes) - Add the ability for TAcceptQueueServer to limit the maximum number of concurrent tasks - Added a test case to thrift-server-test to test max_concurrent_connections enforcement - Changed the remaining Thrift servers to use TAcceptQueueServer. (hs2/beeswax/network-perf-benchmark) - The timeout is still needed in TAcceptQueueServer since SetupConnection follows a similar pattern that TThreadPoolServer does. - Removed support for TThreadPool from ThriftServer() since it is no longer used anywhere. ThriftServer() now always uses TAcceptQueueServer. - Deprecated enable_accept_queue_server flag and removed supporting code. Change-Id: I56a5f3d9cf931cff14eae7f236fea018236a6255 --- M be/src/benchmarks/network-perf-benchmark.cc M be/src/common/global-flags.cc M be/src/rpc/TAcceptQueueServer.cpp M be/src/rpc/TAcceptQueueServer.h M be/src/rpc/thrift-server-test.cc M be/src/rpc/thrift-server.cc M be/src/rpc/thrift-server.h M be/src/service/impala-server.cc M be/src/transport/TSaslServerTransport.cpp M common/thrift/metrics.json 10 files changed, 159 insertions(+), 112 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/61/7061/9 -- To view, visit http://gerrit.cloudera.org:8080/7061 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I56a5f3d9cf931cff14eae7f236fea018236a6255 Gerrit-Change-Number: 7061 Gerrit-PatchSet: 9 Gerrit-Owner: John ShermanGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: John Sherman Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Sailesh Mukil
[Impala-ASF-CR] IMPALA-5383: [DOCS] Document unpartitioned Kudu tables
John Russell has posted comments on this change. ( http://gerrit.cloudera.org:8080/8180 ) Change subject: IMPALA-5383: [DOCS] Document unpartitioned Kudu tables .. Patch Set 3: (1 comment) Thomas, good catch. I rearranged the CDATA [ ] delimiters to allow tags in the first part while preserving the < characters unescaped in the second part. Also had to rebase to make the substitution resolve. http://gerrit.cloudera.org:8080/#/c/8180/3/docs/topics/impala_create_table.xml File docs/topics/impala_create_table.xml: http://gerrit.cloudera.org:8080/#/c/8180/3/docs/topics/impala_create_table.xml@483 PS3, Line 483: -- Single partition. Only for > When I built this locally, these tags were present in the output. Done -- To view, visit http://gerrit.cloudera.org:8080/8180 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia2b466e1e482d62de84253c0cb406668fd5ad5eb Gerrit-Change-Number: 8180 Gerrit-PatchSet: 3 Gerrit-Owner: John RussellGerrit-Reviewer: Ambreen Kazi Gerrit-Reviewer: John Russell Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Mon, 02 Oct 2017 23:11:04 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5383: [DOCS] Document unpartitioned Kudu tables
Hello Thomas Tauber-Marshall, Ambreen Kazi, Todd Lipcon, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8180 to look at the new patch set (#4). Change subject: IMPALA-5383: [DOCS] Document unpartitioned Kudu tables .. IMPALA-5383: [DOCS] Document unpartitioned Kudu tables Change-Id: Ia2b466e1e482d62de84253c0cb406668fd5ad5eb --- M docs/impala_keydefs.ditamap M docs/topics/impala_create_table.xml 2 files changed, 25 insertions(+), 8 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/80/8180/4 -- To view, visit http://gerrit.cloudera.org:8080/8180 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ia2b466e1e482d62de84253c0cb406668fd5ad5eb Gerrit-Change-Number: 8180 Gerrit-PatchSet: 4 Gerrit-Owner: John RussellGerrit-Reviewer: Ambreen Kazi Gerrit-Reviewer: John Russell Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Todd Lipcon
[Impala-ASF-CR] IMPALA-5425: Add test for validating input when setting query options
Hello Lars Volker, Tim Armstrong, Alex Behm, Dan Hecht, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/7805 to look at the new patch set (#13). Change subject: IMPALA-5425: Add test for validating input when setting query options .. IMPALA-5425: Add test for validating input when setting query options This patch adds multiple query option validation testcases to be/src/service/query-options-test.cc The test cases include parsing edge cases, bondary values, special cases for some options and some testcases moved from testdata/workloads/functional-query/queries/QueryTest/set.test This patch also fixes a bug generating wrong error message for query option RUNTIME_FILTER_WAIT_TIME_MS. Change-Id: I510e02bb0776673d8cbfc22b903831882c6908d7 --- M be/src/service/query-options-test.cc M be/src/service/query-options.cc M testdata/workloads/functional-query/queries/QueryTest/set.test 3 files changed, 301 insertions(+), 127 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/05/7805/13 -- To view, visit http://gerrit.cloudera.org:8080/7805 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I510e02bb0776673d8cbfc22b903831882c6908d7 Gerrit-Change-Number: 7805 Gerrit-PatchSet: 13 Gerrit-Owner: Tianyi WangGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-4736: Add SIGUSR1 behavior to help string for 'minidump path' flag
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/8164 ) Change subject: IMPALA-4736: Add SIGUSR1 behavior to help string for 'minidump_path' flag .. Patch Set 6: Code-Review+2 Carrying forward Lars' +2. -- To view, visit http://gerrit.cloudera.org:8080/8164 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I8f4da8ff12c10e4a84a339ddb4036d1682b3a2a0 Gerrit-Change-Number: 8164 Gerrit-PatchSet: 6 Gerrit-Owner: Csaba RinghoferGerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Mon, 02 Oct 2017 22:32:22 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4939, IMPALA-4940: Decimal V2 multiplication
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/7438 ) Change subject: IMPALA-4939, IMPALA-4940: Decimal V2 multiplication .. Patch Set 7: Code-Review+2 (2 comments) http://gerrit.cloudera.org:8080/#/c/7438/6//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/7438/6//COMMIT_MSG@67 PS6, Line 67: after: 126.17s > No, not always. Look at line 73 in this file. Yeah, but that's "just" 2x, as opposed to 10x. http://gerrit.cloudera.org:8080/#/c/7438/7/be/src/runtime/decimal-value.inline.h File be/src/runtime/decimal-value.inline.h: http://gerrit.cloudera.org:8080/#/c/7438/7/be/src/runtime/decimal-value.inline.h@249 PS7, Line 249: } else { maybe a quick comment: // We've verified the intermediate value will fit in int128. is my understanding correct? -- To view, visit http://gerrit.cloudera.org:8080/7438 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I37ad6232d7953bd75c18dc86e665b2b501a1ebe1 Gerrit-Change-Number: 7438 Gerrit-PatchSet: 7 Gerrit-Owner: Taras BobrovytskyGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zach Amsden Gerrit-Comment-Date: Mon, 02 Oct 2017 22:29:16 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5425: Add test for validating input when setting query options
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/7805 ) Change subject: IMPALA-5425: Add test for validating input when setting query options .. Patch Set 11: (1 comment) http://gerrit.cloudera.org:8080/#/c/7805/11/testdata/workloads/functional-query/queries/QueryTest/set.test File testdata/workloads/functional-query/queries/QueryTest/set.test: http://gerrit.cloudera.org:8080/#/c/7805/11/testdata/workloads/functional-query/queries/QueryTest/set.test@a145 PS11, Line 145: > Error message is not verified. Do you suggest leaving them here or testing Maybe we should leave these then. I forget, are these generic messages produced by the query option code itself, or specific to each individual option? -- To view, visit http://gerrit.cloudera.org:8080/7805 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I510e02bb0776673d8cbfc22b903831882c6908d7 Gerrit-Change-Number: 7805 Gerrit-PatchSet: 11 Gerrit-Owner: Tianyi WangGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Mon, 02 Oct 2017 22:22:20 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5425: Add test for validating input when setting query options
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/7805 ) Change subject: IMPALA-5425: Add test for validating input when setting query options .. Patch Set 12: (2 comments) http://gerrit.cloudera.org:8080/#/c/7805/12/be/src/service/query-options-test.cc File be/src/service/query-options-test.cc: http://gerrit.cloudera.org:8080/#/c/7805/12/be/src/service/query-options-test.cc@70 PS12, Line 70: if 'str' would be successfully read as expected : // value. I don't understand that. Maybe this should say: Create a shortcut function to verify that the given query option has the 'expected_value' value after setting the 'option_def' option to 'str'. or something like that? http://gerrit.cloudera.org:8080/#/c/7805/12/be/src/service/query-options-test.cc@86 PS12, Line 86: // Create a shortcut function to test if 'str' would result into a failure. I think that should be reworded as per above. -- To view, visit http://gerrit.cloudera.org:8080/7805 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I510e02bb0776673d8cbfc22b903831882c6908d7 Gerrit-Change-Number: 7805 Gerrit-PatchSet: 12 Gerrit-Owner: Tianyi WangGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Mon, 02 Oct 2017 22:20:44 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4939, IMPALA-4940: Decimal V2 multiplication
Taras Bobrovytsky has uploaded a new patch set (#7). ( http://gerrit.cloudera.org:8080/7438 ) Change subject: IMPALA-4939, IMPALA-4940: Decimal V2 multiplication .. IMPALA-4939, IMPALA-4940: Decimal V2 multiplication Implement the new DECIMAL return type rules for multiply expressions, active when query option DECIMAL_V2=1. The algorithm for determining the type of the result of multiplication is described in the JIRA. DECIMAL V1: +---+ | typeof(cast('0.1' as decimal(38,38)) * cast('0.1' as decimal(38,38))) | +---+ | DECIMAL(38,38)| +---+ +---+ | typeof(cast('0.1' as decimal(38,15)) * cast('0.1' as decimal(38,15))) | +---+ | DECIMAL(38,30)| +---+ DECIMAL V2: +---+ | typeof(cast('0.1' as decimal(38,38)) * cast('0.1' as decimal(38,38))) | +---+ | DECIMAL(38,37)| +---+ +---+ | typeof(cast('0.1' as decimal(38,15)) * cast('0.1' as decimal(38,15))) | +---+ | DECIMAL(38,6) | +---+ In this patch, we also fix the early multiplication overflow. We compute a 256 bit integer intermediate value, which we then attempt to scale down and round. Performance: I ran TPCH 300 and TPCDS 1000 workloads and the performance is almost identical. For TPCH Q1, there was an improvement from 21 seconds to 16 seconds. I did not see any regressions. The performance improvement is due to the way we check for overflows after this patch (by counting the leading zeros instead of dividing). It can be clealy seen in this query: select cast(2.2 as decimal(38, 1)) * cast(2.2 as decimal(38, 1)) before: 7.85s after: 2.03s I noticed performance regressions in the following cases: - When we need to convert to a 256 bit integer before multiplying, which was introduced in this patch. Whether this happens depends on the resulting precision and the value of the inputs. In the following extreme case, the intermediate value is converted to a 256 bit integer every time. select cast(1.1 as decimal(38, 37)) * cast(1.1 as decimal(38, 37)) before: 14.56s (returns null) after: 126.17s - When we need to scale down the intermediate value. In the following query the result is decimal(38,6) after the patch, so the intermediate needs to be scaled down. select cast(2.2 as decimal(38,1)) * cast(2.2 as decimal(38,19)) before: 7.25s after: 13.06s These regressions are possible only when the resulting precision is 38 which is not common in typical workloads. Note: The actual queries that I ran for the benchmark are not exactly as above. I constructed tables with millions of rows with those values. I ran the queries with DECIMAL_v2=1 option before and after the patch. Change-Id: I37ad6232d7953bd75c18dc86e665b2b501a1ebe1 --- M be/src/exprs/expr-test.cc M be/src/runtime/decimal-value.inline.h M be/src/util/bit-util.h M fe/src/main/java/org/apache/impala/analysis/TypesUtil.java 4 files changed, 292 insertions(+), 59 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/38/7438/7 -- To view, visit http://gerrit.cloudera.org:8080/7438 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I37ad6232d7953bd75c18dc86e665b2b501a1ebe1 Gerrit-Change-Number: 7438 Gerrit-PatchSet: 7 Gerrit-Owner: Taras BobrovytskyGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zach Amsden
[Impala-ASF-CR] IMPALA-4786: Clean up how ImpalaServers are created
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/8076 ) Change subject: IMPALA-4786: Clean up how ImpalaServers are created .. Patch Set 6: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1287/ -- To view, visit http://gerrit.cloudera.org:8080/8076 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If388c5618258a9c4529cd1d63e956566b92bd0d8 Gerrit-Change-Number: 8076 Gerrit-PatchSet: 6 Gerrit-Owner: Sailesh MukilGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Mon, 02 Oct 2017 22:16:52 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4786: Clean up how ImpalaServers are created
Sailesh Mukil has posted comments on this change. ( http://gerrit.cloudera.org:8080/8076 ) Change subject: IMPALA-4786: Clean up how ImpalaServers are created .. Patch Set 6: Code-Review+2 Hit IMPALA-5999 twice in a row. Trying GVO again. Rebase, carry +2. -- To view, visit http://gerrit.cloudera.org:8080/8076 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If388c5618258a9c4529cd1d63e956566b92bd0d8 Gerrit-Change-Number: 8076 Gerrit-PatchSet: 6 Gerrit-Owner: Sailesh MukilGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Mon, 02 Oct 2017 22:16:36 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5529: [DOCS] New trunc() signatures
Greg Rahn has posted comments on this change. ( http://gerrit.cloudera.org:8080/8189 ) Change subject: IMPALA-5529: [DOCS] New trunc() signatures .. Patch Set 3: > (1 comment) There should be TRUNC(timestamp) and TRUNC(number) -- Oracle docs have exactly this. https://docs.oracle.com/database/121/SQLRF/functions237.htm#SQLRF06150 https://docs.oracle.com/database/121/SQLRF/functions236.htm#SQLRF06151 -- To view, visit http://gerrit.cloudera.org:8080/8189 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ice4753dee4f7b8e09c35508a9cad1e36f4ab2826 Gerrit-Change-Number: 8189 Gerrit-PatchSet: 3 Gerrit-Owner: John RussellGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Greg Rahn Gerrit-Reviewer: John Russell Gerrit-Reviewer: Kim Jin Chul Gerrit-Comment-Date: Mon, 02 Oct 2017 22:15:18 + Gerrit-HasComments: No
[Impala-ASF-CR] [DOCS] Explain Boost setting needed for 96-bit timestamps
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/7983 ) Change subject: [DOCS] Explain Boost setting needed for 96-bit timestamps .. Patch Set 2: If we're not going to move this forward can we abandon it? -- To view, visit http://gerrit.cloudera.org:8080/7983 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I4b67cd7762f682c3a054e0d9641080aa51801c83 Gerrit-Change-Number: 7983 Gerrit-PatchSet: 2 Gerrit-Owner: John RussellGerrit-Reviewer: John Russell Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Mon, 02 Oct 2017 22:13:41 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5525 Extend TestScannersFuzzing to test uncompressed parquet
Joe McDonnell has posted comments on this change. ( http://gerrit.cloudera.org:8080/8056 ) Change subject: IMPALA-5525 Extend TestScannersFuzzing to test uncompressed parquet .. Patch Set 5: (2 comments) I'm close to a +1 on this. Only minor thoughts left for me. I just invited Tim to the review. http://gerrit.cloudera.org:8080/#/c/8056/5/tests/query_test/test_scanners_fuzz.py File tests/query_test/test_scanners_fuzz.py: http://gerrit.cloudera.org:8080/#/c/8056/5/tests/query_test/test_scanners_fuzz.py@103 PS5, Line 103: with no compression. Nit: I would emphasize that this has a new table. Something like: "into a new table with no compression" http://gerrit.cloudera.org:8080/#/c/8056/5/tests/query_test/test_scanners_fuzz.py@110 PS5, Line 110: for orig_tbl_name in tbl_list: : src_table_name = "parquet_uncomp_src_" + orig_tbl_name : dst_table_name = "parquet_uncomp_dst_" + orig_tbl_name Nit: I think I would prefer to have "dst_table_name" be "fuzz_table_name". Also, in other code, I think it would be good to be explicit about this being the fuzz_db and fuzz_table rather than dst_db and dst_table. -- To view, visit http://gerrit.cloudera.org:8080/8056 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I760de7203a51cf82b16016fa8043cadc7c8325bc Gerrit-Change-Number: 8056 Gerrit-PatchSet: 5 Gerrit-Owner: Pranay Singh Gerrit-Reviewer: Joe McDonnellGerrit-Reviewer: Pranay Singh Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Mon, 02 Oct 2017 22:05:50 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5425: Add test for validating input when setting query options
Tianyi Wang has posted comments on this change. ( http://gerrit.cloudera.org:8080/7805 ) Change subject: IMPALA-5425: Add test for validating input when setting query options .. Patch Set 12: (16 comments) http://gerrit.cloudera.org:8080/#/c/7805/11/be/src/service/query-options-test.cc File be/src/service/query-options-test.cc: http://gerrit.cloudera.org:8080/#/c/7805/11/be/src/service/query-options-test.cc@59 PS11, Line 59: Enu > what's the reasoning behind the terminology "Enum" here? oh, is it that th Done http://gerrit.cloudera.org:8080/#/c/7805/11/be/src/service/query-options-test.cc@68 PS11, Line 68: > what's that? It's an outdated comment and is updated. http://gerrit.cloudera.org:8080/#/c/7805/11/be/src/service/query-options-test.cc@71 PS11, Line 71: > is that modified? if yes, we generally use pointer instead of reference. If Done http://gerrit.cloudera.org:8080/#/c/7805/11/be/src/service/query-options-test.cc@78 PS11, Line 78: template > same comments Done http://gerrit.cloudera.org:8080/#/c/7805/11/be/src/service/query-options-test.cc@86 PS11, Line 86: // Create a shortcut function to test if 'str' would result into a failure. > document what this test case is verifying. Done http://gerrit.cloudera.org:8080/#/c/7805/11/be/src/service/query-options-test.cc@92 PS11, Line 92: T> > what is a "byte case"? oh, maybe it means query option that take a number Done http://gerrit.cloudera.org:8080/#/c/7805/11/be/src/service/query-options-test.cc@94 PS11, Line 94: [options, > document what that is Done http://gerrit.cloudera.org:8080/#/c/7805/11/be/src/service/query-options-test.cc@95 PS11, Line 95: ueryOption(opt > same comment as above Done http://gerrit.cloudera.org:8080/#/c/7805/11/be/src/service/query-options-test.cc@132 PS11, Line 132: : TestError("1%"); > that's the explanation I think is missing above. Done http://gerrit.cloudera.org:8080/#/c/7805/11/be/src/service/query-options-test.cc@161 PS11, Line 161:{M > explain what it means to "test" an enum. Done http://gerrit.cloudera.org:8080/#/c/7805/11/be/src/service/query-options-test.cc@161 PS11, Line 161: can_range_length), {-1, I64_MAX}} > what is that trying to convey? It tries to explain the parameter accept_integer. Now documented in params. http://gerrit.cloudera.org:8080/#/c/7805/11/be/src/service/query-options-test.cc@162 PS11, Line 162: {MAKE_OPTIONDEF(rm_initial_mem),{-1, I64_MAX}}, : {MAKE_OPTIONDEF(buffer_pool_limit), {-1, I64_MAX}}, : {MAKE_OPTIONDEF(max_ > document the params (include template param). Done http://gerrit.cloudera.org:8080/#/c/7805/11/be/src/service/query-options-test.cc@218 PS11, Line 218: / ENTRY(_, TPrefetch > let's be clearer about what we verify. Done http://gerrit.cloudera.org:8080/#/c/7805/11/be/src/service/query-options-test.cc@243 PS11, Line 243: estEnumCase(, CASE(compression_codec, > same I'm not sure how to clarify what we verify here. The rules are described in the comments below and there isn't much to summary. http://gerrit.cloudera.org:8080/#/c/7805/11/testdata/workloads/functional-query/queries/QueryTest/set.test File testdata/workloads/functional-query/queries/QueryTest/set.test: http://gerrit.cloudera.org:8080/#/c/7805/11/testdata/workloads/functional-query/queries/QueryTest/set.test@a145 PS11, Line 145: > does the new unit test really give this same coverage? does it verify these Error message is not verified. Do you suggest leaving them here or testing them in query-option-test.cc? If we leave them here we might add verification of other error messages as well. Testing them in query-option-test involves redesigning it completely. http://gerrit.cloudera.org:8080/#/c/7805/11/testdata/workloads/functional-query/queries/QueryTest/set.test@a293 PS11, Line 293: > same question See above. -- To view, visit http://gerrit.cloudera.org:8080/7805 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I510e02bb0776673d8cbfc22b903831882c6908d7 Gerrit-Change-Number: 7805 Gerrit-PatchSet: 12 Gerrit-Owner: Tianyi WangGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Mon, 02 Oct 2017 22:03:24 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5425: Add test for validating input when setting query options
Tianyi Wang has uploaded a new patch set (#12). ( http://gerrit.cloudera.org:8080/7805 ) Change subject: IMPALA-5425: Add test for validating input when setting query options .. IMPALA-5425: Add test for validating input when setting query options This patch adds multiple query option validation testcases to be/src/service/query-options-test.cc The test cases include parsing edge cases, bondary values, special cases for some options and some testcases moved from testdata/workloads/functional-query/queries/QueryTest/set.test This patch also fixes a bug generating wrong error message for query option RUNTIME_FILTER_WAIT_TIME_MS. Change-Id: I510e02bb0776673d8cbfc22b903831882c6908d7 --- M be/src/service/query-options-test.cc M be/src/service/query-options.cc M testdata/workloads/functional-query/queries/QueryTest/set.test 3 files changed, 301 insertions(+), 154 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/05/7805/12 -- To view, visit http://gerrit.cloudera.org:8080/7805 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I510e02bb0776673d8cbfc22b903831882c6908d7 Gerrit-Change-Number: 7805 Gerrit-PatchSet: 12 Gerrit-Owner: Tianyi WangGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-5664: Unix time to timestamp conversions may crash Impala
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/7954 ) Change subject: IMPALA-5664: Unix time to timestamp conversions may crash Impala .. Patch Set 9: (4 comments) http://gerrit.cloudera.org:8080/#/c/7954/9/be/src/exprs/timestamp-functions-ir.cc File be/src/exprs/timestamp-functions-ir.cc: http://gerrit.cloudera.org:8080/#/c/7954/9/be/src/exprs/timestamp-functions-ir.cc@136 PS9, Line 136: if(!tv.HasDateAndTime()){ here an elsewhere - please follow the style used by the rest of impala. http://gerrit.cloudera.org:8080/#/c/7954/9/be/src/exprs/timestamp-functions-ir.cc@137 PS9, Line 137: UnixMicrosToUtcTimestamp is that meaningful to the user? http://gerrit.cloudera.org:8080/#/c/7954/9/be/src/exprs/timestamp-functions-ir.cc@139 PS9, Line 139: } are the ones you added the only places where your new validation might produce a !HasDateAndTime() TimestampValue? http://gerrit.cloudera.org:8080/#/c/7954/9/be/src/exprs/timestamp-functions-ir.cc@751 PS9, Line 751: datetime.date().year(); it still seems like this is doing the same validation. should we remove this an instead do the result_value.HasDateAndTime() check here? Or is this really a different case? -- To view, visit http://gerrit.cloudera.org:8080/7954 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I77b2f6284d3a597f57e61c17a67c959eff9e38ff Gerrit-Change-Number: 7954 Gerrit-PatchSet: 9 Gerrit-Owner: Csaba RinghoferGerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Lars Volker Gerrit-Comment-Date: Mon, 02 Oct 2017 21:51:05 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5522:Use tracked memory for DictDecoder and DictEncoder
Joe McDonnell has posted comments on this change. ( http://gerrit.cloudera.org:8080/8034 ) Change subject: IMPALA-5522:Use tracked memory for DictDecoder and DictEncoder .. Patch Set 3: (1 comment) http://gerrit.cloudera.org:8080/#/c/8034/3/be/src/util/dict-encoding.h File be/src/util/dict-encoding.h: http://gerrit.cloudera.org:8080/#/c/8034/3/be/src/util/dict-encoding.h@412 PS3, Line 412: ConsumeBytes(sizeof(value)); > using num_values * size of type might not work when dealing with variable s For strings, the StringValue is a pointer and a length. In the case of the dictionary, the pointer is pointing into the dictionary (allocated from dictionary_pool_ in InitDictionary()). Since the memory for the dictionary page is already tracked, we only need to track these StringValue's, which are fixed-size. Either way, we should aggregate the Consume() calls here. -- To view, visit http://gerrit.cloudera.org:8080/8034 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I02a3b54f6c107d19b62ad9e1c49df94175964299 Gerrit-Change-Number: 8034 Gerrit-PatchSet: 3 Gerrit-Owner: Pranay Singh Gerrit-Reviewer: Bikramjeet VigGerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Pranay Singh Gerrit-Comment-Date: Mon, 02 Oct 2017 21:48:41 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5448: fix invalid number of splits reported in Parquet scan node
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/8147 ) Change subject: IMPALA-5448: fix invalid number of splits reported in Parquet scan node .. Patch Set 2: (6 comments) http://gerrit.cloudera.org:8080/#/c/8147/2/be/src/exec/hdfs-scan-node-base.h File be/src/exec/hdfs-scan-node-base.h: http://gerrit.cloudera.org:8080/#/c/8147/2/be/src/exec/hdfs-scan-node-base.h@557 PS2, Line 557: __builtin_popcount Should call BitUtil::Popcount(), which will use hardware acceleration if appropriate. http://gerrit.cloudera.org:8080/#/c/8147/2/be/src/exec/hdfs-scan-node-base.h@579 PS2, Line 579: bit_map We put an underscore at the end of private members, i.e. 'bit_map_' http://gerrit.cloudera.org:8080/#/c/8147/2/be/src/exec/hdfs-scan-node-base.h@582 PS2, Line 582: /// Mapping of file formats (file type, compression types set) to the number of Not your change, but it should mention the second entry in the tuple - whether the split was skipped. http://gerrit.cloudera.org:8080/#/c/8147/2/testdata/datasets/functional/functional_schema_template.sql File testdata/datasets/functional/functional_schema_template.sql: http://gerrit.cloudera.org:8080/#/c/8147/2/testdata/datasets/functional/functional_schema_template.sql@1581 PS2, Line 1581: -- IMPALA-5448: parquet files with multiple compression types We moved to loading "special" files as part of the tests rather than part of the data loading in a lot of cases. I think that is better practically because if you change this template then everyone has to reload data. I commented on an instance of the alternative approach that we should switch to. http://gerrit.cloudera.org:8080/#/c/8147/2/tests/query_test/test_scanners.py File tests/query_test/test_scanners.py: http://gerrit.cloudera.org:8080/#/c/8147/2/tests/query_test/test_scanners.py@82 PS2, Line 82: def test_hdfs_parquet_scan_node_profile(self, vector): This only applies to parquet so should go in TestParquet below (TestScannersAllTableFormats runs the test for all table formats). http://gerrit.cloudera.org:8080/#/c/8147/2/tests/query_test/test_scanners.py@337 PS2, Line 337: def test_corrupt_rle_counts(self, vector, unique_database): This is an example of the alternative way of loading data files as part of the test. -- To view, visit http://gerrit.cloudera.org:8080/8147 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iaacc2d775032f5707061e704f12e0a63cde695d1 Gerrit-Change-Number: 8147 Gerrit-PatchSet: 2 Gerrit-Owner: Quanlong HuangGerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Mon, 02 Oct 2017 21:47:57 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4670: Introduces RpcMgr class
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/7901 ) Change subject: IMPALA-4670: Introduces RpcMgr class .. Patch Set 8: Code-Review+2 (7 comments) http://gerrit.cloudera.org:8080/#/c/7901/8/be/src/rpc/rpc-mgr.h File be/src/rpc/rpc-mgr.h: http://gerrit.cloudera.org:8080/#/c/7901/8/be/src/rpc/rpc-mgr.h@90 PS8, Line 90: thrads threads http://gerrit.cloudera.org:8080/#/c/7901/8/be/src/rpc/rpc-mgr.h@90 PS8, Line 90: some this is kind of a detail but "some" implies that requests can bypass the queue when there is an available service thread. is that actually the case? http://gerrit.cloudera.org:8080/#/c/7901/6/be/src/rpc/rpc-mgr.h File be/src/rpc/rpc-mgr.h: http://gerrit.cloudera.org:8080/#/c/7901/6/be/src/rpc/rpc-mgr.h@76 PS6, Line 76: /// terminated. RpcMgr::Init() and RpcMgr::Shutdown() are not thread safe. > There is none actually. The assumption is that Init() is called from the si Okay. The case I'm worried about with Shutdown() is if this happens while RPCs are in flight, what happens. It'd be nice to move Impala toward having clean exit behavior, rather than adding more things that could crash when e.g. the impalad process gets a term signal. http://gerrit.cloudera.org:8080/#/c/7901/6/be/src/rpc/rpc-mgr.h@133 PS6, Line 133: > Normally, we shouldn't shutdown as there is really no clean way to shut dow See my comment above about clean process shutdown. Not sure there's anything to tackle now but let's think it through. http://gerrit.cloudera.org:8080/#/c/7901/6/be/src/rpc/rpc-mgr.h@156 PS6, Line 156: > This is similar to a shared_ptr in a way. KRPC messenger internally uses a Not suggesting we change it, just want to check that the scoped_refptr is dictated by the krpc layer API. http://gerrit.cloudera.org:8080/#/c/7901/8/be/src/scheduling/scheduler.cc File be/src/scheduling/scheduler.cc: http://gerrit.cloudera.org:8080/#/c/7901/8/be/src/scheduling/scheduler.cc@72 PS8, Line 72: is delete http://gerrit.cloudera.org:8080/#/c/7901/8/be/src/service/impala-server.cc File be/src/service/impala-server.cc: http://gerrit.cloudera.org:8080/#/c/7901/8/be/src/service/impala-server.cc@1662 PS8, Line 1662: } okay. or maybe the exec env should just have the resolved IP in it from ExecEnv::StartServices()? -- To view, visit http://gerrit.cloudera.org:8080/7901 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I8adb10ae375d7bf945394c38a520f12d29cf7b46 Gerrit-Change-Number: 7901 Gerrit-PatchSet: 8 Gerrit-Owner: Michael HoGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Mon, 02 Oct 2017 21:42:03 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4736: Add SIGUSR1 behavior to help string for 'minidump path' flag
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/8164 ) Change subject: IMPALA-4736: Add SIGUSR1 behavior to help string for 'minidump_path' flag .. Patch Set 6: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1286/ -- To view, visit http://gerrit.cloudera.org:8080/8164 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I8f4da8ff12c10e4a84a339ddb4036d1682b3a2a0 Gerrit-Change-Number: 8164 Gerrit-PatchSet: 6 Gerrit-Owner: Csaba RinghoferGerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Mon, 02 Oct 2017 21:31:22 + Gerrit-HasComments: No
[Impala-ASF-CR] Re-apply: IMPALA-5589: change "set" in impala-shell to show empty string for unset query options
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/8096 ) Change subject: Re-apply: IMPALA-5589: change "set" in impala-shell to show empty string for unset query options .. Patch Set 4: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1285/ -- To view, visit http://gerrit.cloudera.org:8080/8096 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I29f5d8ab874cb1338077f16019a9537766cac0c4 Gerrit-Change-Number: 8096 Gerrit-PatchSet: 4 Gerrit-Owner: Philip ZeyligerGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Philip Zeyliger Gerrit-Comment-Date: Mon, 02 Oct 2017 21:07:57 + Gerrit-HasComments: No
[Impala-ASF-CR] Re-apply: IMPALA-5589: change "set" in impala-shell to show empty string for unset query options
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/8096 ) Change subject: Re-apply: IMPALA-5589: change "set" in impala-shell to show empty string for unset query options .. Patch Set 4: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/8096 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I29f5d8ab874cb1338077f16019a9537766cac0c4 Gerrit-Change-Number: 8096 Gerrit-PatchSet: 4 Gerrit-Owner: Philip ZeyligerGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Philip Zeyliger Gerrit-Comment-Date: Mon, 02 Oct 2017 21:07:33 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5668: Fix cast(X as timestamp) for negative subsecond Decimals
Hello Lars Volker, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8051 to look at the new patch set (#2). Change subject: IMPALA-5668: Fix cast(X as timestamp) for negative subsecond Decimals .. IMPALA-5668: Fix cast(X as timestamp) for negative subsecond Decimals The timestamp conversion from negative fractional Decimal types (interpreted as unix timestamp) was wrong - the whole part was rounded toward zero, and fractional part was being added instead of being subtracted. Example for the wrong behaivour: +-+ | cast(cast(-0.1 as decimal(20,10)) as timestamp) | +-+ | 1970-01-01 00:00:00.1 | +-+ while casting to double works correctly: +-+ | cast(cast(-0.1 as double) as timestamp) | +-+ | 1969-12-31 23:59:59.9 | +-+ Change-Id: I8216f4c0f100c1bd68891cd6048236bfe4c205f0 --- M be/src/exprs/decimal-operators-ir.cc M be/src/exprs/expr-test.cc 2 files changed, 9 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/51/8051/2 -- To view, visit http://gerrit.cloudera.org:8080/8051 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I8216f4c0f100c1bd68891cd6048236bfe4c205f0 Gerrit-Change-Number: 8051 Gerrit-PatchSet: 2 Gerrit-Owner: Csaba RinghoferGerrit-Reviewer: Lars Volker
[Impala-ASF-CR] IMPALA-4252: Move runtime filters to ScanNode
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/8148 ) Change subject: IMPALA-4252: Move runtime filters to ScanNode .. Patch Set 2: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/8148 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I17bdc869046dc2cd837d02f333441fa6324ff086 Gerrit-Change-Number: 8148 Gerrit-PatchSet: 2 Gerrit-Owner: Thomas Tauber-MarshallGerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Mon, 02 Oct 2017 20:01:24 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4252: Move runtime filters to ScanNode
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/8148 ) Change subject: IMPALA-4252: Move runtime filters to ScanNode .. IMPALA-4252: Move runtime filters to ScanNode As a preliminary step towards adding runtime filters for Kudu scans, this patch moves runtime filter related code from HdfsScanNodeBase to ScanNode so that it's available to KuduScanNodeBase. The change was mechanical with no logic changes, except for moving the calculation of the max wait time into WaitForRuntimeFilters(). Testing: - Ran existing runtime filter tests. Change-Id: I17bdc869046dc2cd837d02f333441fa6324ff086 Reviewed-on: http://gerrit.cloudera.org:8080/8148 Reviewed-by: Tim ArmstrongTested-by: Impala Public Jenkins --- M be/src/exec/hdfs-scan-node-base.cc M be/src/exec/hdfs-scan-node-base.h M be/src/exec/kudu-scan-node-base.cc M be/src/exec/kudu-scan-node-base.h M be/src/exec/scan-node.cc M be/src/exec/scan-node.h 6 files changed, 131 insertions(+), 114 deletions(-) Approvals: Tim Armstrong: Looks good to me, approved Impala Public Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/8148 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I17bdc869046dc2cd837d02f333441fa6324ff086 Gerrit-Change-Number: 8148 Gerrit-PatchSet: 3 Gerrit-Owner: Thomas Tauber-Marshall Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-5983: Fix crash in to/from utc timestamp("10:00:00", 'MSK')
Hello Lars Volker, Alex Behm, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8139 to look at the new patch set (#4). Change subject: IMPALA-5983: Fix crash in to/from_utc_timestamp("10:00:00", 'MSK') .. IMPALA-5983: Fix crash in to/from_utc_timestamp("10:00:00", 'MSK') Moscow timezone is handled differrently than other timezones, and it was possible to cause unhandled boost exception by trying to convert "dateless" timestamps like "10:00:00". These timestamps cannot be handled by timestamp conversion generally, because daylight saving time logic needs month and day to work correctly. The condition HasDateOrTime() incorrectly suggested that these timestamps can be handled, so I made the condition stricter. Change-Id: I592389333a16a15a680beed389e2702dfc16fe1d --- M be/src/exprs/expr-test.cc M be/src/exprs/timestamp-functions.cc 2 files changed, 7 insertions(+), 2 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/39/8139/4 -- To view, visit http://gerrit.cloudera.org:8080/8139 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I592389333a16a15a680beed389e2702dfc16fe1d Gerrit-Change-Number: 8139 Gerrit-PatchSet: 4 Gerrit-Owner: Csaba RinghoferGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Lars Volker
[Impala-ASF-CR] IMPALA-4622: [DOCS] New Kudu ALTER TABLE syntax
John Russell has uploaded a new patch set (#2). ( http://gerrit.cloudera.org:8080/8191 ) Change subject: IMPALA-4622: [DOCS] New Kudu ALTER TABLE syntax .. IMPALA-4622: [DOCS] New Kudu ALTER TABLE syntax ALTER TABLE syntax for changing Kudu column default and storage attributes. Also SET COMMENT for non-Kudu tables. Change-Id: I7647dfe8acfdb598caf561d41d74e38a8b66ce9d --- M docs/topics/impala_alter_table.xml 1 file changed, 69 insertions(+), 2 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/91/8191/2 -- To view, visit http://gerrit.cloudera.org:8080/8191 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I7647dfe8acfdb598caf561d41d74e38a8b66ce9d Gerrit-Change-Number: 8191 Gerrit-PatchSet: 2 Gerrit-Owner: John Russell
[Impala-ASF-CR] IMPALA-4736: Add SIGUSR1 behavior to help string for 'minidump path' flag
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/8164 ) Change subject: IMPALA-4736: Add SIGUSR1 behavior to help string for 'minidump_path' flag .. Patch Set 6: (1 comment) Can you re-run the jenkins test? It failed because of TestKuduClientTimeout.test_catalogd_timeout, which was removed in https://github.com/cloudera/Impala/commit/fc54882d9a4769263d800d0654048f0ab1252153 , so I hope that the rebase has solved the issue. http://gerrit.cloudera.org:8080/#/c/8164/4/be/src/common/global-flags.cc File be/src/common/global-flags.cc: http://gerrit.cloudera.org:8080/#/c/8164/4/be/src/common/global-flags.cc@103 PS4, Line 103: be w > Remove "only", since it conflicts with the following sentence. Done -- To view, visit http://gerrit.cloudera.org:8080/8164 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I8f4da8ff12c10e4a84a339ddb4036d1682b3a2a0 Gerrit-Change-Number: 8164 Gerrit-PatchSet: 6 Gerrit-Owner: Csaba RinghoferGerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Mon, 02 Oct 2017 18:10:04 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-3504: [DOCS] Document utc timestamp()
Bikramjeet Vig has posted comments on this change. ( http://gerrit.cloudera.org:8080/8190 ) Change subject: IMPALA-3504: [DOCS] Document utc_timestamp() .. Patch Set 1: all these seem to be added in the may-july 2017 time-frame utc_to_unix_micros() was added as a part of IMPALA-5137 in may 2017 timestamp_from_unix_micros() added in IMPALA-5338 (may 2017) was renamed to unix_micros_to_utc_timestamp() in IMPALA-5539 (july 2017) -- To view, visit http://gerrit.cloudera.org:8080/8190 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia2662fc79d588f22a24a5067429a57b3c0d0f0f0 Gerrit-Change-Number: 8190 Gerrit-PatchSet: 1 Gerrit-Owner: John RussellGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Greg Rahn Gerrit-Comment-Date: Mon, 02 Oct 2017 18:02:09 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5664: Unix time to timestamp conversions may crash Impala
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/7954 ) Change subject: IMPALA-5664: Unix time to timestamp conversions may crash Impala .. Patch Set 2: (2 comments) http://gerrit.cloudera.org:8080/#/c/7954/2/be/src/runtime/timestamp-value.h File be/src/runtime/timestamp-value.h: http://gerrit.cloudera.org:8080/#/c/7954/2/be/src/runtime/timestamp-value.h@86 PS2, Line 86: } > But what we need is to log to the warning log, like the code in timestamp-f I have added some logging to warning. The string -> timestamp cast is not related to this issue, but I think it is very useful to warn the user about the invalid formatting. I have left the logging in TimestampValue::Validate, maybe it will be useful for postmortem analysis one day. http://gerrit.cloudera.org:8080/#/c/7954/8/be/src/runtime/timestamp-value.cc File be/src/runtime/timestamp-value.cc: http://gerrit.cloudera.org:8080/#/c/7954/8/be/src/runtime/timestamp-value.cc@116 PS8, Line 116: /// The time zone of the resulting ptime is local time. This is called by > nit: for style consistency combine lines 115 & 116. Done -- To view, visit http://gerrit.cloudera.org:8080/7954 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I77b2f6284d3a597f57e61c17a67c959eff9e38ff Gerrit-Change-Number: 7954 Gerrit-PatchSet: 2 Gerrit-Owner: Csaba RinghoferGerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Lars Volker Gerrit-Comment-Date: Mon, 02 Oct 2017 18:00:10 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5664: Unix time to timestamp conversions may crash Impala
Hello Lars Volker, Dan Hecht, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/7954 to look at the new patch set (#9). Change subject: IMPALA-5664: Unix time to timestamp conversions may crash Impala .. IMPALA-5664: Unix time to timestamp conversions may crash Impala TimestampValue::FromSubsecondUnixTime() and UtcFromUnixTimeMicros() are incorrect only in case of the last second of 1399, because these sub-second values are rounded first towards 1400-01-01 00:00:00, which is accepted as a valid date, and the sub-second part is subtracted afterwards, leading to a date outside the valid interval. The maximum case, -12-31 59:59:59 is a bit different, because as I understand, with nanosecond precision posix times, the maximum value is actually 1-01-01. 00:00:00 minus 1 nanosec. TimestampValue::FromUnixTimeNanos() can create problematic TimestampValues both <1400 and 1<=. These timestamps can cause problems, because most code assumes that if HasDate/HasTime is true, then it really is a valid timestamp. To fix this, the posix times are checked in the constructor of TimestampValue, and if it is outside the valid interval,both time_ and date_ are set to not_a_date_time. Also added logging to "cast to timestamp" functions. Test: select cast(-17987443200-0.1 as timestamp); This query no longer crashes, but returns NULL, similarly to other < 1400 timestamps. Change-Id: I77b2f6284d3a597f57e61c17a67c959eff9e38ff --- M be/src/exec/parquet-column-readers.cc M be/src/exec/parquet-column-stats.inline.h M be/src/exprs/cast-functions-ir.cc M be/src/exprs/timestamp-functions-ir.cc M be/src/runtime/timestamp-test.cc M be/src/runtime/timestamp-value.cc M be/src/runtime/timestamp-value.h 7 files changed, 66 insertions(+), 10 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/54/7954/9 -- To view, visit http://gerrit.cloudera.org:8080/7954 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I77b2f6284d3a597f57e61c17a67c959eff9e38ff Gerrit-Change-Number: 7954 Gerrit-PatchSet: 9 Gerrit-Owner: Csaba RinghoferGerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Lars Volker
[Impala-ASF-CR] IMPALA-4951: Fix database visibility for user with only column privilege
Bikramjeet Vig has posted comments on this change. ( http://gerrit.cloudera.org:8080/8168 ) Change subject: IMPALA-4951: Fix database visibility for user with only column privilege .. Patch Set 5: Code-Review+2 rebased, carried +2 -- To view, visit http://gerrit.cloudera.org:8080/8168 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id77904876729c0223fd6ace2d5e7199bd700a33a Gerrit-Change-Number: 8168 Gerrit-PatchSet: 5 Gerrit-Owner: Bikramjeet VigGerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Mon, 02 Oct 2017 17:16:26 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5529: [DOCS] New trunc() signatures
John Russell has posted comments on this change. ( http://gerrit.cloudera.org:8080/8189 ) Change subject: IMPALA-5529: [DOCS] New trunc() signatures .. Patch Set 3: (1 comment) http://gerrit.cloudera.org:8080/#/c/8189/3/docs/topics/impala_datetime_functions.xml File docs/topics/impala_datetime_functions.xml: http://gerrit.cloudera.org:8080/#/c/8189/3/docs/topics/impala_datetime_functions.xml@2174 PS3, Line 2174: trunc(numeric n), : trunc(decimal d, integer scale) Why don't we add a pointer from dtrunc / truncate to here. Those are covered under "Mathematical Functions". Actually it feels strange to have trunc() doing double duty as a date/time functiona and a math function, both listed on the Date/Time Functions page. But I don't want to make entries labelled "trunc()" on both pages. Maybe I can reuse some of the text and examples for numeric trunc() under truncate() on the math functions page. -- To view, visit http://gerrit.cloudera.org:8080/8189 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ice4753dee4f7b8e09c35508a9cad1e36f4ab2826 Gerrit-Change-Number: 8189 Gerrit-PatchSet: 3 Gerrit-Owner: John RussellGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Greg Rahn Gerrit-Reviewer: John Russell Gerrit-Reviewer: Kim Jin Chul Gerrit-Comment-Date: Mon, 02 Oct 2017 17:15:21 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5425: Add test for validating input when setting query options
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/7805 ) Change subject: IMPALA-5425: Add test for validating input when setting query options .. Patch Set 11: > (16 comments) > > I agree this is generally pretty tough to read. We wouldn't want to > add so much metaprogramming to impala, but I'm okay with moving > forward with it if we keep it to the unit test, provided we clarify > the comments (which are extra important in this case). The naming of fields using structs rather than pair<> definitely helped, though. -- To view, visit http://gerrit.cloudera.org:8080/7805 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I510e02bb0776673d8cbfc22b903831882c6908d7 Gerrit-Change-Number: 7805 Gerrit-PatchSet: 11 Gerrit-Owner: Tianyi WangGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Mon, 02 Oct 2017 17:01:27 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5425: Add test for validating input when setting query options
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/7805 ) Change subject: IMPALA-5425: Add test for validating input when setting query options .. Patch Set 11: (16 comments) I agree this is generally pretty tough to read. We wouldn't want to add so much metaprogramming to impala, but I'm okay with moving forward with it if we keep it to the unit test, provided we clarify the comments (which are extra important in this case). http://gerrit.cloudera.org:8080/#/c/7805/11/be/src/service/query-options-test.cc File be/src/service/query-options-test.cc: http://gerrit.cloudera.org:8080/#/c/7805/11/be/src/service/query-options-test.cc@59 PS11, Line 59: Enum what's the reasoning behind the terminology "Enum" here? oh, is it that this is only used to test query options that take an enumeration for possible values? let's be more explicit. http://gerrit.cloudera.org:8080/#/c/7805/11/be/src/service/query-options-test.cc@68 PS11, Line 68: 'value' what's that? http://gerrit.cloudera.org:8080/#/c/7805/11/be/src/service/query-options-test.cc@71 PS11, Line 71: TQueryOptions& options is that modified? if yes, we generally use pointer instead of reference. If not, then "const ref" is okay. also document these params http://gerrit.cloudera.org:8080/#/c/7805/11/be/src/service/query-options-test.cc@78 PS11, Line 78: // Create a shortcut function to test if 'value' would result into a failure. same comments http://gerrit.cloudera.org:8080/#/c/7805/11/be/src/service/query-options-test.cc@86 PS11, Line 86: TEST(QueryOptions, SetNonExistentOptions) { document what this test case is verifying. http://gerrit.cloudera.org:8080/#/c/7805/11/be/src/service/query-options-test.cc@92 PS11, Line 92: byte cases what is a "byte case"? oh, maybe it means query option that take a number of bytes as input. let's be explicit about this. and what does it mean to "test a set" of them? please summarize what we verify. http://gerrit.cloudera.org:8080/#/c/7805/11/be/src/service/query-options-test.cc@94 PS11, Line 94: typename T document what that is http://gerrit.cloudera.org:8080/#/c/7805/11/be/src/service/query-options-test.cc@95 PS11, Line 95: TQueryOptions& same comment as above http://gerrit.cloudera.org:8080/#/c/7805/11/be/src/service/query-options-test.cc@132 PS11, Line 132: Byte size options accept integers and integers with a : // suffix like "KB". that's the explanation I think is missing above. http://gerrit.cloudera.org:8080/#/c/7805/11/be/src/service/query-options-test.cc@161 PS11, Line 161: Test explain what it means to "test" an enum. http://gerrit.cloudera.org:8080/#/c/7805/11/be/src/service/query-options-test.cc@161 PS11, Line 161: It may or may not accept integers what is that trying to convey? http://gerrit.cloudera.org:8080/#/c/7805/11/be/src/service/query-options-test.cc@162 PS11, Line 162: template : void TestEnumCase(TQueryOptions& options, const EnumCase& test_case, : bool accept_integer) { document the params (include template param). http://gerrit.cloudera.org:8080/#/c/7805/11/be/src/service/query-options-test.cc@218 PS11, Line 218: Test integer options let's be clearer about what we verify. http://gerrit.cloudera.org:8080/#/c/7805/11/be/src/service/query-options-test.cc@243 PS11, Line 243: Test options with non regular validation rule. same http://gerrit.cloudera.org:8080/#/c/7805/11/testdata/workloads/functional-query/queries/QueryTest/set.test File testdata/workloads/functional-query/queries/QueryTest/set.test: http://gerrit.cloudera.org:8080/#/c/7805/11/testdata/workloads/functional-query/queries/QueryTest/set.test@a145 PS11, Line 145: does the new unit test really give this same coverage? does it verify these error messages? http://gerrit.cloudera.org:8080/#/c/7805/11/testdata/workloads/functional-query/queries/QueryTest/set.test@a293 PS11, Line 293: same question -- To view, visit http://gerrit.cloudera.org:8080/7805 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I510e02bb0776673d8cbfc22b903831882c6908d7 Gerrit-Change-Number: 7805 Gerrit-PatchSet: 11 Gerrit-Owner: Tianyi WangGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Mon, 02 Oct 2017 17:00:18 + Gerrit-HasComments: Yes
[native-toolchain-CR] Bump Kudu version to bec2a24
Thomas Tauber-Marshall has uploaded this change for review. ( http://gerrit.cloudera.org:8080/8192 Change subject: Bump Kudu version to bec2a24 .. Bump Kudu version to bec2a24 Change-Id: I1d1416e475488db580a4f9cb8ed2882440d9abcd --- M buildall.sh 1 file changed, 1 insertion(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/native-toolchain refs/changes/92/8192/1 -- To view, visit http://gerrit.cloudera.org:8080/8192 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: native-toolchain Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I1d1416e475488db580a4f9cb8ed2882440d9abcd Gerrit-Change-Number: 8192 Gerrit-PatchSet: 1 Gerrit-Owner: Thomas Tauber-Marshall
[Impala-ASF-CR](asf-site) A blog post about IMPALA-4916
Jim Apple has abandoned this change. ( http://gerrit.cloudera.org:8080/5995 ) Change subject: A blog post about IMPALA-4916 .. Abandoned -- To view, visit http://gerrit.cloudera.org:8080/5995 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: asf-site Gerrit-MessageType: abandon Gerrit-Change-Id: Ifa90bf5621ef6466a4821f77a6e8a8b20c5512ae Gerrit-Change-Number: 5995 Gerrit-PatchSet: 3 Gerrit-Owner: Jim AppleGerrit-Reviewer: Alex Behm Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Laszlo Gaal Gerrit-Reviewer: Michael Brown
[Impala-ASF-CR] IMPALA-2190: [DOCS] from timestamp() and to timestamp()
Andre Calfa has posted comments on this change. ( http://gerrit.cloudera.org:8080/8046 ) Change subject: IMPALA-2190: [DOCS] from_timestamp() and to_timestamp() .. Patch Set 2: Code-Review+1 -- To view, visit http://gerrit.cloudera.org:8080/8046 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I5f6dfd3fb99a70975d712bbef6c05900eddadd27 Gerrit-Change-Number: 8046 Gerrit-PatchSet: 2 Gerrit-Owner: John RussellGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Andre Calfa Gerrit-Reviewer: Greg Rahn Gerrit-Comment-Date: Mon, 02 Oct 2017 16:24:59 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4252: Move runtime filters to ScanNode
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/8148 ) Change subject: IMPALA-4252: Move runtime filters to ScanNode .. Patch Set 2: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1284/ -- To view, visit http://gerrit.cloudera.org:8080/8148 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I17bdc869046dc2cd837d02f333441fa6324ff086 Gerrit-Change-Number: 8148 Gerrit-PatchSet: 2 Gerrit-Owner: Thomas Tauber-MarshallGerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Mon, 02 Oct 2017 16:00:26 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5383: [DOCS] Document unpartitioned Kudu tables
Thomas Tauber-Marshall has posted comments on this change. ( http://gerrit.cloudera.org:8080/8180 ) Change subject: IMPALA-5383: [DOCS] Document unpartitioned Kudu tables .. Patch Set 3: (1 comment) > Thomas, can I tag you in for a few Kudu-related reviews for 5.13, > in place of MJ? Yep, happy to take a look http://gerrit.cloudera.org:8080/#/c/8180/3/docs/topics/impala_create_table.xml File docs/topics/impala_create_table.xml: http://gerrit.cloudera.org:8080/#/c/8180/3/docs/topics/impala_create_table.xml@483 PS3, Line 483: -- Single partition. Only for When I built this locally, these tags were present in the output. -- To view, visit http://gerrit.cloudera.org:8080/8180 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia2b466e1e482d62de84253c0cb406668fd5ad5eb Gerrit-Change-Number: 8180 Gerrit-PatchSet: 3 Gerrit-Owner: John RussellGerrit-Reviewer: Ambreen Kazi Gerrit-Reviewer: John Russell Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Mon, 02 Oct 2017 15:54:00 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4252: Move runtime filters to ScanNode
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/8148 ) Change subject: IMPALA-4252: Move runtime filters to ScanNode .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/8148/1/be/src/exec/scan-node.cc File be/src/exec/scan-node.cc: http://gerrit.cloudera.org:8080/#/c/8148/1/be/src/exec/scan-node.cc@79 PS1, Line 79: // TODO: Move this to Prepare() > Not sure, it was left by Michael (I'll ping him about it) in the review "Di Yes, I should have probably left more details. The original plan for the MT work is to have one filter bank shared across all fragment instances. The single PlanNode instance of ScanNode will then register the filter during the PlanNode tree creation. Then, during ScanNode::Prepare() for an ExecTree instance (inside a fragment instance), we will just look up the filter in the shared filter bank and fill in the filter_ctxs_. -- To view, visit http://gerrit.cloudera.org:8080/8148 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I17bdc869046dc2cd837d02f333441fa6324ff086 Gerrit-Change-Number: 8148 Gerrit-PatchSet: 1 Gerrit-Owner: Thomas Tauber-MarshallGerrit-Reviewer: Michael Ho Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Mon, 02 Oct 2017 06:44:47 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-3504: [DOCS] Document utc timestamp()
John Russell has uploaded this change for review. ( http://gerrit.cloudera.org:8080/8190 Change subject: IMPALA-3504: [DOCS] Document utc_timestamp() .. IMPALA-3504: [DOCS] Document utc_timestamp() This function seems to be related to unix_micros_to_utc_timestamp() and utc_to_unix_micros() which are also not currently documented. Did they all come in at the same time or have the *unix_micros* ones existed for a while? Change-Id: Ia2662fc79d588f22a24a5067429a57b3c0d0f0f0 --- M docs/topics/impala_datetime_functions.xml 1 file changed, 56 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/90/8190/1 -- To view, visit http://gerrit.cloudera.org:8080/8190 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: Ia2662fc79d588f22a24a5067429a57b3c0d0f0f0 Gerrit-Change-Number: 8190 Gerrit-PatchSet: 1 Gerrit-Owner: John Russell
[Impala-ASF-CR] IMPALA-2190: [DOCS] from timestamp() and to timestamp()
John Russell has uploaded a new patch set (#2). ( http://gerrit.cloudera.org:8080/8046 ) Change subject: IMPALA-2190: [DOCS] from_timestamp() and to_timestamp() .. IMPALA-2190: [DOCS] from_timestamp() and to_timestamp() These functions have been around for a while but didn't get picked up in the big wave of built-in functions documented in Impala 2.3. Change-Id: I5f6dfd3fb99a70975d712bbef6c05900eddadd27 (cherry picked from commit 60b7540c9938c6d4fcff26d610cd8c0b0ef6cbce) --- M docs/topics/impala_datetime_functions.xml 1 file changed, 142 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/46/8046/2 -- To view, visit http://gerrit.cloudera.org:8080/8046 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I5f6dfd3fb99a70975d712bbef6c05900eddadd27 Gerrit-Change-Number: 8046 Gerrit-PatchSet: 2 Gerrit-Owner: John Russell