[Impala-ASF-CR] IMPALA-5768: Better developer documentation
Impala Public Jenkins has posted comments on this change. Change subject: IMPALA-5768: Better developer documentation .. Patch Set 6: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/7350 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I16d34cb4fa0c60c5ad6d9c8764cc0ec21c5cb368 Gerrit-PatchSet: 6 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Zach AmsdenGerrit-Reviewer: David Knupp Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zach Amsden Gerrit-HasComments: No
[Impala-ASF-CR] MPALA-5776: Write partial tuple to the correct mempool
Taras Bobrovytsky has uploaded a new patch set (#2). Change subject: MPALA-5776: Write partial tuple to the correct mempool .. MPALA-5776: Write partial tuple to the correct mempool In the text scanner, we were writing the partial tuple variable length data to data_buffer_pool_ mempool which caused strange behavior, such as incorrect results. If we are scanning compressed data, the pool gets attached to the row batch at the end of a GetNext() call and gets freed before the next GetNext() call. This is wrong because we expect the data in the partial tuple to survive between the GetNext() calls. If we are scanning non compressed data, data_buffer_pool_ never gets cleared and grows over time until the scanner finishes reading the scan range. We fix the problem by writing the varlen partial tuple data to boundary_pool_, which is where the constant length partial tuple data is written. Testing: - Ran some tests locally on ASAN build. - No new tests were added, because it is difficult to construct test cases due to the issue being non-deterministic. Change-Id: I60ba5c113aefd17f697c1888fd46a237ef396540 --- M be/src/exec/hdfs-text-scanner.cc 1 file changed, 13 insertions(+), 3 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/39/7639/2 -- To view, visit http://gerrit.cloudera.org:8080/7639 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I60ba5c113aefd17f697c1888fd46a237ef396540 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Taras BobrovytskyGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] MPALA-5776: Write partial tuple to the correct mempool
Taras Bobrovytsky has uploaded a new patch set (#2). Change subject: MPALA-5776: Write partial tuple to the correct mempool .. MPALA-5776: Write partial tuple to the correct mempool In the text scanner, we were writing the partial tuple variable length data to data_buffer_pool_ mempool which caused strange behavior, such as incorrect results. If we are scanning compressed data, the pool gets attached to the row batch at the end of a GetNext() call and gets freed before the next GetNext() call. This is wrong because we expect the data in the partial tuple to survive between the GetNext() calls. If we are scanning non compressed data, data_buffer_pool_ never gets cleared and grows over time until the scanner finishes reading the scan range. We fix the problem by writing the varlen partial tuple data to boundary_pool_, which is where the constant length partial tuple data is written. Testing: - Ran some tests locally on ASAN build. - No new tests were added, because it is difficult to construct test cases due to the issue being non-deterministic. Change-Id: I60ba5c113aefd17f697c1888fd46a237ef396540 --- M be/src/exec/hdfs-text-scanner.cc 1 file changed, 13 insertions(+), 3 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/39/7639/2 -- To view, visit http://gerrit.cloudera.org:8080/7639 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I60ba5c113aefd17f697c1888fd46a237ef396540 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Taras BobrovytskyGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] MPALA-5776: Write partial tuple to the correct mempool
Taras Bobrovytsky has uploaded a new patch set (#2). Change subject: MPALA-5776: Write partial tuple to the correct mempool .. MPALA-5776: Write partial tuple to the correct mempool In the text scanner, we were writing the partial tuple variable length data to data_buffer_pool_ mempool which caused strange behavior, such as incorrect results. If we are scanning compressed data, the pool gets attached to the row batch at the end of a GetNext() call and gets freed before the next GetNext() call. This is wrong because we expect the data in the partial tuple to survive between the GetNext() calls. If we are scanning non compressed data, data_buffer_pool_ never gets cleared and grows over time until the scanner finishes reading the scan range. We fix the problem by writing the varlen partial tuple data to boundary_pool_, which is where the constant length partial tuple data is written. Testing: - Ran some tests locally on ASAN build. - No new tests were added, because it is difficult to construct test cases due to the issue being non-deterministic. Change-Id: I60ba5c113aefd17f697c1888fd46a237ef396540 --- M be/src/exec/hdfs-text-scanner.cc 1 file changed, 13 insertions(+), 3 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/39/7639/2 -- To view, visit http://gerrit.cloudera.org:8080/7639 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I60ba5c113aefd17f697c1888fd46a237ef396540 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Taras BobrovytskyGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] MPALA-5776: Write partial tuple to the correct mempool
Taras Bobrovytsky has uploaded a new patch set (#2). Change subject: MPALA-5776: Write partial tuple to the correct mempool .. MPALA-5776: Write partial tuple to the correct mempool In the text scanner, we were writing the partial tuple variable length data to data_buffer_pool_ mempool which caused strange behavior, such as incorrect results. If we are scanning compressed data, the pool gets attached to the row batch at the end of a GetNext() call and gets freed before the next GetNext() call. This is wrong because we expect the data in the partial tuple to survive between the GetNext() calls. If we are scanning non compressed data, data_buffer_pool_ never gets cleared and grows over time until the scanner finishes reading the scan range. We fix the problem by writing the varlen partial tuple data to boundary_pool_, which is where the constant length partial tuple data is written. Testing: - Ran some tests locally on ASAN build. - No new tests were added, because it is difficult to construct test cases due to the issue being non-deterministic. Change-Id: I60ba5c113aefd17f697c1888fd46a237ef396540 --- M be/src/exec/hdfs-text-scanner.cc 1 file changed, 14 insertions(+), 4 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/39/7639/2 -- To view, visit http://gerrit.cloudera.org:8080/7639 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I60ba5c113aefd17f697c1888fd46a237ef396540 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Taras BobrovytskyGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] MPALA-5776: Write partial tuple to the correct mempool
Taras Bobrovytsky has uploaded a new change for review. http://gerrit.cloudera.org:8080/7639 Change subject: MPALA-5776: Write partial tuple to the correct mempool .. MPALA-5776: Write partial tuple to the correct mempool In the text scanner, we were writing the partial tuple variable length data to data_buffer_pool_ mempool which caused strange behavior, such as incorrect results. If we are scanning compressed data, the pool gets attached to the row batch at the end of a GetNext() call and gets freed before the next GetNext() call. This is wrong because we expect the data in the partial tuple to survive between the GetNext() calls. If we are scanning non compressed data, data_buffer_pool_ never gets cleared and grows over time until the scanner finishes reading the scan range. We fix the problem by writing the varlen partial tuple data to boundary_pool_, which is where the constant length partial tuple data is written. Testing: - Ran some tests locally on ASAN build. - No new tests were added, because it is difficult to construct test cases due to the issue being non-deterministic. Change-Id: I60ba5c113aefd17f697c1888fd46a237ef396540 --- M be/src/exec/hdfs-text-scanner.cc 1 file changed, 11 insertions(+), 3 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/39/7639/1 -- To view, visit http://gerrit.cloudera.org:8080/7639 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I60ba5c113aefd17f697c1888fd46a237ef396540 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Taras Bobrovytsky
[Impala-ASF-CR] IMPALA-4737: Prevent SIGUSR1 from killing daemons when minidumps are disabled
Lars Volker has uploaded a new change for review. http://gerrit.cloudera.org:8080/7631 Change subject: IMPALA-4737: Prevent SIGUSR1 from killing daemons when minidumps are disabled .. IMPALA-4737: Prevent SIGUSR1 from killing daemons when minidumps are disabled If a user disabled minidumps before this change, we did not register the signal handler for SIGUSR1 at all. Sending SIGUSR1 to a daemon would subsequently kill it. This change registers a dummy handler to ignore the signal if minidumps are disabled. Change-Id: I13d866a2eec832500131954a7f693c33585ea51e --- M be/src/util/minidump.cc M tests/custom_cluster/test_breakpad.py 2 files changed, 33 insertions(+), 9 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/31/7631/1 -- To view, visit http://gerrit.cloudera.org:8080/7631 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I13d866a2eec832500131954a7f693c33585ea51e Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Lars Volker
[Impala-ASF-CR] IMPALA-4833: Compute precise per-host reservation size
Tim Armstrong has posted comments on this change. Change subject: IMPALA-4833: Compute precise per-host reservation size .. Patch Set 1: (3 comments) Overall I think this looks good - obviously it needs some tests but that's my main concern. I played around a bit and it seemed to work. E.g. this contrived query resulted in different reservations per node: select straight_join * from tpch_parquet.lineitem join (select o_shippriority, count(distinct o_orderkey) from tpch_parquet.orders group by 1) v on o_shippriority = l_orderkey; Per Host Min Reservation: tarmstrong-box:22000(36.12 MB) tarmstrong-box:22001(36.12 MB) tarmstrong-box:22002(1.06 MB) http://gerrit.cloudera.org:8080/#/c/7630/1/common/thrift/ImpalaInternalService.thrift File common/thrift/ImpalaInternalService.thrift: Line 484: 6: i64 min_reservation_bytes Are these optional or required? Not sure what the default is in thrift, we seem to be explicit elsewhere. http://gerrit.cloudera.org:8080/#/c/7630/1/common/thrift/Planner.thrift File common/thrift/Planner.thrift: Line 66: 7: optional i64 min_reservation_bytes It's kind-of confusing that these are optional, since the backend handles this case in exactly the same way as if they were zero. It might be clearer to make these required and set them to zero in PlanFragment.java if the profile is invalid. That way the invalid resource profile handling is more contained in PlanFragment. The profiles should only be invalid for join build sinks in parallel plans, which is explained over there. http://gerrit.cloudera.org:8080/#/c/7630/1/fe/src/main/java/org/apache/impala/planner/Planner.java File fe/src/main/java/org/apache/impala/planner/Planner.java: Line 365: // Different fragments do not synchronize their Open() and Close(), so the backend I think this comment is important for understand why we sum up the fragments. Can we reference this in coordinator-backend-state.cc briefly? -- To view, visit http://gerrit.cloudera.org:8080/7630 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Idbcd1e9b1be14edc4017b4907e83f9d56059fbac Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Matthew JacobsGerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4863/IMPALA-5311: Correctly account the file type and compression codec
Tim Armstrong has posted comments on this change. Change subject: IMPALA-4863/IMPALA-5311: Correctly account the file type and compression codec .. Patch Set 5: (3 comments) http://gerrit.cloudera.org:8080/#/c/7245/5/be/src/exec/hdfs-scan-node-base.cc File be/src/exec/hdfs-scan-node-base.cc: Line 885: ss << file_format << "/" << "Unknown"<< "(Skipped):" << it->second << " "; nit: missing space before << PS5, Line 888: << nit: align <<'s vertically (it's in the style guide somewhere). http://gerrit.cloudera.org:8080/#/c/7245/5/testdata/workloads/functional-query/queries/QueryTest/hdfs_scanner_profile.test File testdata/workloads/functional-query/queries/QueryTest/hdfs_scanner_profile.test: Line 9: QUERY I missed this in an earlier pass but we don't have a regression test for IMPALA-4863. Can we add tests for the runtime filter partition case, maybe for both text and parquet. E.g. this query works for me: set runtime_filter_wait_time_ms=10; select count(*) from tpcds.store_sales join tpcds.date_dim on ss_sold_date_sk = d_date_sk where d_qoy=1 -- To view, visit http://gerrit.cloudera.org:8080/7245 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I797916505f62e568f4159e07099481b8ff571da2 Gerrit-PatchSet: 5 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: anujphadkeGerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: anujphadke Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5764: Allow overriding packaged components
Tim Armstrong has posted comments on this change. Change subject: IMPALA-5764: Allow overriding packaged components .. Patch Set 1: Ping? -- To view, visit http://gerrit.cloudera.org:8080/7581 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I95c2662e6f62adc924cc5de7a371202126046545 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Zach AmsdenGerrit-Reviewer: David Knupp Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zach Amsden Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5768: Better developer documentation
Impala Public Jenkins has posted comments on this change. Change subject: IMPALA-5768: Better developer documentation .. Patch Set 6: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1027/ -- To view, visit http://gerrit.cloudera.org:8080/7350 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I16d34cb4fa0c60c5ad6d9c8764cc0ec21c5cb368 Gerrit-PatchSet: 6 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Zach AmsdenGerrit-Reviewer: David Knupp Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zach Amsden Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5768: Better developer documentation
Tim Armstrong has posted comments on this change. Change subject: IMPALA-5768: Better developer documentation .. Patch Set 5: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/7350 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I16d34cb4fa0c60c5ad6d9c8764cc0ec21c5cb368 Gerrit-PatchSet: 5 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Zach AmsdenGerrit-Reviewer: David Knupp Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zach Amsden Gerrit-HasComments: No
[Impala-ASF-CR] PREVIEW: IMPALA-3208: max row size option
Tim Armstrong has posted comments on this change. Change subject: PREVIEW: IMPALA-3208: max_row_size option .. Patch Set 6: This is ready for review except for the missing end-to-end tests. -- To view, visit http://gerrit.cloudera.org:8080/7629 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ic70f6dddbcef124bb4b329ffa2e42a74a1826570 Gerrit-PatchSet: 6 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] PREVIEW: IMPALA-3208: max row size option
Tim Armstrong has uploaded a new patch set (#6). Change subject: PREVIEW: IMPALA-3208: max_row_size option .. PREVIEW: IMPALA-3208: max_row_size option This is a preview because it is missing tests. I have manually tested it and it is behaving it as expected so far. Adds support for a "max_row_size" query option that instructs Impala to reserve enough memory to process rows of the specified size. For spilling operators, the planner reserves enough memory to process rows of this size. The advantage of this compared to simply specifying larger values for min_spillable_buffer_size and default_spillable_buffer_size is that operators may be able to handler larger rows without increasing the size of all their buffers. The default value is 512KB. I picked that number because it doesn't increase minimum reservations *too* much but should be large enough for almost all reasonable workloads. This is implemented in the aggs and joins using the variable page size support added to BufferedTupleStream in an earlier commit. The synopsis is that each stream requires reservation for one default-sized page per read and write iterator, and temporarily requires reservation for a max-sized page when reading or writing larger pages. The max-sized write reservation is released immediately after the row is appended and the max-size read reservation is released after advancing to the next row. The sorter and analytic simply use max-sized buffers for all pages in the stream. Testing: Updated existing planner tests to reflect default max_row_size. Added new planner tests to test the effect of the query option. Added "set" test to check validation of query option. * TODO: Add end-to-end tests exercising all operators with large rows with and without spilling. Change-Id: Ic70f6dddbcef124bb4b329ffa2e42a74a1826570 --- M be/src/exec/partitioned-aggregation-node.cc M be/src/exec/partitioned-aggregation-node.h M be/src/exec/partitioned-hash-join-builder.cc M be/src/exec/partitioned-hash-join-builder.h M be/src/exec/partitioned-hash-join-node.cc M be/src/runtime/buffered-tuple-stream.cc M be/src/service/query-options.cc M be/src/service/query-options.h M common/thrift/ImpalaInternalService.thrift M common/thrift/ImpalaService.thrift M common/thrift/PlanNodes.thrift M common/thrift/generate_error_codes.py M fe/src/main/java/org/apache/impala/planner/AggregationNode.java M fe/src/main/java/org/apache/impala/planner/AnalyticEvalNode.java M fe/src/main/java/org/apache/impala/planner/HashJoinNode.java M fe/src/main/java/org/apache/impala/planner/PlanNode.java M fe/src/main/java/org/apache/impala/planner/Planner.java M fe/src/main/java/org/apache/impala/planner/ResourceProfile.java A fe/src/main/java/org/apache/impala/planner/ResourceProfileBuilder.java M fe/src/main/java/org/apache/impala/planner/SortNode.java M fe/src/test/java/org/apache/impala/planner/PlannerTest.java M testdata/workloads/functional-planner/queries/PlannerTest/constant-folding.test M testdata/workloads/functional-planner/queries/PlannerTest/disable-codegen.test M testdata/workloads/functional-planner/queries/PlannerTest/fk-pk-join-detection.test A testdata/workloads/functional-planner/queries/PlannerTest/max-row-size.test M testdata/workloads/functional-planner/queries/PlannerTest/mt-dop-validation.test M testdata/workloads/functional-planner/queries/PlannerTest/resource-requirements.test M testdata/workloads/functional-planner/queries/PlannerTest/spillable-buffer-sizing.test M testdata/workloads/functional-planner/queries/PlannerTest/tablesample.test M testdata/workloads/functional-query/queries/QueryTest/set.test M testdata/workloads/functional-query/queries/QueryTest/spilling.test 31 files changed, 690 insertions(+), 150 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/29/7629/6 -- To view, visit http://gerrit.cloudera.org:8080/7629 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ic70f6dddbcef124bb4b329ffa2e42a74a1826570 Gerrit-PatchSet: 6 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Matthew Jacobs
[Impala-ASF-CR] PREVIEW: IMPALA-3208: max row size option
Tim Armstrong has uploaded a new patch set (#5). Change subject: PREVIEW: IMPALA-3208: max_row_size option .. PREVIEW: IMPALA-3208: max_row_size option This is a preview because it is missing tests. I have manually tested it and it is behaving it as expected so far. Adds support for a "max_row_size" query option that instructs Impala to reserve enough memory to process rows of the specified size. For spilling operators, the planner reserves enough memory to process rows of this size. The advantage of this compared to simply specifying larger values for min_spillable_buffer_size and default_spillable_buffer_size is that operators may be able to handler larger rows without increasing the size of all their buffers. This is implemented using the variable page size support added to BufferedTupleStream in an earlier commit. The synopsis is that each stream requires reservation for one default-sized page per read and write iterator, and temporarily requires reservation for a max-sized page when reading or writing larger pages. The max-sized write reservation is released immediately after the row is appended and the max-size read reservation is released after advancing to the next row. This means that in the aggs and joins we require one max-size read buffer for the read stream and one max-size write buffer that can be used to append a large value to any stream. The sorter and analytic are simpler: there we simply use the max-sized buffers for all pages in the stream. Testing: Updated existing planner tests to reflect default max_row_size. Added new planner tests to test the effect of the query option. Added "set" test to check validation of query option. * TODO: Add end-to-end tests exercising all operators with large rows with and without spilling. Change-Id: Ic70f6dddbcef124bb4b329ffa2e42a74a1826570 --- M be/src/exec/partitioned-aggregation-node.cc M be/src/exec/partitioned-aggregation-node.h M be/src/exec/partitioned-hash-join-builder.cc M be/src/exec/partitioned-hash-join-builder.h M be/src/exec/partitioned-hash-join-node.cc M be/src/runtime/buffered-tuple-stream.cc M be/src/service/query-options.cc M be/src/service/query-options.h M common/thrift/ImpalaInternalService.thrift M common/thrift/ImpalaService.thrift M common/thrift/PlanNodes.thrift M common/thrift/generate_error_codes.py M fe/src/main/java/org/apache/impala/planner/AggregationNode.java M fe/src/main/java/org/apache/impala/planner/AnalyticEvalNode.java M fe/src/main/java/org/apache/impala/planner/HashJoinNode.java M fe/src/main/java/org/apache/impala/planner/PlanNode.java M fe/src/main/java/org/apache/impala/planner/Planner.java M fe/src/main/java/org/apache/impala/planner/ResourceProfile.java A fe/src/main/java/org/apache/impala/planner/ResourceProfileBuilder.java M fe/src/main/java/org/apache/impala/planner/SortNode.java M fe/src/test/java/org/apache/impala/planner/PlannerTest.java M testdata/workloads/functional-planner/queries/PlannerTest/constant-folding.test M testdata/workloads/functional-planner/queries/PlannerTest/disable-codegen.test M testdata/workloads/functional-planner/queries/PlannerTest/fk-pk-join-detection.test A testdata/workloads/functional-planner/queries/PlannerTest/max-row-size.test M testdata/workloads/functional-planner/queries/PlannerTest/mt-dop-validation.test M testdata/workloads/functional-planner/queries/PlannerTest/resource-requirements.test M testdata/workloads/functional-planner/queries/PlannerTest/spillable-buffer-sizing.test M testdata/workloads/functional-planner/queries/PlannerTest/tablesample.test M testdata/workloads/functional-query/queries/QueryTest/set.test M testdata/workloads/functional-query/queries/QueryTest/spilling.test 31 files changed, 690 insertions(+), 150 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/29/7629/5 -- To view, visit http://gerrit.cloudera.org:8080/7629 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ic70f6dddbcef124bb4b329ffa2e42a74a1826570 Gerrit-PatchSet: 5 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong
[Impala-ASF-CR] IMPALA-4833: Compute precise per-host reservation size
Matthew Jacobs has posted comments on this change. Change subject: IMPALA-4833: Compute precise per-host reservation size .. Patch Set 1: TODO: * more specific test case(s), validate new profile field for per-host min reservations -- To view, visit http://gerrit.cloudera.org:8080/7630 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Idbcd1e9b1be14edc4017b4907e83f9d56059fbac Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Matthew JacobsGerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4833: Compute precise per-host reservation size
Matthew Jacobs has uploaded a new change for review. http://gerrit.cloudera.org:8080/7630 Change subject: IMPALA-4833: Compute precise per-host reservation size .. IMPALA-4833: Compute precise per-host reservation size Before this change, the per-host reservation size was computed by the Planner. However, scheduling happens after planning, so the Planner must assume that all fragments run on all hosts, and the reservation size is likely much larger than it needs to be. This moves the computation of the per-host reservation size to the BE where it can be computed more precisely. This also includes a number of plan/profile changes. Change-Id: Idbcd1e9b1be14edc4017b4907e83f9d56059fbac --- M be/src/runtime/coordinator-backend-state.cc M be/src/runtime/coordinator-backend-state.h M be/src/runtime/coordinator.cc M be/src/runtime/query-state.cc M be/src/service/client-request-state.cc M common/thrift/Frontend.thrift M common/thrift/ImpalaInternalService.thrift M common/thrift/Planner.thrift M fe/src/main/java/org/apache/impala/planner/PlanFragment.java M fe/src/main/java/org/apache/impala/planner/Planner.java M testdata/workloads/functional-planner/queries/PlannerTest/disable-codegen.test M testdata/workloads/functional-planner/queries/PlannerTest/resource-requirements.test M testdata/workloads/functional-planner/queries/PlannerTest/spillable-buffer-sizing.test M testdata/workloads/functional-query/queries/QueryTest/explain-level0.test M testdata/workloads/functional-query/queries/QueryTest/explain-level1.test M testdata/workloads/functional-query/queries/QueryTest/explain-level2.test M testdata/workloads/functional-query/queries/QueryTest/explain-level3.test M testdata/workloads/functional-query/queries/QueryTest/stats-extrapolation.test 18 files changed, 211 insertions(+), 172 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/30/7630/1 -- To view, visit http://gerrit.cloudera.org:8080/7630 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: Idbcd1e9b1be14edc4017b4907e83f9d56059fbac Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Matthew Jacobs
[Impala-ASF-CR] IMPALA-5744: Add 'use krpc' flag and create DataStream interface
Impala Public Jenkins has submitted this change and it was merged. Change subject: IMPALA-5744: Add 'use_krpc' flag and create DataStream interface .. IMPALA-5744: Add 'use_krpc' flag and create DataStream interface This patch introduces a 'use_krpc' flag and creates an abstract interface for the DataStreamRecvr/Mgr. The 'use_krpc' flag defaults to 'false'. Cluster startup will abort with an error if the flag is switched to 'true'. The DataStreamSender implements the same virtual interface as the DataSink, so a pure virtual class for the DataStreamSender would essentially be an empty class. Therefore, it is not implemented. The new interfaces are pure virtual base classes and are named DataStream*Base. Using pure virtual base classes may result in code duplication but the current approach is to optimize for the eventual removal of the thrift implementations. Stubs for the Krpc implementations are also introduced and are named KrpcDataStream*. They currently only abort with a fatal error if they are used. Their actual implementations will be filled in a later patch. When the time comes to standardize on the KRPC implementations of DataStream*Base, we will get rid of the DataStream*Base classes and the Thrift versions of the classes and rename KrpcDataStream* to DataStream*. We will also rename all the references that the clients have to DataStream*Base to DataStream*. Also did some spurious includes cleanup. Change-Id: I5d52245154e910529a68f53049520238eca16241 Reviewed-on: http://gerrit.cloudera.org:8080/7542 Reviewed-by: Sailesh MukilTested-by: Impala Public Jenkins --- M be/src/exec/data-sink.cc M be/src/exec/exchange-node.cc M be/src/exec/exchange-node.h M be/src/runtime/CMakeLists.txt A be/src/runtime/data-stream-mgr-base.h M be/src/runtime/data-stream-mgr.cc M be/src/runtime/data-stream-mgr.h A be/src/runtime/data-stream-recvr-base.h M be/src/runtime/data-stream-recvr.h M be/src/runtime/data-stream-test.cc M be/src/runtime/exec-env.cc M be/src/runtime/exec-env.h M be/src/runtime/fragment-instance-state.cc A be/src/runtime/krpc-data-stream-mgr.cc A be/src/runtime/krpc-data-stream-mgr.h A be/src/runtime/krpc-data-stream-recvr.cc A be/src/runtime/krpc-data-stream-recvr.h M be/src/runtime/runtime-state.cc M be/src/runtime/runtime-state.h M be/src/service/impala-server.cc 20 files changed, 436 insertions(+), 39 deletions(-) Approvals: Impala Public Jenkins: Verified Sailesh Mukil: Looks good to me, approved -- To view, visit http://gerrit.cloudera.org:8080/7542 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: merged Gerrit-Change-Id: I5d52245154e910529a68f53049520238eca16241 Gerrit-PatchSet: 14 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Sailesh Mukil Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil
[Impala-ASF-CR] Bump Kudu version to 943b1ae, and OpenSSL to 1.0.2l
Impala Public Jenkins has posted comments on this change. Change subject: Bump Kudu version to 943b1ae, and OpenSSL to 1.0.2l .. Patch Set 2: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/7626 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I43a510c07b877cb18c3170a034cf9322baef16c0 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Matthew JacobsGerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-HasComments: No
[Impala-ASF-CR] Bump Kudu version to 943b1ae, and OpenSSL to 1.0.2l
Impala Public Jenkins has submitted this change and it was merged. Change subject: Bump Kudu version to 943b1ae, and OpenSSL to 1.0.2l .. Bump Kudu version to 943b1ae, and OpenSSL to 1.0.2l Change-Id: I43a510c07b877cb18c3170a034cf9322baef16c0 Reviewed-on: http://gerrit.cloudera.org:8080/7626 Reviewed-by: Michael HoTested-by: Impala Public Jenkins --- M bin/impala-config.sh 1 file changed, 3 insertions(+), 3 deletions(-) Approvals: Impala Public Jenkins: Verified Michael Ho: Looks good to me, approved -- To view, visit http://gerrit.cloudera.org:8080/7626 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: merged Gerrit-Change-Id: I43a510c07b877cb18c3170a034cf9322baef16c0 Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Matthew Jacobs Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Thomas Tauber-Marshall
[Impala-ASF-CR] IMPALA-5781: Only use TLSv1.0-compatible ciphers for tests
Impala Public Jenkins has posted comments on this change. Change subject: IMPALA-5781: Only use TLSv1.0-compatible ciphers for tests .. Patch Set 1: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/7624 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Id66508040bcc7745b7c68b62ace71ae1d394c1b4 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Henry RobinsonGerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Matthew Jacobs Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5781: Only use TLSv1.0-compatible ciphers for tests
Impala Public Jenkins has submitted this change and it was merged. Change subject: IMPALA-5781: Only use TLSv1.0-compatible ciphers for tests .. IMPALA-5781: Only use TLSv1.0-compatible ciphers for tests If we ask OpenSSL to use a cipher suite that's not compatible with TLSv1.0, it will fail on machines where TLSv1.1+ is not supported (i.e. those with OpenSSL v1.0.0). Fix tests to only use TLSv1.0-compatible cipher suites, picked from https://wiki.openssl.org/index.php/Manual:Ciphers(1)#TLS_v1.0_cipher_suites. Confirmed that tests start servers with TLSv1.0 support. Before this patch, servers would be silently upgraded to TLSv1.2 only (i.e. the minimum version that supported the requested cipher suite). Change-Id: Id66508040bcc7745b7c68b62ace71ae1d394c1b4 Reviewed-on: http://gerrit.cloudera.org:8080/7624 Reviewed-by: Matthew JacobsTested-by: Impala Public Jenkins --- M be/src/rpc/thrift-server-test.cc 1 file changed, 15 insertions(+), 7 deletions(-) Approvals: Impala Public Jenkins: Verified Matthew Jacobs: Looks good to me, approved -- To view, visit http://gerrit.cloudera.org:8080/7624 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: merged Gerrit-Change-Id: Id66508040bcc7745b7c68b62ace71ae1d394c1b4 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Henry Robinson Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Matthew Jacobs
[Impala-ASF-CR] IMPALA-5417: make I/O buffer queue fixed-size
Tim Armstrong has posted comments on this change. Change subject: IMPALA-5417: make I/O buffer queue fixed-size .. Patch Set 7: Code-Review+1 Carry +1 -- To view, visit http://gerrit.cloudera.org:8080/7408 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: If7cc3f7199f5320db00b7face97a96cdadb6f83f Gerrit-PatchSet: 7 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] Bump Kudu version to 943b1ae, and OpenSSL to 1.0.2l
Michael Ho has posted comments on this change. Change subject: Bump Kudu version to 943b1ae, and OpenSSL to 1.0.2l .. Patch Set 2: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/7626 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I43a510c07b877cb18c3170a034cf9322baef16c0 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Matthew JacobsGerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5744: Add 'use krpc' flag and create DataStream interface
Impala Public Jenkins has posted comments on this change. Change subject: IMPALA-5744: Add 'use_krpc' flag and create DataStream interface .. Patch Set 13: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1026/ -- To view, visit http://gerrit.cloudera.org:8080/7542 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I5d52245154e910529a68f53049520238eca16241 Gerrit-PatchSet: 13 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Sailesh MukilGerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5744: Add 'use krpc' flag and create DataStream interface
Sailesh Mukil has posted comments on this change. Change subject: IMPALA-5744: Add 'use_krpc' flag and create DataStream interface .. Patch Set 13: Code-Review+2 More clang-tidy fixes. Rebase, carry +2. -- To view, visit http://gerrit.cloudera.org:8080/7542 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I5d52245154e910529a68f53049520238eca16241 Gerrit-PatchSet: 13 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Sailesh MukilGerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5744: Add 'use krpc' flag and create DataStream interface
Hello Impala Public Jenkins, Henry Robinson, Michael Ho, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/7542 to look at the new patch set (#13). Change subject: IMPALA-5744: Add 'use_krpc' flag and create DataStream interface .. IMPALA-5744: Add 'use_krpc' flag and create DataStream interface This patch introduces a 'use_krpc' flag and creates an abstract interface for the DataStreamRecvr/Mgr. The 'use_krpc' flag defaults to 'false'. Cluster startup will abort with an error if the flag is switched to 'true'. The DataStreamSender implements the same virtual interface as the DataSink, so a pure virtual class for the DataStreamSender would essentially be an empty class. Therefore, it is not implemented. The new interfaces are pure virtual base classes and are named DataStream*Base. Using pure virtual base classes may result in code duplication but the current approach is to optimize for the eventual removal of the thrift implementations. Stubs for the Krpc implementations are also introduced and are named KrpcDataStream*. They currently only abort with a fatal error if they are used. Their actual implementations will be filled in a later patch. When the time comes to standardize on the KRPC implementations of DataStream*Base, we will get rid of the DataStream*Base classes and the Thrift versions of the classes and rename KrpcDataStream* to DataStream*. We will also rename all the references that the clients have to DataStream*Base to DataStream*. Also did some spurious includes cleanup. Change-Id: I5d52245154e910529a68f53049520238eca16241 --- M be/src/exec/data-sink.cc M be/src/exec/exchange-node.cc M be/src/exec/exchange-node.h M be/src/runtime/CMakeLists.txt A be/src/runtime/data-stream-mgr-base.h M be/src/runtime/data-stream-mgr.cc M be/src/runtime/data-stream-mgr.h A be/src/runtime/data-stream-recvr-base.h M be/src/runtime/data-stream-recvr.h M be/src/runtime/data-stream-test.cc M be/src/runtime/exec-env.cc M be/src/runtime/exec-env.h M be/src/runtime/fragment-instance-state.cc A be/src/runtime/krpc-data-stream-mgr.cc A be/src/runtime/krpc-data-stream-mgr.h A be/src/runtime/krpc-data-stream-recvr.cc A be/src/runtime/krpc-data-stream-recvr.h M be/src/runtime/runtime-state.cc M be/src/runtime/runtime-state.h M be/src/service/impala-server.cc 20 files changed, 436 insertions(+), 39 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/42/7542/13 -- To view, visit http://gerrit.cloudera.org:8080/7542 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I5d52245154e910529a68f53049520238eca16241 Gerrit-PatchSet: 13 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Sailesh MukilGerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil
[Impala-ASF-CR] Bump Kudu version to 943b1ae, and OpenSSL to 1.0.2l
Impala Public Jenkins has posted comments on this change. Change subject: Bump Kudu version to 943b1ae, and OpenSSL to 1.0.2l .. Patch Set 2: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1025/ -- To view, visit http://gerrit.cloudera.org:8080/7626 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I43a510c07b877cb18c3170a034cf9322baef16c0 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Matthew JacobsGerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-HasComments: No
[Impala-ASF-CR] Bump Kudu version to 943b1ae, and OpenSSL to 1.0.2l
Hello Impala Public Jenkins, Thomas Tauber-Marshall, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/7626 to look at the new patch set (#2). Change subject: Bump Kudu version to 943b1ae, and OpenSSL to 1.0.2l .. Bump Kudu version to 943b1ae, and OpenSSL to 1.0.2l Change-Id: I43a510c07b877cb18c3170a034cf9322baef16c0 --- M bin/impala-config.sh 1 file changed, 3 insertions(+), 3 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/26/7626/2 -- To view, visit http://gerrit.cloudera.org:8080/7626 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I43a510c07b877cb18c3170a034cf9322baef16c0 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Matthew JacobsGerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Tauber-Marshall
[Impala-ASF-CR] Bump Kudu version to 943b1ae
Impala Public Jenkins has posted comments on this change. Change subject: Bump Kudu version to 943b1ae .. Patch Set 1: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/1024/ -- To view, visit http://gerrit.cloudera.org:8080/7626 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I43a510c07b877cb18c3170a034cf9322baef16c0 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Matthew JacobsGerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-HasComments: No
[Impala-ASF-CR] Bump Kudu version to 943b1ae
Impala Public Jenkins has posted comments on this change. Change subject: Bump Kudu version to 943b1ae .. Patch Set 1: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1024/ -- To view, visit http://gerrit.cloudera.org:8080/7626 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I43a510c07b877cb18c3170a034cf9322baef16c0 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Matthew JacobsGerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5781: Only use TLSv1.0-compatible ciphers for tests
Henry Robinson has posted comments on this change. Change subject: IMPALA-5781: Only use TLSv1.0-compatible ciphers for tests .. Patch Set 1: I also tried this on a machine with OpenSSL 1.0.0. -- To view, visit http://gerrit.cloudera.org:8080/7624 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Id66508040bcc7745b7c68b62ace71ae1d394c1b4 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Henry RobinsonGerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Matthew Jacobs Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5781: Only use TLSv1.0-compatible ciphers for tests
Impala Public Jenkins has posted comments on this change. Change subject: IMPALA-5781: Only use TLSv1.0-compatible ciphers for tests .. Patch Set 1: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1023/ -- To view, visit http://gerrit.cloudera.org:8080/7624 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Id66508040bcc7745b7c68b62ace71ae1d394c1b4 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Henry RobinsonGerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Matthew Jacobs Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5708: Test failure with invalid exec summary
Matthew Jacobs has posted comments on this change. Change subject: IMPALA-5708: Test failure with invalid exec summary .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/7627/1/tests/beeswax/impala_beeswax.py File tests/beeswax/impala_beeswax.py: PS1, Line 218: impala_shell.py Ugh, just saw this. If you take a look at shell/impala_shell.py -> shell/impala_client.py , you'll see this was already handled. Before we diverge any further, can you ping Michael Brown/David K and see if they have any thoughts for factoring this common python code into a library that we can re-use between tests and the shell? -- To view, visit http://gerrit.cloudera.org:8080/7627 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Id52ac62da2b01f9e163e97cbe4590f8db6b663d2 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Thomas Tauber-MarshallGerrit-Reviewer: Matthew Jacobs Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5781: Only use TLSv1.0-compatible ciphers for tests
Matthew Jacobs has posted comments on this change. Change subject: IMPALA-5781: Only use TLSv1.0-compatible ciphers for tests .. Patch Set 1: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/7624 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Id66508040bcc7745b7c68b62ace71ae1d394c1b4 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Henry RobinsonGerrit-Reviewer: Matthew Jacobs Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5412 Scan returns wrong partition-column values when scanning multiple partitions pointing to the same filesystem location.
Tim Armstrong has posted comments on this change. Change subject: IMPALA-5412 Scan returns wrong partition-column values when scanning multiple partitions pointing to the same filesystem location. .. Patch Set 1: (2 comments) Looks like MJ and I collided. I think we mostly agree here. http://gerrit.cloudera.org:8080/#/c/7625/1/tests/metadata/test_partition_metadata.py File tests/metadata/test_partition_metadata.py: PS1, Line 80: avro > does this need to be avro? seems like text would work, and this test class The bug affects Seq/RC/Avro only. PS1, Line 110: note, that shortcoming on avro writer would result the inserted value as NULL, : # but the point is the second column for partition i > This isn't a clear sentence; why does the avro writer insert NULL? The Avro writer is broken. It always represents null values as the second element of the union, ignoring any schema. But the default avro schema in the frontend represents null values as the first element of the union. I guess we could flip the 0 and 1 in HdfsAvroTableWriter::AppendField() and at least make the writer usable for this test. -- To view, visit http://gerrit.cloudera.org:8080/7625 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ie74b305377248045c0d87b911943e1cabb7223e9 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Gabor KaszabGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Laszlo Gaal Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5412 Scan returns wrong partition-column values when scanning multiple partitions pointing to the same filesystem location.
Tim Armstrong has posted comments on this change. Change subject: IMPALA-5412 Scan returns wrong partition-column values when scanning multiple partitions pointing to the same filesystem location. .. Patch Set 1: (12 comments) http://gerrit.cloudera.org:8080/#/c/7625/1/be/src/exec/base-sequence-scanner.cc File be/src/exec/base-sequence-scanner.cc: Line 90: context->partition_descriptor()->id(),stream_->filename())); nit:space after , http://gerrit.cloudera.org:8080/#/c/7625/1/be/src/exec/hdfs-scan-node-base.cc File be/src/exec/hdfs-scan-node-base.cc: PS1, Line 266: string Do we need to call the string constructor explicitly? Seems a bit weird. PS1, Line 634: name) { Nit: long line > 90 chars Line 648: void* HdfsScanNodeBase::GetFileMetadata( Nit: I think this line fits in 90 characters. http://gerrit.cloudera.org:8080/#/c/7625/1/be/src/exec/hdfs-scan-node-base.h File be/src/exec/hdfs-scan-node-base.h: Line 197: /// Allocate a new scan range object, stored in the runtime state's object pool. For This comment needs updating since partition_id is now required. Line 360: struct pair_hash { We have some utilities like this already in util/container-util.h, let's move it there. Line 363: return std::hash{}(p.second); Let's hash both values in the pair and combine them with boost::hash_combine. The current approach probably works ok for this use case but it seems better to implement a general-purpose pair hash that can be reused, given it's not much more work. http://gerrit.cloudera.org:8080/#/c/7625/1/tests/metadata/test_partition_metadata.py File tests/metadata/test_partition_metadata.py: Line 43: """Regression test for IMPALA-597. Verifies Impala is able to properly read Can we run this test for Parquet too? TestInsertQueries has text and parquet in its test matrix and uses the "STORED AS" clause too, so we can copy its approach. Line 71: data = self.execute_scalar("select sum(i), sum(j) from %s" % FQ_TBL_NAME) Can we run this query with num_nodes=1 to verify that the same bug doesn't exist with text file? Line 80: def test_multiple_partitions_same_location_avro(self, vector, unique_database): Test looks good, I think we should just make sure we address the full test coverage gap for the other file formats too. Let's run this test for the other affected formats with an unsupported writers - SequenceFile. If we change the test matrix for this class as I suggest above, we probably want to pull out this test into a separate class with a different test matrix. PS1, Line 81: imapala Impala Line 110: # (note, that shortcoming on avro writer would result the inserted value as NULL, How about we only query the second column to avoid this complication? -- To view, visit http://gerrit.cloudera.org:8080/7625 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ie74b305377248045c0d87b911943e1cabb7223e9 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Gabor KaszabGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Laszlo Gaal Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5412 Scan returns wrong partition-column values when scanning multiple partitions pointing to the same filesystem location.
Matthew Jacobs has posted comments on this change. Change subject: IMPALA-5412 Scan returns wrong partition-column values when scanning multiple partitions pointing to the same filesystem location. .. Patch Set 1: (15 comments) http://gerrit.cloudera.org:8080/#/c/7625/1//COMMIT_MSG Commit Message: PS1, Line 7: IMPALA-5412 Scan returns wrong partition-column values when scanning multiple partitions : pointing to the same filesystem location. : : The maps storing file descriptors and file metadata were using filename as a key. : Multiple partitions pointing to the same filesystem location resulted that these : map entries were occasionally overwritted by the other partition poing to the same. : : As a solution the map key was enhanced to contain a pair of partition ID and file name. wrap at 60cols in git commit msgs http://gerrit.cloudera.org:8080/#/c/7625/1/be/src/exec/base-sequence-scanner.cc File be/src/exec/base-sequence-scanner.cc: PS1, Line 163: ,s space PS1, Line 309: 4 spaces http://gerrit.cloudera.org:8080/#/c/7625/1/be/src/exec/hdfs-parquet-scanner.cc File be/src/exec/hdfs-parquet-scanner.cc: Line 613: context_->partition_descriptor()->id(), filename()); 4 spaces http://gerrit.cloudera.org:8080/#/c/7625/1/be/src/exec/hdfs-scan-node-base.cc File be/src/exec/hdfs-scan-node-base.cc: PS1, Line 266: string remove string constructor here PS1, Line 266: , space http://gerrit.cloudera.org:8080/#/c/7625/1/be/src/exec/hdfs-scan-node-base.h File be/src/exec/hdfs-scan-node-base.h: PS1, Line 237: /// Returns nullptr if the search doesn't find the descriptor. not true; it has a dcheck PS1, Line 360: pair_hash misleading name because the hash is computed on the value. why is it necessary to provide a custom hash? PS1, Line 369: pairCan you typedef this and using that here and in the .cc file ? PS1, Line 381: pair same http://gerrit.cloudera.org:8080/#/c/7625/1/tests/metadata/test_partition_metadata.py File tests/metadata/test_partition_metadata.py: PS1, Line 80: avro does this need to be avro? seems like text would work, and this test class is using the text/none dimension (see l37) PS1, Line 81: ad as? PS1, Line 109: output 'output' isn't clear (it doesn't match up with the queries). say this is what would get returned by select i, j from %s PS1, Line 110: note, that shortcoming on avro writer would result the inserted value as NULL, : # but the point is the second column for partition i This isn't a clear sentence; why does the avro writer insert NULL? PS1, Line 122: output: : # [NULL, 1] 3 times : # [NULL, 2] 3 times again, this is confusing -- To view, visit http://gerrit.cloudera.org:8080/7625 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ie74b305377248045c0d87b911943e1cabb7223e9 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Gabor Kaszab Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Laszlo Gaal Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] Bump Kudu version to 943b1ae
Thomas Tauber-Marshall has posted comments on this change. Change subject: Bump Kudu version to 943b1ae .. Patch Set 1: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/7626 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I43a510c07b877cb18c3170a034cf9322baef16c0 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Matthew JacobsGerrit-Reviewer: Thomas Tauber-Marshall Gerrit-HasComments: No
[Impala-ASF-CR] Bump Kudu version to 943b1ae
Matthew Jacobs has uploaded a new change for review. http://gerrit.cloudera.org:8080/7626 Change subject: Bump Kudu version to 943b1ae .. Bump Kudu version to 943b1ae Change-Id: I43a510c07b877cb18c3170a034cf9322baef16c0 --- M bin/impala-config.sh 1 file changed, 2 insertions(+), 2 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/26/7626/1 -- To view, visit http://gerrit.cloudera.org:8080/7626 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I43a510c07b877cb18c3170a034cf9322baef16c0 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Matthew Jacobs
[Impala-ASF-CR] IMPALA-5778: clarify --read size option.
Balazs Jeszenszky has posted comments on this change. Change subject: IMPALA-5778: clarify --read_size option. .. Patch Set 1: Code-Review+1 Looks good. -- To view, visit http://gerrit.cloudera.org:8080/7623 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I3c20a9d55f89170b11f569c90b7f2949ddbe4211 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Balazs Jeszenszky Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5412 Scan returns wrong partition-column values when scanning multiple partitions pointing to the same filesystem location.
Gabor Kaszab has uploaded a new change for review. http://gerrit.cloudera.org:8080/7625 Change subject: IMPALA-5412 Scan returns wrong partition-column values when scanning multiple partitions pointing to the same filesystem location. .. IMPALA-5412 Scan returns wrong partition-column values when scanning multiple partitions pointing to the same filesystem location. The maps storing file descriptors and file metadata were using filename as a key. Multiple partitions pointing to the same filesystem location resulted that these map entries were occasionally overwritted by the other partition poing to the same. As a solution the map key was enhanced to contain a pair of partition ID and file name. Change-Id: Ie74b305377248045c0d87b911943e1cabb7223e9 --- M be/src/exec/base-sequence-scanner.cc M be/src/exec/hdfs-parquet-scanner.cc M be/src/exec/hdfs-scan-node-base.cc M be/src/exec/hdfs-scan-node-base.h M be/src/exec/hdfs-text-scanner.cc M be/src/exec/scanner-context.cc M tests/metadata/test_partition_metadata.py 7 files changed, 109 insertions(+), 34 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/25/7625/1 -- To view, visit http://gerrit.cloudera.org:8080/7625 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: Ie74b305377248045c0d87b911943e1cabb7223e9 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Gabor Kaszab