[Impala-ASF-CR] IMPALA-2250: Make multiple COUNT(DISTINCT) message state workarounds
Hello Alex Behm, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8614 to look at the new patch set (#4). Change subject: IMPALA-2250: Make multiple COUNT(DISTINCT) message state workarounds .. IMPALA-2250: Make multiple COUNT(DISTINCT) message state workarounds Change-Id: I5084be10946d68f3ec0760c2b7e698635df26a89 --- M fe/src/main/java/org/apache/impala/analysis/AggregateInfo.java 1 file changed, 4 insertions(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/14/8614/4 -- To view, visit http://gerrit.cloudera.org:8080/8614 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I5084be10946d68f3ec0760c2b7e698635df26a89 Gerrit-Change-Number: 8614 Gerrit-PatchSet: 4 Gerrit-Owner: Kim Jin Chul Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Kim Jin Chul
[Impala-ASF-CR] IMPALA-2250: Make multiple COUNT(DISTINCT) message state workarounds
Kim Jin Chul has posted comments on this change. ( http://gerrit.cloudera.org:8080/8614 ) Change subject: IMPALA-2250: Make multiple COUNT(DISTINCT) message state workarounds .. Patch Set 3: (1 comment) http://gerrit.cloudera.org:8080/#/c/8614/3/fe/src/main/java/org/apache/impala/analysis/AggregateInfo.java File fe/src/main/java/org/apache/impala/analysis/AggregateInfo.java: http://gerrit.cloudera.org:8080/#/c/8614/3/fe/src/main/java/org/apache/impala/analysis/AggregateInfo.java@249 PS3, Line 249: if (distinctAggExprs.get(i).getFnName().getFunction().equalsIgnoreCase( : "count")) { > In my example, it does not make sense to print an error message that uses N Okay, let me reflect your proposal. -- To view, visit http://gerrit.cloudera.org:8080/8614 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I5084be10946d68f3ec0760c2b7e698635df26a89 Gerrit-Change-Number: 8614 Gerrit-PatchSet: 3 Gerrit-Owner: Kim Jin Chul Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Kim Jin Chul Gerrit-Comment-Date: Wed, 22 Nov 2017 08:01:54 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6225: Query profile date-time strings should have ns precision.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/8611 ) Change subject: IMPALA-6225: Query profile date-time strings should have ns precision. .. Patch Set 7: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/1514/ -- To view, visit http://gerrit.cloudera.org:8080/8611 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2e124b9c7e0717b8dc2cdab46aea41d74c5f2fd0 Gerrit-Change-Number: 8611 Gerrit-PatchSet: 7 Gerrit-Owner: Zoram Thanga Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Zoram Thanga Gerrit-Comment-Date: Wed, 22 Nov 2017 08:20:18 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-3703: Store query context in thread-local variables
Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/8621 ) Change subject: IMPALA-3703: Store query context in thread-local variables .. Patch Set 1: (2 comments) Thanks for the review. I added SetThreadName(), SetQueryId(), and SetInstanceId() to InfoTable. They have their separate buffer on the stack. > "how hard will it be to visit a /threads debug page and see what all the > threads are up to?" If you have a complete core file (one that is not generated from a minidump), one can traverse all the threads and inspect the global impala::thread_info_table object and retrieve the needed information. If you have a core that is created from a minidump, it is only a little bit more complicated. During thread traversion one need to select the stack frame that has the InfoTable object (currently it is Thread::SuperviseThread()). After the needed information can be retrieved from the InfoTable object. > "Have you tried pulling one of these out in a minidump?" I used minidumps, but I converted them to core files because the tooling for minidumps are not that user friendly / feature rich. > RegisterAppMemory Yes, this can be used, but then I would need to call RegisterAppMemory on every thread creation, and UnregisterAppMemory on every thread destruction. It is not that big a deal, since I can put this logic into InfoTable's constructor/destructor pairs. I would also need to make the breakpad ExceptionHandler object accessible for InfoTable, and synchronize the accesses to it between threads. With stack allocation, everything is automatically included in the minidump. If you say we cannot afford having this object on the stack, I rewrite it to use the heap. > testing I created backend tests in thread-info-test.cc http://gerrit.cloudera.org:8080/#/c/8621/1/be/src/common/thread-info.h File be/src/common/thread-info.h: http://gerrit.cloudera.org:8080/#/c/8621/1/be/src/common/thread-info.h@70 PS1, Line 70: /// Call this global function to add information about the current thread > Should we document that you should avoid information that's sensitive here? I removed this function, now only the member AddExtraInfo() can be used to add unstructured text info to the InfoTable. I added a comment there that it is not for sensitive data. http://gerrit.cloudera.org:8080/#/c/8621/1/be/src/common/thread-info.h@73 PS1, Line 73: /// is a no-op. > It's weird to me that this can be a no-op if there's no object, but it fail I removed this function, now we cannot call InfoTable::AddExtraInfo() without an object. -- To view, visit http://gerrit.cloudera.org:8080/8621 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I566f7f1db5117c498e86e0bd05b33bdcff624609 Gerrit-Change-Number: 8621 Gerrit-PatchSet: 1 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Wed, 22 Nov 2017 16:14:59 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-3703: Store query context in thread-local variables
Hello Lars Volker, Philip Zeyliger, Tim Armstrong, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8621 to look at the new patch set (#2). Change subject: IMPALA-3703: Store query context in thread-local variables .. IMPALA-3703: Store query context in thread-local variables This commit introduces the InfoTable class which can hold information about the current thread that can be useful in debug sessions. It needs to be allocated on the stack in order to include it to minidumps. Currently an InfoTable object is created in Thread::SuperviseThread. This object is available in all the child stack frames through the global function GetThreadInfoTable(). InfoTable has members for the thread name, query id, and instance id. These are fixed size char buffers. InfoTable also has a member called text, which is a char buffer that holds lines of strings. A line has a format of = . New data can be appended to the buffer by calling AddExtraInfo(key, value). If you have a fully-fledged core file, you can locate the InfoTable for the current thread through the global pointer impala::thread_info_table. In a core file that has been created from a minidump, we need to select the stack frame that allocated the InfoTable object in order to inspect it. It is currently allocated in Thread::SuperviseThread(). We can use printf in gdb to print the members, e.g.: printf "%s" thread_info_table.text printf "%s\n" thread_info_table.query_id Currently the thread category, thread name, query id, and instance id is stored. I created some tests in thread-info-test.cc Change-Id: I566f7f1db5117c498e86e0bd05b33bdcff624609 --- M be/src/common/CMakeLists.txt A be/src/common/thread-info-test.cc A be/src/common/thread-info.cc A be/src/common/thread-info.h M be/src/runtime/query-state.cc M be/src/util/thread.cc 6 files changed, 285 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/21/8621/2 -- To view, visit http://gerrit.cloudera.org:8080/8621 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I566f7f1db5117c498e86e0bd05b33bdcff624609 Gerrit-Change-Number: 8621 Gerrit-PatchSet: 2 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] IMPALA-5936: operator '%' overflows on large decimals
Hello Taras Bobrovytsky, Tim Armstrong, Dan Hecht, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8574 to look at the new patch set (#3). Change subject: IMPALA-5936: operator '%' overflows on large decimals .. IMPALA-5936: operator '%' overflows on large decimals Suppose we have a large decimal number, which is greater than INT_MAX. We want to calculate the modulo of this number by 3: BIG_DECIMAL % 3 The result of this calculation can be 0, 1, or 2. This can fit into a decimal with precision 1. The in-memory representation of such small decimals are stored in int32_t in the backend. Let's call this int32_t the result type. The backend had the invalid assumption that it can do the calculation as well using the result type. This assumption is true for multiplying or adding numbers, but not for modulo. Now the backend selects the biggest type of ['return type', '1st operand type', '2nd operand type'] to do the calculation. Change-Id: I2b06c8acd5aa490943e84013faf2eaac7c26ceb4 --- M be/src/exprs/decimal-operators-ir.cc M testdata/workloads/functional-query/queries/QueryTest/decimal-exprs.test 2 files changed, 24 insertions(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/74/8574/3 -- To view, visit http://gerrit.cloudera.org:8080/8574 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I2b06c8acd5aa490943e84013faf2eaac7c26ceb4 Gerrit-Change-Number: 8574 Gerrit-PatchSet: 3 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] IMPALA-5936: operator '%' overflows on large decimals
Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/8574 ) Change subject: IMPALA-5936: operator '%' overflows on large decimals .. Patch Set 3: (2 comments) Thanks. http://gerrit.cloudera.org:8080/#/c/8574/2/be/src/exprs/decimal-operators-ir.cc File be/src/exprs/decimal-operators-ir.cc: http://gerrit.cloudera.org:8080/#/c/8574/2/be/src/exprs/decimal-operators-ir.cc@675 PS2, Line 675: case 4: { \ > It's probably a good idea to DCHECK here that x_size <= 4 and y_size <= 4 Done http://gerrit.cloudera.org:8080/#/c/8574/2/be/src/exprs/decimal-operators-ir.cc@684 PS2, Line 684: } \ > Maybe DCHECK that x_size <= 8 and y_size <= 8, to make sure where are not r Done -- To view, visit http://gerrit.cloudera.org:8080/8574 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2b06c8acd5aa490943e84013faf2eaac7c26ceb4 Gerrit-Change-Number: 8574 Gerrit-PatchSet: 3 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Wed, 22 Nov 2017 16:48:14 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6227: deflake admission stress tests
Tim Armstrong has uploaded this change for review. ( http://gerrit.cloudera.org:8080/8631 Change subject: IMPALA-6227: deflake admission stress tests .. IMPALA-6227: deflake admission stress tests The problem was that, during the initial admission decision phase, some queries were initially queued then dequeued once memory came available. All of the accounting in the test implicitly relies on queries not being dequeued until queries are later explicitly ended, so if this happened, the test broke in multiple subtle ways. This happened because the query only scanned a small number of rows, which could be all buffered on the receiver side of the exchange even before the client fetched any rows from the coordinator. This means that the reserved memory on some backends could increase then decrease during the initial admission phase, resulting in a query being queued then dequeued. The fix is to increase the number of rows returned by the query so that all fragments remain active during the initial admission phase. This increased test execution time somewhat, so I also had to bump the queue wait timeout for the admission stress tests (they assume that queries don't time out in the queue). Testing: Ran the test under debug, release and ASAN builds, i.e. impala-py.test tests/custom_cluster/test_admission_controller.py \ --workload_exploration_strategy="functional-query:exhaustive" I looped the mem_limit test for a while to confirm it didn't reproduce (it reproduced reliably every 2-3 iterations before this fix). I will try looping it a bit more under different build types concurrently with the review. Change-Id: Iafb3af0ce68f96e5d713dbb3b37dd0b50ea66bb4 --- M fe/src/test/resources/llama-site-test2.xml M tests/custom_cluster/test_admission_controller.py 2 files changed, 102 insertions(+), 38 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/31/8631/1 -- To view, visit http://gerrit.cloudera.org:8080/8631 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: Iafb3af0ce68f96e5d713dbb3b37dd0b50ea66bb4 Gerrit-Change-Number: 8631 Gerrit-PatchSet: 1 Gerrit-Owner: Tim Armstrong
[Impala-ASF-CR] IMPALA-4132: Use -fno-omit-frame-pointer
Bharath Vissapragada has posted comments on this change. ( http://gerrit.cloudera.org:8080/8612 ) Change subject: IMPALA-4132: Use -fno-omit-frame-pointer .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/8612/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/8612/1//COMMIT_MSG@10 PS1, Line 10: As a result we expect more useful stack : traces to be produced. Just curious how well this works, could you paste a couple of thread stacks before and after this change using a stripped binary. It'd be a great change from a supportability POV if we can get reliable stacks without having to attach symbols separately. -- To view, visit http://gerrit.cloudera.org:8080/8612 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib7d0d88ba015a847356ed0274fe91017b98cb941 Gerrit-Change-Number: 8612 Gerrit-PatchSet: 1 Gerrit-Owner: Gabor Kaszab Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 22 Nov 2017 17:33:17 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-3703: Store query context in thread-local variables
Philip Zeyliger has posted comments on this change. ( http://gerrit.cloudera.org:8080/8621 ) Change subject: IMPALA-3703: Store query context in thread-local variables .. Patch Set 2: (6 comments) Thanks for the updates. Some more comments. I'd advise you to get Tim/Lars to look at this before reacting too much to my comments, since there are a bunch of things here that I'm insufficiently familiar with to review wisely. But I like the direction it's going! http://gerrit.cloudera.org:8080/#/c/8621/2/be/src/common/thread-info-test.cc File be/src/common/thread-info-test.cc: http://gerrit.cloudera.org:8080/#/c/8621/2/be/src/common/thread-info-test.cc@55 PS2, Line 55: Status status = info_table.AddExtraInfo("extra", "info"); please add a comment about what's going on: // We've exhausted our allocated space by now, so AddExtraInfo should be failing. http://gerrit.cloudera.org:8080/#/c/8621/2/be/src/common/thread-info.h File be/src/common/thread-info.h: http://gerrit.cloudera.org:8080/#/c/8621/2/be/src/common/thread-info.h@37 PS2, Line 37: /// it in minidumps. Until this object lives, it is available through the global "Until this object lives" -> "While this object is alive" http://gerrit.cloudera.org:8080/#/c/8621/2/be/src/common/thread-info.h@41 PS2, Line 41: class InfoTable : boost::noncopyable { This is the only usage of boost:noncopyable I've seen in our code base. We should check that it's the style of this sort of thing we want. I think DISALLOW_COPY_AND_ASSIGN does something very similar and is present in our code base. It's defined in be/src/gutil/macros.h. http://gerrit.cloudera.org:8080/#/c/8621/2/be/src/common/thread-info.h@43 PS2, Line 43: InfoTable() : text(), text_size(0), query_id(), instance_id(), thread_name() { We should comment that you're only allowed to create one of these per thread at a time. I had to think through how the constructor/destructor update a thread local. (I had assumed it would work the other way, with the thread_local being maintained by the cpde that's creating these , but that may just be the Java in me speaking.) http://gerrit.cloudera.org:8080/#/c/8621/2/be/src/common/thread-info.h@61 PS2, Line 61: std::copy(line.begin(), line.end(), text + text_size); Would line.copy(text + text_size, line.length()) be clearer here? I don't think anything is null-terminating these, so printf in the debugger may have a bunch of junk in the remaining buffer. We should either null-terminate the strings here, or, perhaps better, zero out the bytes when initialized. http://gerrit.cloudera.org:8080/#/c/8621/2/be/src/common/thread-info.h@108 PS2, Line 108: char query_id[TUNIQUE_ID_STRING_SIZE]; You're choosing to store this as a string intentionally? -- To view, visit http://gerrit.cloudera.org:8080/8621 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I566f7f1db5117c498e86e0bd05b33bdcff624609 Gerrit-Change-Number: 8621 Gerrit-PatchSet: 2 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Wed, 22 Nov 2017 17:56:00 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4985: use parquet stats of nested types for dynamic pruning
Hello Lars Volker, Alex Behm, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8480 to look at the new patch set (#10). Change subject: IMPALA-4985: use parquet stats of nested types for dynamic pruning .. IMPALA-4985: use parquet stats of nested types for dynamic pruning Currently, parquet row-groups can be pruned at run-time using min/max stats when predicates (in, binary) are specified for column scalar types. This patch extends pruning to nested types for the same class of predicates. A nested value is an instance of a nested type (struct, array, map). A nested value consists of other nested and scalar values (as declared by its type). Predicates that can be used for row-group pruning must be applied to nested scalar values. In addition, the parent of the nested scalar must also be required, that is, not empty. The latter requirement is conservative: some filters that could be used for pruning are not used for correctness reasons. Testing: - extended nested-types-parquet-stats e2e test cases. Change-Id: I0c99e20cb080b504442cd5376ea3e046016158fe --- M be/src/exec/hdfs-parquet-scanner.h M fe/src/main/java/org/apache/impala/analysis/CollectionStructType.java M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java M fe/src/main/java/org/apache/impala/analysis/SlotRef.java M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java M testdata/workloads/functional-planner/queries/PlannerTest/constant-folding.test M testdata/workloads/functional-planner/queries/PlannerTest/mt-dop-validation.test M testdata/workloads/functional-planner/queries/PlannerTest/parquet-filtering.test M testdata/workloads/functional-query/queries/QueryTest/nested-types-parquet-stats.test M tests/query_test/test_nested_types.py 10 files changed, 649 insertions(+), 38 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/80/8480/10 -- To view, visit http://gerrit.cloudera.org:8080/8480 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I0c99e20cb080b504442cd5376ea3e046016158fe Gerrit-Change-Number: 8480 Gerrit-PatchSet: 10 Gerrit-Owner: Vuk Ercegovac Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Vuk Ercegovac
[Impala-ASF-CR] IMPALA-4985: use parquet stats of nested types for dynamic pruning
Vuk Ercegovac has posted comments on this change. ( http://gerrit.cloudera.org:8080/8480 ) Change subject: IMPALA-4985: use parquet stats of nested types for dynamic pruning .. Patch Set 10: found the issue with the planner test: mismatch between local and test environment. change includes the fix. carrying +2 -- To view, visit http://gerrit.cloudera.org:8080/8480 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I0c99e20cb080b504442cd5376ea3e046016158fe Gerrit-Change-Number: 8480 Gerrit-PatchSet: 10 Gerrit-Owner: Vuk Ercegovac Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Wed, 22 Nov 2017 18:06:05 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4835: Part 1: simplify I/O mgr mem mgmt and cancellation
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/8414 ) Change subject: IMPALA-4835: Part 1: simplify I/O mgr mem mgmt and cancellation .. Patch Set 12: Tianyi, did you have any more comments? I'm not going to merge this regardless right now for the reasons stated in the commit message. -- To view, visit http://gerrit.cloudera.org:8080/8414 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If5cb42437d11c13bc4a55c3ab426b66777332bd1 Gerrit-Change-Number: 8414 Gerrit-PatchSet: 12 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 22 Nov 2017 18:17:44 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5522:Use tracked memory for DictDecoder and DictEncoder
Hello Joe McDonnell, Tim Armstrong, Bikramjeet Vig, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8034 to look at the new patch set (#17). Change subject: IMPALA-5522:Use tracked memory for DictDecoder and DictEncoder .. IMPALA-5522:Use tracked memory for DictDecoder and DictEncoder Currently DictDecoder class and DictEncoder class uses std::vector to store the tables mapping codeword to value and vice-versa. It is hard to detect the memory usage by these tables when they becomes very large, since this memory is not accounted by Impala's memory mangement infrastructure. This patch uses the memory tracker of HdfsScanner to track the memory used by dictionary in DictDecoder class. Similary it uses memory tracker of HdfsTableSink to track the memory used by dictionary in DictEncoder class. Memory for the dictionary, stored as std::vector is still allocated from std:allocator but the amount allocated is accounted by introducing a counter which is incremented and decremented as the memory is consumed and released by vector. Testing --- Ran all the backend and end-end tests with no failures. Change-Id: I02a3b54f6c107d19b62ad9e1c49df94175964299 --- M be/src/exec/hdfs-parquet-table-writer.cc M be/src/exec/parquet-column-readers.cc M be/src/exec/parquet-column-readers.h M be/src/util/dict-encoding.h M be/src/util/dict-test.cc 5 files changed, 140 insertions(+), 17 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/34/8034/17 -- To view, visit http://gerrit.cloudera.org:8080/8034 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I02a3b54f6c107d19b62ad9e1c49df94175964299 Gerrit-Change-Number: 8034 Gerrit-PatchSet: 17 Gerrit-Owner: Pranay Singh Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Pranay Singh Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-1144: Fix exception when cancelling query in Impala-shell with CTRL-C
Gabor Kaszab has posted comments on this change. ( http://gerrit.cloudera.org:8080/8549 ) Change subject: IMPALA-1144: Fix exception when cancelling query in Impala-shell with CTRL-C .. Patch Set 6: (1 comment) http://gerrit.cloudera.org:8080/#/c/8549/6/tests/shell/test_shell_commandline.py File tests/shell/test_shell_commandline.py: http://gerrit.cloudera.org:8080/#/c/8549/6/tests/shell/test_shell_commandline.py@352 PS6, Line 352: sleep(0.5) Increased this to 1.5 as RELEASE and ASAN build failed with this value. As a result RELEASE passes but ASAN still fails. Increasing this further to check where ASAN turns green. -- To view, visit http://gerrit.cloudera.org:8080/8549 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6cefaf1dae78baae238289816a7cb9d210fb38e2 Gerrit-Change-Number: 8549 Gerrit-PatchSet: 6 Gerrit-Owner: Gabor Kaszab Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Laszlo Gaal Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Wed, 22 Nov 2017 18:23:35 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4985: use parquet stats of nested types for dynamic pruning
Alex Behm has posted comments on this change. ( http://gerrit.cloudera.org:8080/8480 ) Change subject: IMPALA-4985: use parquet stats of nested types for dynamic pruning .. Patch Set 10: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/8480 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I0c99e20cb080b504442cd5376ea3e046016158fe Gerrit-Change-Number: 8480 Gerrit-PatchSet: 10 Gerrit-Owner: Vuk Ercegovac Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Wed, 22 Nov 2017 18:25:48 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4985: use parquet stats of nested types for dynamic pruning
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/8480 ) Change subject: IMPALA-4985: use parquet stats of nested types for dynamic pruning .. Patch Set 10: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1515/ -- To view, visit http://gerrit.cloudera.org:8080/8480 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I0c99e20cb080b504442cd5376ea3e046016158fe Gerrit-Change-Number: 8480 Gerrit-PatchSet: 10 Gerrit-Owner: Vuk Ercegovac Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Wed, 22 Nov 2017 18:28:17 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-1144: Fix exception when cancelling query in Impala-shell with CTRL-C
Philip Zeyliger has posted comments on this change. ( http://gerrit.cloudera.org:8080/8549 ) Change subject: IMPALA-1144: Fix exception when cancelling query in Impala-shell with CTRL-C .. Patch Set 6: Code-Review+1 (1 comment) Looks good to me at this point. I'll let you figure out the right timings for the test. Feel free to limit the environments in which we run it, too. http://gerrit.cloudera.org:8080/#/c/8549/6//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/8549/6//COMMIT_MSG@9 PS6, Line 9: Issue1: When query is cancelled via CTRL-C while being executed in Impala-shell nit: "Issue 1" reads better to me. -- To view, visit http://gerrit.cloudera.org:8080/8549 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6cefaf1dae78baae238289816a7cb9d210fb38e2 Gerrit-Change-Number: 8549 Gerrit-PatchSet: 6 Gerrit-Owner: Gabor Kaszab Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Laszlo Gaal Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Wed, 22 Nov 2017 18:46:15 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6206: Fix data load failure with -notests
Zach Amsden has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/8580 ) Change subject: IMPALA-6206: Fix data load failure with -notests .. IMPALA-6206: Fix data load failure with -notests When tests are not built, specifically with -notests, instead of just -skiptests, the be-test target is omitted by cmake, and since nothing in impalad depends on uda/udf samples, they are not built. This causes data load to fail on a clean build. Just build them anyway under the target ImpalaUdf since they build in a few seconds. This shaves a few minutes off data load testing by avoiding the time spent linking tests. Note: only empirically, not scientifically measured. Testing: Run a clean build find . -name 'libud[af]sample.so' ./be/build/debug/udf_samples/libudasample.so ./be/build/debug/udf_samples/libudfsample.so Change-Id: Idaa193989d77d56b72db05ad54e1fb0a345938fb Reviewed-on: http://gerrit.cloudera.org:8080/8580 Tested-by: Impala Public Jenkins Reviewed-by: Zach Amsden --- M bin/make_impala.sh 1 file changed, 2 insertions(+), 2 deletions(-) Approvals: Impala Public Jenkins: Verified Zach Amsden: Looks good to me, approved -- To view, visit http://gerrit.cloudera.org:8080/8580 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: Idaa193989d77d56b72db05ad54e1fb0a345938fb Gerrit-Change-Number: 8580 Gerrit-PatchSet: 5 Gerrit-Owner: Zach Amsden Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zach Amsden
[Impala-ASF-CR] IMPALA-6206: Fix data load failure with -notests
Zach Amsden has posted comments on this change. ( http://gerrit.cloudera.org:8080/8580 ) Change subject: IMPALA-6206: Fix data load failure with -notests .. Patch Set 4: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/8580 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Idaa193989d77d56b72db05ad54e1fb0a345938fb Gerrit-Change-Number: 8580 Gerrit-PatchSet: 4 Gerrit-Owner: Zach Amsden Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zach Amsden Gerrit-Comment-Date: Wed, 22 Nov 2017 18:49:26 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6225: Query profile date-time strings should have ns precision.
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/8611 ) Change subject: IMPALA-6225: Query profile date-time strings should have ns precision. .. Patch Set 7: (1 comment) http://gerrit.cloudera.org:8080/#/c/8611/7/tests/query_test/test_observability.py File tests/query_test/test_observability.py: http://gerrit.cloudera.org:8080/#/c/8611/7/tests/query_test/test_observability.py@133 PS7, Line 133: while (retries < 120) GVO failed because 120s doesn't seem to be a good enough timeout. 08:20:17 ] FAIL query_test/test_observability.py::TestObservability::()::test_query_profile_timestamps 08:20:17 ] === FAILURES === 08:20:17 ] ___ TestObservability.test_query_profile_timestamps 08:20:17 ] [gw13] linux2 -- Python 2.7.12 /home/ubuntu/Impala/bin/../infra/python/env/bin/python 08:20:17 ] query_test/test_observability.py:152: in test_query_profile_timestamps 08:20:17 ] assert False, dbg_str 08:20:17 ] E AssertionError: Debug profile for query not available in 120 seconds, (, ). 08:20:17 ] E assert False 08:20:17 ] - Captured stderr call - -- To view, visit http://gerrit.cloudera.org:8080/8611 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2e124b9c7e0717b8dc2cdab46aea41d74c5f2fd0 Gerrit-Change-Number: 8611 Gerrit-PatchSet: 7 Gerrit-Owner: Zoram Thanga Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Zoram Thanga Gerrit-Comment-Date: Wed, 22 Nov 2017 18:58:21 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-2181: Add query option levels for display
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/8447 ) Change subject: IMPALA-2181: Add query option levels for display .. Patch Set 14: (5 comments) http://gerrit.cloudera.org:8080/#/c/8447/18/be/src/service/query-options.h File be/src/service/query-options.h: http://gerrit.cloudera.org:8080/#/c/8447/18/be/src/service/query-options.h@113 PS18, Line 113: per discussion on the JIRA, I think this should be under DEVELOPMENT until it's documented. http://gerrit.cloudera.org:8080/#/c/8447/18/shell/impala_client.py File shell/impala_client.py: http://gerrit.cloudera.org:8080/#/c/8447/18/shell/impala_client.py@100 PS18, Line 100: _indent_level, output): This check shouldn't be necessary, right? Unless you're doing what I did and running impala-shell against the wrong thrift output because you're too lazy to rebuild Impala :). http://gerrit.cloudera.org:8080/#/c/8447/18/shell/impala_shell.py File shell/impala_shell.py: http://gerrit.cloudera.org:8080/#/c/8447/18/shell/impala_shell.py@83 PS18, Line 83: correspond http://gerrit.cloudera.org:8080/#/c/8447/18/shell/impala_shell.py@242 PS18, Line 242: self._print_option_group(advanced_options) This is subtle - can you add a comment? If I understand correctly, this is detecting that the impala server didn't send back any advanced query options, so we're assuming it's an old one. http://gerrit.cloudera.org:8080/#/c/8447/14/tests/hs2/test_hs2.py File tests/hs2/test_hs2.py: http://gerrit.cloudera.org:8080/#/c/8447/14/tests/hs2/test_hs2.py@53 PS14, Line 53: fetch_results_req.operationHandle = execute_statement_resp.operationHandle > This function is actually a copy paste: I moved it out from test_session_op Seems ok then. -- To view, visit http://gerrit.cloudera.org:8080/8447 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I75720d0d454527e1a0ed19bb43cf9e4f018ce1d1 Gerrit-Change-Number: 8447 Gerrit-PatchSet: 14 Gerrit-Owner: Gabor Kaszab Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Laszlo Gaal Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Wed, 22 Nov 2017 19:06:09 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-2181: Add query option levels for display
Hello Lars Volker, Laszlo Gaal, Zoltan Borok-Nagy, Philip Zeyliger, Attila Jeges, Tim Armstrong, Csaba Ringhofer, Dan Hecht, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8447 to look at the new patch set (#19). Change subject: IMPALA-2181: Add query option levels for display .. IMPALA-2181: Add query option levels for display Four display levels are introduced for each query option: REGULAR, ADVANCED, DEVELOPMENT and DEPRECATED. When the query options are displayed in Impala shell using SET then only the REGULAR and ADVANCED options are shown. A new command called SET ALL shows all the options grouped by their option levels. When the query options are displayed through HS2 interface then the result set would contain an extra column indicating the level of each option. Similarly to Impala shell on this interface the SET command only diplays the REGULAR and ADVANCED options while SET ALL shows them all. If the Impala shell connects to an Impala daemon that predates this change then all the options would be displayed in the REGULAR group. Change-Id: I75720d0d454527e1a0ed19bb43cf9e4f018ce1d1 --- M be/src/service/child-query.cc M be/src/service/client-request-state.cc M be/src/service/client-request-state.h M be/src/service/impala-server.cc M be/src/service/impala-server.h M be/src/service/query-options.cc M be/src/service/query-options.h M common/thrift/Frontend.thrift M common/thrift/beeswax.thrift M fe/src/main/cup/sql-parser.cup M fe/src/main/java/org/apache/impala/analysis/SetStmt.java M fe/src/main/java/org/apache/impala/service/Frontend.java M shell/impala_client.py M shell/impala_shell.py M testdata/workloads/functional-query/queries/QueryTest/set.test M tests/custom_cluster/test_set_and_unset.py M tests/hs2/test_hs2.py M tests/shell/test_shell_interactive.py 18 files changed, 487 insertions(+), 226 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/47/8447/19 -- To view, visit http://gerrit.cloudera.org:8080/8447 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I75720d0d454527e1a0ed19bb43cf9e4f018ce1d1 Gerrit-Change-Number: 8447 Gerrit-PatchSet: 19 Gerrit-Owner: Gabor Kaszab Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Laszlo Gaal Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] IMPALA-2181: Add query option levels for display
Gabor Kaszab has posted comments on this change. ( http://gerrit.cloudera.org:8080/8447 ) Change subject: IMPALA-2181: Add query option levels for display .. Patch Set 19: (4 comments) http://gerrit.cloudera.org:8080/#/c/8447/18/be/src/service/query-options.h File be/src/service/query-options.h: http://gerrit.cloudera.org:8080/#/c/8447/18/be/src/service/query-options.h@113 PS18, Line 113: QUERY_OPT_FN(strict_mode, STRICT_MODE, TQueryOptionLevel::DEVELOPMENT)\ > per discussion on the JIRA, I think this should be under DEVELOPMENT until Done http://gerrit.cloudera.org:8080/#/c/8447/18/shell/impala_client.py File shell/impala_client.py: http://gerrit.cloudera.org:8080/#/c/8447/18/shell/impala_client.py@100 PS18, Line 100: hasattr(option, 'level') and > This check shouldn't be necessary, right? Unless you're doing what I did an in theory, if the shell with the feature comes from an Impala that is compiled than this is not necessary. I would still keep it be make 100% sure, that nothing breaks if you don't mind. http://gerrit.cloudera.org:8080/#/c/8447/18/shell/impala_shell.py File shell/impala_shell.py: http://gerrit.cloudera.org:8080/#/c/8447/18/shell/impala_shell.py@83 PS18, Line 83: correspond > correspond Done http://gerrit.cloudera.org:8080/#/c/8447/18/shell/impala_shell.py@242 PS18, Line 242: # If the shell is connected to an Impala that predates IMPALA-2181 then > This is subtle - can you add a comment? If I understand correctly, this is Done -- To view, visit http://gerrit.cloudera.org:8080/8447 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I75720d0d454527e1a0ed19bb43cf9e4f018ce1d1 Gerrit-Change-Number: 8447 Gerrit-PatchSet: 19 Gerrit-Owner: Gabor Kaszab Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Laszlo Gaal Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Wed, 22 Nov 2017 19:45:46 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5146: Fix inconsitent results at FROM UNIXTIME()
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/8629 ) Change subject: IMPALA-5146: Fix inconsitent results at FROM_UNIXTIME() .. Patch Set 1: (1 comment) Thanks for the fix! http://gerrit.cloudera.org:8080/#/c/8629/1/be/src/exprs/timestamp-functions-ir.cc File be/src/exprs/timestamp-functions-ir.cc: http://gerrit.cloudera.org:8080/#/c/8629/1/be/src/exprs/timestamp-functions-ir.cc@84 PS1, Line 84: ToUnixTime i think it would be clearer to just call tv.HasDateAndTime() directly to show the intention of this check. (also, generally for Impala style, we don't use extraneous parenthesis around the conditional, though I see that in this file the style has diverged). -- To view, visit http://gerrit.cloudera.org:8080/8629 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ie3a5e9a9cb39d32993fa2c7f725be44d8b9ce9f2 Gerrit-Change-Number: 8629 Gerrit-PatchSet: 1 Gerrit-Owner: Kim Jin Chul Gerrit-Reviewer: Dan Hecht Gerrit-Comment-Date: Wed, 22 Nov 2017 19:50:31 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6227: deflake admission stress tests
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/8631 ) Change subject: IMPALA-6227: deflake admission stress tests .. Patch Set 1: (3 comments) http://gerrit.cloudera.org:8080/#/c/8631/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/8631/1//COMMIT_MSG@11 PS1, Line 11: All of the accounting in the test implicitly relies on queries not being : dequeued until queries are later explicitly ended, so if this happened, : the test broke in multiple subtle ways. and this assumption is no longer true because of the final change for IMPALA-1575, right? It'd be good to note that explicitly. http://gerrit.cloudera.org:8080/#/c/8631/1/tests/custom_cluster/test_admission_controller.py File tests/custom_cluster/test_admission_controller.py: http://gerrit.cloudera.org:8080/#/c/8631/1/tests/custom_cluster/test_admission_controller.py@705 PS1, Line 705: b4 where is that used? http://gerrit.cloudera.org:8080/#/c/8631/1/tests/custom_cluster/test_admission_controller.py@792 PS1, Line 792: amount of time is this actually timing sensitive, or is it that as long as we don't fetch the results (and the rowbatches have exceeded the various buffering), the query will remain active? -- To view, visit http://gerrit.cloudera.org:8080/8631 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iafb3af0ce68f96e5d713dbb3b37dd0b50ea66bb4 Gerrit-Change-Number: 8631 Gerrit-PatchSet: 1 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Dan Hecht Gerrit-Comment-Date: Wed, 22 Nov 2017 20:06:13 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-2181: Add query option levels for display
Hello Lars Volker, Laszlo Gaal, Zoltan Borok-Nagy, Philip Zeyliger, Attila Jeges, Tim Armstrong, Csaba Ringhofer, Dan Hecht, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8447 to look at the new patch set (#20). Change subject: IMPALA-2181: Add query option levels for display .. IMPALA-2181: Add query option levels for display Four display levels are introduced for each query option: REGULAR, ADVANCED, DEVELOPMENT and DEPRECATED. When the query options are displayed in Impala shell using SET then only the REGULAR and ADVANCED options are shown. A new command called SET ALL shows all the options grouped by their option levels. When the query options are displayed through HS2 interface then the result set would contain an extra column indicating the level of each option. Similarly to Impala shell on this interface the SET command only diplays the REGULAR and ADVANCED options while SET ALL shows them all. If the Impala shell connects to an Impala daemon that predates this change then all the options would be displayed in the REGULAR group. Change-Id: I75720d0d454527e1a0ed19bb43cf9e4f018ce1d1 --- M be/src/service/child-query.cc M be/src/service/client-request-state.cc M be/src/service/client-request-state.h M be/src/service/impala-server.cc M be/src/service/impala-server.h M be/src/service/query-options.cc M be/src/service/query-options.h M common/thrift/Frontend.thrift M common/thrift/beeswax.thrift M fe/src/main/cup/sql-parser.cup M fe/src/main/java/org/apache/impala/analysis/SetStmt.java M fe/src/main/java/org/apache/impala/service/Frontend.java M shell/impala_client.py M shell/impala_shell.py M testdata/workloads/functional-query/queries/QueryTest/set.test M tests/custom_cluster/test_set_and_unset.py M tests/hs2/test_hs2.py M tests/shell/test_shell_interactive.py 18 files changed, 487 insertions(+), 226 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/47/8447/20 -- To view, visit http://gerrit.cloudera.org:8080/8447 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I75720d0d454527e1a0ed19bb43cf9e4f018ce1d1 Gerrit-Change-Number: 8447 Gerrit-PatchSet: 20 Gerrit-Owner: Gabor Kaszab Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Laszlo Gaal Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] IMPALA-2181: Add query option levels for display
Gabor Kaszab has posted comments on this change. ( http://gerrit.cloudera.org:8080/8447 ) Change subject: IMPALA-2181: Add query option levels for display .. Patch Set 20: (1 comment) http://gerrit.cloudera.org:8080/#/c/8447/18/shell/impala_client.py File shell/impala_client.py: http://gerrit.cloudera.org:8080/#/c/8447/18/shell/impala_client.py@100 PS18, Line 100: option.level is not None: > in theory, if the shell with the feature comes from an Impala that is compi Done -- To view, visit http://gerrit.cloudera.org:8080/8447 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I75720d0d454527e1a0ed19bb43cf9e4f018ce1d1 Gerrit-Change-Number: 8447 Gerrit-PatchSet: 20 Gerrit-Owner: Gabor Kaszab Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Laszlo Gaal Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Wed, 22 Nov 2017 20:08:00 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-2250: Make multiple COUNT(DISTINCT) message state workarounds
Alex Behm has posted comments on this change. ( http://gerrit.cloudera.org:8080/8614 ) Change subject: IMPALA-2250: Make multiple COUNT(DISTINCT) message state workarounds .. Patch Set 4: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/8614 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I5084be10946d68f3ec0760c2b7e698635df26a89 Gerrit-Change-Number: 8614 Gerrit-PatchSet: 4 Gerrit-Owner: Kim Jin Chul Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Kim Jin Chul Gerrit-Comment-Date: Wed, 22 Nov 2017 20:45:54 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-2250: Make multiple COUNT(DISTINCT) message state workarounds
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/8614 ) Change subject: IMPALA-2250: Make multiple COUNT(DISTINCT) message state workarounds .. Patch Set 4: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1516/ -- To view, visit http://gerrit.cloudera.org:8080/8614 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I5084be10946d68f3ec0760c2b7e698635df26a89 Gerrit-Change-Number: 8614 Gerrit-PatchSet: 4 Gerrit-Owner: Kim Jin Chul Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kim Jin Chul Gerrit-Comment-Date: Wed, 22 Nov 2017 20:46:38 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6067: Enable s3 access via IAM roles for EC2 VMs
Laszlo Gaal has posted comments on this change. ( http://gerrit.cloudera.org:8080/8294 ) Change subject: IMPALA-6067: Enable s3 access via IAM roles for EC2 VMs .. Patch Set 3: >[...]Please do answer the question about changing > the FE tests to just use s3a so that we can remove the hackery > altogether.[...] In short: it depends on your end goal. If the end goal is to get rid of the s3n: provider and its supporting menagerie (jets3t, etc.), then yes, it can be done with a small price: a single negative test in AnalyzeStmtsTest.java will have to go. The code there in https://github.com/cloudera/Impala/blob/cdh5-trunk/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java#L3000 performs a test step that should fail for s3n -- but if I remove the s3n properties from core-site.xml, this line blows up complaining about missing credentials. The stack trace reveals that Impala's HdfsUri::analyze() calls into Hadoop's Path::getFileSystem() to find out what kind of filesystem the URI points to; this in turn calls down all the way to org.apache.hadoop.fs.s3.S3Credentials.initialize(S3Credentials.java:74) being called from Jets3t. If your end goal is to remove dummy credentials from core-site.xml.tmpl, this will probably not be complete success: the s3a: provider behaves in a similar () though not identical) way. When the front-end tests are run without an S3 container in sight (e.g. your local dev box with no S3 access set up), AnalyzeDDLTest's s3a: URIs still require similar dummy credentials, but with s3a: these can be supplied in environment variables. This means that core-site.xml[.tmpl] could be cleaned up, but dummy credentials in envireonment variables would still be there. -- To view, visit http://gerrit.cloudera.org:8080/8294 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I14cd9d4453a91baad3c379aa7e4944993fca95ae Gerrit-Change-Number: 8294 Gerrit-PatchSet: 3 Gerrit-Owner: Laszlo Gaal Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Laszlo Gaal Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 22 Nov 2017 21:04:24 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4964: Fix Decimal modulo overflow
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/8329 ) Change subject: IMPALA-4964: Fix Decimal modulo overflow .. Patch Set 4: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1517/ -- To view, visit http://gerrit.cloudera.org:8080/8329 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I5420201d4440d421e33e443df005cdcc16b8a6cd Gerrit-Change-Number: 8329 Gerrit-PatchSet: 4 Gerrit-Owner: Taras Bobrovytsky Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zach Amsden Gerrit-Reviewer: anujphadke Gerrit-Comment-Date: Wed, 22 Nov 2017 21:23:00 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6067: Enable s3 access via IAM roles for EC2 VMs
Philip Zeyliger has posted comments on this change. ( http://gerrit.cloudera.org:8080/8294 ) Change subject: IMPALA-6067: Enable s3 access via IAM roles for EC2 VMs .. Patch Set 3: > Patch Set 3: > > >[...]Please do answer the question about changing > > the FE tests to just use s3a so that we can remove the hackery > > altogether.[...] > > In short: it depends on your end goal. > If the end goal is to get rid of the s3n: provider and its supporting > menagerie (jets3t, etc.), then yes, it can be done with a small price: a > single negative test in AnalyzeStmtsTest.java will have to go. The code there > in > https://github.com/cloudera/Impala/blob/cdh5-trunk/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java#L3000 > performs a test step that should fail for s3n -- but if I remove the s3n > properties from core-site.xml, this line blows up complaining about missing > credentials. The stack trace reveals that Impala's HdfsUri::analyze() calls > into Hadoop's Path::getFileSystem() to find out what kind of filesystem the > URI points to; this in turn calls down all the way to > org.apache.hadoop.fs.s3.S3Credentials.initialize(S3Credentials.java:74) being > called from Jets3t. > > If your end goal is to remove dummy credentials from core-site.xml.tmpl, this > will probably not be complete success: the s3a: provider behaves in a similar > () though not identical) way. When the front-end tests are run without an S3 > container in sight (e.g. your local dev box with no S3 access set up), > AnalyzeDDLTest's s3a: URIs still require similar dummy credentials, but with > s3a: these can be supplied in environment variables. This means that > core-site.xml[.tmpl] could be cleaned up, but dummy credentials in > envireonment variables would still be there. Hadoop has essentially deprecated s3n and it's removed entirely in Hadoop-3.0. (See HADOOP-14738.) The distribution of Hadoop we're depending on also prefers s3a. I think we can do away with s3n entirely, but I'm happy to take that on as a separate change after yours here. -- To view, visit http://gerrit.cloudera.org:8080/8294 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I14cd9d4453a91baad3c379aa7e4944993fca95ae Gerrit-Change-Number: 8294 Gerrit-PatchSet: 3 Gerrit-Owner: Laszlo Gaal Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Laszlo Gaal Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 22 Nov 2017 21:25:56 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4985: use parquet stats of nested types for dynamic pruning
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/8480 ) Change subject: IMPALA-4985: use parquet stats of nested types for dynamic pruning .. Patch Set 10: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/8480 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I0c99e20cb080b504442cd5376ea3e046016158fe Gerrit-Change-Number: 8480 Gerrit-PatchSet: 10 Gerrit-Owner: Vuk Ercegovac Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Wed, 22 Nov 2017 22:00:14 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4985: use parquet stats of nested types for dynamic pruning
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/8480 ) Change subject: IMPALA-4985: use parquet stats of nested types for dynamic pruning .. IMPALA-4985: use parquet stats of nested types for dynamic pruning Currently, parquet row-groups can be pruned at run-time using min/max stats when predicates (in, binary) are specified for column scalar types. This patch extends pruning to nested types for the same class of predicates. A nested value is an instance of a nested type (struct, array, map). A nested value consists of other nested and scalar values (as declared by its type). Predicates that can be used for row-group pruning must be applied to nested scalar values. In addition, the parent of the nested scalar must also be required, that is, not empty. The latter requirement is conservative: some filters that could be used for pruning are not used for correctness reasons. Testing: - extended nested-types-parquet-stats e2e test cases. Change-Id: I0c99e20cb080b504442cd5376ea3e046016158fe Reviewed-on: http://gerrit.cloudera.org:8080/8480 Reviewed-by: Alex Behm Tested-by: Impala Public Jenkins --- M be/src/exec/hdfs-parquet-scanner.h M fe/src/main/java/org/apache/impala/analysis/CollectionStructType.java M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java M fe/src/main/java/org/apache/impala/analysis/SlotRef.java M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java M testdata/workloads/functional-planner/queries/PlannerTest/constant-folding.test M testdata/workloads/functional-planner/queries/PlannerTest/mt-dop-validation.test M testdata/workloads/functional-planner/queries/PlannerTest/parquet-filtering.test M testdata/workloads/functional-query/queries/QueryTest/nested-types-parquet-stats.test M tests/query_test/test_nested_types.py 10 files changed, 649 insertions(+), 38 deletions(-) Approvals: Alex Behm: Looks good to me, approved Impala Public Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/8480 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I0c99e20cb080b504442cd5376ea3e046016158fe Gerrit-Change-Number: 8480 Gerrit-PatchSet: 11 Gerrit-Owner: Vuk Ercegovac Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Vuk Ercegovac
[Impala-ASF-CR] IMPALA-5310: Part 2: Add SAMPLED NDV() function.
Bharath Vissapragada has posted comments on this change. ( http://gerrit.cloudera.org:8080/8569 ) Change subject: IMPALA-5310: Part 2: Add SAMPLED_NDV() function. .. Patch Set 2: (9 comments) I'm still looking at the tests, just want to publish what I have so far. Overall the patch (and approach) looks good to me. http://gerrit.cloudera.org:8080/#/c/8569/1/be/src/exprs/aggregate-functions-ir.cc File be/src/exprs/aggregate-functions-ir.cc: http://gerrit.cloudera.org:8080/#/c/8569/1/be/src/exprs/aggregate-functions-ir.cc@1464 PS1, Line 1464: of a SampledNdvState with 'num_buckets' HLL bu > Good idea done. I did some more similar cleanup in this class. thanks, easier to follow. http://gerrit.cloudera.org:8080/#/c/8569/2/be/src/exprs/aggregate-functions-ir.cc File be/src/exprs/aggregate-functions-ir.cc: http://gerrit.cloudera.org:8080/#/c/8569/2/be/src/exprs/aggregate-functions-ir.cc@1443 PS2, Line 1443: count IMO 'count' sounds a little generic and overlaps with total_count. May be rename them to "bucket_row_count" and "total_row_count" or something similar? http://gerrit.cloudera.org:8080/#/c/8569/2/be/src/exprs/aggregate-functions-ir.cc@1459 PS2, Line 1459: double sample_perc; static? http://gerrit.cloudera.org:8080/#/c/8569/2/be/src/exprs/aggregate-functions-ir.cc@1465 PS2, Line 1465: static int64_t GetStateSize(int64_t num_buckets) { Given num_buckets is always a const, may be make this a static const too like BUCKET size? static const in64_t STATE_SIZE = NUM_HLL_BUCKETS * BUCKET_SIZE . http://gerrit.cloudera.org:8080/#/c/8569/2/be/src/exprs/aggregate-functions-ir.cc@1530 PS2, Line 1530: ndv may be call ndv_estimate, just to be clear http://gerrit.cloudera.org:8080/#/c/8569/2/be/src/exprs/aggregate-functions-ir.cc@1552 PS2, Line 1552: int bucket_idx = (i + j) % SampledNdvState::NUM_HLL_BUCKETS; : merged_count += *state->GetCountPtr(bucket_idx); : counts[pidx] = merged_count; : StringVal hll = StringVal(state->GetHllPtr(bucket_idx), HLL_LEN); : HllMerge(ctx, hll, &merged_hll); : ndvs[pidx] = HllFinalEstimate(merged_hll.ptr, HLL_LEN); Wondering if we can avoid duplicates in data points since hll_to_merge for (i,j) = (j,i). Do they add any additional value? We could probably optimize the loop a little further to avoid that for (int i=0, i < NUM_HLL_BUCKETS; ++i) { for (int j=i, j < NUM_HLL_BUCKETS, ++j) { } } http://gerrit.cloudera.org:8080/#/c/8569/2/be/src/exprs/aggregate-functions-ir.cc@1563 PS2, Line 1563: int64_t max_count = counts[num_points - 1]; not sure I understand this, what invariant guarantees counts[num_points-1] is the max_count? http://gerrit.cloudera.org:8080/#/c/8569/1/be/src/util/mpfit-util.h File be/src/util/mpfit-util.h: http://gerrit.cloudera.org:8080/#/c/8569/1/be/src/util/mpfit-util.h@50 PS1, Line 50: /// Returns true if fitting was successful, false otherwise. > Done here. I could not find another place, did I miss one? nope, just here. http://gerrit.cloudera.org:8080/#/c/8569/2/fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java File fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java: http://gerrit.cloudera.org:8080/#/c/8569/2/fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java@506 PS2, Line 506: SAMPED_NDV nit: typo -- To view, visit http://gerrit.cloudera.org:8080/8569 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia51d56ee67ec6073e92f90bebb4005484138b820 Gerrit-Change-Number: 8569 Gerrit-PatchSet: 2 Gerrit-Owner: Alex Behm Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Bharath Vissapragada Gerrit-Comment-Date: Wed, 22 Nov 2017 22:36:41 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4835: Part 1: simplify I/O mgr mem mgmt and cancellation
Tianyi Wang has posted comments on this change. ( http://gerrit.cloudera.org:8080/8414 ) Change subject: IMPALA-4835: Part 1: simplify I/O mgr mem mgmt and cancellation .. Patch Set 12: > Patch Set 12: > > Tianyi, did you have any more comments? > > I'm not going to merge this regardless right now for the reasons stated in > the commit message. I don't have more comments -- To view, visit http://gerrit.cloudera.org:8080/8414 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If5cb42437d11c13bc4a55c3ab426b66777332bd1 Gerrit-Change-Number: 8414 Gerrit-PatchSet: 12 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 22 Nov 2017 22:40:16 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4132: Use -fno-omit-frame-pointer
Hello Bharath Vissapragada, Lars Volker, Tim Armstrong, Mostafa Mokhtar, Dan Hecht, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8612 to look at the new patch set (#2). Change subject: IMPALA-4132: Use -fno-omit-frame-pointer .. IMPALA-4132: Use -fno-omit-frame-pointer Using -fno-omit-frame-pointer would keep the frame pointers for functions in the register. As a result we expect more useful stack traces to be produced. For testing performance benchmark was executed on TPC-H and TPC-DS without any significant discrepancy from the baseline results. (For the specific measures check the attachments in the Jira.) Change-Id: Ib7d0d88ba015a847356ed0274fe91017b98cb941 --- M be/CMakeLists.txt 1 file changed, 2 insertions(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/12/8612/2 -- To view, visit http://gerrit.cloudera.org:8080/8612 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ib7d0d88ba015a847356ed0274fe91017b98cb941 Gerrit-Change-Number: 8612 Gerrit-PatchSet: 2 Gerrit-Owner: Gabor Kaszab Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-4132: Use -fno-omit-frame-pointer
Gabor Kaszab has posted comments on this change. ( http://gerrit.cloudera.org:8080/8612 ) Change subject: IMPALA-4132: Use -fno-omit-frame-pointer .. Patch Set 2: (2 comments) http://gerrit.cloudera.org:8080/#/c/8612/1/be/CMakeLists.txt File be/CMakeLists.txt: http://gerrit.cloudera.org:8080/#/c/8612/1/be/CMakeLists.txt@42 PS1, Line 42: # -fno-omit-frame-pointers: Keep frame pointer for functions in register > Have you considered using -fasynchronous-unwind-tables instead? This page r >From my side no such consideration was made. Tim, are you aware of any discussion related to the flag Lars mentions? http://gerrit.cloudera.org:8080/#/c/8612/1/be/CMakeLists.txt@42 PS1, Line 42: # -fno-omit-frame-pointers: Keep frame pointer for functions in register : SET(CXX_COMMON_FL > How did this affect the size of the resulting binaries? Done Binary sizes (output of ls, in bytes): impalad: -fno-omit-frame-pointers: 437163424 -fomit-frame-pointers: 437110928 size of the whole be/build/debug/ dir: (output of du, in kbytes) -fno-omit-frame-pointers: 45456600 -fomit-frame-pointers: 45451244 -- To view, visit http://gerrit.cloudera.org:8080/8612 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib7d0d88ba015a847356ed0274fe91017b98cb941 Gerrit-Change-Number: 8612 Gerrit-PatchSet: 2 Gerrit-Owner: Gabor Kaszab Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 22 Nov 2017 22:45:17 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6227: deflake admission stress tests
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/8631 ) Change subject: IMPALA-6227: deflake admission stress tests .. Patch Set 1: (3 comments) http://gerrit.cloudera.org:8080/#/c/8631/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/8631/1//COMMIT_MSG@11 PS1, Line 11: All of the accounting in the test implicitly relies on queries not being : dequeued until queries are later explicitly ended, so if this happened, : the test broke in multiple subtle ways. > and this assumption is no longer true because of the final change for IMPAL I don't think it was ever true for the mem_limit tests. Even before IMPALA-1575 it depended implicitly on the memory staying reserved on all backends. >From what I can tell, the bug was always there, it was a tweak to the >statestore frequency that threw it out of balance. http://gerrit.cloudera.org:8080/#/c/8631/1/tests/custom_cluster/test_admission_controller.py File tests/custom_cluster/test_admission_controller.py: http://gerrit.cloudera.org:8080/#/c/8631/1/tests/custom_cluster/test_admission_controller.py@705 PS1, Line 705: b4 > where is that used? Leftover debugging code that I missed removing. http://gerrit.cloudera.org:8080/#/c/8631/1/tests/custom_cluster/test_admission_controller.py@792 PS1, Line 792: amount of time > is this actually timing sensitive, or is it that as long as we don't fetch It shouldn't be timing sensitive. Updated the comment. -- To view, visit http://gerrit.cloudera.org:8080/8631 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iafb3af0ce68f96e5d713dbb3b37dd0b50ea66bb4 Gerrit-Change-Number: 8631 Gerrit-PatchSet: 1 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 22 Nov 2017 22:45:50 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6227: deflake admission stress tests
Hello Bikramjeet Vig, Dan Hecht, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8631 to look at the new patch set (#2). Change subject: IMPALA-6227: deflake admission stress tests .. IMPALA-6227: deflake admission stress tests The problem was that, during the initial admission decision phase, some queries were initially queued then dequeued once memory came available. All of the accounting in the test implicitly relies on queries not being dequeued until queries are later explicitly ended, so if this happened, the test broke in multiple subtle ways. This happened because the query only scanned a small number of rows, which could be all buffered on the receiver side of the exchange even before the client fetched any rows from the coordinator. This means that the reserved memory on some backends could increase then decrease during the initial admission phase, resulting in a query being queued then dequeued. The fix is to increase the number of rows returned by the query so that all fragments remain active during the initial admission phase. This increased test execution time somewhat, so I also had to bump the queue wait timeout for the admission stress tests (they assume that queries don't time out in the queue). Testing: Ran the test under debug, release and ASAN builds, i.e. impala-py.test tests/custom_cluster/test_admission_controller.py \ --workload_exploration_strategy="functional-query:exhaustive" I looped the mem_limit test for a while to confirm it didn't reproduce (it reproduced reliably every 2-3 iterations before this fix). I will try looping it a bit more under different build types concurrently with the review. Change-Id: Iafb3af0ce68f96e5d713dbb3b37dd0b50ea66bb4 --- M fe/src/test/resources/llama-site-test2.xml M tests/custom_cluster/test_admission_controller.py 2 files changed, 89 insertions(+), 47 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/31/8631/2 -- To view, visit http://gerrit.cloudera.org:8080/8631 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Iafb3af0ce68f96e5d713dbb3b37dd0b50ea66bb4 Gerrit-Change-Number: 8631 Gerrit-PatchSet: 2 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Dan Hecht
[Impala-ASF-CR] IMPALA-6227: deflake admission stress tests
Hello Bikramjeet Vig, Dan Hecht, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8631 to look at the new patch set (#3). Change subject: IMPALA-6227: deflake admission stress tests .. IMPALA-6227: deflake admission stress tests The problem was that, during the initial admission decision phase, some queries were initially queued then dequeued once memory came available. All of the accounting in the test implicitly relies on queries not being dequeued until queries are later explicitly ended, so if this happened, the test broke in multiple subtle ways. This happened because the query only scanned a small number of rows, which could be all buffered on the receiver side of the exchange even before the client fetched any rows from the coordinator. This means that the reserved memory on some backends could increase then decrease during the initial admission phase, resulting in a query being queued then dequeued. The fix is to increase the number of rows returned by the query so that all fragments remain active during the initial admission phase. This increased test execution time somewhat, so I also had to bump the queue wait timeout for the admission stress tests (they assume that queries don't time out in the queue). Testing: Ran the test under debug, release and ASAN builds, i.e. impala-py.test tests/custom_cluster/test_admission_controller.py \ --workload_exploration_strategy="functional-query:exhaustive" I looped the mem_limit test for a while to confirm it didn't reproduce (it reproduced reliably every 2-3 iterations before this fix). I will try looping it a bit more under different build types concurrently with the review. Change-Id: Iafb3af0ce68f96e5d713dbb3b37dd0b50ea66bb4 --- M fe/src/test/resources/llama-site-test2.xml M tests/custom_cluster/test_admission_controller.py 2 files changed, 111 insertions(+), 47 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/31/8631/3 -- To view, visit http://gerrit.cloudera.org:8080/8631 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Iafb3af0ce68f96e5d713dbb3b37dd0b50ea66bb4 Gerrit-Change-Number: 8631 Gerrit-PatchSet: 3 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-6227: deflake admission stress tests
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/8631 ) Change subject: IMPALA-6227: deflake admission stress tests .. Patch Set 3: (1 comment) http://gerrit.cloudera.org:8080/#/c/8631/3/tests/custom_cluster/test_admission_controller.py File tests/custom_cluster/test_admission_controller.py: http://gerrit.cloudera.org:8080/#/c/8631/3/tests/custom_cluster/test_admission_controller.py@706 PS3, Line 706: sleep(0.2) what's that about? does the QUERY_TIMEOUT comment above need updating? -- To view, visit http://gerrit.cloudera.org:8080/8631 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iafb3af0ce68f96e5d713dbb3b37dd0b50ea66bb4 Gerrit-Change-Number: 8631 Gerrit-PatchSet: 3 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 22 Nov 2017 22:57:24 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6227: deflake admission stress tests
Hello Bikramjeet Vig, Dan Hecht, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8631 to look at the new patch set (#4). Change subject: IMPALA-6227: deflake admission stress tests .. IMPALA-6227: deflake admission stress tests The problem was that, during the initial admission decision phase, some queries were initially queued then dequeued once memory came available. All of the accounting in the test implicitly relies on queries not being dequeued until queries are later explicitly ended, so if this happened, the test broke in multiple subtle ways. This happened because the query only scanned a small number of rows, which could be all buffered on the receiver side of the exchange even before the client fetched any rows from the coordinator. This means that the reserved memory on some backends could increase then decrease during the initial admission phase, resulting in a query being queued then dequeued. The fix is to increase the number of rows returned by the query so that all fragments remain active during the initial admission phase. This increased test execution time somewhat, so I also had to bump the queue wait timeout for the admission stress tests (they assume that queries don't time out in the queue). Testing: Ran the test under debug, release and ASAN builds, i.e. impala-py.test tests/custom_cluster/test_admission_controller.py \ --workload_exploration_strategy="functional-query:exhaustive" I looped the mem_limit test for a while to confirm it didn't reproduce (it reproduced reliably every 2-3 iterations before this fix). It still reproduces every 5-10 runs with exhaustive+release, so I need to do further work to make it more robust. Change-Id: Iafb3af0ce68f96e5d713dbb3b37dd0b50ea66bb4 --- M fe/src/test/resources/llama-site-test2.xml M tests/custom_cluster/test_admission_controller.py 2 files changed, 114 insertions(+), 47 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/31/8631/4 -- To view, visit http://gerrit.cloudera.org:8080/8631 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Iafb3af0ce68f96e5d713dbb3b37dd0b50ea66bb4 Gerrit-Change-Number: 8631 Gerrit-PatchSet: 4 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-6227: deflake admission stress tests
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/8631 ) Change subject: IMPALA-6227: deflake admission stress tests .. Patch Set 3: (1 comment) Just to summarise where this is at, it's a lot less flaky but I'm still seeing occasional flakes that I think are the result of the test seeing inconsistent metrics (i.e. admitted, queued, dequeued, etc are out of sync). http://gerrit.cloudera.org:8080/#/c/8631/3/tests/custom_cluster/test_admission_controller.py File tests/custom_cluster/test_admission_controller.py: http://gerrit.cloudera.org:8080/#/c/8631/3/tests/custom_cluster/test_admission_controller.py@706 PS3, Line 706: sleep(0.2) > what's that about? does the QUERY_TIMEOUT comment above need updating? I was experimenting with ways to reduce test runtime and reduce the chances of it hitting various timeouts. Factored out a constant to make the relationship clearer. -- To view, visit http://gerrit.cloudera.org:8080/8631 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iafb3af0ce68f96e5d713dbb3b37dd0b50ea66bb4 Gerrit-Change-Number: 8631 Gerrit-PatchSet: 3 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 22 Nov 2017 23:03:27 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6227: deflake admission stress tests
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/8631 ) Change subject: IMPALA-6227: deflake admission stress tests .. Patch Set 4: Code-Review+2 (1 comment) Seems like an improvement if you want to commit it here even if it doesn't address all flakiness. http://gerrit.cloudera.org:8080/#/c/8631/4/tests/custom_cluster/test_admission_controller.py File tests/custom_cluster/test_admission_controller.py: http://gerrit.cloudera.org:8080/#/c/8631/4/tests/custom_cluster/test_admission_controller.py@698 PS4, Line 698: every : # second shouldn't that say "at least every second" or something? -- To view, visit http://gerrit.cloudera.org:8080/8631 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iafb3af0ce68f96e5d713dbb3b37dd0b50ea66bb4 Gerrit-Change-Number: 8631 Gerrit-PatchSet: 4 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 22 Nov 2017 23:14:03 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6227: deflake admission stress tests
Hello Bikramjeet Vig, Dan Hecht, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8631 to look at the new patch set (#5). Change subject: IMPALA-6227: deflake admission stress tests .. IMPALA-6227: deflake admission stress tests The problem was that, during the initial admission decision phase, some queries were initially queued then dequeued once memory came available. All of the accounting in the test implicitly relies on queries not being dequeued until queries are later explicitly ended, so if this happened, the test broke in multiple subtle ways. This happened because the query only scanned a small number of rows, which could be all buffered on the receiver side of the exchange even before the client fetched any rows from the coordinator. This means that the reserved memory on some backends could increase then decrease during the initial admission phase, resulting in a query being queued then dequeued. The fix is to increase the number of rows returned by the query so that all fragments remain active during the initial admission phase. This increased test execution time somewhat, so I also had to bump the queue wait timeout for the admission stress tests (they assume that queries don't time out in the queue). Testing: Ran the test under debug, release and ASAN builds, i.e. impala-py.test tests/custom_cluster/test_admission_controller.py \ --workload_exploration_strategy="functional-query:exhaustive" I looped the mem_limit test for a while to confirm it didn't reproduce (it reproduced reliably every 2-3 iterations before this fix). It still reproduces every 5-10 runs with exhaustive+release, so I need to do further work to make it more robust. Change-Id: Iafb3af0ce68f96e5d713dbb3b37dd0b50ea66bb4 --- M fe/src/test/resources/llama-site-test2.xml M tests/custom_cluster/test_admission_controller.py 2 files changed, 116 insertions(+), 49 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/31/8631/5 -- To view, visit http://gerrit.cloudera.org:8080/8631 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Iafb3af0ce68f96e5d713dbb3b37dd0b50ea66bb4 Gerrit-Change-Number: 8631 Gerrit-PatchSet: 5 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-6227: deflake admission stress tests
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/8631 ) Change subject: IMPALA-6227: deflake admission stress tests .. Patch Set 4: (1 comment) http://gerrit.cloudera.org:8080/#/c/8631/4/tests/custom_cluster/test_admission_controller.py File tests/custom_cluster/test_admission_controller.py: http://gerrit.cloudera.org:8080/#/c/8631/4/tests/custom_cluster/test_admission_controller.py@698 PS4, Line 698: every : # second > shouldn't that say "at least every second" or something? Done -- To view, visit http://gerrit.cloudera.org:8080/8631 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iafb3af0ce68f96e5d713dbb3b37dd0b50ea66bb4 Gerrit-Change-Number: 8631 Gerrit-PatchSet: 4 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 22 Nov 2017 23:16:40 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6227: deflake admission stress tests
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/8631 ) Change subject: IMPALA-6227: deflake admission stress tests .. Patch Set 5: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1518/ -- To view, visit http://gerrit.cloudera.org:8080/8631 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iafb3af0ce68f96e5d713dbb3b37dd0b50ea66bb4 Gerrit-Change-Number: 8631 Gerrit-PatchSet: 5 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 22 Nov 2017 23:17:14 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6227: deflake admission stress tests
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/8631 ) Change subject: IMPALA-6227: deflake admission stress tests .. Patch Set 5: Code-Review+2 Carry -- To view, visit http://gerrit.cloudera.org:8080/8631 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iafb3af0ce68f96e5d713dbb3b37dd0b50ea66bb4 Gerrit-Change-Number: 8631 Gerrit-PatchSet: 5 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 22 Nov 2017 23:17:01 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6092: avoid drop/create function interactions in e2e tests
Vuk Ercegovac has posted comments on this change. ( http://gerrit.cloudera.org:8080/8593 ) Change subject: IMPALA-6092: avoid drop/create function interactions in e2e tests .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/8593/2/tests/query_test/test_udfs.py File tests/query_test/test_udfs.py: http://gerrit.cloudera.org:8080/#/c/8593/2/tests/query_test/test_udfs.py@422 PS2, Line 422: check_call(["hadoop", "fs", "-put", "-f", src_udf_path, tgt_udf_path]) > it does get left around (as with the other tests). perhaps a better pattern changed the paths so that these tmp jar get cleaned up. a new dir is created for each test's db. I put the jar in that db dir (as discussed) and verified that the dirs + files are removed (useful for dev env where we re-use the state). -- To view, visit http://gerrit.cloudera.org:8080/8593 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ica3538788b1d2ab5e361261e2ade62780b838e65 Gerrit-Change-Number: 8593 Gerrit-PatchSet: 2 Gerrit-Owner: Vuk Ercegovac Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Wed, 22 Nov 2017 23:18:34 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6092: avoid drop/create function interactions in e2e tests
Hello Bharath Vissapragada, Alex Behm, Dan Hecht, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8593 to look at the new patch set (#3). Change subject: IMPALA-6092: avoid drop/create function interactions in e2e tests .. IMPALA-6092: avoid drop/create function interactions in e2e tests The e2e unit tests for udfs can interact via the backend lib_cache, causing test flakes. IMPALA-6215 explains a race between the lib_cache and UdfExecutor in the frontend which is the likely the root cause. Two e2e tests use the same jar (test_java_udfs and test_udf_invalid_symbol), test_udf_invalid_symbol drops a function from that jar, which causes the use of that jar to fail in the test_java_udfs test. Since the state of lib_cache is per process, its state causes these interactions across unit tests. This change avoids the interactions by using separate jars for the separate tests. Change-Id: Ica3538788b1d2ab5e361261e2ade62780b838e65 --- M be/src/runtime/lib-cache.h M fe/src/main/java/org/apache/impala/hive/executor/UdfExecutor.java M tests/query_test/test_udfs.py 3 files changed, 16 insertions(+), 9 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/93/8593/3 -- To view, visit http://gerrit.cloudera.org:8080/8593 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ica3538788b1d2ab5e361261e2ade62780b838e65 Gerrit-Change-Number: 8593 Gerrit-PatchSet: 3 Gerrit-Owner: Vuk Ercegovac Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Vuk Ercegovac
[Impala-ASF-CR] IMPALA-6092: avoid drop/create function interactions in e2e tests
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/8593 ) Change subject: IMPALA-6092: avoid drop/create function interactions in e2e tests .. Patch Set 3: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/8593 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ica3538788b1d2ab5e361261e2ade62780b838e65 Gerrit-Change-Number: 8593 Gerrit-PatchSet: 3 Gerrit-Owner: Vuk Ercegovac Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Wed, 22 Nov 2017 23:20:44 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6092: avoid drop/create function interactions in e2e tests
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/8593 ) Change subject: IMPALA-6092: avoid drop/create function interactions in e2e tests .. Patch Set 3: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1519/ -- To view, visit http://gerrit.cloudera.org:8080/8593 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ica3538788b1d2ab5e361261e2ade62780b838e65 Gerrit-Change-Number: 8593 Gerrit-PatchSet: 3 Gerrit-Owner: Vuk Ercegovac Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Wed, 22 Nov 2017 23:21:04 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4927: Impala should handle invalid input from Sentry
Zach Amsden has posted comments on this change. ( http://gerrit.cloudera.org:8080/8588 ) Change subject: IMPALA-4927: Impala should handle invalid input from Sentry .. Patch Set 10: Code-Review+2 I'm not entirely convinced that ImpalaException is the right thing to catch, but there are upstream Sentry changes coming anyway that are going to change this again, so arguing against it doesn't seem productive. -- To view, visit http://gerrit.cloudera.org:8080/8588 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I781411018d580854d80a9cad81a1ded7ca16af8b Gerrit-Change-Number: 8588 Gerrit-PatchSet: 10 Gerrit-Owner: Pranay Singh Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Pranay Singh Gerrit-Reviewer: Zach Amsden Gerrit-Comment-Date: Wed, 22 Nov 2017 23:56:39 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4927: Impala should handle invalid input from Sentry
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/8588 ) Change subject: IMPALA-4927: Impala should handle invalid input from Sentry .. Patch Set 10: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1520/ -- To view, visit http://gerrit.cloudera.org:8080/8588 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I781411018d580854d80a9cad81a1ded7ca16af8b Gerrit-Change-Number: 8588 Gerrit-PatchSet: 10 Gerrit-Owner: Pranay Singh Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Pranay Singh Gerrit-Reviewer: Zach Amsden Gerrit-Comment-Date: Wed, 22 Nov 2017 23:58:17 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5310: Part 2: Add SAMPLED NDV() function.
Bharath Vissapragada has posted comments on this change. ( http://gerrit.cloudera.org:8080/8569 ) Change subject: IMPALA-5310: Part 2: Add SAMPLED_NDV() function. .. Patch Set 2: (2 comments) http://gerrit.cloudera.org:8080/#/c/8569/2/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java File fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java: http://gerrit.cloudera.org:8080/#/c/8569/2/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java@2063 PS2, Line 2063: 0.0 Wondering if 0.0 is a valid sample %. What do we scan in that case? Do we assign some default ndvs? http://gerrit.cloudera.org:8080/#/c/8569/2/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java@2066 PS2, Line 2066: for (Column col: allScalarTypes.getColumns()) { -ve test cases on non scalar columns, just for completeness? -- To view, visit http://gerrit.cloudera.org:8080/8569 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia51d56ee67ec6073e92f90bebb4005484138b820 Gerrit-Change-Number: 8569 Gerrit-PatchSet: 2 Gerrit-Owner: Alex Behm Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Bharath Vissapragada Gerrit-Comment-Date: Thu, 23 Nov 2017 00:02:53 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5936: operator '%' overflows on large decimals
Taras Bobrovytsky has posted comments on this change. ( http://gerrit.cloudera.org:8080/8574 ) Change subject: IMPALA-5936: operator '%' overflows on large decimals .. Patch Set 3: Code-Review+2 Looks good! -- To view, visit http://gerrit.cloudera.org:8080/8574 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2b06c8acd5aa490943e84013faf2eaac7c26ceb4 Gerrit-Change-Number: 8574 Gerrit-PatchSet: 3 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Thu, 23 Nov 2017 00:20:25 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-2250: Make multiple COUNT(DISTINCT) message state workarounds
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/8614 ) Change subject: IMPALA-2250: Make multiple COUNT(DISTINCT) message state workarounds .. Patch Set 4: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/8614 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I5084be10946d68f3ec0760c2b7e698635df26a89 Gerrit-Change-Number: 8614 Gerrit-PatchSet: 4 Gerrit-Owner: Kim Jin Chul Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kim Jin Chul Gerrit-Comment-Date: Thu, 23 Nov 2017 00:21:29 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-2250: Make multiple COUNT(DISTINCT) message state workarounds
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/8614 ) Change subject: IMPALA-2250: Make multiple COUNT(DISTINCT) message state workarounds .. IMPALA-2250: Make multiple COUNT(DISTINCT) message state workarounds Change-Id: I5084be10946d68f3ec0760c2b7e698635df26a89 Reviewed-on: http://gerrit.cloudera.org:8080/8614 Reviewed-by: Alex Behm Tested-by: Impala Public Jenkins --- M fe/src/main/java/org/apache/impala/analysis/AggregateInfo.java 1 file changed, 4 insertions(+), 1 deletion(-) Approvals: Alex Behm: Looks good to me, approved Impala Public Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/8614 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I5084be10946d68f3ec0760c2b7e698635df26a89 Gerrit-Change-Number: 8614 Gerrit-PatchSet: 5 Gerrit-Owner: Kim Jin Chul Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kim Jin Chul
[Impala-ASF-CR] IMPALA-2181: Add query option levels for display
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/8447 ) Change subject: IMPALA-2181: Add query option levels for display .. Patch Set 20: Code-Review+1 Looks good to me. Did anyone else on the long list of reviewers want to take another look? -- To view, visit http://gerrit.cloudera.org:8080/8447 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I75720d0d454527e1a0ed19bb43cf9e4f018ce1d1 Gerrit-Change-Number: 8447 Gerrit-PatchSet: 20 Gerrit-Owner: Gabor Kaszab Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Laszlo Gaal Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Thu, 23 Nov 2017 00:28:26 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5522:Use tracked memory for DictDecoder and DictEncoder
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/8034 ) Change subject: IMPALA-5522:Use tracked memory for DictDecoder and DictEncoder .. Patch Set 17: This is still on my radar, need to get back to it. -- To view, visit http://gerrit.cloudera.org:8080/8034 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I02a3b54f6c107d19b62ad9e1c49df94175964299 Gerrit-Change-Number: 8034 Gerrit-PatchSet: 17 Gerrit-Owner: Pranay Singh Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Pranay Singh Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 23 Nov 2017 00:29:19 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5237: Support a quoted string in date/time format
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/8508 ) Change subject: IMPALA-5237: Support a quoted string in date/time format .. Patch Set 3: I need to get back to this one after the thanksgiving break - it's on my radar. @Attila does the latest iteration look good to you? -- To view, visit http://gerrit.cloudera.org:8080/8508 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ie34055ac695748bcfb110bfa6ed5308f469ea178 Gerrit-Change-Number: 8508 Gerrit-PatchSet: 3 Gerrit-Owner: Kim Jin Chul Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Kim Jin Chul Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 23 Nov 2017 00:31:33 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4964: Fix Decimal modulo overflow
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/8329 ) Change subject: IMPALA-4964: Fix Decimal modulo overflow .. IMPALA-4964: Fix Decimal modulo overflow The modulo operation between two decimals should never overflow. Before this patch, there would be an overflow if the scale difference between the two decimals was large. We would try to scale up the one with the smaller scale, so that the scales matched, which could result in an overflow. We fix the problem by first checking if the scaled up value would fit into 128 bits by estimating the number of leading zeros in it. If we detect that 128 bits is not enough, we convert both numbers to int256, do the operation, then convert back to 128 bits. Testing: - Added some expr tests that excercise the new code path. Change-Id: I5420201d4440d421e33e443df005cdcc16b8a6cd Reviewed-on: http://gerrit.cloudera.org:8080/8329 Reviewed-by: Tim Armstrong Tested-by: Impala Public Jenkins --- M be/src/exprs/expr-test.cc M be/src/runtime/decimal-test.cc M be/src/runtime/decimal-value.inline.h 3 files changed, 129 insertions(+), 29 deletions(-) Approvals: Tim Armstrong: Looks good to me, approved Impala Public Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/8329 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I5420201d4440d421e33e443df005cdcc16b8a6cd Gerrit-Change-Number: 8329 Gerrit-PatchSet: 5 Gerrit-Owner: Taras Bobrovytsky Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zach Amsden Gerrit-Reviewer: anujphadke
[Impala-ASF-CR] IMPALA-4964: Fix Decimal modulo overflow
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/8329 ) Change subject: IMPALA-4964: Fix Decimal modulo overflow .. Patch Set 4: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/8329 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I5420201d4440d421e33e443df005cdcc16b8a6cd Gerrit-Change-Number: 8329 Gerrit-PatchSet: 4 Gerrit-Owner: Taras Bobrovytsky Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zach Amsden Gerrit-Reviewer: anujphadke Gerrit-Comment-Date: Thu, 23 Nov 2017 00:55:34 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-2181: Add query option levels for display
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/8447 ) Change subject: IMPALA-2181: Add query option levels for display .. Patch Set 20: (6 comments) Looks good to me, though I didn't look at the tests very carefully myself. Just some minor comments. http://gerrit.cloudera.org:8080/#/c/8447/20/be/src/service/impala-server.h File be/src/service/impala-server.h: http://gerrit.cloudera.org:8080/#/c/8447/20/be/src/service/impala-server.h@541 PS20, Line 541: his function nit: delete (Obvious that the comment is talking about this function). http://gerrit.cloudera.org:8080/#/c/8447/20/be/src/service/impala-server.h@544 PS20, Line 544: beeswax::ConfigVariable& option if that's modified, you should use ConfigVariable *. We prefer to use pointers rather than references for things that get modified, so that it's clear at the callsite. (i.e. only use const references). http://gerrit.cloudera.org:8080/#/c/8447/20/be/src/service/impala-server.h@545 PS20, Line 545: option_key that's not explained. and is it even needed -- why not just use the key from 'option'? http://gerrit.cloudera.org:8080/#/c/8447/20/be/src/service/impala-server.cc File be/src/service/impala-server.cc: http://gerrit.cloudera.org:8080/#/c/8447/20/be/src/service/impala-server.cc@1236 PS20, Line 1236: config.__set_level(TQueryOptionLevel::ADVANCED); does query_option_levels_ not contain all keys? i.e. should line 1233 be a DCHECK? http://gerrit.cloudera.org:8080/#/c/8447/20/be/src/service/query-options.h File be/src/service/query-options.h: http://gerrit.cloudera.org:8080/#/c/8447/20/be/src/service/query-options.h@34 PS20, Line 34: Introduced comments shouldn't talk about current code vs. older code. They should just talk about the code. So, reword this: Maps from query option name to option level used ... or similar http://gerrit.cloudera.org:8080/#/c/8447/20/be/src/service/query-options.h@175 PS20, Line 175: QueryOptionLevels& use pointer -- To view, visit http://gerrit.cloudera.org:8080/8447 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I75720d0d454527e1a0ed19bb43cf9e4f018ce1d1 Gerrit-Change-Number: 8447 Gerrit-PatchSet: 20 Gerrit-Owner: Gabor Kaszab Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Laszlo Gaal Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Thu, 23 Nov 2017 00:55:31 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-2181: Add query option levels for display
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/8447 ) Change subject: IMPALA-2181: Add query option levels for display .. Patch Set 20: (2 comments) (Also didn't look too much at the python shell code.) http://gerrit.cloudera.org:8080/#/c/8447/20//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/8447/20//COMMIT_MSG@14 PS20, Line 14: HS2 interface I think that should say "the SET SQL statement" or similar, since SET can be executed via HS2 or beeswax. It's just that the shell doesn't use it today. http://gerrit.cloudera.org:8080/#/c/8447/20/common/thrift/beeswax.thrift File common/thrift/beeswax.thrift: http://gerrit.cloudera.org:8080/#/c/8447/20/common/thrift/beeswax.thrift@98 PS20, Line 98: maybe we should be super explicit here and say: // Impala extension: ... to point out that this isn't part of the "standard" beeswax interface. I don't think anything else in this file was customized for Impala besides this. (The customization usually happens in ImpalaService.thrift. We could even add this functionality there instead, but I don't feel strongly given that this seems like a backwards compatible modification and beeswax will go away. -- To view, visit http://gerrit.cloudera.org:8080/8447 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I75720d0d454527e1a0ed19bb43cf9e4f018ce1d1 Gerrit-Change-Number: 8447 Gerrit-PatchSet: 20 Gerrit-Owner: Gabor Kaszab Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Laszlo Gaal Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Thu, 23 Nov 2017 01:03:55 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5310: Part 2: Add SAMPLED NDV() function.
Hello Bharath Vissapragada, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8569 to look at the new patch set (#3). Change subject: IMPALA-5310: Part 2: Add SAMPLED_NDV() function. .. IMPALA-5310: Part 2: Add SAMPLED_NDV() function. Adds a new SAMPLED_NDV() aggregate function that is intended to be used in COMPUTE STATS TABLESAMPLE. This patch only adds the function itself. Integration with COMPUTE STATS will come in a separate patch. SAMPLED_NDV() estimates the number of distinct values (NDV) based on a sample of data and the corresponding sampling rate. The main idea is to collect several x/y data points where x is the number of rows and y is the corresponding NDV estimate. These data points are used to fit an objective function to the data such that the true NDV can be extrapolated. The aggregate function maintains a fixed number of HyperLogLog intermediates to compute the x/y points. Several objective functions are fit and the best-fit one is used for extrapolation. Adds the MPFIT C library to perform curve fitting: https://www.physics.wisc.edu/~craigm/idl/cmpfit.html The library is a C port from Fortran. Scipy uses the Fortran version of the library for curve fitting. Testing: - added functional tests - core/hdfs run passed Change-Id: Ia51d56ee67ec6073e92f90bebb4005484138b820 --- M be/src/exprs/aggregate-functions-ir.cc M be/src/exprs/aggregate-functions.h A be/src/thirdparty/mpfit/DISCLAIMER A be/src/thirdparty/mpfit/README A be/src/thirdparty/mpfit/mpfit.c A be/src/thirdparty/mpfit/mpfit.h M be/src/util/CMakeLists.txt A be/src/util/mpfit-util.cc A be/src/util/mpfit-util.h M fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java M fe/src/main/java/org/apache/impala/catalog/BuiltinsDb.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java M fe/src/test/java/org/apache/impala/common/FrontendTestBase.java M tests/query_test/test_aggregation.py 15 files changed, 3,897 insertions(+), 19 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/69/8569/3 -- To view, visit http://gerrit.cloudera.org:8080/8569 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ia51d56ee67ec6073e92f90bebb4005484138b820 Gerrit-Change-Number: 8569 Gerrit-PatchSet: 3 Gerrit-Owner: Alex Behm Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Bharath Vissapragada
[Impala-ASF-CR] IMPALA-5310: Part 2: Add SAMPLED NDV() function.
Alex Behm has posted comments on this change. ( http://gerrit.cloudera.org:8080/8569 ) Change subject: IMPALA-5310: Part 2: Add SAMPLED_NDV() function. .. Patch Set 2: (9 comments) http://gerrit.cloudera.org:8080/#/c/8569/2/be/src/exprs/aggregate-functions-ir.cc File be/src/exprs/aggregate-functions-ir.cc: http://gerrit.cloudera.org:8080/#/c/8569/2/be/src/exprs/aggregate-functions-ir.cc@1443 PS2, Line 1443: count > IMO 'count' sounds a little generic and overlaps with total_count. May be r Done http://gerrit.cloudera.org:8080/#/c/8569/2/be/src/exprs/aggregate-functions-ir.cc@1459 PS2, Line 1459: double sample_perc; > static? This is the local state of a single aggregation instance, so static does not make sense. http://gerrit.cloudera.org:8080/#/c/8569/2/be/src/exprs/aggregate-functions-ir.cc@1465 PS2, Line 1465: static int64_t GetStateSize(int64_t num_buckets) { > Given num_buckets is always a const, may be make this a static const too li Can't do that because sizeof(SampledNdvState) fails since SampledNdvState is considered to be (size not known). This is the error: error: invalid application of ‘sizeof’ to incomplete type ‘impala::SampledNdvState’ sizeof(SampledNdvState) + NUM_HLL_BUCKETS * BUCKET_SIZE; http://gerrit.cloudera.org:8080/#/c/8569/2/be/src/exprs/aggregate-functions-ir.cc@1530 PS2, Line 1530: ndv > may be call ndv_estimate, just to be clear Done http://gerrit.cloudera.org:8080/#/c/8569/2/be/src/exprs/aggregate-functions-ir.cc@1552 PS2, Line 1552: int bucket_idx = (i + j) % SampledNdvState::NUM_HLL_BUCKETS; : merged_count += *state->GetCountPtr(bucket_idx); : counts[pidx] = merged_count; : StringVal hll = StringVal(state->GetHllPtr(bucket_idx), HLL_LEN); : HllMerge(ctx, hll, &merged_hll); : ndvs[pidx] = HllFinalEstimate(merged_hll.ptr, HLL_LEN); > Wondering if we can avoid duplicates in data points since hll_to_merge for I expanded the comment here to explain and justify. Not sure I follow your point generation method. I think in my method only the last data point is guaranteed to be a duplicate (which is a good thing, now explained in comment). Others may or may not be duplicates (probably not duplicates though). Consider this example input and my point generation method: input counts: 1, 2, 3, 4 counts of 1st rolling window: 1, 3, 6, 10 counts of 2nd rolling window 2, 5, 9, 10 counts of 3rd rolling window 3, 7, 8, 10 counts of 4th rolling window 4, 5, 7, 10 For each "count" value above, we mere the corresponding HLL intermediates to get an NDV estimate for that subset of the data. Note that the merged NDVs corresponding to the same "count" value are likely to be different because they represent different subsets of the data. http://gerrit.cloudera.org:8080/#/c/8569/2/be/src/exprs/aggregate-functions-ir.cc@1563 PS2, Line 1563: int64_t max_count = counts[num_points - 1]; > not sure I understand this, what invariant guarantees counts[num_points-1] Added comment. http://gerrit.cloudera.org:8080/#/c/8569/2/fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java File fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java: http://gerrit.cloudera.org:8080/#/c/8569/2/fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java@506 PS2, Line 506: SAMPED_NDV > nit: typo Done http://gerrit.cloudera.org:8080/#/c/8569/2/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java File fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java: http://gerrit.cloudera.org:8080/#/c/8569/2/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java@2063 PS2, Line 2063: 0.0 > Wondering if 0.0 is a valid sample %. What do we scan in that case? Do we a The value passed to this agg fn is independent of how much data is scanned. You can think of the sample percent as an "extrapolation factor". We will eventually combine TABLESAMPLE and SAMPLED_NDV() in COMPUTE STATS and there the effective sampling rate will be plugged into the SAMPLED_NDV() function. But for now, let's view SAMPLED_NDV() as a standalone aggregate function. I think 0.0 is a valid value. It will give you the estimated NDV at x=0. Not sure how useful that is, but I don't think it's invalid (0 is a valid "sampling ratio"). I'm not sure it's worth special casing 0, what do you think? http://gerrit.cloudera.org:8080/#/c/8569/2/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java@2066 PS2, Line 2066: for (Column col: allScalarTypes.getColumns()) { > -ve test cases on non scalar columns, just for completeness? Maybe I'm misunderstanding but we don't support Exprs on non-scalar types. Even if we have nested collections, we must explode them to scalars. -- To view, visit http://gerrit.cloudera.org:8080/8569 To unsubscribe, visit http://gerrit.cloudera.org:808
[Impala-ASF-CR] IMPALA-3703: Store query context in thread-local variables
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/8621 ) Change subject: IMPALA-3703: Store query context in thread-local variables .. Patch Set 2: (7 comments) http://gerrit.cloudera.org:8080/#/c/8621/2/be/src/common/thread-info-test.cc File be/src/common/thread-info-test.cc: http://gerrit.cloudera.org:8080/#/c/8621/2/be/src/common/thread-info-test.cc@55 PS2, Line 55: Status status = info_table.AddExtraInfo("extra", "info"); > please add a comment about what's going on: I agree it would be helpful to have some more comments. Maybe at least a comment on each test to explain what it is testing at a high-level, then possible comments for any tricky details of the tests themselves. http://gerrit.cloudera.org:8080/#/c/8621/2/be/src/common/thread-info.h File be/src/common/thread-info.h: http://gerrit.cloudera.org:8080/#/c/8621/2/be/src/common/thread-info.h@41 PS2, Line 41: class InfoTable : boost::noncopyable { > This is the only usage of boost:noncopyable I've seen in our code base. We Yeah DISALLOW_COPY_AND_ASSIGN is the canonical way to do it in Impala (for better or worse). http://gerrit.cloudera.org:8080/#/c/8621/2/be/src/common/thread-info.h@42 PS2, Line 42: public: We've generally been moving to using the new inline initialisation syntax for member variables if they're constant, or we're just calling the default constructor. I.e. text_size = 0, etc. http://gerrit.cloudera.org:8080/#/c/8621/2/be/src/common/thread-info.h@101 PS2, Line 101: size_t We generally use int64_t for byte sizes in the codebase. There are arguments for either int64_t or size_t, but we've settled on int64_t in Impala because it's large enough to hold any practical size and we avoid any hard-to-reason-about edge cases with mixed signed/unsigned expressions. http://gerrit.cloudera.org:8080/#/c/8621/2/be/src/common/thread-info.h@105 PS2, Line 105: char text[MAX_TEXT_SIZE]; The naming convention for member variables is to include a _ suffix. In this case we're following the google style guide: https://google.github.io/styleguide/cppguide.html#Variable_Names http://gerrit.cloudera.org:8080/#/c/8621/2/be/src/common/thread-info.h@115 PS2, Line 115: InfoTable& GetThreadInfoTable(); By convention we generally use pointers if arguments or return values are meant to be mutable. http://gerrit.cloudera.org:8080/#/c/8621/2/be/src/runtime/query-state.cc File be/src/runtime/query-state.cc: http://gerrit.cloudera.org:8080/#/c/8621/2/be/src/runtime/query-state.cc@378 PS2, Line 378: thread_info_table.SetQueryId(fis->query_id()); We should also do this for other threads running in the context of the query/fragment. The fragment thread can spawn off threads in a couple of places: BlockingJoinNode::ProcessBuildInputAndOpenProbe() and HdfsScanNode::ThreadTokenAvailableCb(). FragmentInstanceState::Prepare() also spawns off a per-fragment thread, although I believe at some point that will become a per-query thread. I guess we could also automatically inherit query_id and fragment_id from the parent thread, if that works out sanely, or maybe we should have a special Thread::Create() for fragment/query threads that takes additional arguments. Mainly it would be nice to have a pattern that makes it harder to accidentally avoid adding the appropriate info. -- To view, visit http://gerrit.cloudera.org:8080/8621 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I566f7f1db5117c498e86e0bd05b33bdcff624609 Gerrit-Change-Number: 8621 Gerrit-PatchSet: 2 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Thu, 23 Nov 2017 01:20:29 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-2248: Make idle session timeout a query option
Gabor Kaszab has posted comments on this change. ( http://gerrit.cloudera.org:8080/8490 ) Change subject: IMPALA-2248: Make idle_session_timeout a query option .. Patch Set 9: (7 comments) http://gerrit.cloudera.org:8080/#/c/8490/9/be/src/service/impala-server.h File be/src/service/impala-server.h: http://gerrit.cloudera.org:8080/#/c/8490/9/be/src/service/impala-server.h@441 PS9, Line 441: friend class ClientRequestState; This is required so that ClientRequestState can call UnregisterSessionTimeout() and RegisterSessionTimeout(), right? Would it be possible to make Register- and UnregisterTimeout() public and get rid of the friend class? http://gerrit.cloudera.org:8080/#/c/8490/9/be/src/service/impala-server.cc File be/src/service/impala-server.cc: http://gerrit.cloudera.org:8080/#/c/8490/9/be/src/service/impala-server.cc@1240 PS9, Line 1240: std::to_string(this->session_timeout), I have no knowledge about how threading works around this part of the code. Do you know if a race-condition is possible around this->session_timeout? http://gerrit.cloudera.org:8080/#/c/8490/9/be/src/service/impala-server.cc@1816 PS9, Line 1816: session_timeout_cv_.NotifyOne(); In RegisterSessionTimeout() the notify_one call is also protected by the lock. For my benefit, could you explain me why it was not required here? http://gerrit.cloudera.org:8080/#/c/8490/9/be/src/service/query-options.h File be/src/service/query-options.h: http://gerrit.cloudera.org:8080/#/c/8490/9/be/src/service/query-options.h@99 PS9, Line 99: QUERY_OPT_FN(idle_session_timeout, IDLE_SESSION_TIMEOUT)\ For the sake of IMPALA-2181, could you find out which of the 4 new option group this one should belong to? (regular, advanced, development, deprecated). Guess some of the regular of advanced groups, right? http://gerrit.cloudera.org:8080/#/c/8490/9/be/src/service/query-options.cc File be/src/service/query-options.cc: http://gerrit.cloudera.org:8080/#/c/8490/9/be/src/service/query-options.cc@578 PS9, Line 578: "Only positive numbers are allowed.", value)); I think the '4 spaces for indentation' rule applies here as well. I would either add 2 more spaces or indent the beginning of the string to match the one above (if that didn't make the line too long). http://gerrit.cloudera.org:8080/#/c/8490/9/tests/custom_cluster/test_session_expiration.py File tests/custom_cluster/test_session_expiration.py: http://gerrit.cloudera.org:8080/#/c/8490/9/tests/custom_cluster/test_session_expiration.py@62 PS9, Line 62: sleep(3) Could you please re-work these sleep times a little bit to reduce the runtimes of these tests? These 2 tests spend 14 secs just on sleep. http://gerrit.cloudera.org:8080/#/c/8490/9/tests/custom_cluster/test_session_expiration.py@70 PS9, Line 70: sleep(8) I learnt this week that running the test suite on RELEASE and ASAN build might give you tricky results. Once re-worked these numbers, could you please verify that the tests pass also on those build? (For me a query that took like 0,5 sec to start fetching results apparently succeeded in a DEBUG build but on RELEASE and ASAN failed even if modified the sleep from 0,5 to 1.5 sec) -- To view, visit http://gerrit.cloudera.org:8080/8490 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I32e2775f80da387b0df4195fe2c5435b3f8e585e Gerrit-Change-Number: 8490 Gerrit-PatchSet: 9 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Laszlo Gaal Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Thu, 23 Nov 2017 01:51:23 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5146: Fix inconsitent results at FROM UNIXTIME()
Hello Dan Hecht, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8629 to look at the new patch set (#2). Change subject: IMPALA-5146: Fix inconsitent results at FROM_UNIXTIME() .. IMPALA-5146: Fix inconsitent results at FROM_UNIXTIME() The FROM_UNIXTIME(epoch) and FROM_UNIXTIME(epoch, format) produce different results when epoch is out of range of TimestampValue. The former produces an empty string, while the latter gives NULL. The fix is to harmonize the results to NULL. Testing: Add unit tests to ExprTest.TimestampFunctions. Change-Id: Ie3a5e9a9cb39d32993fa2c7f725be44d8b9ce9f2 --- M be/src/exprs/expr-test.cc M be/src/exprs/timestamp-functions-ir.cc 2 files changed, 5 insertions(+), 2 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/29/8629/2 -- To view, visit http://gerrit.cloudera.org:8080/8629 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ie3a5e9a9cb39d32993fa2c7f725be44d8b9ce9f2 Gerrit-Change-Number: 8629 Gerrit-PatchSet: 2 Gerrit-Owner: Kim Jin Chul Gerrit-Reviewer: Dan Hecht
[Impala-ASF-CR] IMPALA-5146: Fix inconsitent results at FROM UNIXTIME()
Kim Jin Chul has posted comments on this change. ( http://gerrit.cloudera.org:8080/8629 ) Change subject: IMPALA-5146: Fix inconsitent results at FROM_UNIXTIME() .. Patch Set 1: (1 comment) Thanks for your comment http://gerrit.cloudera.org:8080/#/c/8629/1/be/src/exprs/timestamp-functions-ir.cc File be/src/exprs/timestamp-functions-ir.cc: http://gerrit.cloudera.org:8080/#/c/8629/1/be/src/exprs/timestamp-functions-ir.cc@84 PS1, Line 84: ToUnixTime > i think it would be clearer to just call tv.HasDateAndTime() directly to sh Done -- To view, visit http://gerrit.cloudera.org:8080/8629 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ie3a5e9a9cb39d32993fa2c7f725be44d8b9ce9f2 Gerrit-Change-Number: 8629 Gerrit-PatchSet: 1 Gerrit-Owner: Kim Jin Chul Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Kim Jin Chul Gerrit-Comment-Date: Thu, 23 Nov 2017 01:58:55 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5501: [DOCS] Clarify 'binaries' w.r.t. installation procedure
John Russell has uploaded this change for review. ( http://gerrit.cloudera.org:8080/8637 Change subject: IMPALA-5501: [DOCS] Clarify 'binaries' w.r.t. installation procedure .. IMPALA-5501: [DOCS] Clarify 'binaries' w.r.t. installation procedure Change the wording to match up with the earlier instructions saying to build from source. Change-Id: Ib12b8920e352ee086ce3fbc2601fbea4310a683c --- M docs/topics/impala_install.xml 1 file changed, 2 insertions(+), 2 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/37/8637/1 -- To view, visit http://gerrit.cloudera.org:8080/8637 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: Ib12b8920e352ee086ce3fbc2601fbea4310a683c Gerrit-Change-Number: 8637 Gerrit-PatchSet: 1 Gerrit-Owner: John Russell
[Impala-ASF-CR] IMPALA-5053: [SECURITY] Make KRPC work with Kerberos
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/8270 ) Change subject: IMPALA-5053: [SECURITY] Make KRPC work with Kerberos .. Patch Set 5: (14 comments) Sorry for the delay. Some more comments. http://gerrit.cloudera.org:8080/#/c/8270/5/be/src/rpc/auth-provider.h File be/src/rpc/auth-provider.h: http://gerrit.cloudera.org:8080/#/c/8270/5/be/src/rpc/auth-provider.h@27 PS5, Line 27: #include "util/thread.h" What is this for ? http://gerrit.cloudera.org:8080/#/c/8270/4/be/src/rpc/auth-provider.h File be/src/rpc/auth-provider.h: http://gerrit.cloudera.org:8080/#/c/8270/4/be/src/rpc/auth-provider.h@78 PS4, Line 78: /// Wrap the client transport with a new TSaslClientTransport. This is only for : /// internal connections. Since, as a daemon, we only do Kerberos and not LDAP, : /// we can go straight to Kerberos. : virtual Status WrapClientTransport(const std::string& hostname, : boost::shared_ptr raw_transport, : const std::string& service_name, : boost::shared_ptr* wrapped_transport); Is this only used for thrift connections ? May be it helps to state it now that we have two different RPC mechanisms in our code. http://gerrit.cloudera.org:8080/#/c/8270/4/be/src/rpc/authentication.cc File be/src/rpc/authentication.cc: http://gerrit.cloudera.org:8080/#/c/8270/4/be/src/rpc/authentication.cc@643 PS4, Line 643: kudu::rpc::DisableSaslInitialization() Can you please add a comment as to why this is necessary and how is SASL initialized when KRPC is enabled ? http://gerrit.cloudera.org:8080/#/c/8270/5/be/src/rpc/authentication.cc File be/src/rpc/authentication.cc: http://gerrit.cloudera.org:8080/#/c/8270/5/be/src/rpc/authentication.cc@643 PS5, Line 643: KUDU_RETURN_IF_ERROR(kudu::rpc::DisableSaslInitialization(), : "Unable to disable Kudu RPC SASL initialization."); Please add a small comment that this overlaps with the initialization below. http://gerrit.cloudera.org:8080/#/c/8270/5/be/src/rpc/authentication.cc@658 PS5, Line 658: // We assume all impala processes are both server and client. : sasl::TSaslServer::SaslInit(GENERAL_CALLBACKS, appname); : sasl::TSaslClient::SaslInit(GENERAL_CALLBACKS); Once we completely move away from Thrift, do we rely on Kudu to initialize the Sasl library ? http://gerrit.cloudera.org:8080/#/c/8270/5/be/src/rpc/authentication.cc@1034 PS5, Line 1034: RETURN_IF_ERROR(GetInternalKerberosPrincipal(&kerberos_internal_principal)); : RETURN_IF_ERROR(GetExternalKerberosPrincipal(&kerberos_external_principal)); The internal and external principals look like good candidates to be stored in ExecEnv and should be initialized once. http://gerrit.cloudera.org:8080/#/c/8270/5/be/src/rpc/authentication.cc@1036 PS5, Line 1036: nit: blank line not needed. http://gerrit.cloudera.org:8080/#/c/8270/5/be/src/rpc/rpc-mgr.cc File be/src/rpc/rpc-mgr.cc: http://gerrit.cloudera.org:8080/#/c/8270/5/be/src/rpc/rpc-mgr.cc@63 PS5, Line 63: !FLAGS_principal.empty() IsKerberosEnabled() http://gerrit.cloudera.org:8080/#/c/8270/5/be/src/rpc/rpc-mgr.cc@70 PS5, Line 70: bld.set_sasl_proto_name(service_name); It may be better to set FLAGS_rpc_authentication (defined by Kudu) to "required" to make sure that Kerberos must be enabled if it's enabled via FLAGS_principal. http://gerrit.cloudera.org:8080/#/c/8270/5/be/src/testutil/mini-kdc-wrapper.h File be/src/testutil/mini-kdc-wrapper.h: http://gerrit.cloudera.org:8080/#/c/8270/5/be/src/testutil/mini-kdc-wrapper.h@53 PS5, Line 53: // nit: /// http://gerrit.cloudera.org:8080/#/c/8270/5/be/src/testutil/mini-kdc-wrapper.h@86 PS5, Line 86: Status StopKdc(); Please add comments about what clean up it may do. http://gerrit.cloudera.org:8080/#/c/8270/5/be/src/testutil/mini-kdc-wrapper.cc File be/src/testutil/mini-kdc-wrapper.cc: http://gerrit.cloudera.org:8080/#/c/8270/5/be/src/testutil/mini-kdc-wrapper.cc@67 PS5, Line 67: > != seems to make less assumption about the ordering of the ENUM values. http://gerrit.cloudera.org:8080/#/c/8270/5/be/src/testutil/mini-kdc-wrapper.cc@88 PS5, Line 88: > != http://gerrit.cloudera.org:8080/#/c/8270/5/be/src/util/auth-util.cc File be/src/util/auth-util.cc: http://gerrit.cloudera.org:8080/#/c/8270/5/be/src/util/auth-util.cc@86 PS5, Line 86: const Status BAD_FORMAT_STATUS = Status(strings::Substitute( : "Kerberos principal should be of the form: /@ - got: $0", : principal)) Should this live in generate_error_codes.py ? Also, our existing implementation of Status has the unfortunate effect of creating a backtrace and logging to log file when creating a Status object. May be it's not preferable to pre-fabricate one here or it may confuse users to see this error message in
[Impala-ASF-CR] IMPALA-5501: [DOCS] Clarify 'binaries' w.r.t. installation procedure
John Russell has abandoned this change. ( http://gerrit.cloudera.org:8080/8637 ) Change subject: IMPALA-5501: [DOCS] Clarify 'binaries' w.r.t. installation procedure .. Abandoned Made this gerrit based on master, rather than a private branch, by mistake. -- To view, visit http://gerrit.cloudera.org:8080/8637 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: abandon Gerrit-Change-Id: Ib12b8920e352ee086ce3fbc2601fbea4310a683c Gerrit-Change-Number: 8637 Gerrit-PatchSet: 1 Gerrit-Owner: John Russell
[Impala-ASF-CR] IMPALA-5501: [DOCS] Clarify 'binaries' w.r.t. installation procedure
John Russell has uploaded this change for review. ( http://gerrit.cloudera.org:8080/8638 Change subject: IMPALA-5501: [DOCS] Clarify 'binaries' w.r.t. installation procedure .. IMPALA-5501: [DOCS] Clarify 'binaries' w.r.t. installation procedure Change the wording to match up with the earlier instructions saying to build from source. Change-Id: I1ea14e50b2d9a952091a316b637d7ea02c2b6ffd --- M docs/topics/impala_install.xml 1 file changed, 2 insertions(+), 2 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/38/8638/1 -- To view, visit http://gerrit.cloudera.org:8080/8638 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I1ea14e50b2d9a952091a316b637d7ea02c2b6ffd Gerrit-Change-Number: 8638 Gerrit-PatchSet: 1 Gerrit-Owner: John Russell
[Impala-ASF-CR] IMPALA-5522:Use tracked memory for DictDecoder and DictEncoder
Pranay Singh has posted comments on this change. ( http://gerrit.cloudera.org:8080/8034 ) Change subject: IMPALA-5522:Use tracked memory for DictDecoder and DictEncoder .. Patch Set 17: > This is still on my radar, need to get back to it. Thanks Tim -- To view, visit http://gerrit.cloudera.org:8080/8034 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I02a3b54f6c107d19b62ad9e1c49df94175964299 Gerrit-Change-Number: 8034 Gerrit-PatchSet: 17 Gerrit-Owner: Pranay Singh Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Pranay Singh Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 23 Nov 2017 02:23:58 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6227: deflake admission stress tests
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/8631 ) Change subject: IMPALA-6227: deflake admission stress tests .. Patch Set 5: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/1518/ -- To view, visit http://gerrit.cloudera.org:8080/8631 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iafb3af0ce68f96e5d713dbb3b37dd0b50ea66bb4 Gerrit-Change-Number: 8631 Gerrit-PatchSet: 5 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 23 Nov 2017 02:48:15 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4927: Impala should handle invalid input from Sentry
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/8588 ) Change subject: IMPALA-4927: Impala should handle invalid input from Sentry .. IMPALA-4927: Impala should handle invalid input from Sentry Impala requests a list of roles from Sentry and then asks for privileges for each role. If Sentry returns a non existent role in the first step, then there will be a Java exception in Impala in the second step and the communication with Sentry is aborted. The issue is fixed by handling the exception if an invalid role is found and continue with getting permissions for the rest of the roles. Testing: --- Since invalid role could not be created through impala-shell/Hue interface the code was instrumented to have an invalid Role " ", and SHOW ROLES statement was executed from impala shell to see how the condition is handled. Change-Id: I781411018d580854d80a9cad81a1ded7ca16af8b Reviewed-on: http://gerrit.cloudera.org:8080/8588 Reviewed-by: Alex Behm Reviewed-by: Zach Amsden Tested-by: Impala Public Jenkins --- M fe/src/main/java/org/apache/impala/util/SentryProxy.java 1 file changed, 10 insertions(+), 3 deletions(-) Approvals: Alex Behm: Looks good to me, but someone else must approve Zach Amsden: Looks good to me, approved Impala Public Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/8588 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I781411018d580854d80a9cad81a1ded7ca16af8b Gerrit-Change-Number: 8588 Gerrit-PatchSet: 11 Gerrit-Owner: Pranay Singh Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Pranay Singh Gerrit-Reviewer: Zach Amsden
[Impala-ASF-CR] IMPALA-4927: Impala should handle invalid input from Sentry
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/8588 ) Change subject: IMPALA-4927: Impala should handle invalid input from Sentry .. Patch Set 10: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/8588 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I781411018d580854d80a9cad81a1ded7ca16af8b Gerrit-Change-Number: 8588 Gerrit-PatchSet: 10 Gerrit-Owner: Pranay Singh Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Pranay Singh Gerrit-Reviewer: Zach Amsden Gerrit-Comment-Date: Thu, 23 Nov 2017 03:25:56 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6227: deflake admission stress tests
Hello Bikramjeet Vig, Dan Hecht, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8631 to look at the new patch set (#6). Change subject: IMPALA-6227: deflake admission stress tests .. IMPALA-6227: deflake admission stress tests The problem was that, during the initial admission decision phase, some queries were initially queued then dequeued once memory came available. All of the accounting in the test implicitly relies on queries not being dequeued until queries are later explicitly ended, so if this happened, the test broke in multiple subtle ways. This happened because the query only scanned a small number of rows, which could be all buffered on the receiver side of the exchange even before the client fetched any rows from the coordinator. This means that the reserved memory on some backends could increase then decrease during the initial admission phase, resulting in a query being queued then dequeued. The fix is to increase the number of rows returned by the query so that all fragments remain active during the initial admission phase. This increased test execution time somewhat, so I also had to bump the queue wait timeout for the admission stress tests (they assume that queries don't time out in the queue). Testing: Ran the test under debug, release and ASAN builds, i.e. impala-py.test tests/custom_cluster/test_admission_controller.py \ --workload_exploration_strategy="functional-query:exhaustive" I looped the mem_limit test for a while to confirm it didn't reproduce (it reproduced reliably every 2-3 iterations before this fix). It still reproduces every 5-10 runs with exhaustive+release, so I need to do further work to make it more robust. Change-Id: Iafb3af0ce68f96e5d713dbb3b37dd0b50ea66bb4 --- M fe/src/test/java/org/apache/impala/util/TestRequestPoolService.java M fe/src/test/resources/llama-site-test2.xml M tests/custom_cluster/test_admission_controller.py 3 files changed, 117 insertions(+), 50 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/31/8631/6 -- To view, visit http://gerrit.cloudera.org:8080/8631 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Iafb3af0ce68f96e5d713dbb3b37dd0b50ea66bb4 Gerrit-Change-Number: 8631 Gerrit-PatchSet: 6 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-6227: deflake admission stress tests
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/8631 ) Change subject: IMPALA-6227: deflake admission stress tests .. Patch Set 6: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1521/ -- To view, visit http://gerrit.cloudera.org:8080/8631 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iafb3af0ce68f96e5d713dbb3b37dd0b50ea66bb4 Gerrit-Change-Number: 8631 Gerrit-PatchSet: 6 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 23 Nov 2017 04:19:00 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6227: deflake admission stress tests
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/8631 ) Change subject: IMPALA-6227: deflake admission stress tests .. Patch Set 6: Code-Review+2 Trivial test fix -- To view, visit http://gerrit.cloudera.org:8080/8631 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iafb3af0ce68f96e5d713dbb3b37dd0b50ea66bb4 Gerrit-Change-Number: 8631 Gerrit-PatchSet: 6 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 23 Nov 2017 04:19:09 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-2181: Add query option levels for display
Hello Lars Volker, Laszlo Gaal, Zoltan Borok-Nagy, Philip Zeyliger, Attila Jeges, Tim Armstrong, Csaba Ringhofer, Dan Hecht, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8447 to look at the new patch set (#21). Change subject: IMPALA-2181: Add query option levels for display .. IMPALA-2181: Add query option levels for display Four display levels are introduced for each query option: REGULAR, ADVANCED, DEVELOPMENT and DEPRECATED. When the query options are displayed in Impala shell using SET then only the REGULAR and ADVANCED options are shown. A new command called SET ALL shows all the options grouped by their option levels. When the query options are displayed through the SET SQL statement then the result set would contain an extra column indicating the level of each option. Similarly to Impala shell here the SET command only diplays the REGULAR and ADVANCED options while SET ALL shows them all. If the Impala shell connects to an Impala daemon that predates this change then all the options would be displayed in the REGULAR group. Change-Id: I75720d0d454527e1a0ed19bb43cf9e4f018ce1d1 --- M be/src/service/child-query.cc M be/src/service/client-request-state.cc M be/src/service/client-request-state.h M be/src/service/impala-server.cc M be/src/service/impala-server.h M be/src/service/query-options.cc M be/src/service/query-options.h M common/thrift/Frontend.thrift M common/thrift/beeswax.thrift M fe/src/main/cup/sql-parser.cup M fe/src/main/java/org/apache/impala/analysis/SetStmt.java M fe/src/main/java/org/apache/impala/service/Frontend.java M shell/impala_client.py M shell/impala_shell.py M testdata/workloads/functional-query/queries/QueryTest/set.test M tests/custom_cluster/test_set_and_unset.py M tests/hs2/test_hs2.py M tests/shell/test_shell_interactive.py 18 files changed, 486 insertions(+), 226 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/47/8447/21 -- To view, visit http://gerrit.cloudera.org:8080/8447 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I75720d0d454527e1a0ed19bb43cf9e4f018ce1d1 Gerrit-Change-Number: 8447 Gerrit-PatchSet: 21 Gerrit-Owner: Gabor Kaszab Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Laszlo Gaal Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] IMPALA-2181: Add query option levels for display
Gabor Kaszab has posted comments on this change. ( http://gerrit.cloudera.org:8080/8447 ) Change subject: IMPALA-2181: Add query option levels for display .. Patch Set 21: (8 comments) http://gerrit.cloudera.org:8080/#/c/8447/20//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/8447/20//COMMIT_MSG@14 PS20, Line 14: the SET SQL s > I think that should say "the SET SQL statement" or similar, since SET can Done http://gerrit.cloudera.org:8080/#/c/8447/20/be/src/service/impala-server.h File be/src/service/impala-server.h: http://gerrit.cloudera.org:8080/#/c/8447/20/be/src/service/impala-server.h@541 PS20, Line 541: ets the optio > nit: delete (Obvious that the comment is talking about this function). Done http://gerrit.cloudera.org:8080/#/c/8447/20/be/src/service/impala-server.h@544 PS20, Line 544: query option. > if that's modified, you should use ConfigVariable *. We prefer to use poin Done http://gerrit.cloudera.org:8080/#/c/8447/20/be/src/service/impala-server.h@545 PS20, Line 545: lToConfig( > that's not explained. and is it even needed -- why not just use the key fro I didn't want to make this function dependent on whether the key field of the ConfigVariable was set by the caller or not. I found it safer this way. I'll mention this in the comment. http://gerrit.cloudera.org:8080/#/c/8447/20/be/src/service/impala-server.cc File be/src/service/impala-server.cc: http://gerrit.cloudera.org:8080/#/c/8447/20/be/src/service/impala-server.cc@1236 PS20, Line 1236: > does query_option_levels_ not contain all keys? i.e. should line 1233 be a In fact, a DCHECK is enough here. (Changing this to a DCHECK actually caught a bug: Where support_start_over was added to the options map the name should have been added with uppercase letters.) http://gerrit.cloudera.org:8080/#/c/8447/20/be/src/service/query-options.h File be/src/service/query-options.h: http://gerrit.cloudera.org:8080/#/c/8447/20/be/src/service/query-options.h@34 PS20, Line 34: Maps query > comments shouldn't talk about current code vs. older code. They should just Done http://gerrit.cloudera.org:8080/#/c/8447/20/be/src/service/query-options.h@175 PS20, Line 175: QueryOptionLevels* > use pointer Done http://gerrit.cloudera.org:8080/#/c/8447/20/common/thrift/beeswax.thrift File common/thrift/beeswax.thrift: http://gerrit.cloudera.org:8080/#/c/8447/20/common/thrift/beeswax.thrift@98 PS20, Line 98: > maybe we should be super explicit here and say: Done -- To view, visit http://gerrit.cloudera.org:8080/8447 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I75720d0d454527e1a0ed19bb43cf9e4f018ce1d1 Gerrit-Change-Number: 8447 Gerrit-PatchSet: 21 Gerrit-Owner: Gabor Kaszab Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Laszlo Gaal Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Thu, 23 Nov 2017 05:56:49 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6227: deflake admission stress tests
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/8631 ) Change subject: IMPALA-6227: deflake admission stress tests .. IMPALA-6227: deflake admission stress tests The problem was that, during the initial admission decision phase, some queries were initially queued then dequeued once memory came available. All of the accounting in the test implicitly relies on queries not being dequeued until queries are later explicitly ended, so if this happened, the test broke in multiple subtle ways. This happened because the query only scanned a small number of rows, which could be all buffered on the receiver side of the exchange even before the client fetched any rows from the coordinator. This means that the reserved memory on some backends could increase then decrease during the initial admission phase, resulting in a query being queued then dequeued. The fix is to increase the number of rows returned by the query so that all fragments remain active during the initial admission phase. This increased test execution time somewhat, so I also had to bump the queue wait timeout for the admission stress tests (they assume that queries don't time out in the queue). Testing: Ran the test under debug, release and ASAN builds, i.e. impala-py.test tests/custom_cluster/test_admission_controller.py \ --workload_exploration_strategy="functional-query:exhaustive" I looped the mem_limit test for a while to confirm it didn't reproduce (it reproduced reliably every 2-3 iterations before this fix). It still reproduces every 5-10 runs with exhaustive+release, so I need to do further work to make it more robust. Change-Id: Iafb3af0ce68f96e5d713dbb3b37dd0b50ea66bb4 Reviewed-on: http://gerrit.cloudera.org:8080/8631 Reviewed-by: Tim Armstrong Tested-by: Impala Public Jenkins --- M fe/src/test/java/org/apache/impala/util/TestRequestPoolService.java M fe/src/test/resources/llama-site-test2.xml M tests/custom_cluster/test_admission_controller.py 3 files changed, 117 insertions(+), 50 deletions(-) Approvals: Tim Armstrong: Looks good to me, approved Impala Public Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/8631 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: Iafb3af0ce68f96e5d713dbb3b37dd0b50ea66bb4 Gerrit-Change-Number: 8631 Gerrit-PatchSet: 7 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-6227: deflake admission stress tests
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/8631 ) Change subject: IMPALA-6227: deflake admission stress tests .. Patch Set 6: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/8631 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iafb3af0ce68f96e5d713dbb3b37dd0b50ea66bb4 Gerrit-Change-Number: 8631 Gerrit-PatchSet: 6 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 23 Nov 2017 07:48:17 + Gerrit-HasComments: No