[Impala-ASF-CR] IMPALA-5538: Use explicit catalog versions for deleted objects
Dimitris Tsirogiannis has uploaded a new patch set (#2). Change subject: IMPALA-5538: Use explicit catalog versions for deleted objects .. IMPALA-5538: Use explicit catalog versions for deleted objects This commit changes the way deletions are handled in the catalog and disseminated to the impalad nodes through the statestore. Previously, deletions of catalog objects were not explicitly annotated with the catalog version in which the deletion occured and the impalads were using the max catalog version in a catalog update in order to decide whether a deletion should be applied to the local catalog cache or not. This works correctly under the assumption that all the changes that occurred in the catalog between an update's min and max catalog version are included in that update, i.e. no gaps or missing updates. With the upcoming fix for IMPALA-5058, that constraint will be relaxed, thus allowing for gaps in the catalog updates. To avoid breaking the existing behavior, this patch introduced the following changes: * Deletions in the catalog are explicitly recorded in a log with the catalog version in which they occurred. As before, added and deleted catalog objects are sent to the statestore. * Topic entries associated with deleted catalog objects have non-empty values (besided keys) that contain minimal object metadata including the catalog version. * Statestore is no longer using the existence or not of topic entry values in order to identify deleted topic entries. Deleted topic entries should be explicitly marked as such by the statestore subscribers that produce them. * Statestore subscribers now use the 'deleted' flag to determine if a topic entry corresponds to a deleted item. * Impalads use the deleted objects' catalog versions when updating the local catalog cache from a catalog update and not the update's maximum catalog version. Testing: - No new tests were added as these paths are already exercised by existing tests. - Run all core tests. Change-Id: I93cb7a033dc8f0d3e0339394b36affe14523274c --- M be/src/catalog/catalog-server.cc M be/src/catalog/catalog-server.h M be/src/scheduling/admission-controller.cc M be/src/scheduling/admission-controller.h M be/src/scheduling/scheduler-test-util.cc M be/src/scheduling/scheduler.cc M be/src/service/impala-server.cc M be/src/statestore/statestore.cc M be/src/statestore/statestore.h M common/thrift/CatalogInternalService.thrift M common/thrift/StatestoreService.thrift M fe/src/main/java/org/apache/impala/catalog/CatalogDeltaLog.java M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M tests/statestore/test_statestore.py 16 files changed, 438 insertions(+), 331 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/31/7731/2 -- To view, visit http://gerrit.cloudera.org:8080/7731 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I93cb7a033dc8f0d3e0339394b36affe14523274c Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Dimitris Tsirogiannis Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Dimitris Tsirogiannis
[Impala-ASF-CR] [PREVIEW] IMPALA-5538: Use explicit catalog versions for deleted objects
Dimitris Tsirogiannis has posted comments on this change. Change subject: [PREVIEW] IMPALA-5538: Use explicit catalog versions for deleted objects .. Patch Set 1: (28 comments) http://gerrit.cloudera.org:8080/#/c/7731/1/be/src/catalog/catalog-server.cc File be/src/catalog/catalog-server.cc: Line 288: BuildTopicUpdates(catalog_objects.updated_objects, false); > Is there a requirement/assumption that the deletions must come after the up No, it doesn't matter. Line 300: void CatalogServer::BuildTopicUpdates(const vector& catalog_objects, > IMO, it is more readable if the signature of BuildTopicUpdates takes TGetAl These two "do something" blocks are almost identical. I refactored the code a bit to make it a bit less flaky. Line 333: << catalog_object.catalog_version; > indent Done http://gerrit.cloudera.org:8080/#/c/7731/1/be/src/catalog/catalog-server.h File be/src/catalog/catalog-server.h: Line 154: /// determine items that have been deleted, it saves the set of topic entry keys sent > Update comment Done Line 164: bool topic_deletions); > What does this param do? Updated the comment. http://gerrit.cloudera.org:8080/#/c/7731/1/be/src/scheduling/scheduler-test-util.cc File be/src/scheduling/scheduler-test-util.cc: Line 481: item.key = host.ip; > Use __set fn like above to be consistent Done http://gerrit.cloudera.org:8080/#/c/7731/1/be/src/scheduling/scheduler.cc File be/src/scheduling/scheduler.cc: Line 141: if (delta.is_delta && delta.topic_entries.empty()) { > single line Done http://gerrit.cloudera.org:8080/#/c/7731/1/be/src/service/impala-server.cc File be/src/service/impala-server.cc: Line 1323: TCatalogObject catalog_object; > Please move this right before DeserializeThriftMsg() so we can see where it Done Line 1343: << item.key << " is " > indentation Done Line 1367: LOG(INFO) << "Deleted item: " << item.key << " version: " << catalog_object.catalog_version; > That's a lot of logging, sure we need this at INFO level? Sorry, that was for debugging. Line 1398: if (existing_object.catalog_version <= catalog_object.catalog_version) { > Shouldn't this be < and not <=? Done PS1, Line 1406: batch_size_bytes > Looks like we didn't account this earlier for deletions. This logic makes m Done Line 1527: if (item.deleted) { > Is it possible that within a single list of topic_entries there is both an No, it's not possible to have both an addition and a deletion in the same list of topic_entries. http://gerrit.cloudera.org:8080/#/c/7731/1/be/src/statestore/statestore.cc File be/src/statestore/statestore.cc: Line 485: item.value.empty() ? Statestore::TopicEntry::NULL_VALUE : item.value, > Why not just 'item.value'? The value gets copied. Good point. Done Line 527: deleted_key_size_bytes -= itr->first.size(); > += ? Done http://gerrit.cloudera.org:8080/#/c/7731/1/common/thrift/StatestoreService.thrift File common/thrift/StatestoreService.thrift: Line 92 > - I kind of like this separation of updates and deletions in different list By switching to TTopicItem for topic deletions there is now significant overlap between operations performed on both added and deleted items. We could try to extract most of the common logic in separate functions and keep the existing logic in place, but for now it seems simple enough to have one loop over one list of items and separate the logic with an if statement when needed. If we had to add many of these if statements in multiple places, I'd agree that splitting the logic would have been better. Line 97: 5: required bool is_delta; > fix member numbering Done http://gerrit.cloudera.org:8080/#/c/7731/1/fe/src/main/java/org/apache/impala/catalog/CatalogDeltaLog.java File fe/src/main/java/org/apache/impala/catalog/CatalogDeltaLog.java: Line 56: private Map removedCatalogObjectVersionsByKey_ = Maps.newHashMap(); > I might be wrong, but I think we can end up having the same object deleted Good catch, that was a bug indeed. Removed the second map and simplified the logic around handling the deleted objects. Line 70:* key 'key'. > update comment No longer applicable. Function is removed. Line 90: public synchronized void garbageCollect(long currentCatalogVersion) { > Although these methods are synchronized, the underlying maps aren't thread That's not correct. The essence of having synchronized functions is exactly that, to ensure proper synchronization even though the underlying collections are not thread safe. Two threads calling garbageCollect and addRemoved cannot both proceed, one will grab the lock on the CatalogDeltaLog object and make progress while the other will simply block. In any case, this class is simplified and it only contains one collection. Line 154: private String toCatalogObjectKey(TCatalogObject catalogObject) {
[Impala-ASF-CR] IMPALA-5881: Use native allocation while building catalog updates
Alex Behm has posted comments on this change. Change subject: IMPALA-5881: Use native allocation while building catalog updates .. Patch Set 10: (1 comment) http://gerrit.cloudera.org:8080/#/c/7955/10/fe/src/main/java/org/apache/impala/service/JniCatalog.java File fe/src/main/java/org/apache/impala/service/JniCatalog.java: Line 151: if (serializedByteSize >= SERIALIZED_TOPIC_SIZE_WARNING_LIMIT_BYTES) { Dimitris and I discussed this warning. If we decide to issue a warning, then I think we should warn when close to the limit, e.g. at 3GB. The warning message should pretty-print the number of bytes and also print the hard limit. Also the message should be more assertive, e.g. "potentially" is vague and should be removed (we know the exact serialized size). For inspiration, let me propose this msg: Serialized size of catalog update approaching hard limit. Current size: 3GB. Hard limit: 4GB. Catalog updates that exceed the limit will be ignored and will result in stale metadata on coordinator nodes. -- To view, visit http://gerrit.cloudera.org:8080/7955 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I383684effa9524734ce3c6c0fb7ed37de0e15782 Gerrit-PatchSet: 10 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Bharath Vissapragada Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4513: Promote integer types for ABS()
Lars Volker has posted comments on this change. Change subject: IMPALA-4513: Promote integer types for ABS() .. Patch Set 3: Code-Review+1 -- To view, visit http://gerrit.cloudera.org:8080/8004 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I86cc880e78258d5f90471bd8af4caeb4305eed77 Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Zach Amsden Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Zach Amsden Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5867: Fix bugs parsing 2-digit year
Impala Public Jenkins has posted comments on this change. Change subject: IMPALA-5867: Fix bugs parsing 2-digit year .. Patch Set 6: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/7910 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ia4f430caea88b6c33f8050a1984ee0ee32ecb0a1 Gerrit-PatchSet: 6 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tianyi Wang Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5867: Fix bugs parsing 2-digit year
Impala Public Jenkins has submitted this change and it was merged. Change subject: IMPALA-5867: Fix bugs parsing 2-digit year .. IMPALA-5867: Fix bugs parsing 2-digit year This patch fixes several bugs parsing 1 or 2-digit year formats. Existing code is broken in several ways: 1. With 1 or 2-digit year format and month/day missing, ParseDateTime() throws an uncaught exception. 2. If now() is 02/29 in a leap year but (now() - 80 years) isn't, DateTimeFormatContext::SetCenturyBreak() throws an uncaught exception. 3. If the year parsed is 02/29 in a leap year but it isn't a leap year 100 years ago, TimestampParser::Parse() will consider the date as invalid though it isn't. This patch fixes above bugs and adds a few test cases in be/src/runtime/timestamp-test.cc The behaviors after change is: 1. A date without month or day is considered invalid. This is a pre-existing difference from Hive, which defaults missing month/day to 01/01. 2. Century break would be set to 02/28 80 years ago. 3. If parsed date is 00/02/29 but 1900/02/29 does not exist, treat it as 03/01 when comparing to century break. Change-Id: Ia4f430caea88b6c33f8050a1984ee0ee32ecb0a1 Reviewed-on: http://gerrit.cloudera.org:8080/7910 Reviewed-by: Tim Armstrong Tested-by: Impala Public Jenkins --- M be/src/runtime/timestamp-parse-util.cc M be/src/runtime/timestamp-parse-util.h M be/src/runtime/timestamp-test.cc 3 files changed, 104 insertions(+), 50 deletions(-) Approvals: Impala Public Jenkins: Verified Tim Armstrong: Looks good to me, approved -- To view, visit http://gerrit.cloudera.org:8080/7910 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: merged Gerrit-Change-Id: Ia4f430caea88b6c33f8050a1984ee0ee32ecb0a1 Gerrit-PatchSet: 7 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tianyi Wang Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-4513: Promote integer types for ABS()
Zach Amsden has uploaded a new patch set (#3). Change subject: IMPALA-4513: Promote integer types for ABS() .. IMPALA-4513: Promote integer types for ABS() The internal representation of the most negative number in two's complement requires 1 more bit to represent the positive version. This means ABS() must promote integer types to the next highest width. Change-Id: I86cc880e78258d5f90471bd8af4caeb4305eed77 --- M be/src/exprs/expr-test.cc M be/src/exprs/math-functions-ir.cc M be/src/exprs/math-functions.h M common/function-registry/impala_functions.py 4 files changed, 31 insertions(+), 11 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/04/8004/3 -- To view, visit http://gerrit.cloudera.org:8080/8004 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I86cc880e78258d5f90471bd8af4caeb4305eed77 Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Zach Amsden Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Zach Amsden
[Impala-ASF-CR] IMPALA-4513: Promote integer types for ABS()
Zach Amsden has posted comments on this change. Change subject: IMPALA-4513: Promote integer types for ABS() .. Patch Set 2: (3 comments) http://gerrit.cloudera.org:8080/#/c/8004/2/be/src/exprs/expr-test.cc File be/src/exprs/expr-test.cc: Line 4216: TestValue("abs(-128)", TYPE_SMALLINT, 128); > What does this do with abs(-100 - 27)? How about abs(-200 + 100)? The types may be promoted or not depending on operator precedence (I believe we have some weirdness with unary minus) but the end result is now the same: [localhost:21000] > select abs(-100 - 27); Query: select abs(-100 - 27) Query submitted at: 2017-09-07 19:30:10 (Coordinator: http://zach-pa:25000) Query progress can be monitored at: http://zach-pa:25000/query_plan?query_id=c42cf1a096e10f3:48ac2186 ++ | abs(-100 - 27) | ++ | 127| ++ Fetched 1 row(s) in 0.01s [localhost:21000] > select abs(-200 + 100); Query: select abs(-200 + 100) Query submitted at: 2017-09-07 19:30:22 (Coordinator: http://zach-pa:25000) Query progress can be monitored at: http://zach-pa:25000/query_plan?query_id=2548f7f5364aa478:70dc0e3e +-+ | abs(-200 + 100) | +-+ | 100 | +-+ Fetched 1 row(s) in 0.01s http://gerrit.cloudera.org:8080/#/c/8004/2/be/src/exprs/math-functions-ir.cc File be/src/exprs/math-functions-ir.cc: Line 65: ctx->AddWarning("Expression overflowed, returning NULL"); > Do we have a test for this case? Not sure it is possible to test the warning message without adding more machinery, but there is a test for the case. Line 73: ONE_ARG_MATH_FN(Abs, BigIntVal, IntVal, llabs); > Maybe mention the upcast in a comment here? Otherwise it seems easy to miss Done -- To view, visit http://gerrit.cloudera.org:8080/8004 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I86cc880e78258d5f90471bd8af4caeb4305eed77 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Zach Amsden Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Zach Amsden Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4513: Promote integer types for ABS()
Lars Volker has posted comments on this change. Change subject: IMPALA-4513: Promote integer types for ABS() .. Patch Set 2: (3 comments) Looks good to me, but I'm no expert in this part of the codebase. http://gerrit.cloudera.org:8080/#/c/8004/2/be/src/exprs/expr-test.cc File be/src/exprs/expr-test.cc: Line 4216: TestValue("abs(-128)", TYPE_SMALLINT, 128); What does this do with abs(-100 - 27)? How about abs(-200 + 100)? http://gerrit.cloudera.org:8080/#/c/8004/2/be/src/exprs/math-functions-ir.cc File be/src/exprs/math-functions-ir.cc: Line 65: ctx->AddWarning("Expression overflowed, returning NULL"); Do we have a test for this case? Line 73: ONE_ARG_MATH_FN(Abs, BigIntVal, IntVal, llabs); Maybe mention the upcast in a comment here? Otherwise it seems easy to miss. -- To view, visit http://gerrit.cloudera.org:8080/8004 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I86cc880e78258d5f90471bd8af4caeb4305eed77 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Zach Amsden Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Zach Amsden Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5199: prevent hang on empty row batch exchange
Tim Armstrong has posted comments on this change. Change subject: IMPALA-5199: prevent hang on empty row batch exchange .. Patch Set 2: That seems pretty rare but possible. In the common case where the limit is at the top level of the query, I think this would be fine, because the query would hit the limit on the coordinator, return successfully, and ignore any future errors. Those timeout errors would have to arrive after STREAM_EXPIRATION_TIME_MS (since if the closed stream was in the cache, we wouldn't report an error before that time). So I think in general the timeline would need to be: * Limit is hit at t1 * The eos message is received at t2 >= t1 + STREAM_EXPIRATION_TIME_MS and an error is sent to the coordinator * The query fails, but it would have succeeded at t3 > t2 if the error hadn't been received If the limit is not at the top level, it seems possible but unlikely that the error would cancel a query that was on the path to succeeding. I'm not sure if it changes the odds of queries succeeding enough to worry about, given that the cluster is already unhealthy if these timeouts are occurring. -- To view, visit http://gerrit.cloudera.org:8080/8005 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ib370ebe44e3bb34d3f0fb9f05aa6386eb91c8645 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5881: Use native allocation while building catalog updates
Bharath Vissapragada has uploaded a new patch set (#10). Change subject: IMPALA-5881: Use native allocation while building catalog updates .. IMPALA-5881: Use native allocation while building catalog updates This patch moves the allocation of thrift structures for serializing the output of GetAllCatalogObjects() call into the native side. This is done to prevent the Catalog from hitting JVM array limitations (2GB maximum size) at scale. Additionally this patch also extends the --compact_catalog_top=true to apply TCompactProtocol for the above serialization to reduce the in-memory footprint. This patch also caps the native allocations to not go beyond 4GB due to Thrift library limitations. (Thrift internally uses uint32_t datatype to represent the message size which limits the size to ~4.2GB). Testing: Passed ASAN + HDFS core and DEBUG + HDFS exhaustive. Deployed the patch on a 16 node cluster and tested it on a Catalog-update topic of 3.5GB (uncompressed) or 780MB (compressed). Change-Id: I383684effa9524734ce3c6c0fb7ed37de0e15782 --- M be/src/catalog/catalog.cc M be/src/catalog/catalog.h M be/src/rpc/jni-thrift-util.h M be/src/util/backend-gflag-util.cc M common/thrift/BackendGflags.thrift M common/thrift/JniCatalog.thrift M fe/src/main/java/org/apache/impala/service/JniCatalog.java A fe/src/main/java/org/apache/impala/thrift/NativeByteArrayOutputStream.java A fe/src/main/java/org/apache/impala/thrift/TNativeSerializer.java A fe/src/test/java/org/apache/impala/thrift/TNativeSerializerTest.java 10 files changed, 485 insertions(+), 6 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/55/7955/10 -- To view, visit http://gerrit.cloudera.org:8080/7955 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I383684effa9524734ce3c6c0fb7ed37de0e15782 Gerrit-PatchSet: 10 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Bharath Vissapragada Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-5881: Use native allocation while building catalog updates
Bharath Vissapragada has posted comments on this change. Change subject: IMPALA-5881: Use native allocation while building catalog updates .. Patch Set 9: (1 comment) http://gerrit.cloudera.org:8080/#/c/7955/9/fe/src/main/java/org/apache/impala/thrift/NativeByteArrayOutputStream.java File fe/src/main/java/org/apache/impala/thrift/NativeByteArrayOutputStream.java: PS9, Line 159: public void writeTo(OutputStream out) { : throw new IllegalStateException("Not implemented."); : } : : public byte toByteArray()[] { : throw new IllegalStateException("Not implemented."); : } : : public int size() { : throw new IllegalStateException("Not implemented."); : } > I get that, but you're not extending the ByteArrayOutputStream class, you'r Done -- To view, visit http://gerrit.cloudera.org:8080/7955 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I383684effa9524734ce3c6c0fb7ed37de0e15782 Gerrit-PatchSet: 9 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Bharath Vissapragada Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5881: Use native allocation while building catalog updates
Dimitris Tsirogiannis has posted comments on this change. Change subject: IMPALA-5881: Use native allocation while building catalog updates .. Patch Set 9: (1 comment) http://gerrit.cloudera.org:8080/#/c/7955/9/fe/src/main/java/org/apache/impala/thrift/NativeByteArrayOutputStream.java File fe/src/main/java/org/apache/impala/thrift/NativeByteArrayOutputStream.java: PS9, Line 159: public void writeTo(OutputStream out) { : throw new IllegalStateException("Not implemented."); : } : : public byte toByteArray()[] { : throw new IllegalStateException("Not implemented."); : } : : public int size() { : throw new IllegalStateException("Not implemented."); : } > These are from ByteArrayOutputStream. Given we mimic it, someone might expe I get that, but you're not extending the ByteArrayOutputStream class, you're extending OutputStream. So, either change that or remove anything that is not defined in the parent class. -- To view, visit http://gerrit.cloudera.org:8080/7955 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I383684effa9524734ce3c6c0fb7ed37de0e15782 Gerrit-PatchSet: 9 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Bharath Vissapragada Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4513: Promote integer types for ABS()
Zach Amsden has posted comments on this change. Change subject: IMPALA-4513: Promote integer types for ABS() .. Patch Set 2: As per Greg's request, add overflow check for BIGINT. All other types just get promoted. -- To view, visit http://gerrit.cloudera.org:8080/8004 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I86cc880e78258d5f90471bd8af4caeb4305eed77 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Zach Amsden Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Zach Amsden Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4513: Promote integer types for ABS()
Zach Amsden has uploaded a new patch set (#2). Change subject: IMPALA-4513: Promote integer types for ABS() .. IMPALA-4513: Promote integer types for ABS() The internal representation of the most negative number in 2's complement requires 1 more bit to represent the positive version. This means ABS() must promote integer types to the next highest width. Change-Id: I86cc880e78258d5f90471bd8af4caeb4305eed77 --- M be/src/exprs/expr-test.cc M be/src/exprs/math-functions-ir.cc M be/src/exprs/math-functions.h M common/function-registry/impala_functions.py 4 files changed, 26 insertions(+), 11 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/04/8004/2 -- To view, visit http://gerrit.cloudera.org:8080/8004 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I86cc880e78258d5f90471bd8af4caeb4305eed77 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Zach Amsden Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Zach Amsden
[Impala-ASF-CR] IMPALA-5199: prevent hang on empty row batch exchange
Sailesh Mukil has posted comments on this change. Change subject: IMPALA-5199: prevent hang on empty row batch exchange .. Patch Set 2: The fix looks good to me, and should work fine for most cases. One case I'm worried about is that, under heavy load, if a receiver deregisters itself early (due to a LIMIT, etc.), and the sender sends the 'eos' after STREAM_EXPIRATION_TIME_MS, the query would have previously succeeded, even though the sender couldn't find the receiver in CloseSender(). And the query would be right to succeed in this case since the 'eos' here is spurious as the receiver is already gone. Although, I know this counts as succeeding "by mistake". Now, we would be failing these queries. And given that we see the 'DATASTREAM_SENDER_TIMEOUT' fairly often, that would regress a few workloads. What do you think? -- To view, visit http://gerrit.cloudera.org:8080/8005 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ib370ebe44e3bb34d3f0fb9f05aa6386eb91c8645 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Sailesh Mukil Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5881: Use native allocation while building catalog updates
Bharath Vissapragada has posted comments on this change. Change subject: IMPALA-5881: Use native allocation while building catalog updates .. Patch Set 9: (14 comments) Redo'ing the test a little, will update the new PS shortly. http://gerrit.cloudera.org:8080/#/c/7955/8/common/thrift/JniCatalog.thrift File common/thrift/JniCatalog.thrift: PS8, Line 603: // Struct containing eight strings, used for testing. > Mention what are we testing with this weird looking struct. Also, leave a T Done http://gerrit.cloudera.org:8080/#/c/7955/8/fe/src/main/java/org/apache/impala/service/JniCatalog.java File fe/src/main/java/org/apache/impala/service/JniCatalog.java: PS8, Line 51: import org.apache.impala.thrift.TNativeByteBuffer; > dup (L49) Done http://gerrit.cloudera.org:8080/#/c/7955/9/fe/src/main/java/org/apache/impala/thrift/NativeByteArrayOutputStream.java File fe/src/main/java/org/apache/impala/thrift/NativeByteArrayOutputStream.java: PS9, Line 67: . > nit: remove Done Line 87: private NativeByteArrayOutputStream(long initialSize) { > Merge into the public constructor to remove some code and never mention a c Done PS9, Line 94: reAllocateMemory > Unsafe.reallocateMemory() Done Line 99: Preconditions.checkState(bufferPtr_ >= 0); > needs to go above the try, otherwise we'll try to free an invalid ptr in L1 Done Line 104: throw new RuntimeException(BUFFER_LIMIT_EXCEEDED_MSG + ": " + len); > IllegalArgumentException? Done PS9, Line 159: public void writeTo(OutputStream out) { : throw new IllegalStateException("Not implemented."); : } : : public byte toByteArray()[] { : throw new IllegalStateException("Not implemented."); : } : : public int size() { : throw new IllegalStateException("Not implemented."); : } > These are not defined in OutputStream, so what is the point of defining the These are from ByteArrayOutputStream. Given we mimic it, someone might expect to call them. I'm fine with keeping/removing them. http://gerrit.cloudera.org:8080/#/c/7955/9/fe/src/test/java/org/apache/impala/thrift/TNativeSerializerTest.java File fe/src/test/java/org/apache/impala/thrift/TNativeSerializerTest.java: PS9, Line 38: @SuppressWarnings("restriction") > explain? Also add a brief description of what this class tests. IDE magic, removed PS9, Line 78: huge > "huge" doesn't really convey much information here. Maybe a bit more explic Done PS9, Line 85: char[] chars = new char[128 * 1024 * 1024]; : Arrays.fill(chars, 'a'); : String hugeString = new String(chars); > nit: see if you can use Guava's Strings.repeat() here, may be cheaper. Done PS9, Line 99: > nit: extra space Some kind of visual effect, don't think there are two spaces. PS9, Line 99: Create a huge string of size 512MB. The combined size of the test object : // crosses 4GB. > Maybe say something like "create an object with a serialization size which Done PS9, Line 105: never > ha, famous last words :) lol :D -- To view, visit http://gerrit.cloudera.org:8080/7955 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I383684effa9524734ce3c6c0fb7ed37de0e15782 Gerrit-PatchSet: 9 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Bharath Vissapragada Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4939, IMPALA-4939: Decimal V2 multiplication
Taras Bobrovytsky has posted comments on this change. Change subject: IMPALA-4939, IMPALA-4939: Decimal V2 multiplication .. Patch Set 4: (1 comment) http://gerrit.cloudera.org:8080/#/c/7438/4//COMMIT_MSG Commit Message: Line 75: DECIMAL_V2 disabled: 2.65s > So we go down this path for multiplies that overflow, or are close to overf Running perf top showed that most of the time is spent in ScaleDownAndRound(). It looks like it's the division (of one int256 by another int256) that is taking the most time. I also noticed that disabling the codegen slows down the query in the BEFORE case by ~10x. So the performance difference before my patch and after is huge only when codegen is enabled. Otherwise, it's like a 3x difference. (Do you think it's most likely due to the scanner being much faster in the codegen case?) -- To view, visit http://gerrit.cloudera.org:8080/7438 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I37ad6232d7953bd75c18dc86e665b2b501a1ebe1 Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Taras Bobrovytsky Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zach Amsden Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-2107: [DOCS] Document base64*code() functions
Impala Public Jenkins has submitted this change and it was merged. Change subject: IMPALA-2107: [DOCS] Document base64*code() functions .. IMPALA-2107: [DOCS] Document base64*code() functions base64decode() base64encode() Change-Id: I5251e368ad36756c19a7b97e5ef6f232f616189b Reviewed-on: http://gerrit.cloudera.org:8080/7963 Reviewed-by: Jim Apple Tested-by: Impala Public Jenkins --- M docs/impala_keydefs.ditamap M docs/shared/impala_common.xml M docs/topics/impala_string_functions.xml 3 files changed, 173 insertions(+), 0 deletions(-) Approvals: Impala Public Jenkins: Verified Jim Apple: Looks good to me, approved -- To view, visit http://gerrit.cloudera.org:8080/7963 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: merged Gerrit-Change-Id: I5251e368ad36756c19a7b97e5ef6f232f616189b Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: John Russell Gerrit-Reviewer: Greg Rahn Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: John Russell
[Impala-ASF-CR] IMPALA-2107: [DOCS] Document base64*code() functions
Impala Public Jenkins has posted comments on this change. Change subject: IMPALA-2107: [DOCS] Document base64*code() functions .. Patch Set 2: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/7963 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I5251e368ad36756c19a7b97e5ef6f232f616189b Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: John Russell Gerrit-Reviewer: Greg Rahn Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: John Russell Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5199: prevent hang on empty row batch exchange
Tim Armstrong has uploaded a new patch set (#2). Change subject: IMPALA-5199: prevent hang on empty row batch exchange .. IMPALA-5199: prevent hang on empty row batch exchange The error path where delivery of "eos" fails now behaves the same as if delivery of a row batch fails. Testing: Added a timeout test where the leaf fragments return 0 rows. Before the change this reproduced the hang. Change-Id: Ib370ebe44e3bb34d3f0fb9f05aa6386eb91c8645 --- M be/src/runtime/data-stream-mgr.cc M testdata/workloads/functional-query/queries/QueryTest/exchange-delays.test 2 files changed, 23 insertions(+), 4 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/05/8005/2 -- To view, visit http://gerrit.cloudera.org:8080/8005 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ib370ebe44e3bb34d3f0fb9f05aa6386eb91c8645 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Sailesh Mukil
[Impala-ASF-CR] IMPALA-5199: prevent hang on empty row batch exchange
Tim Armstrong has uploaded a new change for review. http://gerrit.cloudera.org:8080/8005 Change subject: IMPALA-5199: prevent hang on empty row batch exchange .. IMPALA-5199: prevent hang on empty row batch exchange The error path where delivery of "eos" fails now behaves the same as if delivery of a row batch fails. Testing: Added a timeout test where the leaf fragments return 0 rows. Before the change this reproduced the hang. Change-Id: Ib370ebe44e3bb34d3f0fb9f05aa6386eb91c8645 --- M be/src/runtime/data-stream-mgr.cc M testdata/workloads/functional-query/queries/QueryTest/exchange-delays.test 2 files changed, 20 insertions(+), 4 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/05/8005/1 -- To view, visit http://gerrit.cloudera.org:8080/8005 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: Ib370ebe44e3bb34d3f0fb9f05aa6386eb91c8645 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong
[Impala-ASF-CR] IMPALA-2107: [DOCS] Document base64*code() functions
Impala Public Jenkins has posted comments on this change. Change subject: IMPALA-2107: [DOCS] Document base64*code() functions .. Patch Set 2: Build started: https://jenkins.impala.io/job/gerrit-docs-submit/152/ -- To view, visit http://gerrit.cloudera.org:8080/7963 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I5251e368ad36756c19a7b97e5ef6f232f616189b Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: John Russell Gerrit-Reviewer: Greg Rahn Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: John Russell Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-2107: [DOCS] Document base64*code() functions
Jim Apple has posted comments on this change. Change subject: IMPALA-2107: [DOCS] Document base64*code() functions .. Patch Set 2: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/7963 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I5251e368ad36756c19a7b97e5ef6f232f616189b Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: John Russell Gerrit-Reviewer: Greg Rahn Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: John Russell Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5867: Fix bugs parsing 2-digit year
Tim Armstrong has posted comments on this change. Change subject: IMPALA-5867: Fix bugs parsing 2-digit year .. Patch Set 6: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/7910 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ia4f430caea88b6c33f8050a1984ee0ee32ecb0a1 Gerrit-PatchSet: 6 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tianyi Wang Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5867: Fix bugs parsing 2-digit year
Impala Public Jenkins has posted comments on this change. Change subject: IMPALA-5867: Fix bugs parsing 2-digit year .. Patch Set 6: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1208/ -- To view, visit http://gerrit.cloudera.org:8080/7910 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ia4f430caea88b6c33f8050a1984ee0ee32ecb0a1 Gerrit-PatchSet: 6 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tianyi Wang Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] Bump Kudu version to a71ecfd
Impala Public Jenkins has submitted this change and it was merged. Change subject: Bump Kudu version to a71ecfd .. Bump Kudu version to a71ecfd Change-Id: Ie23d852f0d630f9484d8ae4f772af6bba13ea24f Reviewed-on: http://gerrit.cloudera.org:8080/8000 Reviewed-by: Thomas Tauber-Marshall Tested-by: Impala Public Jenkins --- M bin/impala-config.sh 1 file changed, 2 insertions(+), 2 deletions(-) Approvals: Impala Public Jenkins: Verified Thomas Tauber-Marshall: Looks good to me, approved -- To view, visit http://gerrit.cloudera.org:8080/8000 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: merged Gerrit-Change-Id: Ie23d852f0d630f9484d8ae4f772af6bba13ea24f Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Matthew Jacobs Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Tauber-Marshall
[Impala-ASF-CR] Bump Kudu version to a71ecfd
Impala Public Jenkins has posted comments on this change. Change subject: Bump Kudu version to a71ecfd .. Patch Set 1: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/8000 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ie23d852f0d630f9484d8ae4f772af6bba13ea24f Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Matthew Jacobs Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5867: Fix bugs parsing 2-digit year
Tim Armstrong has posted comments on this change. Change subject: IMPALA-5867: Fix bugs parsing 2-digit year .. Patch Set 5: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/7910 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ia4f430caea88b6c33f8050a1984ee0ee32ecb0a1 Gerrit-PatchSet: 5 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tianyi Wang Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4513: Promote integer types for ABS()
Zach Amsden has posted comments on this change. Change subject: IMPALA-4513: Promote integer types for ABS() .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/8004/1/be/src/exprs/expr-test.cc File be/src/exprs/expr-test.cc: Line 4219: TestValue("abs(-1 * cast(pow(2, 31) as int))", TYPE_BIGINT, 2147483648); Obviously there is going to be some debate about what to do about BIGINT types. Let's do that in the JIRA. -- To view, visit http://gerrit.cloudera.org:8080/8004 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I86cc880e78258d5f90471bd8af4caeb4305eed77 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Zach Amsden Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Zach Amsden Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4513: Promote integer types for ABS()
Zach Amsden has uploaded a new change for review. http://gerrit.cloudera.org:8080/8004 Change subject: IMPALA-4513: Promote integer types for ABS() .. IMPALA-4513: Promote integer types for ABS() The internal representation of the most negative number in 2's complement requires 1 more bit to represent the positive version. This means ABS() must promote integer types to the next highest width. Change-Id: I86cc880e78258d5f90471bd8af4caeb4305eed77 --- M be/src/exprs/expr-test.cc M be/src/exprs/math-functions-ir.cc M be/src/exprs/math-functions.h M common/function-registry/impala_functions.py 4 files changed, 15 insertions(+), 10 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/04/8004/1 -- To view, visit http://gerrit.cloudera.org:8080/8004 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I86cc880e78258d5f90471bd8af4caeb4305eed77 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Zach Amsden
[Impala-ASF-CR] IMPALA-5881: Use native allocation while building catalog updates
Alex Behm has posted comments on this change. Change subject: IMPALA-5881: Use native allocation while building catalog updates .. Patch Set 9: (3 comments) Code looks good to me, still need to flesh out tests. http://gerrit.cloudera.org:8080/#/c/7955/9/fe/src/main/java/org/apache/impala/thrift/NativeByteArrayOutputStream.java File fe/src/main/java/org/apache/impala/thrift/NativeByteArrayOutputStream.java: Line 87: private NativeByteArrayOutputStream(long initialSize) { Merge into the public constructor to remove some code and never mention a custom initial capacity (to keep things simple). Line 99: Preconditions.checkState(bufferPtr_ >= 0); needs to go above the try, otherwise we'll try to free an invalid ptr in L113 Line 104: throw new RuntimeException(BUFFER_LIMIT_EXCEEDED_MSG + ": " + len); IllegalArgumentException? -- To view, visit http://gerrit.cloudera.org:8080/7955 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I383684effa9524734ce3c6c0fb7ed37de0e15782 Gerrit-PatchSet: 9 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Bharath Vissapragada Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5425: Add test for validating input when setting query options
Tianyi Wang has uploaded a new patch set (#6). Change subject: IMPALA-5425: Add test for validating input when setting query options .. IMPALA-5425: Add test for validating input when setting query options This patch adds multiple query option validation testcases to be/src/service/query-options-test.cc The test cases include parsing edge cases, bondary values, special cases for some options and some testcases moved from testdata/workloads/functional-query/queries/QueryTest/set.test This patch also fixes a bug generating wrong error message for query option RUNTIME_FILTER_WAIT_TIME_MS. Change-Id: I510e02bb0776673d8cbfc22b903831882c6908d7 --- M be/src/service/query-options-test.cc M be/src/service/query-options.cc M testdata/workloads/functional-query/queries/QueryTest/set.test 3 files changed, 236 insertions(+), 150 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/05/7805/6 -- To view, visit http://gerrit.cloudera.org:8080/7805 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I510e02bb0776673d8cbfc22b903831882c6908d7 Gerrit-PatchSet: 6 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tianyi Wang
[Impala-ASF-CR] IMPALA-5664: Unix time to timestamp conversions may crash Impala
Csaba Ringhofer has posted comments on this change. Change subject: IMPALA-5664: Unix time to timestamp conversions may crash Impala .. Patch Set 2: (11 comments) http://gerrit.cloudera.org:8080/#/c/7954/1//COMMIT_MSG Commit Message: PS1, Line 7: Impala > nit: Capitalize Impala (it's a name) Done PS1, Line 9: FromSubsecondUnixTime > We usually refer to functions as Function(). Done PS1, Line 11: 1400-01-01 00:00:00 > You could write dates in the more standard format (1400-01-01 00:00:00). Done PS1, Line 13: The maximum : case, -12-31 59:59:59 is a bit different, because as I understand, : with nanosecond precision posix times, the maximum value is actually : 1-01-01. 00:00:00 minus 1 nanosec. > Can you add tests with sub-second deltas around the upper bound, too? Done http://gerrit.cloudera.org:8080/#/c/7954/1/be/src/runtime/timestamp-test.cc File be/src/runtime/timestamp-test.cc: Line 637: // FromUnixTime functions incorrectly accepted the last second of 1399 as valid, > Please wrap lines at 90 characters. Please follow the surrounding code for Done PS1, Line 637: he last second of 1399 as valid, > Please outline in the comment briefly why the last second needs special att Done PS1, Line 637: ns i > Extra word? Done Line 640: MIN_DATE_AS_UNIX_TIME - 0.1).HasDate()); > While you are here, can you also add tests for the exact beginning of the l Done http://gerrit.cloudera.org:8080/#/c/7954/1/be/src/runtime/timestamp-value.h File be/src/runtime/timestamp-value.h: PS1, Line 80: s_special() && UNLI > Can you add a comment explaining why this check is necessary? Done Line 80: if(!date_.is_special() && UNLIKELY(!IsValidDate())){ > Please use spaces instead of tabs. Done http://gerrit.cloudera.org:8080/#/c/7954/2/be/src/runtime/timestamp-value.h File be/src/runtime/timestamp-value.h: Line 84: time_ = boost::posix_time::time_duration(boost::posix_time::not_a_date_time); The first patch was incorrect - it did not set time_ to not_a_date_time, but 00:00:00 instead. This caused the last second of year 1399 to be inconsistent with other dates before year 1400. -- To view, visit http://gerrit.cloudera.org:8080/7954 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I77b2f6284d3a597f57e61c17a67c959eff9e38ff Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Csaba Ringhofer Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Lars Volker Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5881: Use native allocation while building catalog updates
Dimitris Tsirogiannis has posted comments on this change. Change subject: IMPALA-5881: Use native allocation while building catalog updates .. Patch Set 9: (9 comments) http://gerrit.cloudera.org:8080/#/c/7955/9/fe/src/main/java/org/apache/impala/thrift/NativeByteArrayOutputStream.java File fe/src/main/java/org/apache/impala/thrift/NativeByteArrayOutputStream.java: PS9, Line 67: . nit: remove PS9, Line 94: reAllocateMemory Unsafe.reallocateMemory() PS9, Line 159: public void writeTo(OutputStream out) { : throw new IllegalStateException("Not implemented."); : } : : public byte toByteArray()[] { : throw new IllegalStateException("Not implemented."); : } : : public int size() { : throw new IllegalStateException("Not implemented."); : } These are not defined in OutputStream, so what is the point of defining them here? Am I missing something? http://gerrit.cloudera.org:8080/#/c/7955/9/fe/src/test/java/org/apache/impala/thrift/TNativeSerializerTest.java File fe/src/test/java/org/apache/impala/thrift/TNativeSerializerTest.java: PS9, Line 38: @SuppressWarnings("restriction") explain? Also add a brief description of what this class tests. PS9, Line 78: huge "huge" doesn't really convey much information here. Maybe a bit more explicit what we're testing (>1GB, < 4GB)? Also, you may want to test multiple sizes to exercise all paths in the reallocation logic (e.g. 500MB, 1GB, 1.5GB, 2GB). You can decide which ones make more sense. PS9, Line 85: char[] chars = new char[128 * 1024 * 1024]; : Arrays.fill(chars, 'a'); : String hugeString = new String(chars); nit: see if you can use Guava's Strings.repeat() here, may be cheaper. PS9, Line 99: nit: extra space PS9, Line 99: Create a huge string of size 512MB. The combined size of the test object : // crosses 4GB. Maybe say something like "create an object with a serialization size which is larger than 4GB" PS9, Line 105: never ha, famous last words :) -- To view, visit http://gerrit.cloudera.org:8080/7955 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I383684effa9524734ce3c6c0fb7ed37de0e15782 Gerrit-PatchSet: 9 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Bharath Vissapragada Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] Bump Kudu version to a71ecfd
Thomas Tauber-Marshall has posted comments on this change. Change subject: Bump Kudu version to a71ecfd .. Patch Set 1: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/8000 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ie23d852f0d630f9484d8ae4f772af6bba13ea24f Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Matthew Jacobs Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-HasComments: No
[Impala-ASF-CR] Bump Kudu version to a71ecfd
Impala Public Jenkins has posted comments on this change. Change subject: Bump Kudu version to a71ecfd .. Patch Set 1: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1207/ -- To view, visit http://gerrit.cloudera.org:8080/8000 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ie23d852f0d630f9484d8ae4f772af6bba13ea24f Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Matthew Jacobs Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-HasComments: No
[Impala-ASF-CR] Bump Kudu version to a71ecfd
Matthew Jacobs has uploaded a new change for review. http://gerrit.cloudera.org:8080/8000 Change subject: Bump Kudu version to a71ecfd .. Bump Kudu version to a71ecfd Change-Id: Ie23d852f0d630f9484d8ae4f772af6bba13ea24f --- M bin/impala-config.sh 1 file changed, 2 insertions(+), 2 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/00/8000/1 -- To view, visit http://gerrit.cloudera.org:8080/8000 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: Ie23d852f0d630f9484d8ae4f772af6bba13ea24f Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Matthew Jacobs
[Impala-ASF-CR] IMPALA-2107: [DOCS] Document base64*code() functions
John Russell has posted comments on this change. Change subject: IMPALA-2107: [DOCS] Document base64*code() functions .. Patch Set 2: (3 comments) http://gerrit.cloudera.org:8080/#/c/7963/1/docs/shared/impala_common.xml File docs/shared/impala_common.xml: Line 802: MAX(), and MIN() than when > Are you sure about MAX and MIN? They might preserve ordering. I don't know. Done. That was a close call that it does appear to be correct. In a sample of about a dozen words and phrases, I found almost all were sorted the same way as the unencoded values, but the 'w' entries came before the 't' ones: | world | d29ybGQ= | | whirled | d2hpcmxlZA== | | whorled | d2hvcmxlZA== | | the rain in spain | dGhlIHJhaW4gaW4gc3BhaW4= | | the time has come | dGhlIHRpbWUgaGFzIGNvbWU= | PS1, Line 808: All argument values : supplied to base64decode() must also be a : multiple of 4 bytes in length. > I don't think this is right. The example you give below, for instance, is ' Right, the constraint (must be) is on base64decode argument, and the guarantee (is always) is on the base64encode return value. That was a finger fumble to say base64encode in both places. http://gerrit.cloudera.org:8080/#/c/7963/1/docs/topics/impala_string_functions.xml File docs/topics/impala_string_functions.xml: Line 88: > This was fixed in 2.6.0, according to the ticket, not 2.9.0 Done. Changed for both base64decode and base64encode tags. -- To view, visit http://gerrit.cloudera.org:8080/7963 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I5251e368ad36756c19a7b97e5ef6f232f616189b Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: John Russell Gerrit-Reviewer: Greg Rahn Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: John Russell Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-2107: [DOCS] Document base64*code() functions
John Russell has uploaded a new patch set (#2). Change subject: IMPALA-2107: [DOCS] Document base64*code() functions .. IMPALA-2107: [DOCS] Document base64*code() functions base64decode() base64encode() Change-Id: I5251e368ad36756c19a7b97e5ef6f232f616189b --- M docs/impala_keydefs.ditamap M docs/shared/impala_common.xml M docs/topics/impala_string_functions.xml 3 files changed, 173 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/63/7963/2 -- To view, visit http://gerrit.cloudera.org:8080/7963 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I5251e368ad36756c19a7b97e5ef6f232f616189b Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: John Russell Gerrit-Reviewer: Greg Rahn Gerrit-Reviewer: Jim Apple
[Impala-ASF-CR] [DOCS] Tighten up advice about first COMPUTE INCREMENTAL STATS
John Russell has uploaded a new change for review. http://gerrit.cloudera.org:8080/7999 Change subject: [DOCS] Tighten up advice about first COMPUTE INCREMENTAL STATS .. [DOCS] Tighten up advice about first COMPUTE INCREMENTAL STATS Explain how doing COMPUTE INCREMENTAL STATS for the first time starts over and discards any previous stats from COMPUTE STATS. As a consequence, moved some wording and examples into impala_common.xml so that content could be used in multiple places. Also made a new subtopic on the "Partitioning" page because I saw COMPUTE INCREMENTAL STATS wasn't mentioned there. Change-Id: Ia53a6518ce5541e5c9a2cd896856ce042a599b03 --- M docs/shared/impala_common.xml M docs/topics/impala_compute_stats.xml M docs/topics/impala_partitioning.xml M docs/topics/impala_perf_stats.xml 4 files changed, 157 insertions(+), 107 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/99/7999/1 -- To view, visit http://gerrit.cloudera.org:8080/7999 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: Ia53a6518ce5541e5c9a2cd896856ce042a599b03 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: John Russell
[Impala-ASF-CR] Tighten up advice about first COMPUTE INCREMENTAL STATS
John Russell has abandoned this change. Change subject: Tighten up advice about first COMPUTE INCREMENTAL STATS .. Abandoned Accidentally created this review based on master rather than a private branch. Will do some git fixup and then resubmit. -- To view, visit http://gerrit.cloudera.org:8080/7945 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: abandon Gerrit-Change-Id: Id5c09edb90668b7b8b8ba7d63d0aefe42aee23b5 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: John Russell Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Greg Rahn Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Silvius Rus
[Impala-ASF-CR] [DOCS] Explain Boost setting needed for 96-bit timestamps
John Russell has posted comments on this change. Change subject: [DOCS] Explain Boost setting needed for 96-bit timestamps .. Patch Set 2: Patch set 2 is a no-op from a review perspective. I got rid of a commit that got into my local master branch by accident. -- To view, visit http://gerrit.cloudera.org:8080/7983 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I4b67cd7762f682c3a054e0d9641080aa51801c83 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: John Russell Gerrit-Reviewer: John Russell Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5881: Use native allocation while building catalog updates
Dimitris Tsirogiannis has posted comments on this change. Change subject: IMPALA-5881: Use native allocation while building catalog updates .. Patch Set 8: (2 comments) Flushing minor comments from patch #8 and switching to patch #9. http://gerrit.cloudera.org:8080/#/c/7955/8/common/thrift/JniCatalog.thrift File common/thrift/JniCatalog.thrift: PS8, Line 603: // Struct containing eight strings, used for testing. Mention what are we testing with this weird looking struct. Also, leave a TODO to move this out of here. http://gerrit.cloudera.org:8080/#/c/7955/8/fe/src/main/java/org/apache/impala/service/JniCatalog.java File fe/src/main/java/org/apache/impala/service/JniCatalog.java: PS8, Line 51: import org.apache.impala.thrift.TNativeByteBuffer; dup (L49) -- To view, visit http://gerrit.cloudera.org:8080/7955 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I383684effa9524734ce3c6c0fb7ed37de0e15782 Gerrit-PatchSet: 8 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Bharath Vissapragada Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5881: Use native allocation while building catalog updates
Bharath Vissapragada has uploaded a new patch set (#9). Change subject: IMPALA-5881: Use native allocation while building catalog updates .. IMPALA-5881: Use native allocation while building catalog updates This patch moves the allocation of thrift structures for serializing the output of GetAllCatalogObjects() call into the native side. This is done to prevent the Catalog from hitting JVM array limitations (2GB maximum size) at scale. Additionally this patch also extends the --compact_catalog_top=true to apply TCompactProtocol for the above serialization to reduce the in-memory footprint. This patch also caps the native allocations to not go beyond 4GB due to Thrift library limitations. (Thrift internally uses uint32_t datatype to represent the message size which limits the size to ~4.2GB). Testing: Passed ASAN + HDFS core and DEBUG + HDFS exhaustive. Deployed the patch on a 16 node cluster and tested it on a Catalog-update topic of 3.5GB (uncompressed) or 780MB (compressed). Change-Id: I383684effa9524734ce3c6c0fb7ed37de0e15782 --- M be/src/catalog/catalog.cc M be/src/catalog/catalog.h M be/src/rpc/jni-thrift-util.h M be/src/util/backend-gflag-util.cc M common/thrift/BackendGflags.thrift M common/thrift/JniCatalog.thrift M fe/src/main/java/org/apache/impala/service/JniCatalog.java A fe/src/main/java/org/apache/impala/thrift/NativeByteArrayOutputStream.java A fe/src/main/java/org/apache/impala/thrift/TNativeSerializer.java A fe/src/test/java/org/apache/impala/thrift/TNativeSerializerTest.java 10 files changed, 498 insertions(+), 6 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/55/7955/9 -- To view, visit http://gerrit.cloudera.org:8080/7955 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I383684effa9524734ce3c6c0fb7ed37de0e15782 Gerrit-PatchSet: 9 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Bharath Vissapragada Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-5881: Use native allocation while building catalog updates
Bharath Vissapragada has posted comments on this change. Change subject: IMPALA-5881: Use native allocation while building catalog updates .. Patch Set 8: (8 comments) http://gerrit.cloudera.org:8080/#/c/7955/8/be/src/catalog/catalog.cc File be/src/catalog/catalog.cc: Line 116: DCHECK(len < std::numeric_limits::max()); > DCHECK_LE Done Line 170: LOG(ERROR) << "Failed to free buffers. Potential memory leak of: " > Failed to free native buffer through JNI. Leaked N bytes. Done http://gerrit.cloudera.org:8080/#/c/7955/8/fe/src/main/java/org/apache/impala/service/JniCatalog.java File fe/src/main/java/org/apache/impala/service/JniCatalog.java: Line 295: TException { > Does this really throw TException? good catch, no! http://gerrit.cloudera.org:8080/#/c/7955/8/fe/src/main/java/org/apache/impala/thrift/NativeByteArrayOutputStream.java File fe/src/main/java/org/apache/impala/thrift/NativeByteArrayOutputStream.java: Line 94: private void tryAllocate(long len, boolean reAllocate) { > Can we simplify this to just always use reallocateMemory()? Realloc from a Good point, checked the JVM sources, it already includes the check to call malloc/realloc depending on whether we pass 0. http://hg.openjdk.java.net/jdk8/jdk8/hotspot/file/87ee5ee27509/src/share/vm/prims/unsafe.cpp#l613 Line 95: try { > Preconditions.checkState(bufferPtr_ >= 0); Done Line 113: } catch (Exception e) { > catch Throwable for extra paranoia Done Line 139: if (newBufferSize >= BUFFER_MAX_SIZE_LIMIT) { > || newBufferSize < 0 (overflow) Don't think thats possible given we cap at BUFFER_MAX_SIZE_LIMIT which is <<< long_max. I'm still including it to be on the safe side. Line 155: public synchronized void reset() {} > What's this for? Is it an @Override? Not an ovveride, removed, this was actually being called from serializer, but does nothing. Ideally useful to reset the state of NBOAS, but given we don't allow calling serialize twice, I don't think that is necessary. -- To view, visit http://gerrit.cloudera.org:8080/7955 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I383684effa9524734ce3c6c0fb7ed37de0e15782 Gerrit-PatchSet: 8 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Bharath Vissapragada Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5881: Use native allocation while building catalog updates
Alex Behm has posted comments on this change. Change subject: IMPALA-5881: Use native allocation while building catalog updates .. Patch Set 8: (8 comments) http://gerrit.cloudera.org:8080/#/c/7955/8/be/src/catalog/catalog.cc File be/src/catalog/catalog.cc: Line 116: DCHECK(len < std::numeric_limits::max()); DCHECK_LE #include Line 170: LOG(ERROR) << "Failed to free buffers. Potential memory leak of: " Failed to free native buffer through JNI. Leaked N bytes. (We definitely leaked at this point). Also include the status error message. http://gerrit.cloudera.org:8080/#/c/7955/8/fe/src/main/java/org/apache/impala/service/JniCatalog.java File fe/src/main/java/org/apache/impala/service/JniCatalog.java: Line 295: TException { Does this really throw TException? http://gerrit.cloudera.org:8080/#/c/7955/8/fe/src/main/java/org/apache/impala/thrift/NativeByteArrayOutputStream.java File fe/src/main/java/org/apache/impala/thrift/NativeByteArrayOutputStream.java: Line 94: private void tryAllocate(long len, boolean reAllocate) { Can we simplify this to just always use reallocateMemory()? Realloc from a 0 ptr should be ok I think. Line 95: try { Preconditions.checkState(bufferPtr_ >= 0); for clarity why the free in L114 is always ok Line 113: } catch (Exception e) { catch Throwable for extra paranoia Line 139: if (newBufferSize >= BUFFER_MAX_SIZE_LIMIT) { || newBufferSize < 0 (overflow) Line 155: public synchronized void reset() {} What's this for? Is it an @Override? -- To view, visit http://gerrit.cloudera.org:8080/7955 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I383684effa9524734ce3c6c0fb7ed37de0e15782 Gerrit-PatchSet: 8 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Bharath Vissapragada Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5664: Unix time to timestamp conversions may crash Impala
Csaba Ringhofer has uploaded a new patch set (#2). Change subject: IMPALA-5664: Unix time to timestamp conversions may crash Impala .. IMPALA-5664: Unix time to timestamp conversions may crash Impala TimestampValue::FromSubsecondUnixTime() and UtcFromUnixTimeMicros() are incorrect only in case of the last second of 1399, because these sub-second values are rounded first towards 1400-01-01 00:00:00, which is accepted as a valid date, and the sub-second part is subtracted afterwards, leading to a date outside the valid interval. The maximum case, -12-31 59:59:59 is a bit different, because as I understand, with nanosecond precision posix times, the maximum value is actually 1-01-01. 00:00:00 minus 1 nanosec. TimestampValue::FromUnixTimeNanos() can create problematic TimestampValues both <1400 and 1<=. These timestamps can cause problems, because most code assumes that if HasDate/HasTime is true, then it really is a valid timestamp. To fix this, the posix times are checked in the constructor of TimestampValue, and if it is outside the valid interval,both time_ and date_ are set to not_a_date_time. Test: select cast(-17987443200-0.1 as timestamp); This query no longer crashes, but returns NULL, similarly to other < 1400 timestamps. Change-Id: I77b2f6284d3a597f57e61c17a67c959eff9e38ff --- M be/src/runtime/timestamp-test.cc M be/src/runtime/timestamp-value.h 2 files changed, 34 insertions(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/54/7954/2 -- To view, visit http://gerrit.cloudera.org:8080/7954 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I77b2f6284d3a597f57e61c17a67c959eff9e38ff Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Csaba Ringhofer Gerrit-Reviewer: Lars Volker
[Impala-ASF-CR] IMPALA-5881: Use native allocation while building catalog updates
Bharath Vissapragada has uploaded a new patch set (#8). Change subject: IMPALA-5881: Use native allocation while building catalog updates .. IMPALA-5881: Use native allocation while building catalog updates This patch moves the allocation of thrift structures for serializing the output of GetAllCatalogObjects() call into the native side. This is done to prevent the Catalog from hitting JVM array limitations (2GB maximum size) at scale. Additionally this patch also extends the --compact_catalog_top=true to apply TCompactProtocol for the above serialization to reduce the in-memory footprint. This patch also caps the native allocations to not go beyond 4GB due to Thrift library limitations. (Thrift internally uses uint32_t datatype to represent the message size which limits the size to ~4.2GB). Testing: Passed ASAN + HDFS core and DEBUG + HDFS exhaustive. Deployed the patch on a 16 node cluster and tested it on a Catalog-update topic of 3.5GB (uncompressed) or 780MB (compressed). Change-Id: I383684effa9524734ce3c6c0fb7ed37de0e15782 --- M be/src/catalog/catalog.cc M be/src/catalog/catalog.h M be/src/rpc/jni-thrift-util.h M be/src/util/backend-gflag-util.cc M common/thrift/BackendGflags.thrift M common/thrift/JniCatalog.thrift M fe/src/main/java/org/apache/impala/service/JniCatalog.java A fe/src/main/java/org/apache/impala/thrift/NativeByteArrayOutputStream.java A fe/src/main/java/org/apache/impala/thrift/TNativeSerializer.java A fe/src/test/java/org/apache/impala/thrift/TNativeSerializerTest.java 10 files changed, 502 insertions(+), 6 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/55/7955/8 -- To view, visit http://gerrit.cloudera.org:8080/7955 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I383684effa9524734ce3c6c0fb7ed37de0e15782 Gerrit-PatchSet: 8 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Bharath Vissapragada Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-5881: Use native allocation while building catalog updates
Bharath Vissapragada has posted comments on this change. Change subject: IMPALA-5881: Use native allocation while building catalog updates .. Patch Set 7: (13 comments) Addressed comments other than the test file. http://gerrit.cloudera.org:8080/#/c/7955/7/be/src/catalog/catalog.cc File be/src/catalog/catalog.cc: Line 114: uint32_t len = static_cast(nbuffer.buffer_len); > DCHECK that the buffer_len is smaller than max uint32_t to make it clear th Done Line 119: JniUtil::CallJniMethod(catalog_, free_native_byte_buffer_id_, nbuffer); > Let's check the status of this call and if non-ok emit LOG(ERROR) that n by Done http://gerrit.cloudera.org:8080/#/c/7955/7/fe/src/main/java/org/apache/impala/service/JniCatalog.java File fe/src/main/java/org/apache/impala/service/JniCatalog.java: Line 298: if (buffer.buffer_ptr <= 0) return; > IIRC you said 0 should work. Convert to Preconditions check for < 0. Done http://gerrit.cloudera.org:8080/#/c/7955/7/fe/src/main/java/org/apache/impala/thrift/NativeByteArrayOutputStream.java File fe/src/main/java/org/apache/impala/thrift/NativeByteArrayOutputStream.java: Line 26: * allocations happen on the native side and this class tracks the > all allocations use native memory and this class tracks the corresponding p Done Line 37: * BUFFER_RESIZE_INCREMENTS; > end sentence with "." (not ";") Done Line 77: tryAllocate(initialSize, false); > For simplicity, let's only allow the default initial size and not a custom Done Line 87: throw new IllegalArgumentException("Current size: " + bufferLen_ + > Isn't this a potential memory leak? Who frees the native memory? Redid the logic a little. Should cover this. Line 101: UnsafeUtil.UNSAFE.freeMemory(bufferPtr_); > This doesn't make sense because bufferPtr_ is already 0. You need to use a Ouch, good catch. Line 103: "Failed to (re)allocateMemory() " + len + " bytes"); > also print the value of reAllocate Done Line 108:* Write a byte array 'b' of length 'len at offset 'offset'. Resizes the buffer > Not really accurate. Suggest: Done Line 115: throw new IndexOutOfBoundsException( > Isn't this a potential memory leak? Who frees the native memory? Done Line 122: newBufferSize += BUFFER_RESIZE_INCREMENTS; > Infinite loop if overflow? Same question for L124. Done http://gerrit.cloudera.org:8080/#/c/7955/7/fe/src/main/java/org/apache/impala/thrift/TNativeSerializer.java File fe/src/main/java/org/apache/impala/thrift/TNativeSerializer.java: Line 62:* Currently that restriction is not enforced by this class and is the responsibility > Seems easy to enforce this with a flag that gets checked at the beginning o Done -- To view, visit http://gerrit.cloudera.org:8080/7955 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I383684effa9524734ce3c6c0fb7ed37de0e15782 Gerrit-PatchSet: 7 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Bharath Vissapragada Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes