[Impala-ASF-CR] IMPALA-5538: Use explicit catalog versions for deleted objects

2017-09-07 Thread Dimitris Tsirogiannis (Code Review)
Dimitris Tsirogiannis has uploaded a new patch set (#2).

Change subject: IMPALA-5538: Use explicit catalog versions for deleted objects
..

IMPALA-5538: Use explicit catalog versions for deleted objects

This commit changes the way deletions are handled in the catalog and
disseminated to the impalad nodes through the statestore. Previously,
deletions of catalog objects were not explicitly annotated with the
catalog version in which the deletion occured and the impalads were
using the max catalog version in a catalog update in order to decide
whether a deletion should be applied to the local catalog cache or not.
This works correctly under the assumption that
all the changes that occurred in the catalog between an update's min
and max catalog version are included in that update, i.e. no gaps or
missing updates. With the upcoming fix for IMPALA-5058, that constraint
will be relaxed, thus allowing for gaps in the catalog updates.

To avoid breaking the existing behavior, this patch introduced the
following changes:
* Deletions in the catalog are explicitly recorded in a log with
the catalog version in which they occurred. As before, added and deleted
catalog objects are sent to the statestore.
* Topic entries associated with deleted catalog objects have non-empty
values (besided keys) that contain minimal object metadata including the
catalog version.
* Statestore is no longer using the existence or not of
topic entry values in order to identify deleted topic entries. Deleted
topic entries should be explicitly marked as such by the statestore
subscribers that produce them.
* Statestore subscribers now use the 'deleted' flag to determine if a
topic entry corresponds to a deleted item.
* Impalads use the deleted objects' catalog versions when updating the
local catalog cache from a catalog update and not the update's maximum
catalog version.

Testing:
- No new tests were added as these paths are already exercised by
existing tests.
- Run all core tests.

Change-Id: I93cb7a033dc8f0d3e0339394b36affe14523274c
---
M be/src/catalog/catalog-server.cc
M be/src/catalog/catalog-server.h
M be/src/scheduling/admission-controller.cc
M be/src/scheduling/admission-controller.h
M be/src/scheduling/scheduler-test-util.cc
M be/src/scheduling/scheduler.cc
M be/src/service/impala-server.cc
M be/src/statestore/statestore.cc
M be/src/statestore/statestore.h
M common/thrift/CatalogInternalService.thrift
M common/thrift/StatestoreService.thrift
M fe/src/main/java/org/apache/impala/catalog/CatalogDeltaLog.java
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M tests/statestore/test_statestore.py
16 files changed, 438 insertions(+), 331 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/31/7731/2
-- 
To view, visit http://gerrit.cloudera.org:8080/7731
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I93cb7a033dc8f0d3e0339394b36affe14523274c
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Dimitris Tsirogiannis 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dimitris Tsirogiannis 


[Impala-ASF-CR] [PREVIEW] IMPALA-5538: Use explicit catalog versions for deleted objects

2017-09-07 Thread Dimitris Tsirogiannis (Code Review)
Dimitris Tsirogiannis has posted comments on this change.

Change subject: [PREVIEW] IMPALA-5538: Use explicit catalog versions for 
deleted objects
..


Patch Set 1:

(28 comments)

http://gerrit.cloudera.org:8080/#/c/7731/1/be/src/catalog/catalog-server.cc
File be/src/catalog/catalog-server.cc:

Line 288: BuildTopicUpdates(catalog_objects.updated_objects, false);
> Is there a requirement/assumption that the deletions must come after the up
No, it doesn't matter.


Line 300: void CatalogServer::BuildTopicUpdates(const vector& 
catalog_objects,
> IMO, it is more readable if the signature of BuildTopicUpdates takes TGetAl
These two "do something" blocks are almost identical. I refactored the code a 
bit to make it a bit less flaky.


Line 333: << catalog_object.catalog_version;
> indent
Done


http://gerrit.cloudera.org:8080/#/c/7731/1/be/src/catalog/catalog-server.h
File be/src/catalog/catalog-server.h:

Line 154:   /// determine items that have been deleted, it saves the set of 
topic entry keys sent
> Update comment
Done


Line 164:   bool topic_deletions);
> What does this param do?
Updated the comment.


http://gerrit.cloudera.org:8080/#/c/7731/1/be/src/scheduling/scheduler-test-util.cc
File be/src/scheduling/scheduler-test-util.cc:

Line 481:   item.key = host.ip;
> Use __set fn like above to be consistent
Done


http://gerrit.cloudera.org:8080/#/c/7731/1/be/src/scheduling/scheduler.cc
File be/src/scheduling/scheduler.cc:

Line 141:   if (delta.is_delta && delta.topic_entries.empty()) {
> single line
Done


http://gerrit.cloudera.org:8080/#/c/7731/1/be/src/service/impala-server.cc
File be/src/service/impala-server.cc:

Line 1323:   TCatalogObject catalog_object;
> Please move this right before DeserializeThriftMsg() so we can see where it
Done


Line 1343:  << item.key << " is "
> indentation
Done


Line 1367: LOG(INFO) << "Deleted item: " << item.key << " version: " << 
catalog_object.catalog_version;
> That's a lot of logging, sure we need this at INFO level?
Sorry, that was for debugging.


Line 1398: if (existing_object.catalog_version <= 
catalog_object.catalog_version) {
> Shouldn't this be < and not <=?
Done


PS1, Line 1406: batch_size_bytes
> Looks like we didn't account this earlier for deletions. This logic makes m
Done


Line 1527:   if (item.deleted) {
> Is it possible that within a single list of topic_entries there is both an 
No, it's not possible to have both an addition and a deletion in the same list 
of topic_entries.


http://gerrit.cloudera.org:8080/#/c/7731/1/be/src/statestore/statestore.cc
File be/src/statestore/statestore.cc:

Line 485: item.value.empty() ? Statestore::TopicEntry::NULL_VALUE : 
item.value,
> Why not just 'item.value'? The value gets copied.
Good point. Done


Line 527:   deleted_key_size_bytes -= itr->first.size();
> += ?
Done


http://gerrit.cloudera.org:8080/#/c/7731/1/common/thrift/StatestoreService.thrift
File common/thrift/StatestoreService.thrift:

Line 92
> - I kind of like this separation of updates and deletions in different list
By switching to TTopicItem for topic deletions there is now significant overlap 
between operations performed on both added and deleted items. We could try to 
extract most of the common logic in separate functions and keep the existing 
logic in place, but for now it seems simple enough to have one loop over one 
list of items and separate the logic with an if statement when needed. If we 
had to add many of these if statements in multiple places, I'd agree that 
splitting the logic would have been better.


Line 97:   5: required bool is_delta;
> fix member numbering
Done


http://gerrit.cloudera.org:8080/#/c/7731/1/fe/src/main/java/org/apache/impala/catalog/CatalogDeltaLog.java
File fe/src/main/java/org/apache/impala/catalog/CatalogDeltaLog.java:

Line 56:   private Map removedCatalogObjectVersionsByKey_ = 
Maps.newHashMap();
> I might be wrong, but I think we can end up having the same object deleted 
Good catch, that was a bug indeed. Removed the second map and simplified the 
logic around handling the deleted objects.


Line 70:* key 'key'.
> update comment
No longer applicable. Function is removed.


Line 90:   public synchronized void garbageCollect(long currentCatalogVersion) {
> Although these methods are synchronized, the underlying maps aren't thread 
That's not correct. The essence of having synchronized functions is exactly 
that, to ensure proper synchronization even though the underlying collections 
are not thread safe. Two threads calling garbageCollect and addRemoved cannot 
both proceed, one will grab the lock on the CatalogDeltaLog object and make 
progress while the other will simply block. In any case, this class is 
simplified and it only contains one collection.


Line 154:   private String toCatalogObjectKey(TCatalogObject catalogObject) {

[Impala-ASF-CR] IMPALA-5881: Use native allocation while building catalog updates

2017-09-07 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: IMPALA-5881: Use native allocation while building catalog 
updates
..


Patch Set 10:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7955/10/fe/src/main/java/org/apache/impala/service/JniCatalog.java
File fe/src/main/java/org/apache/impala/service/JniCatalog.java:

Line 151: if (serializedByteSize >= 
SERIALIZED_TOPIC_SIZE_WARNING_LIMIT_BYTES) {
Dimitris and I discussed this warning. If we decide to issue a warning, then I 
think we should warn when close to the limit, e.g. at 3GB. The warning message 
should pretty-print the number of bytes and also print the hard limit. Also the 
message should be more assertive, e.g. "potentially" is vague and should be 
removed (we know the exact serialized size).

For inspiration, let me propose this msg:

Serialized size of catalog update approaching hard limit. Current size: 3GB. 
Hard limit: 4GB. Catalog updates that exceed the limit will be ignored and will 
result in stale metadata on coordinator nodes.


-- 
To view, visit http://gerrit.cloudera.org:8080/7955
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I383684effa9524734ce3c6c0fb7ed37de0e15782
Gerrit-PatchSet: 10
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4513: Promote integer types for ABS()

2017-09-07 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change.

Change subject: IMPALA-4513: Promote integer types for ABS()
..


Patch Set 3: Code-Review+1

-- 
To view, visit http://gerrit.cloudera.org:8080/8004
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I86cc880e78258d5f90471bd8af4caeb4305eed77
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zach Amsden 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Zach Amsden 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5867: Fix bugs parsing 2-digit year

2017-09-07 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-5867: Fix bugs parsing 2-digit year
..


Patch Set 6: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/7910
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia4f430caea88b6c33f8050a1984ee0ee32ecb0a1
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tianyi Wang 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5867: Fix bugs parsing 2-digit year

2017-09-07 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged.

Change subject: IMPALA-5867: Fix bugs parsing 2-digit year
..


IMPALA-5867: Fix bugs parsing 2-digit year

This patch fixes several bugs parsing 1 or 2-digit year formats.
Existing code is broken in several ways:
1. With 1 or 2-digit year format and month/day missing, ParseDateTime()
   throws an uncaught exception.
2. If now() is 02/29 in a leap year but (now() - 80 years) isn't,
   DateTimeFormatContext::SetCenturyBreak() throws an uncaught
   exception.
3. If the year parsed is 02/29 in a leap year but it isn't a leap year
   100 years ago, TimestampParser::Parse() will consider the date as
   invalid though it isn't.
This patch fixes above bugs and adds a few test cases in
be/src/runtime/timestamp-test.cc
The behaviors after change is:
1. A date without month or day is considered invalid. This is a
   pre-existing difference from Hive, which defaults missing month/day
   to 01/01.
2. Century break would be set to 02/28 80 years ago.
3. If parsed date is 00/02/29 but 1900/02/29 does not exist, treat
   it as 03/01 when comparing to century break.

Change-Id: Ia4f430caea88b6c33f8050a1984ee0ee32ecb0a1
Reviewed-on: http://gerrit.cloudera.org:8080/7910
Reviewed-by: Tim Armstrong 
Tested-by: Impala Public Jenkins
---
M be/src/runtime/timestamp-parse-util.cc
M be/src/runtime/timestamp-parse-util.h
M be/src/runtime/timestamp-test.cc
3 files changed, 104 insertions(+), 50 deletions(-)

Approvals:
  Impala Public Jenkins: Verified
  Tim Armstrong: Looks good to me, approved



-- 
To view, visit http://gerrit.cloudera.org:8080/7910
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: Ia4f430caea88b6c33f8050a1984ee0ee32ecb0a1
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tianyi Wang 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-4513: Promote integer types for ABS()

2017-09-07 Thread Zach Amsden (Code Review)
Zach Amsden has uploaded a new patch set (#3).

Change subject: IMPALA-4513: Promote integer types for ABS()
..

IMPALA-4513: Promote integer types for ABS()

The internal representation of the most negative number
in two's complement requires 1 more bit to represent the
positive version.  This means ABS() must promote integer
types to the next highest width.

Change-Id: I86cc880e78258d5f90471bd8af4caeb4305eed77
---
M be/src/exprs/expr-test.cc
M be/src/exprs/math-functions-ir.cc
M be/src/exprs/math-functions.h
M common/function-registry/impala_functions.py
4 files changed, 31 insertions(+), 11 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/04/8004/3
-- 
To view, visit http://gerrit.cloudera.org:8080/8004
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I86cc880e78258d5f90471bd8af4caeb4305eed77
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zach Amsden 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Zach Amsden 


[Impala-ASF-CR] IMPALA-4513: Promote integer types for ABS()

2017-09-07 Thread Zach Amsden (Code Review)
Zach Amsden has posted comments on this change.

Change subject: IMPALA-4513: Promote integer types for ABS()
..


Patch Set 2:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/8004/2/be/src/exprs/expr-test.cc
File be/src/exprs/expr-test.cc:

Line 4216:   TestValue("abs(-128)", TYPE_SMALLINT, 128);
> What does this do with abs(-100 - 27)? How about abs(-200 + 100)?
The types may be promoted or not depending on operator precedence (I believe we 
have some weirdness with unary minus) but the end result is now the same:

[localhost:21000] > select abs(-100 - 27);
Query: select abs(-100 - 27)
Query submitted at: 2017-09-07 19:30:10 (Coordinator: http://zach-pa:25000)
Query progress can be monitored at: 
http://zach-pa:25000/query_plan?query_id=c42cf1a096e10f3:48ac2186
++
| abs(-100 - 27) |
++
| 127|
++
Fetched 1 row(s) in 0.01s
[localhost:21000] > select abs(-200 + 100);
Query: select abs(-200 + 100)
Query submitted at: 2017-09-07 19:30:22 (Coordinator: http://zach-pa:25000)
Query progress can be monitored at: 
http://zach-pa:25000/query_plan?query_id=2548f7f5364aa478:70dc0e3e
+-+
| abs(-200 + 100) |
+-+
| 100 |
+-+
Fetched 1 row(s) in 0.01s


http://gerrit.cloudera.org:8080/#/c/8004/2/be/src/exprs/math-functions-ir.cc
File be/src/exprs/math-functions-ir.cc:

Line 65:   ctx->AddWarning("Expression overflowed, returning NULL");
> Do we have a test for this case?
Not sure it is possible to test the warning message without adding more 
machinery, but there is a test for the case.


Line 73: ONE_ARG_MATH_FN(Abs, BigIntVal, IntVal, llabs);
> Maybe mention the upcast in a comment here? Otherwise it seems easy to miss
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/8004
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I86cc880e78258d5f90471bd8af4caeb4305eed77
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zach Amsden 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Zach Amsden 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4513: Promote integer types for ABS()

2017-09-07 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change.

Change subject: IMPALA-4513: Promote integer types for ABS()
..


Patch Set 2:

(3 comments)

Looks good to me, but I'm no expert in this part of the codebase.

http://gerrit.cloudera.org:8080/#/c/8004/2/be/src/exprs/expr-test.cc
File be/src/exprs/expr-test.cc:

Line 4216:   TestValue("abs(-128)", TYPE_SMALLINT, 128);
What does this do with abs(-100 - 27)? How about abs(-200 + 100)?


http://gerrit.cloudera.org:8080/#/c/8004/2/be/src/exprs/math-functions-ir.cc
File be/src/exprs/math-functions-ir.cc:

Line 65:   ctx->AddWarning("Expression overflowed, returning NULL");
Do we have a test for this case?


Line 73: ONE_ARG_MATH_FN(Abs, BigIntVal, IntVal, llabs);
Maybe mention the upcast in a comment here? Otherwise it seems easy to miss.


-- 
To view, visit http://gerrit.cloudera.org:8080/8004
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I86cc880e78258d5f90471bd8af4caeb4305eed77
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zach Amsden 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Zach Amsden 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5199: prevent hang on empty row batch exchange

2017-09-07 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-5199: prevent hang on empty row batch exchange
..


Patch Set 2:

That seems pretty rare but possible.

In the common case where the limit is at the top level of the query, I think 
this would be fine, because the query would hit the limit on the coordinator, 
return successfully, and ignore any future errors. Those timeout errors would 
have to arrive after STREAM_EXPIRATION_TIME_MS (since if the closed stream was 
in the cache, we wouldn't report an error before that time). 

So I think in general the timeline would need to be:
* Limit is hit at t1
* The eos message is received at t2 >= t1 + STREAM_EXPIRATION_TIME_MS and an 
error is sent to the coordinator
* The query fails, but it would have succeeded at t3 > t2 if the error hadn't 
been received

If the limit is not at the top level, it seems possible but unlikely that the 
error would cancel a query that was on the path to succeeding. I'm not sure if 
it changes the odds of queries succeeding enough to worry about, given that the 
cluster is already unhealthy if these timeouts are occurring.

-- 
To view, visit http://gerrit.cloudera.org:8080/8005
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ib370ebe44e3bb34d3f0fb9f05aa6386eb91c8645
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5881: Use native allocation while building catalog updates

2017-09-07 Thread Bharath Vissapragada (Code Review)
Bharath Vissapragada has uploaded a new patch set (#10).

Change subject: IMPALA-5881: Use native allocation while building catalog 
updates
..

IMPALA-5881: Use native allocation while building catalog updates

This patch moves the allocation of thrift structures for serializing
the output of GetAllCatalogObjects() call into the native side. This
is done to prevent the Catalog from hitting JVM array limitations
(2GB maximum size) at scale. Additionally this patch also extends the
--compact_catalog_top=true to apply TCompactProtocol for the above
serialization to reduce the in-memory footprint.

This patch also caps the native allocations to not go beyond 4GB due
to Thrift library limitations. (Thrift internally uses uint32_t
datatype to represent the message size which limits the size to
~4.2GB).

Testing: Passed ASAN + HDFS core and DEBUG + HDFS exhaustive.

Deployed the patch on a 16 node cluster and tested it on a
Catalog-update topic of 3.5GB (uncompressed) or 780MB (compressed).

Change-Id: I383684effa9524734ce3c6c0fb7ed37de0e15782
---
M be/src/catalog/catalog.cc
M be/src/catalog/catalog.h
M be/src/rpc/jni-thrift-util.h
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M common/thrift/JniCatalog.thrift
M fe/src/main/java/org/apache/impala/service/JniCatalog.java
A fe/src/main/java/org/apache/impala/thrift/NativeByteArrayOutputStream.java
A fe/src/main/java/org/apache/impala/thrift/TNativeSerializer.java
A fe/src/test/java/org/apache/impala/thrift/TNativeSerializerTest.java
10 files changed, 485 insertions(+), 6 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/55/7955/10
-- 
To view, visit http://gerrit.cloudera.org:8080/7955
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I383684effa9524734ce3c6c0fb7ed37de0e15782
Gerrit-PatchSet: 10
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-5881: Use native allocation while building catalog updates

2017-09-07 Thread Bharath Vissapragada (Code Review)
Bharath Vissapragada has posted comments on this change.

Change subject: IMPALA-5881: Use native allocation while building catalog 
updates
..


Patch Set 9:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7955/9/fe/src/main/java/org/apache/impala/thrift/NativeByteArrayOutputStream.java
File fe/src/main/java/org/apache/impala/thrift/NativeByteArrayOutputStream.java:

PS9, Line 159: public void writeTo(OutputStream out) {
 : throw new IllegalStateException("Not implemented.");
 :   }
 : 
 :   public byte toByteArray()[] {
 : throw new IllegalStateException("Not implemented.");
 :   }
 : 
 :   public int size() {
 : throw new IllegalStateException("Not implemented.");
 :   }
> I get that, but you're not extending the ByteArrayOutputStream class, you'r
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/7955
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I383684effa9524734ce3c6c0fb7ed37de0e15782
Gerrit-PatchSet: 9
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5881: Use native allocation while building catalog updates

2017-09-07 Thread Dimitris Tsirogiannis (Code Review)
Dimitris Tsirogiannis has posted comments on this change.

Change subject: IMPALA-5881: Use native allocation while building catalog 
updates
..


Patch Set 9:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7955/9/fe/src/main/java/org/apache/impala/thrift/NativeByteArrayOutputStream.java
File fe/src/main/java/org/apache/impala/thrift/NativeByteArrayOutputStream.java:

PS9, Line 159: public void writeTo(OutputStream out) {
 : throw new IllegalStateException("Not implemented.");
 :   }
 : 
 :   public byte toByteArray()[] {
 : throw new IllegalStateException("Not implemented.");
 :   }
 : 
 :   public int size() {
 : throw new IllegalStateException("Not implemented.");
 :   }
> These are from ByteArrayOutputStream. Given we mimic it, someone might expe
I get that, but you're not extending the ByteArrayOutputStream class, you're 
extending OutputStream. So, either change that or remove anything that is not 
defined in the parent class.


-- 
To view, visit http://gerrit.cloudera.org:8080/7955
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I383684effa9524734ce3c6c0fb7ed37de0e15782
Gerrit-PatchSet: 9
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4513: Promote integer types for ABS()

2017-09-07 Thread Zach Amsden (Code Review)
Zach Amsden has posted comments on this change.

Change subject: IMPALA-4513: Promote integer types for ABS()
..


Patch Set 2:

As per Greg's request, add overflow check for BIGINT.  All other types just get 
promoted.

-- 
To view, visit http://gerrit.cloudera.org:8080/8004
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I86cc880e78258d5f90471bd8af4caeb4305eed77
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zach Amsden 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Zach Amsden 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4513: Promote integer types for ABS()

2017-09-07 Thread Zach Amsden (Code Review)
Zach Amsden has uploaded a new patch set (#2).

Change subject: IMPALA-4513: Promote integer types for ABS()
..

IMPALA-4513: Promote integer types for ABS()

The internal representation of the most negative number
in 2's complement requires 1 more bit to represent the
positive version.  This means ABS() must promote integer
types to the next highest width.

Change-Id: I86cc880e78258d5f90471bd8af4caeb4305eed77
---
M be/src/exprs/expr-test.cc
M be/src/exprs/math-functions-ir.cc
M be/src/exprs/math-functions.h
M common/function-registry/impala_functions.py
4 files changed, 26 insertions(+), 11 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/04/8004/2
-- 
To view, visit http://gerrit.cloudera.org:8080/8004
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I86cc880e78258d5f90471bd8af4caeb4305eed77
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zach Amsden 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Zach Amsden 


[Impala-ASF-CR] IMPALA-5199: prevent hang on empty row batch exchange

2017-09-07 Thread Sailesh Mukil (Code Review)
Sailesh Mukil has posted comments on this change.

Change subject: IMPALA-5199: prevent hang on empty row batch exchange
..


Patch Set 2:

The fix looks good to me, and should work fine for most cases.

One case I'm worried about is that, under heavy load, if a receiver deregisters 
itself early (due to a LIMIT, etc.), and the sender sends the 'eos' after 
STREAM_EXPIRATION_TIME_MS, the query would have previously succeeded, even 
though the sender couldn't find the receiver in CloseSender(). And the query 
would be right to succeed in this case since the 'eos' here is spurious as the 
receiver is already gone. Although, I know this counts as succeeding "by 
mistake".

Now, we would be failing these queries. And given that we see the 
'DATASTREAM_SENDER_TIMEOUT' fairly often, that would regress a few workloads.

What do you think?

-- 
To view, visit http://gerrit.cloudera.org:8080/8005
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ib370ebe44e3bb34d3f0fb9f05aa6386eb91c8645
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5881: Use native allocation while building catalog updates

2017-09-07 Thread Bharath Vissapragada (Code Review)
Bharath Vissapragada has posted comments on this change.

Change subject: IMPALA-5881: Use native allocation while building catalog 
updates
..


Patch Set 9:

(14 comments)

Redo'ing the test a little, will update the new PS shortly.

http://gerrit.cloudera.org:8080/#/c/7955/8/common/thrift/JniCatalog.thrift
File common/thrift/JniCatalog.thrift:

PS8, Line 603: // Struct containing eight strings, used for testing.
> Mention what are we testing with this weird looking struct. Also, leave a T
Done


http://gerrit.cloudera.org:8080/#/c/7955/8/fe/src/main/java/org/apache/impala/service/JniCatalog.java
File fe/src/main/java/org/apache/impala/service/JniCatalog.java:

PS8, Line 51: import org.apache.impala.thrift.TNativeByteBuffer;
> dup (L49)
Done


http://gerrit.cloudera.org:8080/#/c/7955/9/fe/src/main/java/org/apache/impala/thrift/NativeByteArrayOutputStream.java
File fe/src/main/java/org/apache/impala/thrift/NativeByteArrayOutputStream.java:

PS9, Line 67: .
> nit: remove
Done


Line 87:   private NativeByteArrayOutputStream(long initialSize) {
> Merge into the public constructor to remove some code and never mention a c
Done


PS9, Line 94: reAllocateMemory
> Unsafe.reallocateMemory()
Done


Line 99:   Preconditions.checkState(bufferPtr_ >= 0);
> needs to go above the try, otherwise we'll try to free an invalid ptr in L1
Done


Line 104: throw new RuntimeException(BUFFER_LIMIT_EXCEEDED_MSG + ": " + 
len);
> IllegalArgumentException?
Done


PS9, Line 159: public void writeTo(OutputStream out) {
 : throw new IllegalStateException("Not implemented.");
 :   }
 : 
 :   public byte toByteArray()[] {
 : throw new IllegalStateException("Not implemented.");
 :   }
 : 
 :   public int size() {
 : throw new IllegalStateException("Not implemented.");
 :   }
> These are not defined in OutputStream, so what is the point of defining the
These are from ByteArrayOutputStream. Given we mimic it, someone might expect 
to call them. I'm fine with keeping/removing them.


http://gerrit.cloudera.org:8080/#/c/7955/9/fe/src/test/java/org/apache/impala/thrift/TNativeSerializerTest.java
File fe/src/test/java/org/apache/impala/thrift/TNativeSerializerTest.java:

PS9, Line 38: @SuppressWarnings("restriction")
> explain? Also add a brief description of what this class tests.
IDE magic, removed


PS9, Line 78: huge
> "huge" doesn't really convey much information here. Maybe a bit more explic
Done


PS9, Line 85: char[] chars = new char[128 * 1024 * 1024];
: Arrays.fill(chars, 'a');
: String hugeString = new String(chars);
> nit: see if you can use Guava's Strings.repeat() here, may be cheaper.
Done


PS9, Line 99:   
> nit: extra space
Some kind of visual effect, don't think there are two spaces.


PS9, Line 99: Create a huge string of size 512MB. The combined size of the  
test object
: // crosses 4GB.
> Maybe say something like "create an object with a serialization size which 
Done


PS9, Line 105: never
> ha, famous last words :)
lol :D


-- 
To view, visit http://gerrit.cloudera.org:8080/7955
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I383684effa9524734ce3c6c0fb7ed37de0e15782
Gerrit-PatchSet: 9
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4939, IMPALA-4939: Decimal V2 multiplication

2017-09-07 Thread Taras Bobrovytsky (Code Review)
Taras Bobrovytsky has posted comments on this change.

Change subject: IMPALA-4939, IMPALA-4939: Decimal V2 multiplication
..


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7438/4//COMMIT_MSG
Commit Message:

Line 75: DECIMAL_V2 disabled: 2.65s
> So we go down this path for multiplies that overflow, or are close to overf
Running perf top showed that most of the time is spent in 
ScaleDownAndRound(). It looks like it's the division (of 
one int256 by another int256) that is taking the most time.

I also noticed that disabling the codegen slows down the query in the BEFORE 
case by ~10x. So the performance difference before my patch and after is huge 
only when codegen is enabled. Otherwise, it's like a 3x difference. (Do you 
think it's most likely due to the scanner being much faster in the codegen 
case?)


-- 
To view, visit http://gerrit.cloudera.org:8080/7438
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I37ad6232d7953bd75c18dc86e665b2b501a1ebe1
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-2107: [DOCS] Document base64*code() functions

2017-09-07 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged.

Change subject: IMPALA-2107: [DOCS] Document base64*code() functions
..


IMPALA-2107: [DOCS] Document base64*code() functions

base64decode()
base64encode()

Change-Id: I5251e368ad36756c19a7b97e5ef6f232f616189b
Reviewed-on: http://gerrit.cloudera.org:8080/7963
Reviewed-by: Jim Apple 
Tested-by: Impala Public Jenkins
---
M docs/impala_keydefs.ditamap
M docs/shared/impala_common.xml
M docs/topics/impala_string_functions.xml
3 files changed, 173 insertions(+), 0 deletions(-)

Approvals:
  Impala Public Jenkins: Verified
  Jim Apple: Looks good to me, approved



-- 
To view, visit http://gerrit.cloudera.org:8080/7963
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I5251e368ad36756c19a7b97e5ef6f232f616189b
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: John Russell 
Gerrit-Reviewer: Greg Rahn 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: John Russell 


[Impala-ASF-CR] IMPALA-2107: [DOCS] Document base64*code() functions

2017-09-07 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-2107: [DOCS] Document base64*code() functions
..


Patch Set 2: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/7963
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I5251e368ad36756c19a7b97e5ef6f232f616189b
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: John Russell 
Gerrit-Reviewer: Greg Rahn 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: John Russell 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5199: prevent hang on empty row batch exchange

2017-09-07 Thread Tim Armstrong (Code Review)
Tim Armstrong has uploaded a new patch set (#2).

Change subject: IMPALA-5199: prevent hang on empty row batch exchange
..

IMPALA-5199: prevent hang on empty row batch exchange

The error path where delivery of "eos" fails now behaves
the same as if delivery of a row batch fails.

Testing:
Added a timeout test where the leaf fragments return 0 rows. Before
the change this reproduced the hang.

Change-Id: Ib370ebe44e3bb34d3f0fb9f05aa6386eb91c8645
---
M be/src/runtime/data-stream-mgr.cc
M testdata/workloads/functional-query/queries/QueryTest/exchange-delays.test
2 files changed, 23 insertions(+), 4 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/05/8005/2
-- 
To view, visit http://gerrit.cloudera.org:8080/8005
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ib370ebe44e3bb34d3f0fb9f05aa6386eb91c8645
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Sailesh Mukil 


[Impala-ASF-CR] IMPALA-5199: prevent hang on empty row batch exchange

2017-09-07 Thread Tim Armstrong (Code Review)
Tim Armstrong has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/8005

Change subject: IMPALA-5199: prevent hang on empty row batch exchange
..

IMPALA-5199: prevent hang on empty row batch exchange

The error path where delivery of "eos" fails now behaves
the same as if delivery of a row batch fails.

Testing:
Added a timeout test where the leaf fragments return 0 rows. Before
the change this reproduced the hang.

Change-Id: Ib370ebe44e3bb34d3f0fb9f05aa6386eb91c8645
---
M be/src/runtime/data-stream-mgr.cc
M testdata/workloads/functional-query/queries/QueryTest/exchange-delays.test
2 files changed, 20 insertions(+), 4 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/05/8005/1
-- 
To view, visit http://gerrit.cloudera.org:8080/8005
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: Ib370ebe44e3bb34d3f0fb9f05aa6386eb91c8645
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 


[Impala-ASF-CR] IMPALA-2107: [DOCS] Document base64*code() functions

2017-09-07 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-2107: [DOCS] Document base64*code() functions
..


Patch Set 2:

Build started: https://jenkins.impala.io/job/gerrit-docs-submit/152/

-- 
To view, visit http://gerrit.cloudera.org:8080/7963
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I5251e368ad36756c19a7b97e5ef6f232f616189b
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: John Russell 
Gerrit-Reviewer: Greg Rahn 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: John Russell 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-2107: [DOCS] Document base64*code() functions

2017-09-07 Thread Jim Apple (Code Review)
Jim Apple has posted comments on this change.

Change subject: IMPALA-2107: [DOCS] Document base64*code() functions
..


Patch Set 2: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/7963
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I5251e368ad36756c19a7b97e5ef6f232f616189b
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: John Russell 
Gerrit-Reviewer: Greg Rahn 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: John Russell 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5867: Fix bugs parsing 2-digit year

2017-09-07 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-5867: Fix bugs parsing 2-digit year
..


Patch Set 6: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/7910
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia4f430caea88b6c33f8050a1984ee0ee32ecb0a1
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tianyi Wang 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5867: Fix bugs parsing 2-digit year

2017-09-07 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-5867: Fix bugs parsing 2-digit year
..


Patch Set 6:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1208/

-- 
To view, visit http://gerrit.cloudera.org:8080/7910
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia4f430caea88b6c33f8050a1984ee0ee32ecb0a1
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tianyi Wang 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] Bump Kudu version to a71ecfd

2017-09-07 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged.

Change subject: Bump Kudu version to a71ecfd
..


Bump Kudu version to a71ecfd

Change-Id: Ie23d852f0d630f9484d8ae4f772af6bba13ea24f
Reviewed-on: http://gerrit.cloudera.org:8080/8000
Reviewed-by: Thomas Tauber-Marshall 
Tested-by: Impala Public Jenkins
---
M bin/impala-config.sh
1 file changed, 2 insertions(+), 2 deletions(-)

Approvals:
  Impala Public Jenkins: Verified
  Thomas Tauber-Marshall: Looks good to me, approved



-- 
To view, visit http://gerrit.cloudera.org:8080/8000
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: Ie23d852f0d630f9484d8ae4f772af6bba13ea24f
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Thomas Tauber-Marshall 


[Impala-ASF-CR] Bump Kudu version to a71ecfd

2017-09-07 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: Bump Kudu version to a71ecfd
..


Patch Set 1: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/8000
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ie23d852f0d630f9484d8ae4f772af6bba13ea24f
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5867: Fix bugs parsing 2-digit year

2017-09-07 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-5867: Fix bugs parsing 2-digit year
..


Patch Set 5: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/7910
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia4f430caea88b6c33f8050a1984ee0ee32ecb0a1
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tianyi Wang 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4513: Promote integer types for ABS()

2017-09-07 Thread Zach Amsden (Code Review)
Zach Amsden has posted comments on this change.

Change subject: IMPALA-4513: Promote integer types for ABS()
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8004/1/be/src/exprs/expr-test.cc
File be/src/exprs/expr-test.cc:

Line 4219:   TestValue("abs(-1 * cast(pow(2, 31) as int))", TYPE_BIGINT, 
2147483648);
Obviously there is going to be some debate about what to do about BIGINT types. 
 Let's do that in the JIRA.


-- 
To view, visit http://gerrit.cloudera.org:8080/8004
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I86cc880e78258d5f90471bd8af4caeb4305eed77
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zach Amsden 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Zach Amsden 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4513: Promote integer types for ABS()

2017-09-07 Thread Zach Amsden (Code Review)
Zach Amsden has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/8004

Change subject: IMPALA-4513: Promote integer types for ABS()
..

IMPALA-4513: Promote integer types for ABS()

The internal representation of the most negative number
in 2's complement requires 1 more bit to represent the
positive version.  This means ABS() must promote integer
types to the next highest width.

Change-Id: I86cc880e78258d5f90471bd8af4caeb4305eed77
---
M be/src/exprs/expr-test.cc
M be/src/exprs/math-functions-ir.cc
M be/src/exprs/math-functions.h
M common/function-registry/impala_functions.py
4 files changed, 15 insertions(+), 10 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/04/8004/1
-- 
To view, visit http://gerrit.cloudera.org:8080/8004
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I86cc880e78258d5f90471bd8af4caeb4305eed77
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zach Amsden 


[Impala-ASF-CR] IMPALA-5881: Use native allocation while building catalog updates

2017-09-07 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: IMPALA-5881: Use native allocation while building catalog 
updates
..


Patch Set 9:

(3 comments)

Code looks good to me, still need to flesh out tests.

http://gerrit.cloudera.org:8080/#/c/7955/9/fe/src/main/java/org/apache/impala/thrift/NativeByteArrayOutputStream.java
File fe/src/main/java/org/apache/impala/thrift/NativeByteArrayOutputStream.java:

Line 87:   private NativeByteArrayOutputStream(long initialSize) {
Merge into the public constructor to remove some code and never mention a 
custom initial capacity (to keep things simple).


Line 99:   Preconditions.checkState(bufferPtr_ >= 0);
needs to go above the try, otherwise we'll try to free an invalid ptr in L113


Line 104: throw new RuntimeException(BUFFER_LIMIT_EXCEEDED_MSG + ": " + 
len);
IllegalArgumentException?


-- 
To view, visit http://gerrit.cloudera.org:8080/7955
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I383684effa9524734ce3c6c0fb7ed37de0e15782
Gerrit-PatchSet: 9
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5425: Add test for validating input when setting query options

2017-09-07 Thread Tianyi Wang (Code Review)
Tianyi Wang has uploaded a new patch set (#6).

Change subject: IMPALA-5425: Add test for validating input when setting query 
options
..

IMPALA-5425: Add test for validating input when setting query options

This patch adds multiple query option validation testcases to
be/src/service/query-options-test.cc
The test cases include parsing edge cases, bondary values, special
cases for some options and some testcases moved from
testdata/workloads/functional-query/queries/QueryTest/set.test
This patch also fixes a bug generating wrong error message for
query option RUNTIME_FILTER_WAIT_TIME_MS.

Change-Id: I510e02bb0776673d8cbfc22b903831882c6908d7
---
M be/src/service/query-options-test.cc
M be/src/service/query-options.cc
M testdata/workloads/functional-query/queries/QueryTest/set.test
3 files changed, 236 insertions(+), 150 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/05/7805/6
-- 
To view, visit http://gerrit.cloudera.org:8080/7805
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I510e02bb0776673d8cbfc22b903831882c6908d7
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tianyi Wang 


[Impala-ASF-CR] IMPALA-5664: Unix time to timestamp conversions may crash Impala

2017-09-07 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change.

Change subject: IMPALA-5664: Unix time to timestamp conversions may crash Impala
..


Patch Set 2:

(11 comments)

http://gerrit.cloudera.org:8080/#/c/7954/1//COMMIT_MSG
Commit Message:

PS1, Line 7: Impala
> nit: Capitalize Impala (it's a name)
Done


PS1, Line 9: FromSubsecondUnixTime
> We usually refer to functions as Function().
Done


PS1, Line 11: 1400-01-01 00:00:00
> You could write dates in the more standard format (1400-01-01 00:00:00).
Done


PS1, Line 13: The maximum
: case, -12-31 59:59:59 is a bit different, because as I 
understand,
: with nanosecond precision posix times, the maximum value is 
actually
: 1-01-01. 00:00:00 minus 1 nanosec.
> Can you add tests with sub-second deltas around the upper bound, too?
Done


http://gerrit.cloudera.org:8080/#/c/7954/1/be/src/runtime/timestamp-test.cc
File be/src/runtime/timestamp-test.cc:

Line 637:   // FromUnixTime functions incorrectly accepted the last second of 
1399 as valid,
> Please wrap lines at 90 characters. Please follow the surrounding code for 
Done


PS1, Line 637: he last second of 1399 as valid,
> Please outline in the comment briefly why the last second needs special att
Done


PS1, Line 637: ns i
> Extra word?
Done


Line 640:   MIN_DATE_AS_UNIX_TIME - 0.1).HasDate());
> While you are here, can you also add tests for the exact beginning of the l
Done


http://gerrit.cloudera.org:8080/#/c/7954/1/be/src/runtime/timestamp-value.h
File be/src/runtime/timestamp-value.h:

PS1, Line 80: s_special() && UNLI
> Can you add a comment explaining why this check is necessary?
Done


Line 80: if(!date_.is_special() && UNLIKELY(!IsValidDate())){
> Please use spaces instead of tabs.
Done


http://gerrit.cloudera.org:8080/#/c/7954/2/be/src/runtime/timestamp-value.h
File be/src/runtime/timestamp-value.h:

Line 84:   time_ = 
boost::posix_time::time_duration(boost::posix_time::not_a_date_time);
The first patch was incorrect - it did not set time_ to not_a_date_time, but 
00:00:00 instead. This caused the last second of year 1399 to be inconsistent 
with other dates before year 1400.


-- 
To view, visit http://gerrit.cloudera.org:8080/7954
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I77b2f6284d3a597f57e61c17a67c959eff9e38ff
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Lars Volker 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5881: Use native allocation while building catalog updates

2017-09-07 Thread Dimitris Tsirogiannis (Code Review)
Dimitris Tsirogiannis has posted comments on this change.

Change subject: IMPALA-5881: Use native allocation while building catalog 
updates
..


Patch Set 9:

(9 comments)

http://gerrit.cloudera.org:8080/#/c/7955/9/fe/src/main/java/org/apache/impala/thrift/NativeByteArrayOutputStream.java
File fe/src/main/java/org/apache/impala/thrift/NativeByteArrayOutputStream.java:

PS9, Line 67: .
nit: remove


PS9, Line 94: reAllocateMemory
Unsafe.reallocateMemory()


PS9, Line 159: public void writeTo(OutputStream out) {
 : throw new IllegalStateException("Not implemented.");
 :   }
 : 
 :   public byte toByteArray()[] {
 : throw new IllegalStateException("Not implemented.");
 :   }
 : 
 :   public int size() {
 : throw new IllegalStateException("Not implemented.");
 :   }
These are not defined in OutputStream, so what is the point of defining them 
here? Am I missing something?


http://gerrit.cloudera.org:8080/#/c/7955/9/fe/src/test/java/org/apache/impala/thrift/TNativeSerializerTest.java
File fe/src/test/java/org/apache/impala/thrift/TNativeSerializerTest.java:

PS9, Line 38: @SuppressWarnings("restriction")
explain? Also add a brief description of what this class tests.


PS9, Line 78: huge
"huge" doesn't really convey much information here. Maybe a bit more explicit 
what we're testing (>1GB, < 4GB)? Also, you may want to test multiple sizes to 
exercise all paths in the reallocation logic (e.g. 500MB, 1GB, 1.5GB, 2GB). You 
can decide which ones make more sense.


PS9, Line 85: char[] chars = new char[128 * 1024 * 1024];
: Arrays.fill(chars, 'a');
: String hugeString = new String(chars);
nit: see if you can use Guava's Strings.repeat() here, may be cheaper.


PS9, Line 99:   
nit: extra space


PS9, Line 99: Create a huge string of size 512MB. The combined size of the  
test object
: // crosses 4GB.
Maybe say something like "create an object with a serialization size which is 
larger than 4GB"


PS9, Line 105: never
ha, famous last words :)


-- 
To view, visit http://gerrit.cloudera.org:8080/7955
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I383684effa9524734ce3c6c0fb7ed37de0e15782
Gerrit-PatchSet: 9
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] Bump Kudu version to a71ecfd

2017-09-07 Thread Thomas Tauber-Marshall (Code Review)
Thomas Tauber-Marshall has posted comments on this change.

Change subject: Bump Kudu version to a71ecfd
..


Patch Set 1: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/8000
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ie23d852f0d630f9484d8ae4f772af6bba13ea24f
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-HasComments: No


[Impala-ASF-CR] Bump Kudu version to a71ecfd

2017-09-07 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: Bump Kudu version to a71ecfd
..


Patch Set 1:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1207/

-- 
To view, visit http://gerrit.cloudera.org:8080/8000
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ie23d852f0d630f9484d8ae4f772af6bba13ea24f
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-HasComments: No


[Impala-ASF-CR] Bump Kudu version to a71ecfd

2017-09-07 Thread Matthew Jacobs (Code Review)
Matthew Jacobs has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/8000

Change subject: Bump Kudu version to a71ecfd
..

Bump Kudu version to a71ecfd

Change-Id: Ie23d852f0d630f9484d8ae4f772af6bba13ea24f
---
M bin/impala-config.sh
1 file changed, 2 insertions(+), 2 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/00/8000/1
-- 
To view, visit http://gerrit.cloudera.org:8080/8000
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: Ie23d852f0d630f9484d8ae4f772af6bba13ea24f
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs 


[Impala-ASF-CR] IMPALA-2107: [DOCS] Document base64*code() functions

2017-09-07 Thread John Russell (Code Review)
John Russell has posted comments on this change.

Change subject: IMPALA-2107: [DOCS] Document base64*code() functions
..


Patch Set 2:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/7963/1/docs/shared/impala_common.xml
File docs/shared/impala_common.xml:

Line 802: MAX(), and MIN() than when
> Are you sure about MAX and MIN? They might preserve ordering. I don't know.
Done. That was a close call that it does appear to be correct. In a sample of 
about a dozen words and phrases, I found almost all were sorted the same way as 
the unencoded values, but the 'w' entries came before the 't' ones:

| world | d29ybGQ=  
   |
| whirled   | d2hpcmxlZA==  
   |
| whorled   | d2hvcmxlZA==  
   |
| the rain in spain | dGhlIHJhaW4gaW4gc3BhaW4=  
   |
| the time has come | dGhlIHRpbWUgaGFzIGNvbWU=  
   |


PS1, Line 808: All argument values
 : supplied to base64decode() must also be 
a
 : multiple of 4 bytes in length.
> I don't think this is right. The example you give below, for instance, is '
Right, the constraint (must be) is on base64decode argument, and the guarantee 
(is always) is on the base64encode return value. That was a finger fumble to 
say base64encode in both places.


http://gerrit.cloudera.org:8080/#/c/7963/1/docs/topics/impala_string_functions.xml
File docs/topics/impala_string_functions.xml:

Line 88:   
> This was fixed in 2.6.0, according to the ticket, not 2.9.0
Done. Changed for both base64decode and base64encode  tags.


-- 
To view, visit http://gerrit.cloudera.org:8080/7963
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I5251e368ad36756c19a7b97e5ef6f232f616189b
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: John Russell 
Gerrit-Reviewer: Greg Rahn 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: John Russell 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-2107: [DOCS] Document base64*code() functions

2017-09-07 Thread John Russell (Code Review)
John Russell has uploaded a new patch set (#2).

Change subject: IMPALA-2107: [DOCS] Document base64*code() functions
..

IMPALA-2107: [DOCS] Document base64*code() functions

base64decode()
base64encode()

Change-Id: I5251e368ad36756c19a7b97e5ef6f232f616189b
---
M docs/impala_keydefs.ditamap
M docs/shared/impala_common.xml
M docs/topics/impala_string_functions.xml
3 files changed, 173 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/63/7963/2
-- 
To view, visit http://gerrit.cloudera.org:8080/7963
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I5251e368ad36756c19a7b97e5ef6f232f616189b
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: John Russell 
Gerrit-Reviewer: Greg Rahn 
Gerrit-Reviewer: Jim Apple 


[Impala-ASF-CR] [DOCS] Tighten up advice about first COMPUTE INCREMENTAL STATS

2017-09-07 Thread John Russell (Code Review)
John Russell has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/7999

Change subject: [DOCS] Tighten up advice about first COMPUTE INCREMENTAL STATS
..

[DOCS] Tighten up advice about first COMPUTE INCREMENTAL STATS

Explain how doing COMPUTE INCREMENTAL STATS for the first time
starts over and discards any previous stats from COMPUTE STATS.

As a consequence, moved some wording and examples into
impala_common.xml so that content could be used in
multiple places. Also made a new subtopic on the "Partitioning"
page because I saw COMPUTE INCREMENTAL STATS wasn't mentioned
there.

Change-Id: Ia53a6518ce5541e5c9a2cd896856ce042a599b03
---
M docs/shared/impala_common.xml
M docs/topics/impala_compute_stats.xml
M docs/topics/impala_partitioning.xml
M docs/topics/impala_perf_stats.xml
4 files changed, 157 insertions(+), 107 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/99/7999/1
-- 
To view, visit http://gerrit.cloudera.org:8080/7999
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: Ia53a6518ce5541e5c9a2cd896856ce042a599b03
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: John Russell 


[Impala-ASF-CR] Tighten up advice about first COMPUTE INCREMENTAL STATS

2017-09-07 Thread John Russell (Code Review)
John Russell has abandoned this change.

Change subject: Tighten up advice about first COMPUTE INCREMENTAL STATS
..


Abandoned

Accidentally created this review based on master rather than a private branch. 
Will do some git fixup and then resubmit.

-- 
To view, visit http://gerrit.cloudera.org:8080/7945
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: abandon
Gerrit-Change-Id: Id5c09edb90668b7b8b8ba7d63d0aefe42aee23b5
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: John Russell 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Greg Rahn 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Silvius Rus 


[Impala-ASF-CR] [DOCS] Explain Boost setting needed for 96-bit timestamps

2017-09-07 Thread John Russell (Code Review)
John Russell has posted comments on this change.

Change subject: [DOCS] Explain Boost setting needed for 96-bit timestamps
..


Patch Set 2:

Patch set 2 is a no-op from a review perspective. I got rid of a commit that 
got into my local master branch by accident.

-- 
To view, visit http://gerrit.cloudera.org:8080/7983
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I4b67cd7762f682c3a054e0d9641080aa51801c83
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: John Russell 
Gerrit-Reviewer: John Russell 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5881: Use native allocation while building catalog updates

2017-09-07 Thread Dimitris Tsirogiannis (Code Review)
Dimitris Tsirogiannis has posted comments on this change.

Change subject: IMPALA-5881: Use native allocation while building catalog 
updates
..


Patch Set 8:

(2 comments)

Flushing minor comments from patch #8 and switching to patch #9.

http://gerrit.cloudera.org:8080/#/c/7955/8/common/thrift/JniCatalog.thrift
File common/thrift/JniCatalog.thrift:

PS8, Line 603: // Struct containing eight strings, used for testing.
Mention what are we testing with this weird looking struct. Also, leave a TODO 
to move this out of here.


http://gerrit.cloudera.org:8080/#/c/7955/8/fe/src/main/java/org/apache/impala/service/JniCatalog.java
File fe/src/main/java/org/apache/impala/service/JniCatalog.java:

PS8, Line 51: import org.apache.impala.thrift.TNativeByteBuffer;
dup (L49)


-- 
To view, visit http://gerrit.cloudera.org:8080/7955
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I383684effa9524734ce3c6c0fb7ed37de0e15782
Gerrit-PatchSet: 8
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5881: Use native allocation while building catalog updates

2017-09-07 Thread Bharath Vissapragada (Code Review)
Bharath Vissapragada has uploaded a new patch set (#9).

Change subject: IMPALA-5881: Use native allocation while building catalog 
updates
..

IMPALA-5881: Use native allocation while building catalog updates

This patch moves the allocation of thrift structures for serializing
the output of GetAllCatalogObjects() call into the native side. This
is done to prevent the Catalog from hitting JVM array limitations
(2GB maximum size) at scale. Additionally this patch also extends the
--compact_catalog_top=true to apply TCompactProtocol for the above
serialization to reduce the in-memory footprint.

This patch also caps the native allocations to not go beyond 4GB due
to Thrift library limitations. (Thrift internally uses uint32_t
datatype to represent the message size which limits the size to
~4.2GB).

Testing: Passed ASAN + HDFS core and DEBUG + HDFS exhaustive.

Deployed the patch on a 16 node cluster and tested it on a
Catalog-update topic of 3.5GB (uncompressed) or 780MB (compressed).

Change-Id: I383684effa9524734ce3c6c0fb7ed37de0e15782
---
M be/src/catalog/catalog.cc
M be/src/catalog/catalog.h
M be/src/rpc/jni-thrift-util.h
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M common/thrift/JniCatalog.thrift
M fe/src/main/java/org/apache/impala/service/JniCatalog.java
A fe/src/main/java/org/apache/impala/thrift/NativeByteArrayOutputStream.java
A fe/src/main/java/org/apache/impala/thrift/TNativeSerializer.java
A fe/src/test/java/org/apache/impala/thrift/TNativeSerializerTest.java
10 files changed, 498 insertions(+), 6 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/55/7955/9
-- 
To view, visit http://gerrit.cloudera.org:8080/7955
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I383684effa9524734ce3c6c0fb7ed37de0e15782
Gerrit-PatchSet: 9
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-5881: Use native allocation while building catalog updates

2017-09-07 Thread Bharath Vissapragada (Code Review)
Bharath Vissapragada has posted comments on this change.

Change subject: IMPALA-5881: Use native allocation while building catalog 
updates
..


Patch Set 8:

(8 comments)

http://gerrit.cloudera.org:8080/#/c/7955/8/be/src/catalog/catalog.cc
File be/src/catalog/catalog.cc:

Line 116:   DCHECK(len < std::numeric_limits::max());
> DCHECK_LE
Done


Line 170: LOG(ERROR) << "Failed to free buffers. Potential memory leak of: "
> Failed to free native buffer through JNI. Leaked N bytes.
Done


http://gerrit.cloudera.org:8080/#/c/7955/8/fe/src/main/java/org/apache/impala/service/JniCatalog.java
File fe/src/main/java/org/apache/impala/service/JniCatalog.java:

Line 295:   TException {
> Does this really throw TException?
good catch, no!


http://gerrit.cloudera.org:8080/#/c/7955/8/fe/src/main/java/org/apache/impala/thrift/NativeByteArrayOutputStream.java
File fe/src/main/java/org/apache/impala/thrift/NativeByteArrayOutputStream.java:

Line 94:   private void tryAllocate(long len, boolean reAllocate) {
> Can we simplify this to just always use reallocateMemory()? Realloc from a 
Good point, checked the JVM sources, it already includes the check to call 
malloc/realloc depending on whether we pass 0.

http://hg.openjdk.java.net/jdk8/jdk8/hotspot/file/87ee5ee27509/src/share/vm/prims/unsafe.cpp#l613


Line 95: try {
> Preconditions.checkState(bufferPtr_ >= 0);
Done


Line 113: } catch (Exception e) {
> catch Throwable for extra paranoia
Done


Line 139: if (newBufferSize >= BUFFER_MAX_SIZE_LIMIT) {
> || newBufferSize < 0 (overflow)
Don't think thats possible given we cap at BUFFER_MAX_SIZE_LIMIT which is <<< 
long_max. I'm still including it to be on the safe side.


Line 155:   public synchronized void reset() {}
> What's this for? Is it an @Override?
Not an ovveride, removed, this was actually being called from serializer, but 
does nothing.

Ideally useful to reset the state of NBOAS, but given we don't allow calling 
serialize twice, I don't think that is necessary.


-- 
To view, visit http://gerrit.cloudera.org:8080/7955
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I383684effa9524734ce3c6c0fb7ed37de0e15782
Gerrit-PatchSet: 8
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5881: Use native allocation while building catalog updates

2017-09-07 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: IMPALA-5881: Use native allocation while building catalog 
updates
..


Patch Set 8:

(8 comments)

http://gerrit.cloudera.org:8080/#/c/7955/8/be/src/catalog/catalog.cc
File be/src/catalog/catalog.cc:

Line 116:   DCHECK(len < std::numeric_limits::max());
DCHECK_LE


#include 


Line 170: LOG(ERROR) << "Failed to free buffers. Potential memory leak of: "
Failed to free native buffer through JNI. Leaked N bytes.

(We definitely leaked at this point).

Also include the status error message.


http://gerrit.cloudera.org:8080/#/c/7955/8/fe/src/main/java/org/apache/impala/service/JniCatalog.java
File fe/src/main/java/org/apache/impala/service/JniCatalog.java:

Line 295:   TException {
Does this really throw TException?


http://gerrit.cloudera.org:8080/#/c/7955/8/fe/src/main/java/org/apache/impala/thrift/NativeByteArrayOutputStream.java
File fe/src/main/java/org/apache/impala/thrift/NativeByteArrayOutputStream.java:

Line 94:   private void tryAllocate(long len, boolean reAllocate) {
Can we simplify this to just always use reallocateMemory()? Realloc from a 0 
ptr should be ok I think.


Line 95: try {
Preconditions.checkState(bufferPtr_ >= 0);

for clarity why the free in L114 is always ok


Line 113: } catch (Exception e) {
catch Throwable for extra paranoia


Line 139: if (newBufferSize >= BUFFER_MAX_SIZE_LIMIT) {
|| newBufferSize < 0 (overflow)


Line 155:   public synchronized void reset() {}
What's this for? Is it an @Override?


-- 
To view, visit http://gerrit.cloudera.org:8080/7955
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I383684effa9524734ce3c6c0fb7ed37de0e15782
Gerrit-PatchSet: 8
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5664: Unix time to timestamp conversions may crash Impala

2017-09-07 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has uploaded a new patch set (#2).

Change subject: IMPALA-5664: Unix time to timestamp conversions may crash Impala
..

IMPALA-5664: Unix time to timestamp conversions may crash Impala

TimestampValue::FromSubsecondUnixTime() and UtcFromUnixTimeMicros()
are incorrect only in case of the last second of 1399, because these
sub-second values are rounded first towards 1400-01-01 00:00:00, which
is accepted as a valid date, and the sub-second part is subtracted
afterwards, leading to a date outside the valid interval. The maximum
case, -12-31 59:59:59 is a bit different, because as I understand,
with nanosecond precision posix times, the maximum value is actually
1-01-01. 00:00:00 minus 1 nanosec.

TimestampValue::FromUnixTimeNanos() can create problematic TimestampValues
both <1400 and 1<=.

These timestamps can cause problems, because most code assumes that if
HasDate/HasTime is true, then it really is a valid timestamp.

To fix this, the posix times are checked in the constructor of
TimestampValue, and if it is outside the valid interval,both time_
and date_ are set to not_a_date_time.

Test:
select cast(-17987443200-0.1 as timestamp);
This query no longer crashes, but returns NULL, similarly to other
< 1400 timestamps.

Change-Id: I77b2f6284d3a597f57e61c17a67c959eff9e38ff
---
M be/src/runtime/timestamp-test.cc
M be/src/runtime/timestamp-value.h
2 files changed, 34 insertions(+), 1 deletion(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/54/7954/2
-- 
To view, visit http://gerrit.cloudera.org:8080/7954
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I77b2f6284d3a597f57e61c17a67c959eff9e38ff
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Lars Volker 


[Impala-ASF-CR] IMPALA-5881: Use native allocation while building catalog updates

2017-09-07 Thread Bharath Vissapragada (Code Review)
Bharath Vissapragada has uploaded a new patch set (#8).

Change subject: IMPALA-5881: Use native allocation while building catalog 
updates
..

IMPALA-5881: Use native allocation while building catalog updates

This patch moves the allocation of thrift structures for serializing
the output of GetAllCatalogObjects() call into the native side. This
is done to prevent the Catalog from hitting JVM array limitations
(2GB maximum size) at scale. Additionally this patch also extends the
--compact_catalog_top=true to apply TCompactProtocol for the above
serialization to reduce the in-memory footprint.

This patch also caps the native allocations to not go beyond 4GB due
to Thrift library limitations. (Thrift internally uses uint32_t
datatype to represent the message size which limits the size to
~4.2GB).

Testing: Passed ASAN + HDFS core and DEBUG + HDFS exhaustive.

Deployed the patch on a 16 node cluster and tested it on a
Catalog-update topic of 3.5GB (uncompressed) or 780MB (compressed).

Change-Id: I383684effa9524734ce3c6c0fb7ed37de0e15782
---
M be/src/catalog/catalog.cc
M be/src/catalog/catalog.h
M be/src/rpc/jni-thrift-util.h
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M common/thrift/JniCatalog.thrift
M fe/src/main/java/org/apache/impala/service/JniCatalog.java
A fe/src/main/java/org/apache/impala/thrift/NativeByteArrayOutputStream.java
A fe/src/main/java/org/apache/impala/thrift/TNativeSerializer.java
A fe/src/test/java/org/apache/impala/thrift/TNativeSerializerTest.java
10 files changed, 502 insertions(+), 6 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/55/7955/8
-- 
To view, visit http://gerrit.cloudera.org:8080/7955
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I383684effa9524734ce3c6c0fb7ed37de0e15782
Gerrit-PatchSet: 8
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-5881: Use native allocation while building catalog updates

2017-09-07 Thread Bharath Vissapragada (Code Review)
Bharath Vissapragada has posted comments on this change.

Change subject: IMPALA-5881: Use native allocation while building catalog 
updates
..


Patch Set 7:

(13 comments)

Addressed comments other than the test file.

http://gerrit.cloudera.org:8080/#/c/7955/7/be/src/catalog/catalog.cc
File be/src/catalog/catalog.cc:

Line 114:   uint32_t len = static_cast(nbuffer.buffer_len);
> DCHECK that the buffer_len is smaller than max uint32_t to make it clear th
Done


Line 119:   JniUtil::CallJniMethod(catalog_, free_native_byte_buffer_id_, 
nbuffer);
> Let's check the status of this call and if non-ok emit LOG(ERROR) that n by
Done


http://gerrit.cloudera.org:8080/#/c/7955/7/fe/src/main/java/org/apache/impala/service/JniCatalog.java
File fe/src/main/java/org/apache/impala/service/JniCatalog.java:

Line 298: if (buffer.buffer_ptr <= 0) return;
> IIRC you said 0 should work. Convert to Preconditions check for < 0.
Done


http://gerrit.cloudera.org:8080/#/c/7955/7/fe/src/main/java/org/apache/impala/thrift/NativeByteArrayOutputStream.java
File fe/src/main/java/org/apache/impala/thrift/NativeByteArrayOutputStream.java:

Line 26:  * allocations happen on the native side and this class tracks the
> all allocations use native memory and this class tracks the corresponding p
Done


Line 37:  * BUFFER_RESIZE_INCREMENTS;
> end sentence with "." (not ";")
Done


Line 77: tryAllocate(initialSize, false);
> For simplicity, let's only allow the default initial size and not a custom 
Done


Line 87:   throw new IllegalArgumentException("Current size: " + bufferLen_ 
+
> Isn't this a potential memory leak? Who frees the native memory?
Redid the logic a little. Should cover this.


Line 101:   UnsafeUtil.UNSAFE.freeMemory(bufferPtr_);
> This doesn't make sense because bufferPtr_ is already 0. You need to use a 
Ouch, good catch.


Line 103:   "Failed to (re)allocateMemory() " + len + " bytes");
> also print the value of reAllocate
Done


Line 108:* Write a byte array 'b' of length 'len at offset 'offset'. 
Resizes the buffer
> Not really accurate. Suggest:
Done


Line 115:   throw new IndexOutOfBoundsException(
> Isn't this a potential memory leak? Who frees the native memory?
Done


Line 122:   newBufferSize += BUFFER_RESIZE_INCREMENTS;
> Infinite loop if overflow? Same question for L124.
Done


http://gerrit.cloudera.org:8080/#/c/7955/7/fe/src/main/java/org/apache/impala/thrift/TNativeSerializer.java
File fe/src/main/java/org/apache/impala/thrift/TNativeSerializer.java:

Line 62:* Currently that restriction is not enforced by this class and is 
the responsibility
> Seems easy to enforce this with a flag that gets checked at the beginning o
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/7955
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I383684effa9524734ce3c6c0fb7ed37de0e15782
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes