[Impala-ASF-CR] Make build scripts more friendly to Ubuntu 16.04

2017-10-11 Thread Jim Apple (Code Review)
Jim Apple has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8262 )

Change subject: Make build scripts more friendly to Ubuntu 16.04
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8262/1/bin/bootstrap_build.sh
File bin/bootstrap_build.sh:

http://gerrit.cloudera.org:8080/#/c/8262/1/bin/bootstrap_build.sh@23
PS1, Line 23: # 1. At least 8GB of free disk space
> Is this still true? I vaguely recollect running out of space in a vm that h
Yes, check the df output: 
https://jenkins.impala.io/view/Experimental/job/ubuntu-16.04-build-only/4/consoleFull

You may have been loading data, building static (and building all tests), or 
other developer activities.



--
To view, visit http://gerrit.cloudera.org:8080/8262
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8196a2a87bce5893a349a1b290c3f3d04fd80317
Gerrit-Change-Number: 8262
Gerrit-PatchSet: 1
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Zoram Thanga 
Gerrit-Comment-Date: Thu, 12 Oct 2017 05:00:44 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5429: Multi threaded block metadata loading

2017-10-11 Thread Mostafa Mokhtar (Code Review)
Mostafa Mokhtar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8235 )

Change subject: IMPALA-5429: Multi threaded block metadata loading
..


Patch Set 5:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/8235/5/be/src/catalog/catalog.cc
File be/src/catalog/catalog.cc:

http://gerrit.cloudera.org:8080/#/c/8235/5/be/src/catalog/catalog.cc@42
PS5, Line 42: DEFINE_int32(max_s3_parts_parallel_load, 10,
I would be more aggressive with this parameter and put it at 20.


http://gerrit.cloudera.org:8080/#/c/8235/5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

http://gerrit.cloudera.org:8080/#/c/8235/5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@787
PS5, Line 787: int threadPoolSize = 
FileSystemUtil.supportsStorageIds(tableFs) ?
What is the expected behavior for tables with mixed FSs?
As a mix of S3 and HDFS partitions.


http://gerrit.cloudera.org:8080/#/c/8235/5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@801
PS5, Line 801: getLoadingThreadPoolSize
> can different partitions have different number of files? if so, work across
Parallelization of metadata loading is done on per partition granularity.



--
To view, visit http://gerrit.cloudera.org:8080/8235
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I07eaa7151dfc4d56da8db8c2654bd65d8f808481
Gerrit-Change-Number: 8235
Gerrit-PatchSet: 5
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Thu, 12 Oct 2017 04:51:55 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] Make build scripts more friendly to Ubuntu 16.04

2017-10-11 Thread Zoram Thanga (Code Review)
Zoram Thanga has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8262 )

Change subject: Make build scripts more friendly to Ubuntu 16.04
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8262/1/bin/bootstrap_build.sh
File bin/bootstrap_build.sh:

http://gerrit.cloudera.org:8080/#/c/8262/1/bin/bootstrap_build.sh@23
PS1, Line 23: # 1. At least 8GB of free disk space
Is this still true? I vaguely recollect running out of space in a vm that had 
32GB of disk space.



--
To view, visit http://gerrit.cloudera.org:8080/8262
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8196a2a87bce5893a349a1b290c3f3d04fd80317
Gerrit-Change-Number: 8262
Gerrit-PatchSet: 1
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Zoram Thanga 
Gerrit-Comment-Date: Thu, 12 Oct 2017 04:48:32 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] Make build scripts more friendly to Ubuntu 16.04

2017-10-11 Thread Jim Apple (Code Review)
Jim Apple has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/8262


Change subject: Make build scripts more friendly to Ubuntu 16.04
..

Make build scripts more friendly to Ubuntu 16.04

This commit bundles two changes.

The first extracts from bootstrap_development.sh the commands to
prepare a system to build-and-test without actually doing so. This
enables custom build commands that might not load the test data, for
instance.

The second changes bootstrap_build.sh to work with Ubuntu 16.04. It
should still work with Ubuntu 14.04, but I don't anticipate that being
part of the Jenkins pre-merge job anymore, so I removed that from the
comment at the top of the file explaining what it does.

Change-Id: I8196a2a87bce5893a349a1b290c3f3d04fd80317
---
M bin/bootstrap_build.sh
M bin/bootstrap_development.sh
A bin/bootstrap_system.sh
3 files changed, 227 insertions(+), 174 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/62/8262/1
--
To view, visit http://gerrit.cloudera.org:8080/8262
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I8196a2a87bce5893a349a1b290c3f3d04fd80317
Gerrit-Change-Number: 8262
Gerrit-PatchSet: 1
Gerrit-Owner: Jim Apple 


[Impala-ASF-CR] IMPALA-4856: Port data stream service to KRPC

2017-10-11 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8023 )

Change subject: IMPALA-4856: Port data stream service to KRPC
..


Patch Set 3:

(34 comments)

Another batch of comments...

http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-mgr.h
File be/src/runtime/krpc-data-stream-mgr.h:

http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-mgr.h@63
PS3, Line 63: ongoing transmission from a client to a server as a 'stream'
I don't think that's accurate. see questions in krpc-data-stream-recvr.h about 
stream definition.


http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-mgr.h@93
PS3, Line 93: process it immediately, add it to a fixed-size 'batch queue' for 
later processing
XXX check whether these are really different


http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-mgr.h@164
PS3, Line 164: deferred processing
is that because the recvr hasn't showed up yet, or because the stream's queue 
is full, or both?


http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-mgr.h@235
PS3, Line 235: node_id
dest_node_id?


http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-mgr.h@239
PS3, Line 239: Ownership of the receiver is shared between this DataStream mgr 
instance and the
 :   /// caller.
that seems unnecessary but don't change it now.


http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-mgr.h@246
PS3, Line 246:
'proto_batch'?


http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-mgr.h@248
PS3, Line 248: .
'request'.

Also document what 'response' and 'context' are.


http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-mgr.h@266
PS3, Line 266: Notifies the receiver
is this an RPC handler? I think we should just be explicit about which of these 
methods are RPC handlers.


http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-mgr.h@267
PS3, Line 267: The RPC
what RPC is this talking about? If this is a handler, then it's clear.


http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-mgr.h@274
PS3, Line 274: Closes
Does it close or cancel? (or is there no difference?)


http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-mgr.h@284
PS3, Line 284: RPCs which were buffered
To be consistent with terminology used in class comment, maybe say "deferred 
RPCs"


http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-mgr.h@340
PS3, Line 340: ragment instance id, 0
what is that saying? is that a misplaced comma or am I reading this wrong?


http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-mgr.h@341
PS3, Line 341: instance id changes
I don't understand this.  it kinda sounds like we're trying to be able to find 
all instances of this fragment, but then wouldn't we iterate until the fragment 
id changes (not the instance id)?


http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-mgr.h@349
PS3, Line 349:   struct EarlySendersList {
hmm, I guess we need this now that we can't block the RPC thread?


http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-mgr.h@358
PS3, Line 358: Time
Monotonic time


http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-mgr.h@374
PS3, Line 374: time
monotonic time


http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-mgr.h@382
PS3, Line 382:   boost::unordered_set closed_stream_cache_;
all this parallel startup stuff really needs to be revisited (but not for this 
change). it's too complex and brittle.


http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-mgr.h@386
PS3, Line 386: Deserialize
maybe call it DeserializeDeferred() or DeserializeWorker() to make it clearer 
that this is only for the deferred (slow) path, since the normal path will also 
have to deserialize (but doesn't use this code).


http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-mgr.h@404
PS3, Line 404:   void EnqueueRowBatch(DeserializeWorkItem&& payload);
how about grouping this with Deferred function above since it's related. Also, 
I think the name should be less generic. Like maybe EnqueueDeferredBatch() or 
EnqueueDeferredRpc() (does the response happen before or after this deferred 
work?)


http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-mgr.h@413
PS3, Line 413: block
what's that?


http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-mgr.h@414
PS3, Line 414: Status
I think that status is not getting checked by the caller. I thought Tim made 
Status warn on unused result -- why is it not catching this? (Or do we still 
need to annotate each method?).


http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-da

[Impala-ASF-CR] IMPALA-6040: skip test multi compression types where Hive isn't supported

2017-10-11 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8259 )

Change subject: IMPALA-6040: skip test_multi_compression_types where Hive isn't 
supported
..


Patch Set 2: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/8259
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3ad4b72839f8ac3bcb824287d02dd6964eea3e3e
Gerrit-Change-Number: 8259
Gerrit-PatchSet: 2
Gerrit-Owner: Michael Brown 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 12 Oct 2017 02:16:19 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6040: skip test multi compression types where Hive isn't supported

2017-10-11 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/8259 )

Change subject: IMPALA-6040: skip test_multi_compression_types where Hive isn't 
supported
..

IMPALA-6040: skip test_multi_compression_types where Hive isn't supported

A recent commit "IMPALA-5448: fix invalid number of splits reported in
Parquet scan node" neglected to account for the fact that in some
environments, Impala runs without Hive. The typical pattern for tests
that use Hive is skip them if they are executed against such
environments.

Change-Id: I3ad4b72839f8ac3bcb824287d02dd6964eea3e3e
Reviewed-on: http://gerrit.cloudera.org:8080/8259
Reviewed-by: Michael Brown 
Tested-by: Impala Public Jenkins
---
M tests/query_test/test_scanners.py
1 file changed, 4 insertions(+), 0 deletions(-)

Approvals:
  Michael Brown: Looks good to me, approved
  Impala Public Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/8259
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I3ad4b72839f8ac3bcb824287d02dd6964eea3e3e
Gerrit-Change-Number: 8259
Gerrit-PatchSet: 3
Gerrit-Owner: Michael Brown 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-5376: Implement all TPCDS test cases or alternates for Impala.

2017-10-11 Thread Tim Wood (Code Review)
Tim Wood has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8102 )

Change subject: IMPALA-5376: Implement all TPCDS test cases or alternates for 
Impala.
..


Patch Set 17:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8102/17/testdata/workloads/tpcds/queries/tpcds-q26.test
File testdata/workloads/tpcds/queries/tpcds-q26.test:

http://gerrit.cloudera.org:8080/#/c/8102/17/testdata/workloads/tpcds/queries/tpcds-q26.test@6
PS17, Line 6: truncate(avg(cs_quantity) * 10.0) / 10 agg1,
> The query text in this file now reliably produces the results in this file,
I set the scale adjustment to 10^5 in this line and adjusted 3 result lines so 
the approximation of +1/3 (.3) would be clearer.



--
To view, visit http://gerrit.cloudera.org:8080/8102
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6e284888600a7a69d1f23fcb7dac21cbb13b7d66
Gerrit-Change-Number: 8102
Gerrit-PatchSet: 17
Gerrit-Owner: Tim Wood 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Matthew Mulder 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Michael Ho
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Tim Wood 
Gerrit-Comment-Date: Thu, 12 Oct 2017 02:09:03 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5940: Avoid log spew by using Status::Expected()

2017-10-11 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/8255 )

Change subject: IMPALA-5940: Avoid log spew by using Status::Expected()
..

IMPALA-5940: Avoid log spew by using Status::Expected()

This change converts two more callers of Status() to Status::Expected()
to avoid log spew and the overhead of stack crawling. These two call
sites can easily fill the log when there is any network issue.

Testing done:

Ran concurrent TPC-DS workload with 256 users and verified the log spew is not 
there.

Change-Id: I87493394074a7fdb4e912f4ab2436dd7050d45b3
Reviewed-on: http://gerrit.cloudera.org:8080/8255
Reviewed-by: Michael Ho 
Tested-by: Impala Public Jenkins
---
M be/src/rpc/thrift-client.cc
M be/src/service/impala-server.cc
2 files changed, 9 insertions(+), 4 deletions(-)

Approvals:
  Michael Ho: Looks good to me, approved
  Impala Public Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/8255
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I87493394074a7fdb4e912f4ab2436dd7050d45b3
Gerrit-Change-Number: 8255
Gerrit-PatchSet: 4
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 


[Impala-ASF-CR] IMPALA-5940: Avoid log spew by using Status::Expected()

2017-10-11 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8255 )

Change subject: IMPALA-5940: Avoid log spew by using Status::Expected()
..


Patch Set 3: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/8255
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I87493394074a7fdb4e912f4ab2436dd7050d45b3
Gerrit-Change-Number: 8255
Gerrit-PatchSet: 3
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Thu, 12 Oct 2017 01:34:12 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5789: Add always false flag in bloom filter

2017-10-11 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8170 )

Change subject: IMPALA-5789: Add always_false flag in bloom filter
..


Patch Set 4:

(3 comments)

I need to do a proper pass over this. Did a quick pass now and had some 
questions.

http://gerrit.cloudera.org:8080/#/c/8170/4/be/src/runtime/coordinator.cc
File be/src/runtime/coordinator.cc:

http://gerrit.cloudera.org:8080/#/c/8170/4/be/src/runtime/coordinator.cc@a20
PS4, Line 20:
Thanks for doing cleanup here. How did you determine which includes were 
needed? Is there a good tool to use?


http://gerrit.cloudera.org:8080/#/c/8170/4/be/src/runtime/runtime-filter-ir.cc
File be/src/runtime/runtime-filter-ir.cc:

http://gerrit.cloudera.org:8080/#/c/8170/4/be/src/runtime/runtime-filter-ir.cc@27
PS4, Line 27:   if (UNLIKELY(bloom_filter_->AlwaysFalse())) return false;
Won't these branches be likely in some cases? IMO best to only use these 
annotations when it's very strongly unlikely, e.g. error paths.


http://gerrit.cloudera.org:8080/#/c/8170/4/be/src/util/bloom-filter.h
File be/src/util/bloom-filter.h:

http://gerrit.cloudera.org:8080/#/c/8170/4/be/src/util/bloom-filter.h@200
PS4, Line 200:   if (UNLIKELY(always_false_)) return false;
Doesn't this mean that RuntimeFilter::Eval() checks always_false_ twice? Maybe 
best to only check it in here.



--
To view, visit http://gerrit.cloudera.org:8080/8170
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If680240a3cd4583fc97c3192177d86d9567c4f8d
Gerrit-Change-Number: 8170
Gerrit-PatchSet: 4
Gerrit-Owner: Tianyi Wang 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 12 Oct 2017 01:07:19 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-2494: Support for byte array encoded decimals in Parquet scanner

2017-10-11 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/7822 )

Change subject: IMPALA-2494: Support for byte array encoded decimals in Parquet 
scanner
..


Patch Set 4:

(12 comments)

Did a first pass over it - thought I should flush out comments since you've 
been waiting quite a while for feedback here.

http://gerrit.cloudera.org:8080/#/c/7822/4/be/src/exec/parquet-column-readers.cc
File be/src/exec/parquet-column-readers.cc:

http://gerrit.cloudera.org:8080/#/c/7822/4/be/src/exec/parquet-column-readers.cc@1277
PS4, Line 1277: switch (node.element->type) {
This case is only going to get bigger with the follow on patches - it would be 
best to make it a separate function.


http://gerrit.cloudera.org:8080/#/c/7822/4/be/src/exec/parquet-common.h
File be/src/exec/parquet-common.h:

http://gerrit.cloudera.org:8080/#/c/7822/4/be/src/exec/parquet-common.h@213
PS4, Line 213:   template 
Logical type and physical type I think aren't quite right, since e.g. 
"StringValue" isn't a logical type. It's really decoding/encoding between 
Parquet physical types and Impala's internal representation.

Maybe INTERNAL_TYPE and PARQUET_PHYSICAL_TYPE? Or PARQUET_TYPE seems ok since 
that's what parquet calls it.


http://gerrit.cloudera.org:8080/#/c/7822/4/be/src/exec/parquet-common.h@376
PS4, Line 376: inline int ParquetPlainEncoder::Decode(
I think you can avoid stamping this out if you leave it still parameterized by 
the internal Impala type, since the logic is the same in all cases. There may 
be an opportunity to reduce the redundancy above too, since there are functions 
above with identical bodies.

I tried this out on a dummy program and it looks like it's possible:

  #include 

  template 
  int foo(T* a, S* b) {
return 0;
  }

  template <>
  int foo(int* a, int* b) { return 1; }

  template 
  int foo(T* a, double* b) { return 2; }

  int main(int argc, char**argv) {
int a, b, c;
double x, y, z;
printf("%d\n", foo(&a, &b));
printf("%d\n", foo(&a, &x));
  }


http://gerrit.cloudera.org:8080/#/c/7822/4/be/src/exec/parquet-metadata-utils.cc
File be/src/exec/parquet-metadata-utils.cc:

http://gerrit.cloudera.org:8080/#/c/7822/4/be/src/exec/parquet-metadata-utils.cc@66
PS4, Line 66: bool IsSupportedPhysicalType(PrimitiveType impala_type,
let's make it static - we don't want to export the symbol to be linked with 
code outside this file.


http://gerrit.cloudera.org:8080/#/c/7822/4/be/src/exec/parquet-metadata-utils.cc@70
PS4, Line 70: (
unnecessary parens


http://gerrit.cloudera.org:8080/#/c/7822/4/be/src/exec/parquet-metadata-utils.cc@173
PS4, Line 173: // TODO: Is this check too strict?
I don't see why this shouldn't match the file metadata, this seems valid to me.


http://gerrit.cloudera.org:8080/#/c/7822/4/be/src/exec/parquet-metadata-utils.cc@182
PS4, Line 182:  parquet::SchemaElement*
Why not pass by const ref like above?


http://gerrit.cloudera.org:8080/#/c/7822/4/be/src/exec/parquet-metadata-utils.cc@209
PS4, Line 209: stringstream ss;
Can you convert to using Substitute() while we're here? We've been very 
gradually converting to using that for cases like this.


http://gerrit.cloudera.org:8080/#/c/7822/4/be/src/exec/parquet-plain-test.cc
File be/src/exec/parquet-plain-test.cc:

http://gerrit.cloudera.org:8080/#/c/7822/4/be/src/exec/parquet-plain-test.cc@176
PS4, Line 176:   TestType(Decimal4Value(test_val),
It would be nice to combine the FIXED_LEN_BYTE_ARRAY and BYTE_ARRAY tests. 
Mostly the tests are the same except for the expected size, right? Could we 
make them table-driven so that we test all the same cases but just have 
different expected output?


http://gerrit.cloudera.org:8080/#/c/7822/4/be/src/exec/parquet-plain-test.cc@193
PS4, Line 193: int32_t
Any reason to use sizeof(int..) instead of sizeof(Decimal*Value) like above? Or 
if this is the better way to express it, should we change it above?


http://gerrit.cloudera.org:8080/#/c/7822/4/be/src/exec/parquet-plain-test.cc@212
PS4, Line 212:   for (int i = 1; i <=16; ++i) {
nit: missing space


http://gerrit.cloudera.org:8080/#/c/7822/4/testdata/data/binary_decimal_dictionary.parquet
File testdata/data/binary_decimal_dictionary.parquet:

PS4:
Can you update the README to describe how these files were generated.



--
To view, visit http://gerrit.cloudera.org:8080/7822
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2c0e881045109f337fecba53fec21f9cfb9e619e
Gerrit-Change-Number: 7822
Gerrit-PatchSet: 4
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 12 Oct 2017 00:52:03 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-2494: Support for byte array encoded decimals in Parquet scanner

2017-10-11 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/7822 )

Change subject: IMPALA-2494: Support for byte array encoded decimals in Parquet 
scanner
..


Patch Set 2:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/7822/2/be/src/exec/parquet-column-stats.inline.h
File be/src/exec/parquet-column-stats.inline.h:

http://gerrit.cloudera.org:8080/#/c/7822/2/be/src/exec/parquet-column-stats.inline.h@89
PS2, Line 89:   switch(parquet_type){
> Do you mean to have a Decode wrapper around the templatized Decode methods?
The former, so that the interface of Decode() is simpler. How this is 
implemented seems more a concern of the decoder than the column stats.


http://gerrit.cloudera.org:8080/#/c/7822/2/be/src/exec/parquet-common.h
File be/src/exec/parquet-common.h:

http://gerrit.cloudera.org:8080/#/c/7822/2/be/src/exec/parquet-common.h@237
PS2, Line 237: ByteSize
> Do you mean to have something like
I think this particular call here will always return sizeof(int32_t) (line 
220). I'd just put that here, since your change explicitly documents that as a 
template parameter.


http://gerrit.cloudera.org:8080/#/c/7822/2/be/src/exec/parquet-common.h@338
PS2, Line 338: template <>
> thats kinda difficult because DecimalUtil::DecodeFromFixedLenByteArray is a
I'm not sure I'm following: it looks like the next three methods are exactly 
the same. Couldn't you move them into a new method DecodeDecimalValue(const 
uint8_t* buffer, const uint8_t* buffer_end, int fixed_len_size, T* v) and then 
call it and return in one line here? I may be missing something though :)



--
To view, visit http://gerrit.cloudera.org:8080/7822
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2c0e881045109f337fecba53fec21f9cfb9e619e
Gerrit-Change-Number: 7822
Gerrit-PatchSet: 2
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 12 Oct 2017 00:33:05 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4252: Min-max runtime filters for Kudu

2017-10-11 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/7793 )

Change subject: IMPALA-4252: Min-max runtime filters for Kudu
..


Patch Set 6:

(17 comments)

http://gerrit.cloudera.org:8080/#/c/7793/6//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/7793/6//COMMIT_MSG@15
PS6, Line 15: them into the Kudu client. Because the Kudu client doesn't 
provide a
Is there any documentation for Kudu about what ordering Kudu uses for 
comparisons for different types? I guess we already push filters so we're 
already depending on the orderings being the same as Impala, but I'd be 
interested to know.

We had a similar discussion about Parquet and the spec got clarified a lot as a 
result (https://github.com/apache/parquet-format/blob/master/LogicalTypes.md 
now specifies it). Parquet-MR also had various odd behaviour that got shaken 
out as a result of this.


http://gerrit.cloudera.org:8080/#/c/7793/6/be/src/exec/filter-context.h
File be/src/exec/filter-context.h:

http://gerrit.cloudera.org:8080/#/c/7793/6/be/src/exec/filter-context.h@106
PS6, Line 106: bloom_filter
has_bloom_filter() ? At callsites it reads like this should return a filter 
rather than a bool.


http://gerrit.cloudera.org:8080/#/c/7793/6/be/src/exec/filter-context.h@109
PS6, Line 109: min_max_filter
has_min_max_filter?


http://gerrit.cloudera.org:8080/#/c/7793/6/be/src/runtime/coordinator.cc
File be/src/runtime/coordinator.cc:

http://gerrit.cloudera.org:8080/#/c/7793/6/be/src/runtime/coordinator.cc@1220
PS6, Line 1220:   MinMaxFilter::Or(src_type(), params.min_max_filter, 
min_max_filter_.get());
I mentioned this in a later comment, but we should consider what happens if the 
min/max values are very large - maybe thre's some memory management similar to 
above.


http://gerrit.cloudera.org:8080/#/c/7793/6/be/src/runtime/runtime-filter-bank.cc
File be/src/runtime/runtime-filter-bank.cc:

http://gerrit.cloudera.org:8080/#/c/7793/6/be/src/runtime/runtime-filter-bank.cc@235
PS6, Line 235:   MinMaxFilter* min_max_filter = MinMaxFilter::Create(type, 
&obj_pool_);
(nit) - could combine into one line.


http://gerrit.cloudera.org:8080/#/c/7793/6/be/src/runtime/runtime-filter-ir.cc
File be/src/runtime/runtime-filter-ir.cc:

http://gerrit.cloudera.org:8080/#/c/7793/6/be/src/runtime/runtime-filter-ir.cc@27
PS6, Line 27:   // to a) the atomicity of / pointer assignments and b) the x86 
TSO memory model.
Comment not relevant any more since we're using actual atomics?


http://gerrit.cloudera.org:8080/#/c/7793/6/be/src/util/min-max-filter-ir.cc
File be/src/util/min-max-filter-ir.cc:

http://gerrit.cloudera.org:8080/#/c/7793/6/be/src/util/min-max-filter-ir.cc@53
PS6, Line 53:   min_str = string(value->ptr, value->len);
We should think through the behaviour with very large strings. It may make 
sense to disable the filters if the strings are too large instead of allocating 
arbitrarily large amounts of memory.

Maybe we could also do some trickery with truncating the strings to avoid 
handling the filters not existing. E.g. if we have a 4 byte limit on min/max, 
replace min="aaa" with min="" and max="zz" with max="zzz(". 
Interested to hear what you think - might be too clever but seems like it could 
work.

It would be good to also allocate tracked memory for the string. The Parquet 
ColumnStats class solves a similar problem using StringBuffers that allocate 
from a MemPool.


http://gerrit.cloudera.org:8080/#/c/7793/6/be/src/util/min-max-filter.h
File be/src/util/min-max-filter.h:

http://gerrit.cloudera.org:8080/#/c/7793/6/be/src/util/min-max-filter.h@83
PS6, Line 83: #define NUMERIC_MIN_MAX_FILTER(NAME, TYPE)
 \
Yeah, the macro seems like it may be the least of the evils.


http://gerrit.cloudera.org:8080/#/c/7793/6/be/src/util/min-max-filter.cc
File be/src/util/min-max-filter.cc:

http://gerrit.cloudera.org:8080/#/c/7793/6/be/src/util/min-max-filter.cc@77
PS6, Line 77:   std::string NAME##MinMaxFilter::DebugString() const {   
 \
nit: don't need std:: prefix in .cc files.


http://gerrit.cloudera.org:8080/#/c/7793/6/be/src/util/min-max-filter.cc@139
PS6, Line 139:   out->min.__set_string_val(std::min(in.min.string_val, 
out->min.string_val));
I feel like we should use our StringValue::Compare() function instead of 
relying on std::string's comparison function. I'm not sure if they are exactly 
the same but if we compare these the same way as we compare StringValue's it's 
easier to reason about.


http://gerrit.cloudera.org:8080/#/c/7793/6/be/src/util/min-max-filter.cc@291
PS6, Line 291:   BigIntMinMaxFilter::Or(in, out);
Shouldn't this be calling the timestamp method?


http://gerrit.cloudera.org:8080/#/c/7793/6/be/src/util/min-max-filter.cc@327
PS6, Line 327:   }
Timestamp?


http://gerrit.cloudera.org:8080/#/c/7793/6/fe/src/main/java/org/apache/impala/planner/RuntimeFilt

[Impala-ASF-CR] IMPALA-5376: Implement all TPCDS test cases or alternates for Impala.

2017-10-11 Thread Tim Wood (Code Review)
Tim Wood has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8102 )

Change subject: IMPALA-5376: Implement all TPCDS test cases or alternates for 
Impala.
..


Patch Set 17:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8102/17/testdata/workloads/tpcds/queries/tpcds-q26.test
File testdata/workloads/tpcds/queries/tpcds-q26.test:

http://gerrit.cloudera.org:8080/#/c/8102/17/testdata/workloads/tpcds/queries/tpcds-q26.test@6
PS17, Line 6: truncate(avg(cs_quantity) * 10.0) / 10 agg1,
> What is the reason behind changing the query text along with the results?
The query text in this file now reliably produces the results in this file, and 
the results match what I received as TPC-DS reference results from Greg.



--
To view, visit http://gerrit.cloudera.org:8080/8102
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6e284888600a7a69d1f23fcb7dac21cbb13b7d66
Gerrit-Change-Number: 8102
Gerrit-PatchSet: 17
Gerrit-Owner: Tim Wood 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Matthew Mulder 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Michael Ho
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Tim Wood 
Gerrit-Comment-Date: Thu, 12 Oct 2017 00:05:52 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5376: Implement all TPCDS test cases or alternates for Impala.

2017-10-11 Thread Tim Wood (Code Review)
Tim Wood has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8102 )

Change subject: IMPALA-5376: Implement all TPCDS test cases or alternates for 
Impala.
..


Patch Set 16:

Passing run of gerrit-verify-dryrun-external: 
https://jenkins.impala.io/job/gerrit-verify-dryrun-external/17/


--
To view, visit http://gerrit.cloudera.org:8080/8102
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6e284888600a7a69d1f23fcb7dac21cbb13b7d66
Gerrit-Change-Number: 8102
Gerrit-PatchSet: 16
Gerrit-Owner: Tim Wood 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Matthew Mulder 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Michael Ho
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Tim Wood 
Gerrit-Comment-Date: Thu, 12 Oct 2017 00:01:49 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6030: Don't start coordinator specific thread pools if a node isn't a coordinator node

2017-10-11 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/8242 )

Change subject: IMPALA-6030: Don't start coordinator specific thread pools if a 
node isn't a coordinator node
..

IMPALA-6030: Don't start coordinator specific thread pools if a node isn't a 
coordinator node

Since we introduced the FLAGS_is_coordinator, we've forgotten to
disable the coordinator specific thread pools on nodes that have
only the executor role.

This patch fixes that.

Change-Id: I1cd046dbe5f10fa51fcc37338e48b94843891b62
Reviewed-on: http://gerrit.cloudera.org:8080/8242
Reviewed-by: Dan Hecht 
Tested-by: Impala Public Jenkins
---
M be/src/runtime/exec-env.cc
M be/src/runtime/exec-env.h
2 files changed, 16 insertions(+), 6 deletions(-)

Approvals:
  Dan Hecht: Looks good to me, approved
  Impala Public Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/8242
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I1cd046dbe5f10fa51fcc37338e48b94843891b62
Gerrit-Change-Number: 8242
Gerrit-PatchSet: 4
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Sailesh Mukil 


[Impala-ASF-CR] IMPALA-6030: Don't start coordinator specific thread pools if a node isn't a coordinator node

2017-10-11 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8242 )

Change subject: IMPALA-6030: Don't start coordinator specific thread pools if a 
node isn't a coordinator node
..


Patch Set 3: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/8242
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1cd046dbe5f10fa51fcc37338e48b94843891b62
Gerrit-Change-Number: 8242
Gerrit-PatchSet: 3
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Wed, 11 Oct 2017 22:32:49 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6040: skip test multi compression types where Hive isn't supported

2017-10-11 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8259 )

Change subject: IMPALA-6040: skip test_multi_compression_types where Hive isn't 
supported
..


Patch Set 2:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1333/


--
To view, visit http://gerrit.cloudera.org:8080/8259
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3ad4b72839f8ac3bcb824287d02dd6964eea3e3e
Gerrit-Change-Number: 8259
Gerrit-PatchSet: 2
Gerrit-Owner: Michael Brown 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 11 Oct 2017 22:27:25 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6040: skip test multi compression types where Hive isn't supported

2017-10-11 Thread Michael Brown (Code Review)
Michael Brown has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8259 )

Change subject: IMPALA-6040: skip test_multi_compression_types where Hive isn't 
supported
..


Patch Set 2: Code-Review+2

Hit IMPALA-6027; rebased to include its fix and carry +2.


--
To view, visit http://gerrit.cloudera.org:8080/8259
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3ad4b72839f8ac3bcb824287d02dd6964eea3e3e
Gerrit-Change-Number: 8259
Gerrit-PatchSet: 2
Gerrit-Owner: Michael Brown 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 11 Oct 2017 22:26:43 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6040: skip test multi compression types where Hive isn't supported

2017-10-11 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8259 )

Change subject: IMPALA-6040: skip test_multi_compression_types where Hive isn't 
supported
..


Patch Set 1: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/1328/


--
To view, visit http://gerrit.cloudera.org:8080/8259
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3ad4b72839f8ac3bcb824287d02dd6964eea3e3e
Gerrit-Change-Number: 8259
Gerrit-PatchSet: 1
Gerrit-Owner: Michael Brown 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 11 Oct 2017 22:15:51 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5429: Multi threaded block metadata loading

2017-10-11 Thread Dimitris Tsirogiannis (Code Review)
Dimitris Tsirogiannis has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8235 )

Change subject: IMPALA-5429: Multi threaded block metadata loading
..


Patch Set 5:

(22 comments)

http://gerrit.cloudera.org:8080/#/c/8235/5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

http://gerrit.cloudera.org:8080/#/c/8235/5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@215
PS5, Line 215:   // Block metadata loading stats for a single HDFS path.
nit: File/block (since we're also loading/refreshing files). Also, you may want 
to change the name of the private class to reflect that, e.g. 
PathMetadataLoadingStats or something like that.


http://gerrit.cloudera.org:8080/#/c/8235/5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@217
PS5, Line 217: loadedFiles_
You may want to add a comment here. What is loaded vs refreshed? Is the one 
superset of the other. Good to clarify to avoid confusion.


http://gerrit.cloudera.org:8080/#/c/8235/5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@218
PS5, Line 218: _
I believe the convention is that we don't use '_' for public members.


http://gerrit.cloudera.org:8080/#/c/8235/5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@231
PS5, Line 231: Runnable
I don't know how this is used later on, but alternatively you can make 
PathBlockMetadataLoadRequest implement Callable, hence 
returning the stats when calling call(). Now you seem to access the stats only 
through the debugString() function.


http://gerrit.cloudera.org:8080/#/c/8235/5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@247
PS5, Line 247: Blocks on the loadBlockMetadata() call.
Not following this comment. run() either calls refreshBlockMetadata() or 
loadBlockMetadata(), so I can't really interpret what this comment is saying.


http://gerrit.cloudera.org:8080/#/c/8235/5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@333
PS5, Line 333: loadBlockMetadata
I know I am guilty for some of these names but maybe rename this to 
"resetAndLoadBlockMetadata"? Then it is more clear what the differences are 
between this function and rerfreshBlockMetadata().


http://gerrit.cloudera.org:8080/#/c/8235/5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@363
PS5, Line 363: numUnknownDiskIds
Are you overriding or incrementing the value of numUnknownDiskIds in the 
create() function?


http://gerrit.cloudera.org:8080/#/c/8235/5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@368
PS5, Line 368: newFileDescs
Hm, that doesn't seem particularly safe (i.e. using the same list for every 
partition). Are we certain that any other partition modification operation 
(e.g. alter partition set location) will not try to override the file 
descriptors, thereby affecting all the other partitions?


http://gerrit.cloudera.org:8080/#/c/8235/5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@380
PS5, Line 380: changed
 :* mtime
It's not just the changed mtime that we're looking for in order to determiner 
modification, so you may want to either remove this or mention all the criteria 
we use. I prefer the former.


http://gerrit.cloudera.org:8080/#/c/8235/5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@383
PS5, Line 383: The initial table load still uses the listFiles()
 :* on the data directory that fetches both the FileStatus as 
well as BlockLocations in
 :* a single call.
remove


http://gerrit.cloudera.org:8080/#/c/8235/5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@398
PS5, Line 398: get(0)
Comment why we take the file descriptors of one partition and that it doesn't 
really matter which one we choose.


http://gerrit.cloudera.org:8080/#/c/8235/5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@399
PS5, Line 399: public String apply(FileDescriptor desc) {
 :   return desc.getFileName();
 : }
Add @Override and make it a single line (if it fits)


http://gerrit.cloudera.org:8080/#/c/8235/5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@448
PS5, Line 448: (fd == null) || (fd.getFileLength() != status.getLen()) ||
 :   (fd.getModificationTime() != status.getModificationTime());
I think we were also checking if the partition was cached. Isn't this check 
needed anymore?


http://gerrit.cloudera.org:8080/#/c/8235/5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@455
PS5, Line 455: refreshFileMetadata
Maybe call it refreshPartitionStorageMetadata? Overall, it may make sense to 
replace "File/Block" with "Storage" whenever it makes sense. Thoughts?


http://gerrit.cloudera.org:8080/#/c/8235/5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@694
PS5, Line 694: Exception
Why this change?


http://gerrit.cloudera.org:8080/#/c/8235/5/fe

[Impala-ASF-CR] IMPALA-5129: Use KRPC's Kinit code to avoid expensive fork

2017-10-11 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/7938 )

Change subject: IMPALA-5129: Use KRPC's Kinit code to avoid expensive fork
..


Patch Set 7:

I think this is really close. Please consider adding more tests as suggested in 
the comments.


--
To view, visit http://gerrit.cloudera.org:8080/7938
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9cea56cc6e7412d87f4c2e92399a2f91ea6af6c7
Gerrit-Change-Number: 7938
Gerrit-PatchSet: 7
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Wed, 11 Oct 2017 22:06:39 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5129: Use KRPC's Kinit code to avoid expensive fork

2017-10-11 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/7938 )

Change subject: IMPALA-5129: Use KRPC's Kinit code to avoid expensive fork
..


Patch Set 7:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/7938/7/be/src/rpc/thrift-server-test.cc
File be/src/rpc/thrift-server-test.cc:

http://gerrit.cloudera.org:8080/#/c/7938/7/be/src/rpc/thrift-server-test.cc@96
PS7, Line 96: int kdc_port = GetServerPort();
static


http://gerrit.cloudera.org:8080/#/c/7938/7/be/src/rpc/thrift-server-test.cc@247
PS7, Line 247:
As discussed offline, may be it'd help to test with short renewal period to 
exercise the renewal path too if possible.



--
To view, visit http://gerrit.cloudera.org:8080/7938
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9cea56cc6e7412d87f4c2e92399a2f91ea6af6c7
Gerrit-Change-Number: 7938
Gerrit-PatchSet: 7
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Wed, 11 Oct 2017 22:06:11 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6002: Add a LLVM diagnostic handler for LLVM linker errors

2017-10-11 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8233 )

Change subject: IMPALA-6002: Add a LLVM diagnostic handler for LLVM linker 
errors
..


Patch Set 3:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/8233/3/be/src/codegen/llvm-codegen.cc
File be/src/codegen/llvm-codegen.cc:

http://gerrit.cloudera.org:8080/#/c/8233/3/be/src/codegen/llvm-codegen.cc@1628
PS3, Line 1628: LOG(INFO) << diagnostic_handler->error_str_;
Include the query id for context, which should be accessible via state_? Will 
be hard to track down where the log message is coming from otherwise.


http://gerrit.cloudera.org:8080/#/c/8233/3/be/src/codegen/llvm-codegen.cc@1634
PS3, Line 1634: (std::
Should be able to drop the std:: - we import it into the namespace in 
common/names.h (there is other code in the codebase that is inconsistent about 
this).


http://gerrit.cloudera.org:8080/#/c/8233/3/be/src/codegen/llvm-codegen.cc@1635
PS3, Line 1635: error_str_.clear();
> I might be confused about the C++11 stuff, but given that you just std::mov
Yup


http://gerrit.cloudera.org:8080/#/c/8233/1/tests/query_test/test_codegen.py
File tests/query_test/test_codegen.py:

http://gerrit.cloudera.org:8080/#/c/8233/1/tests/query_test/test_codegen.py@45
PS1, Line 45:   def test_codegen_diagnostic_handler(self, vector):
> I don't know LLVM well enough to suggest how to get at these errors.
Yeah I think if you directly create a function or global variable with a 
conflicting name in the module you could induce an error that way.



--
To view, visit http://gerrit.cloudera.org:8080/8233
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I907ff1f8975c02d422b4a669afbfc2e536ddf1ff
Gerrit-Change-Number: 8233
Gerrit-PatchSet: 3
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: anujphadke 
Gerrit-Comment-Date: Wed, 11 Oct 2017 21:51:13 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6027: Retry downloading toolchain components.

2017-10-11 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/8258 )

Change subject: IMPALA-6027: Retry downloading toolchain components.
..

IMPALA-6027: Retry downloading toolchain components.

We've seen intermittent 500 errors when downloading the toolchain from
S3 over the HTTPS URLs. As a first stab, this commit retries 3 times,
with some jitter.

I also changed the threadpool introduced previously to have a limit
of 4 threads, because that's sufficient to get the speed improvement.
The 500 errors have been observed both before and after the threadpool
change.

For testing, I ran the straight-forward case directly. I introduced
a broken version string to observe that retries would happen on
any error from wget.

Change-Id: I7669c7d41240aa0eb43c30d5bf2bd5c01b66180b
Reviewed-on: http://gerrit.cloudera.org:8080/8258
Reviewed-by: Thomas Tauber-Marshall 
Reviewed-by: Michael Brown 
Tested-by: Impala Public Jenkins
---
M bin/bootstrap_toolchain.py
1 file changed, 15 insertions(+), 4 deletions(-)

Approvals:
  Thomas Tauber-Marshall: Looks good to me, but someone else must approve
  Michael Brown: Looks good to me, approved
  Impala Public Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/8258
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I7669c7d41240aa0eb43c30d5bf2bd5c01b66180b
Gerrit-Change-Number: 8258
Gerrit-PatchSet: 3
Gerrit-Owner: Philip Zeyliger 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Thomas Tauber-Marshall 


[Impala-ASF-CR] IMPALA-6027: Retry downloading toolchain components.

2017-10-11 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8258 )

Change subject: IMPALA-6027: Retry downloading toolchain components.
..


Patch Set 2: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/8258
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7669c7d41240aa0eb43c30d5bf2bd5c01b66180b
Gerrit-Change-Number: 8258
Gerrit-PatchSet: 2
Gerrit-Owner: Philip Zeyliger 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Comment-Date: Wed, 11 Oct 2017 21:45:39 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6002: Add a LLVM diagnostic handler for LLVM linker errors

2017-10-11 Thread anujphadke (Code Review)
anujphadke has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8233 )

Change subject: IMPALA-6002: Add a LLVM diagnostic handler for LLVM linker 
errors
..


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8233/2/be/src/codegen/llvm-codegen.cc
File be/src/codegen/llvm-codegen.cc:

http://gerrit.cloudera.org:8080/#/c/8233/2/be/src/codegen/llvm-codegen.cc@1621
PS2, Line 1621: DiagnosticHandler* diagnostic_handler = 
reinterpret_cast(context);
> I dont think there is any scenario as far as how we are using this handler,
You are right. I confused this with something I read about LLVMContext usage 
online. I don't think we need to check for a null here.



--
To view, visit http://gerrit.cloudera.org:8080/8233
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I907ff1f8975c02d422b4a669afbfc2e536ddf1ff
Gerrit-Change-Number: 8233
Gerrit-PatchSet: 3
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: anujphadke 
Gerrit-Comment-Date: Wed, 11 Oct 2017 21:43:25 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4856: Port data stream service to KRPC

2017-10-11 Thread Sailesh Mukil (Code Review)
Sailesh Mukil has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8023 )

Change subject: IMPALA-4856: Port data stream service to KRPC
..


Patch Set 3:

(23 comments)

http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/rpc/rpc-mgr.cc
File be/src/rpc/rpc-mgr.cc:

http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/rpc/rpc-mgr.cc@77
PS3, Line 77:
Should we add a log message stating which services we registered with KRPC?

It might be useful later on as we add more services, while trying to debug user 
issues to know which services are on KRPC and which are on thrift. Granted 
there are other ways to find that, but this is easily accessible and 
straightforward.


http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-mgr.h
File be/src/runtime/krpc-data-stream-mgr.h:

http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-mgr.h@314
PS3, Line 314: w
susper-nit: Capital 'W'


http://gerrit.cloudera.org:8080/#/c/8023/1/be/src/runtime/krpc-data-stream-mgr.cc
File be/src/runtime/krpc-data-stream-mgr.cc:

http://gerrit.cloudera.org:8080/#/c/8023/1/be/src/runtime/krpc-data-stream-mgr.cc@198
PS1, Line 198: deseria
> Deserialization pool's purpose is to avoid executing deserialization in lin
Hmm, this seems like it would be a nice thing to have. Is the absence of a 
MemTracker the only hindrance to early deserialization?

Is there some way we could add this to the process MemTracker if we can't 
attribute it to a query? If it's too complicated for now, let's track this with 
a JIRA and write down some ideas there for the next KRPC milestone.


http://gerrit.cloudera.org:8080/#/c/8023/1/be/src/runtime/krpc-data-stream-mgr.cc@227
PS1, Line 227: // to make sure that the close operation is performed so add 
to per-recvr list of
 : // pending closes. It's possible for a sender to issue EOS 
RPC without sending any
 : // rows if no rows are m
> This may be a bit subtle but this is equivalent to the logic in the non-KRP
Ah you're right. Case 2 is what changed in IMPALA-5199, but it looks like 
that's automatically fixed here.


http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-mgr.cc
File be/src/runtime/krpc-data-stream-mgr.cc:

http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-mgr.cc@112
PS3, Line 112:   for (unique_ptr& ctx : 
early_senders.waiting_sender_ctxs) {
 :  EnqueueRowBatch({recvr->fragment_instance_id(), move(ctx)});
 :  num_senders_waiting_->Increment(-1);
 :   }
 :   for (unique_ptr& ctx : 
early_senders.closed_sender_ctxs) {
 :  recvr->RemoveSender(ctx->request->sender_id());
 :  Status::OK().ToProto(ctx->response->mutable_status());
 :  ctx->context->RespondSuccess();
 :  num_senders_waiting_->Increment(-1);
 :   }
It's not possible for the same sender to be in waiting_senders_ctxs and 
closed_sender_ctxs for a given receiver right?

Because if it would, it would make more sense to service the 
'closed_sender_ctxs' before the 'waiting_sender_ctxs' since we may as well 
close the receiver instead of wasting CPU processing those RPCs for a while.


http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-mgr.cc@140
PS3, Line 140:   while (range.first != range.second) {
 : shared_ptr recvr = range.first->second;
 : if (recvr->fragment_instance_id() == finst_id &&
 : recvr->dest_node_id() == node_id) {
 :   return recvr;
 : }
 : ++range.first;
 :   }
I'm thinking it makes sense to prioritize finding the receiver with the 
assumption that we will find it in the receiver_map_, rather than assume that 
it most likely will already be unregistered.

In other words, I think it may be more beneficial CPU-wise for general 
workloads to look in the 'receiver_map_' before looking into the 
'closed_stream_cache_'.

What do you think?


http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-mgr.cc@151
PS3, Line 151: KrpcDataStreamMgr::AddEarlySender
We could merge the implementations of AddEarlySender() and 
AddEarlyClosedSender() by using templates and some extra params, but maybe the 
code complexity isn't worth it.


http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-mgr.cc@174
PS3, Line 174: // that it is still preparing, so add payload to 
per-receiver list.
Add comment "In the worst case, this RPC is so late that the receiver is 
already unregistered and removed from the closed_stream_cache_, in which case 
it will be responded to by the Maintenance thread after 
FLAGS_datastream_sender_timeout_ms."


http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-mgr.cc@242
PS3, Line 242:  {
   

[Impala-ASF-CR] IMPALA-5940: Avoid log spew by using Status::Expected()

2017-10-11 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8255 )

Change subject: IMPALA-5940: Avoid log spew by using Status::Expected()
..


Patch Set 3:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1332/


--
To view, visit http://gerrit.cloudera.org:8080/8255
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I87493394074a7fdb4e912f4ab2436dd7050d45b3
Gerrit-Change-Number: 8255
Gerrit-PatchSet: 3
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Wed, 11 Oct 2017 21:34:25 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5940: Avoid log spew by using Status::Expected()

2017-10-11 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8255 )

Change subject: IMPALA-5940: Avoid log spew by using Status::Expected()
..


Patch Set 3: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/8255
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I87493394074a7fdb4e912f4ab2436dd7050d45b3
Gerrit-Change-Number: 8255
Gerrit-PatchSet: 3
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Wed, 11 Oct 2017 21:18:00 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5940: Avoid log spew by using Status::Expected()

2017-10-11 Thread Michael Ho (Code Review)
Hello Sailesh Mukil, Dan Hecht,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8255

to look at the new patch set (#3).

Change subject: IMPALA-5940: Avoid log spew by using Status::Expected()
..

IMPALA-5940: Avoid log spew by using Status::Expected()

This change converts two more callers of Status() to Status::Expected()
to avoid log spew and the overhead of stack crawling. These two call
sites can easily fill the log when there is any network issue.

Testing done:

Ran concurrent TPC-DS workload with 256 users and verified the log spew is not 
there.

Change-Id: I87493394074a7fdb4e912f4ab2436dd7050d45b3
---
M be/src/rpc/thrift-client.cc
M be/src/service/impala-server.cc
2 files changed, 9 insertions(+), 4 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/55/8255/3
--
To view, visit http://gerrit.cloudera.org:8080/8255
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I87493394074a7fdb4e912f4ab2436dd7050d45b3
Gerrit-Change-Number: 8255
Gerrit-PatchSet: 3
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 


[Impala-ASF-CR] IMPALA-5376: Implement all TPCDS test cases or alternates for Impala.

2017-10-11 Thread Mostafa Mokhtar (Code Review)
Mostafa Mokhtar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8102 )

Change subject: IMPALA-5376: Implement all TPCDS test cases or alternates for 
Impala.
..


Patch Set 17:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8102/17/testdata/workloads/tpcds/queries/tpcds-q26.test
File testdata/workloads/tpcds/queries/tpcds-q26.test:

http://gerrit.cloudera.org:8080/#/c/8102/17/testdata/workloads/tpcds/queries/tpcds-q26.test@6
PS17, Line 6: truncate(avg(cs_quantity) * 10.0) / 10 agg1,
What is the reason behind changing the query text along with the results?



--
To view, visit http://gerrit.cloudera.org:8080/8102
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6e284888600a7a69d1f23fcb7dac21cbb13b7d66
Gerrit-Change-Number: 8102
Gerrit-PatchSet: 17
Gerrit-Owner: Tim Wood 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Matthew Mulder 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Michael Ho
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Tim Wood 
Gerrit-Comment-Date: Wed, 11 Oct 2017 19:32:40 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5376: Implement all TPCDS test cases or alternates for Impala.

2017-10-11 Thread Tim Wood (Code Review)
Hello Matthew Mulder, Michael Brown, David Knupp, Alex Behm, Mostafa Mokhtar, 
Michael Ho,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8102

to look at the new patch set (#17).

Change subject: IMPALA-5376: Implement all TPCDS test cases or alternates for 
Impala.
..

IMPALA-5376: Implement all TPCDS test cases or alternates for Impala.

Main source for TPCDS query and result definitions: 
https://github.com/gregrahn/tpcds-kit.
TPC-DS v2.5.0 qualification queries from G. Rahn, Cloudera, Inc.
Data set constructed in mini-cluster using $IMPALA_HOME/buildall.sh 
-testdata
This commit continues previous work on IMPALA-5376 in the ASF Impala repo
and the Cloudera Gerrit service.

This commit splits multi-query tests in the TPC-DS suite definition into one
query and result set per test file, as the test framework requires.  Names for
such files have -1, -2... inner suffixes.

The portion of the TPC-DS test suite in this commit passes.
It contains no failures, as reflected by runs of
$IMPALA_HOME/tests/run-tests.py query_test/test_tpcds_queries.py ...

IMPALA-6007 addresses the TPC-DS cases that require skipping (because we don't
support them or they flap) or expected-failure (xfail, because we support them
but they fail due to bugs.)  These require some added tooling for non-Pytest
frameworks like the stress test to avoid attempting them until they work.
Tests that flap are marked to skip, with a bug ID, since they don't reliably 
pass or xfail.

Expected result sets come from the TPC-DS kit.  Some TPC-DS test cases
in this commit have been modified in sematically-neutral ways so as to pass
on Impala.

The tests/query_test/test_tpcds_queries.py driver file is authoritative for the
active/skip/xfail status for each case and a brief reason.  The following list
describes the current status as:
--- test-name
deviance from TPC-DS spec
changes made

--- tpcds-q22a.test
RESULT MISMATCH in LSD of AVG() values
Fixed AVG()s
--- tpcds-q26.test
RESULT MISMATCH in LSD of AVG() values
-- ADDED TRUNCATE()s TO AVG()s IN SELECT COLUMNS
--- tpcds-q30.test
UNRECOGNIZED CHARACTER
ABSENT, IMPALA-5961.
--- tpcds-q31.test
RESULT MISMATCH in LSD of DECIMAL values
ABSENT, IMPALA-5956.
--- tpcds-q35a.test
RESULT MISMATCH
ABSENT, IMPALA-5950.
--- tpcds-q36a.test
RESULT MISMATCH
ABSENT, IMPALA-4741
--- tpcds-q47.test
RESULT MISMATCH in LSD of DECIMAL values
ADDED TRUNCATE(2) TO 8th COLUMN OF WITH TABLE, TAKE ACTUAL RESULT AS EXPECTED.
--- tpcds-q48.test
RESULT MISMATCH in scalar value
ABSENT, IMPALA-5950.
--- tpcds-q49.test
RESULT MISMATCH in LSD of DECIMAL values
ABSENT, IMPALA-5945
--- tpcds-q57.test
RESULT MISMATCH, excess scale in DECIMAL values
FIXED, ADDED TRUNCATE(2) AROUND 6th COLUMN.
--- tpcds-q58.test
RESULT MISMATCH in DECIMAL values
ABSENT, IMPALA-5946
--- tpcds-q59.test
RESULT MISMATCH, excess scale in DECIMAL values
FIXED, ADDED TRUNCATE(2) AROUND 4th-10th COLUMNS.
--- tpcds-q61.test
RESULT MISMATCH in DECIMAL value
FIXED. CAST RESULT QUOTIENT TO DECIMAL(15, 4), TAKE ACTUAL RESULT AS EXPECTED
--- tpcds-q63.test
RESULT MISMATCH, excess scale in DECIMAL values
ADDED TRUNCATE(2) TO 3rd COLUMN
--- tpcds-q64.test
RESULT MISMATCH
ADDED ORDER BY COLUMNS.
--- tpcds-q66.test
RESULT MISMATCH
ABSENT, IMPALA-4741
--- tpcds-q77a.test
RESULT MISMATCH
FIXED. TAKE ACTUAL RESULT AS EXPECTED
--- tpcds-q78.test
RESULT MISMATCH
FIXED. TAKE ACTUAL RESULT AS EXPECTED
--- tpcds-q83.test
RESULT MISMATCH
ABSENT, IMPALA-5945.
--- tpcds-q85.test
MISSING TABLE "reason"
ABSENT, IMPALA-5960
--- tpcds-q86a.test
RESULT MISMATCH
FIXED. TAKE ACTUAL RESULT AS EXPECTED
--- tpcds-q89.test
RESULT MISMATCH, DECIMAL values flap
ABSENT, ADDED ROUND(2) TO 8th COLUMN, TAKE ACTUAL RESULTS AS EXPECTED, 
IMPALA-5956.
--- tpcds-q90.test
RESULT MISMATCH
ABSENT, IMPALA-5945.
--- tpcds-q93.test
MISSING TABLE "reason"
ABSENT, IMPALA-5960
--- tpcds-q98.test
RESULT MISMATCH
FIXED, ADDED ROUND() TO LAST COLUMN

Change-Id: I6e284888600a7a69d1f23fcb7dac21cbb13b7d66
---
A testdata/workloads/tpcds/queries/tpcds-q10a.test
A testdata/workloads/tpcds/queries/tpcds-q11.test
A testdata/workloads/tpcds/queries/tpcds-q12.test
A testdata/workloads/tpcds/queries/tpcds-q13.test
A testdata/workloads/tpcds/queries/tpcds-q15.test
A testdata/workloads/tpcds/queries/tpcds-q16.test
A testdata/workloads/tpcds/queries/tpcds-q17.test
A testdata/workloads/tpcds/queries/tpcds-q18a.test
A testdata/workloads/tpcds/queries/tpcds-q20.test
A testdata/workloads/tpcds/queries/tpcds-q21.test
A testdata/workloads/tpcds/queries/tpcds-q22a.test
D testdata/workloads/tpcds/queries/tpcds-q23-1.test
D testdata/workloads/tpcds/queries/tpcds-q23-2.test
A testdata/workloads/tpcds/queries/tpcds-q25.test
A testdata/workloads/tpcds/queries/tpcds-q26.test
D testdata/workloads/tpcds/queries/tpcds-q27.test
D testdata/workloads/tpcds/queries/tpcds-q27a.test
M testdata/workloads/tpcds/queries/tpcds-q28.test
A testda

[Impala-ASF-CR] IMPALA-5789: Add always false flag in bloom filter

2017-10-11 Thread Thomas Tauber-Marshall (Code Review)
Thomas Tauber-Marshall has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8170 )

Change subject: IMPALA-5789: Add always_false flag in bloom filter
..


Patch Set 4: Code-Review+1


--
To view, visit http://gerrit.cloudera.org:8080/8170
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If680240a3cd4583fc97c3192177d86d9567c4f8d
Gerrit-Change-Number: 8170
Gerrit-PatchSet: 4
Gerrit-Owner: Tianyi Wang 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 11 Oct 2017 19:22:38 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4856: Port data stream service to KRPC

2017-10-11 Thread Mostafa Mokhtar (Code Review)
Mostafa Mokhtar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8023 )

Change subject: IMPALA-4856: Port data stream service to KRPC
..


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-sender.cc
File be/src/runtime/krpc-data-stream-sender.cc:

http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-sender.cc@644
PS3, Line 644: for (int i = 0; i < channels_.size(); ++i) {
The RowBatch is serialized once per channel which is very wasteful.
IMPALA-6041.
Compare to 
https://github.com/michaelhkw/incubator-impala/blob/krpc-testing-hung/be/src/runtime/data-stream-sender.cc#L429



--
To view, visit http://gerrit.cloudera.org:8080/8023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic0b8c1e50678da66ab1547d16530f88b323ed8c1
Gerrit-Change-Number: 8023
Gerrit-PatchSet: 3
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Wed, 11 Oct 2017 19:14:14 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5940: Avoid log spew by using Status::Expected()

2017-10-11 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8255 )

Change subject: IMPALA-5940: Avoid log spew by using Status::Expected()
..


Patch Set 2: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/8255
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I87493394074a7fdb4e912f4ab2436dd7050d45b3
Gerrit-Change-Number: 8255
Gerrit-PatchSet: 2
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Wed, 11 Oct 2017 19:08:01 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5940: Avoid log spew by using Status::Expected()

2017-10-11 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8255 )

Change subject: IMPALA-5940: Avoid log spew by using Status::Expected()
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8255/2/be/src/rpc/thrift-client.cc
File be/src/rpc/thrift-client.cc:

http://gerrit.cloudera.org:8080/#/c/8255/2/be/src/rpc/thrift-client.cc@74
PS2, Line 74: are
typo: aren't



--
To view, visit http://gerrit.cloudera.org:8080/8255
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I87493394074a7fdb4e912f4ab2436dd7050d45b3
Gerrit-Change-Number: 8255
Gerrit-PatchSet: 2
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Wed, 11 Oct 2017 19:00:06 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5940: Avoid log spew by using Status::Expected()

2017-10-11 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8255 )

Change subject: IMPALA-5940: Avoid log spew by using Status::Expected()
..


Patch Set 1:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/8255/1/be/src/rpc/thrift-client.cc
File be/src/rpc/thrift-client.cc:

http://gerrit.cloudera.org:8080/#/c/8255/1/be/src/rpc/thrift-client.cc@71
PS1, Line 71: }
> I think we should make it clear why we use Expected() when we do (since we
Done


http://gerrit.cloudera.org:8080/#/c/8255/1/be/src/service/impala-server.cc
File be/src/service/impala-server.cc:

http://gerrit.cloudera.org:8080/#/c/8255/1/be/src/service/impala-server.cc@1173
PS1, Line 1173: INTERNAL_ERROR
> hmm, if this is expected (as the comment claims), then this shouldn't be an
Fixed. Was just preserving the previous logic.



--
To view, visit http://gerrit.cloudera.org:8080/8255
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I87493394074a7fdb4e912f4ab2436dd7050d45b3
Gerrit-Change-Number: 8255
Gerrit-PatchSet: 1
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Wed, 11 Oct 2017 18:59:26 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5940: Avoid log spew by using Status::Expected()

2017-10-11 Thread Michael Ho (Code Review)
Hello Sailesh Mukil, Dan Hecht,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8255

to look at the new patch set (#2).

Change subject: IMPALA-5940: Avoid log spew by using Status::Expected()
..

IMPALA-5940: Avoid log spew by using Status::Expected()

This change converts two more callers of Status() to Status::Expected()
to avoid log spew and the overhead of stack crawling. These two call
sites can easily fill the log when there is any network issue.

Testing done:

Ran concurrent TPC-DS workload with 256 users and verified the log spew is not 
there.

Change-Id: I87493394074a7fdb4e912f4ab2436dd7050d45b3
---
M be/src/rpc/thrift-client.cc
M be/src/service/impala-server.cc
2 files changed, 9 insertions(+), 4 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/55/8255/2
--
To view, visit http://gerrit.cloudera.org:8080/8255
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I87493394074a7fdb4e912f4ab2436dd7050d45b3
Gerrit-Change-Number: 8255
Gerrit-PatchSet: 2
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 


[Impala-ASF-CR] IMPALA-5429: Multi threaded block metadata loading

2017-10-11 Thread Vuk Ercegovac (Code Review)
Vuk Ercegovac has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8235 )

Change subject: IMPALA-5429: Multi threaded block metadata loading
..


Patch Set 5:

(9 comments)

http://gerrit.cloudera.org:8080/#/c/8235/5/be/src/catalog/catalog.cc
File be/src/catalog/catalog.cc:

http://gerrit.cloudera.org:8080/#/c/8235/5/be/src/catalog/catalog.cc@35
PS5, Line 35: DEFINE_int32(num_metadata_loading_threads, 16,
: "(Advanced) The number of metadata loading threads (degree of 
parallelism) to use "
: "when loading catalog metadata.");
I'm confused by the commit message which talks about not loading from hms using 
multiple threads and this flag which indicates that hms is loaded using 
multiple threads.


http://gerrit.cloudera.org:8080/#/c/8235/5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

http://gerrit.cloudera.org:8080/#/c/8235/5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@217
PS5, Line 217: public int loadedFiles_ = 0;
 : public int refreshedFiles_ = 0;
 : public int ignoredFiles_ = 0;
add comments for these-- see the question regarding refreshedFiles below, for 
example.


http://gerrit.cloudera.org:8080/#/c/8235/5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@368
PS5, Line 368: for (HdfsPartition partition: partitions) 
partition.setFileDescriptors(
Am I misreading this or does each partition get set to the same list of newly 
found descriptors?


http://gerrit.cloudera.org:8080/#/c/8235/5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@426
PS5, Line 426: new Reference(Long.valueOf(0)
why not use numUnknownDiskIds here?


http://gerrit.cloudera.org:8080/#/c/8235/5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@431
PS5, Line 431: ++loadStats.refreshedFiles_;
does refreshedFiles mean "file blocks reloaded" or "file checked for reload and 
possibly reloaded"?

would be good to track how many times the if-block on L418 was entered since 
this method is intended to be used when few changes are present.


http://gerrit.cloudera.org:8080/#/c/8235/5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@433
PS5, Line 433: for (HdfsPartition partition: partitions) 
partition.setFileDescriptors(n
same question as in the load method.


http://gerrit.cloudera.org:8080/#/c/8235/5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@773
PS5, Line 773: HDFS and S3
just to clarify, HdfsTable covers both hdfs table metadata as well as metadata 
needed for s3?


http://gerrit.cloudera.org:8080/#/c/8235/5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@783
PS5, Line 783: getFileSystem(CONF)
I noticed that this is called in many places in this class-- is it bc a given 
table can be stored on multiple filesystems?


http://gerrit.cloudera.org:8080/#/c/8235/5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@801
PS5, Line 801: getLoadingThreadPoolSize
can different partitions have different number of files? if so, work across 
threads may vary. what's costly here: per file call, per partition call, or 
number of blocks per file?



--
To view, visit http://gerrit.cloudera.org:8080/8235
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I07eaa7151dfc4d56da8db8c2654bd65d8f808481
Gerrit-Change-Number: 8235
Gerrit-PatchSet: 5
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Wed, 11 Oct 2017 18:51:23 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6030: Don't start coordinator specific thread pools if a node isn't a coordinator node

2017-10-11 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8242 )

Change subject: IMPALA-6030: Don't start coordinator specific thread pools if a 
node isn't a coordinator node
..


Patch Set 3: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/8242
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1cd046dbe5f10fa51fcc37338e48b94843891b62
Gerrit-Change-Number: 8242
Gerrit-PatchSet: 3
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Wed, 11 Oct 2017 18:31:00 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5668: Fix cast(X as timestamp) for negative subsecond Decimals

2017-10-11 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8051 )

Change subject: IMPALA-5668: Fix cast(X as timestamp) for negative subsecond 
Decimals
..


Patch Set 4: Code-Review+1


--
To view, visit http://gerrit.cloudera.org:8080/8051
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8216f4c0f100c1bd68891cd6048236bfe4c205f0
Gerrit-Change-Number: 8051
Gerrit-PatchSet: 4
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Lars Volker 
Gerrit-Comment-Date: Wed, 11 Oct 2017 18:36:32 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5668: Fix cast(X as timestamp) for negative subsecond Decimals

2017-10-11 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8051 )

Change subject: IMPALA-5668: Fix cast(X as timestamp) for negative subsecond 
Decimals
..


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8051/3/be/src/exprs/decimal-operators-ir.cc
File be/src/exprs/decimal-operators-ir.cc:

http://gerrit.cloudera.org:8080/#/c/8051/3/be/src/exprs/decimal-operators-ir.cc@613
PS3, Line 613:   if (seconds < numeric_limits::min() ||
 :   seconds > numeric_limits::max()) {
 : // TimeStampVal() takes int64_t.
 : return TimestampVal::null();
> Good question - my guess is that the compiler is smart enough to eliminate
Looks like it should, but it may be worth confirming, e.g. with the compiler 
explorer.



--
To view, visit http://gerrit.cloudera.org:8080/8051
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8216f4c0f100c1bd68891cd6048236bfe4c205f0
Gerrit-Change-Number: 8051
Gerrit-PatchSet: 3
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Lars Volker 
Gerrit-Comment-Date: Wed, 11 Oct 2017 18:36:25 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6030: Don't start coordinator specific thread pools if a node isn't a coordinator node

2017-10-11 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8242 )

Change subject: IMPALA-6030: Don't start coordinator specific thread pools if a 
node isn't a coordinator node
..


Patch Set 3:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1329/


--
To view, visit http://gerrit.cloudera.org:8080/8242
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1cd046dbe5f10fa51fcc37338e48b94843891b62
Gerrit-Change-Number: 8242
Gerrit-PatchSet: 3
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Wed, 11 Oct 2017 18:31:51 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5736: Add impala-shell argument to set default query options

2017-10-11 Thread Philip Zeyliger (Code Review)
Philip Zeyliger has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8038 )

Change subject: IMPALA-5736: Add impala-shell argument to set default query 
options
..


Patch Set 4:

(10 comments)

Thanks! I think this is much clearer than the previous iteration. All of my 
comments are either nits or requests to add a few more comments.

http://gerrit.cloudera.org:8080/#/c/8038/4/shell/impala_shell.py
File shell/impala_shell.py:

http://gerrit.cloudera.org:8080/#/c/8038/4/shell/impala_shell.py@686
PS4, Line 686: tmp_set = {}
Add a comment:

# Use a temporary to avoid changing set_query_options during iteration.

(An alternative is to use del, but use .items() instead of .iteritems())


http://gerrit.cloudera.org:8080/#/c/8038/4/shell/impala_shell.py@1330
PS4, Line 1330: if __name__ == "__main__":
Perhaps we need a big comment:

"""
There are two types of options: shell options and query_options. You can set 
these in the shell itself, which overrides settings on the command line, which 
override the defaults in impala_shell_config_defaults.py. Query options have no 
defaults within the impala-shell, but they do have defaults on the server.
"""


http://gerrit.cloudera.org:8080/#/c/8038/4/shell/impala_shell.py@1350
PS4, Line 1350:   # default options loaded in from 
impala_shell_config_defaults.py
Let's take the time to update this comment by disambiguating "shell options" vs 
"query options" here?


http://gerrit.cloudera.org:8080/#/c/8038/4/shell/impala_shell.py@1354
PS4, Line 1354: impala_shell_defaults.update(loaded_shell_options)
I think "impala_query_options_default" is empty, but this assymetry between 
impala_shell_defaults and query_options is weird. It's weird that you're 
updating impala_shell_defaults, rather than updating "shell_options".


http://gerrit.cloudera.org:8080/#/c/8038/4/shell/impala_shell.py@1437
PS4, Line 1437:   query_options.update(
Add comment:

Override query_options with those specified on the command line.


http://gerrit.cloudera.org:8080/#/c/8038/4/shell/option_parser.py
File shell/option_parser.py:

http://gerrit.cloudera.org:8080/#/c/8038/4/shell/option_parser.py@36
PS4, Line 36: def convert_loaded_shell_option(option, value, default_options, 
config_filename):
Please document config_filename. I'm unclear how it's used.


http://gerrit.cloudera.org:8080/#/c/8038/4/shell/option_parser.py@47
PS4, Line 47: # if the option is not set to either true or false, use 
the default
Do we need to log about the ignored value here?


http://gerrit.cloudera.org:8080/#/c/8038/4/shell/option_parser.py@36
PS4, Line 36: def convert_loaded_shell_option(option, value, default_options, 
config_filename):
: """Converts some values based on their type in default_options
: """
: if default_options[option] in [True, False]:
:   # validate the option if it can only be a boolean value
:   # the only choice for these options is true or false
:   if value.lower() == "true":
: return True
:   elif value.lower() == 'false':
: return False
:   else:
: # if the option is not set to either true or false, use 
the default
: return default_options[option]
: elif value.lower() == "none":
:   return None
: elif option.lower() == "config_file":
:   return config_filename
This function is mixing 2-space indent and 4-space indent. Please clean up.


http://gerrit.cloudera.org:8080/#/c/8038/4/shell/option_parser.py@164
PS4, Line 164:   "Only specify this as a option in the 
commandline."
s/as a option/as an option/


http://gerrit.cloudera.org:8080/#/c/8038/4/shell/option_parser.py@183
PS4, Line 183: help="Sets default query options."
Add: "May be specified multiple times." Unless it's clear from the usage?



--
To view, visit http://gerrit.cloudera.org:8080/8038
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I26a3b67230c80a99bd246b6af205d558fec9a986
Gerrit-Change-Number: 8038
Gerrit-PatchSet: 4
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Comment-Date: Wed, 11 Oct 2017 18:42:39 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6030: Don't start coordinator specific thread pools if a node isn't a coordinator node

2017-10-11 Thread Sailesh Mukil (Code Review)
Sailesh Mukil has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8242 )

Change subject: IMPALA-6030: Don't start coordinator specific thread pools if a 
node isn't a coordinator node
..


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8242/2/be/src/runtime/exec-env.h
File be/src/runtime/exec-env.h:

http://gerrit.cloudera.org:8080/#/c/8242/2/be/src/runtime/exec-env.h@186
PS2, Line 186: Thread
> I think this is meant to be a thread pool just for ExecQueryFInstances RPC
You're right. Done.



--
To view, visit http://gerrit.cloudera.org:8080/8242
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1cd046dbe5f10fa51fcc37338e48b94843891b62
Gerrit-Change-Number: 8242
Gerrit-PatchSet: 3
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Wed, 11 Oct 2017 18:20:21 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6030: Don't start coordinator specific thread pools if a node isn't a coordinator node

2017-10-11 Thread Sailesh Mukil (Code Review)
Hello Dan Hecht,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8242

to look at the new patch set (#3).

Change subject: IMPALA-6030: Don't start coordinator specific thread pools if a 
node isn't a coordinator node
..

IMPALA-6030: Don't start coordinator specific thread pools if a node isn't a 
coordinator node

Since we introduced the FLAGS_is_coordinator, we've forgotten to
disable the coordinator specific thread pools on nodes that have
only the executor role.

This patch fixes that.

Change-Id: I1cd046dbe5f10fa51fcc37338e48b94843891b62
---
M be/src/runtime/exec-env.cc
M be/src/runtime/exec-env.h
2 files changed, 16 insertions(+), 6 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/42/8242/3
--
To view, visit http://gerrit.cloudera.org:8080/8242
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I1cd046dbe5f10fa51fcc37338e48b94843891b62
Gerrit-Change-Number: 8242
Gerrit-PatchSet: 3
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Sailesh Mukil 


[Impala-ASF-CR] IMPALA-6040: skip test multi compression types where Hive isn't supported

2017-10-11 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8259 )

Change subject: IMPALA-6040: skip test_multi_compression_types where Hive isn't 
supported
..


Patch Set 1:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1328/


--
To view, visit http://gerrit.cloudera.org:8080/8259
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3ad4b72839f8ac3bcb824287d02dd6964eea3e3e
Gerrit-Change-Number: 8259
Gerrit-PatchSet: 1
Gerrit-Owner: Michael Brown 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 11 Oct 2017 18:16:29 +
Gerrit-HasComments: No


[Impala-ASF-CR] Allow the SASL protocol service name to be configurable

2017-10-11 Thread Sailesh Mukil (Code Review)
Sailesh Mukil has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8230 )

Change subject: Allow the SASL protocol service name to be configurable
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8230/1/be/src/kudu/rpc/client_negotiation.cc
File be/src/kudu/rpc/client_negotiation.cc:

http://gerrit.cloudera.org:8080/#/c/8230/1/be/src/kudu/rpc/client_negotiation.cc@111
PS1, Line 111: const boost::optional& authn_token,
> Should we just take that patch to make (this and) future cherry picks easie
That patch is a tidy cleanup that adds the 'modernize-pass-by-value' rule, 
which we don't use ourselves.

Looking over the patch, it looks like not having it could cause a few more 
conflicts, but they can all be trivially resolved just as I did here.

Also, we will keep running into this issue of cherry-picked patches that 
conflict due to not having the entire history. So it's something we need to 
solve on a case by case basis.

My take is, let's only cherry-pick the patches that we really need. So as for 
this patch, since it doesn't help us a lot, I think we can leave it out.



--
To view, visit http://gerrit.cloudera.org:8080/8230
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9e30fe4461893b67527333259579e2304b19af1e
Gerrit-Change-Number: 8230
Gerrit-PatchSet: 1
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Wed, 11 Oct 2017 18:12:20 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6030: Don't start coordinator specific thread pools if a node isn't a coordinator node

2017-10-11 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8242 )

Change subject: IMPALA-6030: Don't start coordinator specific thread pools if a 
node isn't a coordinator node
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8242/2/be/src/runtime/exec-env.h
File be/src/runtime/exec-env.h:

http://gerrit.cloudera.org:8080/#/c/8242/2/be/src/runtime/exec-env.h@186
PS2, Line 186: General
I think this is meant to be a thread pool just for ExecQueryFInstances RPC (and 
that's what's meant by the "exec_rpc" in the name).



--
To view, visit http://gerrit.cloudera.org:8080/8242
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1cd046dbe5f10fa51fcc37338e48b94843891b62
Gerrit-Change-Number: 8242
Gerrit-PatchSet: 2
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Wed, 11 Oct 2017 18:06:45 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6040: skip test multi compression types where Hive isn't supported

2017-10-11 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8259 )

Change subject: IMPALA-6040: skip test_multi_compression_types where Hive isn't 
supported
..


Patch Set 1: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/8259
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3ad4b72839f8ac3bcb824287d02dd6964eea3e3e
Gerrit-Change-Number: 8259
Gerrit-PatchSet: 1
Gerrit-Owner: Michael Brown 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 11 Oct 2017 18:03:46 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6030: Don't start coordinator specific thread pools if a node isn't a coordinator node

2017-10-11 Thread Sailesh Mukil (Code Review)
Sailesh Mukil has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8242 )

Change subject: IMPALA-6030: Don't start coordinator specific thread pools if a 
node isn't a coordinator node
..


Patch Set 2:

> It'd be good to update the header comment for these fields to say
 > that they are set only for coordinator role.

Done.


--
To view, visit http://gerrit.cloudera.org:8080/8242
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1cd046dbe5f10fa51fcc37338e48b94843891b62
Gerrit-Change-Number: 8242
Gerrit-PatchSet: 2
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Wed, 11 Oct 2017 18:01:52 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6030: Don't start coordinator specific thread pools if a node isn't a coordinator node

2017-10-11 Thread Sailesh Mukil (Code Review)
Hello Dan Hecht,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8242

to look at the new patch set (#2).

Change subject: IMPALA-6030: Don't start coordinator specific thread pools if a 
node isn't a coordinator node
..

IMPALA-6030: Don't start coordinator specific thread pools if a node isn't a 
coordinator node

Since we introduced the FLAGS_is_coordinator, we've forgotten to
disable the coordinator specific thread pools on nodes that have
only the executor role.

This patch fixes that.

Change-Id: I1cd046dbe5f10fa51fcc37338e48b94843891b62
---
M be/src/runtime/exec-env.cc
M be/src/runtime/exec-env.h
2 files changed, 16 insertions(+), 6 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/42/8242/2
--
To view, visit http://gerrit.cloudera.org:8080/8242
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I1cd046dbe5f10fa51fcc37338e48b94843891b62
Gerrit-Change-Number: 8242
Gerrit-PatchSet: 2
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Dan Hecht 


[Impala-ASF-CR] Allow the SASL protocol service name to be configurable

2017-10-11 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8230 )

Change subject: Allow the SASL protocol service name to be configurable
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8230/1/be/src/kudu/rpc/client_negotiation.cc
File be/src/kudu/rpc/client_negotiation.cc:

http://gerrit.cloudera.org:8080/#/c/8230/1/be/src/kudu/rpc/client_negotiation.cc@111
PS1, Line 111: const boost::optional& authn_token,
> There was a conflict here since we don't have this patch from the kudu code
Should we just take that patch to make (this and) future cherry picks easier?



--
To view, visit http://gerrit.cloudera.org:8080/8230
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9e30fe4461893b67527333259579e2304b19af1e
Gerrit-Change-Number: 8230
Gerrit-PatchSet: 1
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Wed, 11 Oct 2017 17:57:47 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] Allow the SASL protocol service name to be configurable

2017-10-11 Thread Sailesh Mukil (Code Review)
Sailesh Mukil has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8230 )

Change subject: Allow the SASL protocol service name to be configurable
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8230/1/be/src/kudu/rpc/server_negotiation.cc
File be/src/kudu/rpc/server_negotiation.cc:

http://gerrit.cloudera.org:8080/#/c/8230/1/be/src/kudu/rpc/server_negotiation.cc@128
PS1, Line 128: const security::TokenVerifier* token_verifier,
> Similarly, there was a conflict here since we don't have that same patch fr
Oops, sorry this wasn't a conflict. I misspoke. It was just in 
client_negotiation.



--
To view, visit http://gerrit.cloudera.org:8080/8230
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9e30fe4461893b67527333259579e2304b19af1e
Gerrit-Change-Number: 8230
Gerrit-PatchSet: 1
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Wed, 11 Oct 2017 17:55:16 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] Allow the SASL protocol service name to be configurable

2017-10-11 Thread Sailesh Mukil (Code Review)
Sailesh Mukil has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8230 )

Change subject: Allow the SASL protocol service name to be configurable
..


Patch Set 1:

(2 comments)

> Is this a conflict free cherry-pick from something already in Kudu
 > codebase?

Yes this is a cherry-pick from the following kudu commit:
https://github.com/apache/kudu/commit/31d58522b50aea948f977c7cbc8b64f1b849f323

There were a couple of conflicts, but with simple resolutions.

http://gerrit.cloudera.org:8080/#/c/8230/1/be/src/kudu/rpc/client_negotiation.cc
File be/src/kudu/rpc/client_negotiation.cc:

http://gerrit.cloudera.org:8080/#/c/8230/1/be/src/kudu/rpc/client_negotiation.cc@111
PS1, Line 111: const boost::optional& authn_token,
There was a conflict here since we don't have this patch from the kudu code 
base:
https://github.com/apache/kudu/commit/50c7d3249ab5ca19fb4d3c0c8748a4a1c5945a12#diff-ced439e8f22cae7f007af6cb1daf945a

The above patch changes removes the const ref on this variable.

But the resolution was simple.


http://gerrit.cloudera.org:8080/#/c/8230/1/be/src/kudu/rpc/server_negotiation.cc
File be/src/kudu/rpc/server_negotiation.cc:

http://gerrit.cloudera.org:8080/#/c/8230/1/be/src/kudu/rpc/server_negotiation.cc@128
PS1, Line 128: const security::TokenVerifier* token_verifier,
Similarly, there was a conflict here since we don't have that same patch from 
the kudu code base:
https://github.com/apache/kudu/commit/50c7d3249ab5ca19fb4d3c0c8748a4a1c5945a12#diff-ced439e8f22cae7f007af6cb1daf945a

The above patch changes removes the const ref on this variable.

But the resolution was simple.



--
To view, visit http://gerrit.cloudera.org:8080/8230
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9e30fe4461893b67527333259579e2304b19af1e
Gerrit-Change-Number: 8230
Gerrit-PatchSet: 1
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Wed, 11 Oct 2017 17:54:14 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6030: Don't start coordinator specific thread pools if a node isn't a coordinator node

2017-10-11 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8242 )

Change subject: IMPALA-6030: Don't start coordinator specific thread pools if a 
node isn't a coordinator node
..


Patch Set 1:

It'd be good to update the header comment for these fields to say that they are 
set only for coordinator role.


--
To view, visit http://gerrit.cloudera.org:8080/8242
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1cd046dbe5f10fa51fcc37338e48b94843891b62
Gerrit-Change-Number: 8242
Gerrit-PatchSet: 1
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Comment-Date: Wed, 11 Oct 2017 17:53:21 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6027: Retry downloading toolchain components.

2017-10-11 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8258 )

Change subject: IMPALA-6027: Retry downloading toolchain components.
..


Patch Set 2:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1327/


--
To view, visit http://gerrit.cloudera.org:8080/8258
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7669c7d41240aa0eb43c30d5bf2bd5c01b66180b
Gerrit-Change-Number: 8258
Gerrit-PatchSet: 2
Gerrit-Owner: Philip Zeyliger 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Comment-Date: Wed, 11 Oct 2017 17:47:58 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6002: Add a LLVM diagnostic handler for LLVM linker errors

2017-10-11 Thread Philip Zeyliger (Code Review)
Philip Zeyliger has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8233 )

Change subject: IMPALA-6002: Add a LLVM diagnostic handler for LLVM linker 
errors
..


Patch Set 1:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/8233/3/be/src/codegen/llvm-codegen.cc
File be/src/codegen/llvm-codegen.cc:

http://gerrit.cloudera.org:8080/#/c/8233/3/be/src/codegen/llvm-codegen.cc@308
PS3, Line 308:   }
nit: This might look like "... to main module.Bla bla." I.e., no space after 
the period.

You might want to add '<< " "' in here to make it look nicer.


http://gerrit.cloudera.org:8080/#/c/8233/3/be/src/codegen/llvm-codegen.cc@1635
PS3, Line 1635:
I might be confused about the C++11 stuff, but given that you just std::move'd 
it, I don't think you need to clear it.


http://gerrit.cloudera.org:8080/#/c/8233/1/tests/query_test/test_codegen.py
File tests/query_test/test_codegen.py:

http://gerrit.cloudera.org:8080/#/c/8233/1/tests/query_test/test_codegen.py@45
PS1, Line 45:   def test_codegen_diagnostic_handler(self, vector):
> I tried moving this to llvm-codegen-test.cc but it turns out  that a lot ug
I don't know LLVM well enough to suggest how to get at these errors.

That said, LoadModuleFromFile() takes a regular file, I think, and not one on 
HDFS.  You might be able to just cp ./llvm-ir/test-loop.bc to a temporary file 
to induce this. (Or create a second test-loop that has a link-time conflict: 
presumably name collisions cause errors?)



--
To view, visit http://gerrit.cloudera.org:8080/8233
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I907ff1f8975c02d422b4a669afbfc2e536ddf1ff
Gerrit-Change-Number: 8233
Gerrit-PatchSet: 1
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: anujphadke 
Gerrit-Comment-Date: Wed, 11 Oct 2017 17:43:54 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] Allow the SASL protocol service name to be configurable

2017-10-11 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8230 )

Change subject: Allow the SASL protocol service name to be configurable
..


Patch Set 1:

Is this a conflict free cherry-pick from something already in Kudu codebase?


--
To view, visit http://gerrit.cloudera.org:8080/8230
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9e30fe4461893b67527333259579e2304b19af1e
Gerrit-Change-Number: 8230
Gerrit-PatchSet: 1
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Wed, 11 Oct 2017 17:44:16 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5940: Avoid log spew by using Status::Expected()

2017-10-11 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8255 )

Change subject: IMPALA-5940: Avoid log spew by using Status::Expected()
..


Patch Set 1:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/8255/1/be/src/rpc/thrift-client.cc
File be/src/rpc/thrift-client.cc:

http://gerrit.cloudera.org:8080/#/c/8255/1/be/src/rpc/thrift-client.cc@71
PS1, Line 71: }
I think we should make it clear why we use Expected() when we do (since we 
don't want to normally use it). So how about a quick comment about that 
explaining under what condition this error is expected.


http://gerrit.cloudera.org:8080/#/c/8255/1/be/src/service/impala-server.cc
File be/src/service/impala-server.cc:

http://gerrit.cloudera.org:8080/#/c/8255/1/be/src/service/impala-server.cc@1173
PS1, Line 1173: INTERNAL_ERROR
hmm, if this is expected (as the comment claims), then this shouldn't be an 
INTERNAL_ERROR. INTERNAL_ERROR means we've hit a bug of some sort (i.e. some 
invariant was violated). That's not this case.

Anyway, you don't have to fix that now if you don't want.



--
To view, visit http://gerrit.cloudera.org:8080/8255
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I87493394074a7fdb4e912f4ab2436dd7050d45b3
Gerrit-Change-Number: 8255
Gerrit-PatchSet: 1
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Wed, 11 Oct 2017 17:39:36 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6040: skip test multi compression types where Hive isn't supported

2017-10-11 Thread Michael Brown (Code Review)
Michael Brown has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/8259


Change subject: IMPALA-6040: skip test_multi_compression_types where Hive isn't 
supported
..

IMPALA-6040: skip test_multi_compression_types where Hive isn't supported

A recent commit "IMPALA-5448: fix invalid number of splits reported in
Parquet scan node" neglected to account for the fact that in some
environments, Impala runs without Hive. The typical pattern for tests
that use Hive is skip them if they are executed against such
environments.

Change-Id: I3ad4b72839f8ac3bcb824287d02dd6964eea3e3e
---
M tests/query_test/test_scanners.py
1 file changed, 4 insertions(+), 0 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/59/8259/1
--
To view, visit http://gerrit.cloudera.org:8080/8259
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I3ad4b72839f8ac3bcb824287d02dd6964eea3e3e
Gerrit-Change-Number: 8259
Gerrit-PatchSet: 1
Gerrit-Owner: Michael Brown 


[Impala-ASF-CR] IMPALA-2758: Change BufferedTupleStream::GetRows to returning multi batches

2017-10-11 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8226 )

Change subject: IMPALA-2758: Change BufferedTupleStream::GetRows to returning 
multi batches
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8226/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/8226/1//COMMIT_MSG@10
PS1, Line 10: which returns multiple batches instead of one.
> I'm ok with the stream staying pinned, the main goal was to remove the GetR
WFM, agree the important step here is to eliminate GetRows().



--
To view, visit http://gerrit.cloudera.org:8080/8226
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3831c38994da2b69775a9809ff01de5d23584414
Gerrit-Change-Number: 8226
Gerrit-PatchSet: 1
Gerrit-Owner: Tianyi Wang 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 11 Oct 2017 16:55:26 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6027: Retry downloading toolchain components.

2017-10-11 Thread Michael Brown (Code Review)
Michael Brown has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8258 )

Change subject: IMPALA-6027: Retry downloading toolchain components.
..


Patch Set 2: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/8258
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7669c7d41240aa0eb43c30d5bf2bd5c01b66180b
Gerrit-Change-Number: 8258
Gerrit-PatchSet: 2
Gerrit-Owner: Philip Zeyliger 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Comment-Date: Wed, 11 Oct 2017 16:53:32 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6027: Retry downloading toolchain components.

2017-10-11 Thread Thomas Tauber-Marshall (Code Review)
Thomas Tauber-Marshall has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8258 )

Change subject: IMPALA-6027: Retry downloading toolchain components.
..


Patch Set 2: Code-Review+1


--
To view, visit http://gerrit.cloudera.org:8080/8258
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7669c7d41240aa0eb43c30d5bf2bd5c01b66180b
Gerrit-Change-Number: 8258
Gerrit-PatchSet: 2
Gerrit-Owner: Philip Zeyliger 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Comment-Date: Wed, 11 Oct 2017 16:48:27 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6027: Retry downloading toolchain components.

2017-10-11 Thread Philip Zeyliger (Code Review)
Philip Zeyliger has uploaded a new patch set (#2). ( 
http://gerrit.cloudera.org:8080/8258 )

Change subject: IMPALA-6027: Retry downloading toolchain components.
..

IMPALA-6027: Retry downloading toolchain components.

We've seen intermittent 500 errors when downloading the toolchain from
S3 over the HTTPS URLs. As a first stab, this commit retries 3 times,
with some jitter.

I also changed the threadpool introduced previously to have a limit
of 4 threads, because that's sufficient to get the speed improvement.
The 500 errors have been observed both before and after the threadpool
change.

For testing, I ran the straight-forward case directly. I introduced
a broken version string to observe that retries would happen on
any error from wget.

Change-Id: I7669c7d41240aa0eb43c30d5bf2bd5c01b66180b
---
M bin/bootstrap_toolchain.py
1 file changed, 15 insertions(+), 4 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/58/8258/2
--
To view, visit http://gerrit.cloudera.org:8080/8258
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7669c7d41240aa0eb43c30d5bf2bd5c01b66180b
Gerrit-Change-Number: 8258
Gerrit-PatchSet: 2
Gerrit-Owner: Philip Zeyliger 


[Impala-ASF-CR] IMPALA-6027: Retry downloading toolchain components.

2017-10-11 Thread Philip Zeyliger (Code Review)
Philip Zeyliger has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/8258


Change subject: IMPALA-6027: Retry downloading toolchain components.
..

IMPALA-6027: Retry downloading toolchain components.

We've seen intermittent 500 errors when downloading the toolchain from
S3 over the HTTPS URLs. As a first stab, this commit retries 3 times,
with some jitter.

I also changed the threadpool introduced previously to have a limit
of 4 threads, because that's sufficient to get the speed improvement.
The 500 errors have been observed both before and after the threadpool
change.

For testing, I ran the straight-forward case directly. I introduced
a broken version string to observe that retries would happen on
any error from wget.

Change-Id: I7669c7d41240aa0eb43c30d5bf2bd5c01b66180b
---
M bin/bootstrap_toolchain.py
1 file changed, 15 insertions(+), 4 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/58/8258/1
--
To view, visit http://gerrit.cloudera.org:8080/8258
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I7669c7d41240aa0eb43c30d5bf2bd5c01b66180b
Gerrit-Change-Number: 8258
Gerrit-PatchSet: 1
Gerrit-Owner: Philip Zeyliger 


[Impala-ASF-CR] IMPALA-4704: Disallow client connections to imapalad until catalog is received.

2017-10-11 Thread Vuk Ercegovac (Code Review)
Vuk Ercegovac has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8202 )

Change subject: IMPALA-4704: Disallow client connections to imapalad until 
catalog is received.
..


Patch Set 7:

A Frontend test was asserting an error message removed by this change.
Latest patch fixes that test. All tests pass with this latest patch.


--
To view, visit http://gerrit.cloudera.org:8080/8202
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I52b881cba18a7e4533e21a78751c2e35c3d4c8a6
Gerrit-Change-Number: 8202
Gerrit-PatchSet: 7
Gerrit-Owner: Vuk Ercegovac 
Gerrit-Reviewer: Balazs Jeszenszky 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Wed, 11 Oct 2017 16:33:05 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4704: Disallow client connections to imapalad until catalog is received.

2017-10-11 Thread Vuk Ercegovac (Code Review)
Hello Philip Zeyliger, Balazs Jeszenszky, Dan Hecht,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8202

to look at the new patch set (#7).

Change subject: IMPALA-4704: Disallow client connections to imapalad until 
catalog is received.
..

IMPALA-4704: Disallow client connections to imapalad until catalog is received.

Currently, impalad starts beeswax and hs2 servers even if the catalog has not 
yet
been received. As a result, client connections see an error message stating that
the impalad is not yet ready.

This patch changes the impalad startup sequence to wait until the catalog is
received before opening beeswax and hs2 ports and starting their servers.

Testing:
- python e2e tests that start a cluster without a catalog and check that
  client connections are rejected as expected.

Change-Id: I52b881cba18a7e4533e21a78751c2e35c3d4c8a6
---
M be/src/service/frontend.cc
M be/src/service/frontend.h
M be/src/service/impala-server.cc
M be/src/testutil/in-process-servers.cc
M bin/start-impala-cluster.py
M fe/src/main/java/org/apache/impala/service/Frontend.java
M fe/src/main/java/org/apache/impala/service/JniFrontend.java
M fe/src/test/java/org/apache/impala/service/FrontendTest.java
M tests/common/custom_cluster_test_suite.py
M tests/common/impala_cluster.py
M tests/common/impala_service.py
A tests/custom_cluster/test_catalog_wait.py
M tests/custom_cluster/test_coordinators.py
13 files changed, 157 insertions(+), 35 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/02/8202/7
--
To view, visit http://gerrit.cloudera.org:8080/8202
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I52b881cba18a7e4533e21a78751c2e35c3d4c8a6
Gerrit-Change-Number: 8202
Gerrit-PatchSet: 7
Gerrit-Owner: Vuk Ercegovac 
Gerrit-Reviewer: Balazs Jeszenszky 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Vuk Ercegovac 


[Impala-ASF-CR] IMPALA-5668: Fix cast(X as timestamp) for negative subsecond Decimals

2017-10-11 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8051 )

Change subject: IMPALA-5668: Fix cast(X as timestamp) for negative subsecond 
Decimals
..


Patch Set 3:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/8051/3/be/src/exprs/decimal-operators-ir.cc
File be/src/exprs/decimal-operators-ir.cc:

http://gerrit.cloudera.org:8080/#/c/8051/3/be/src/exprs/decimal-operators-ir.cc@613
PS3, Line 613:   if (seconds < numeric_limits::min() ||
 :   seconds > numeric_limits::max()) {
 : // TimeStampVal() takes int64_t.
 : return TimestampVal::null();
> Is this branch now also evaluated for Decimal[4/8]Values? If so, will it ha
Good question - my guess is that the compiler is smart enough to eliminate this 
branch for 4/8.


http://gerrit.cloudera.org:8080/#/c/8051/3/be/src/exprs/decimal-operators.h
File be/src/exprs/decimal-operators.h:

http://gerrit.cloudera.org:8080/#/c/8051/3/be/src/exprs/decimal-operators.h@166
PS3, Line 166: //
> nit: Keep comment formatting consistent with the rest of the file.
Done


http://gerrit.cloudera.org:8080/#/c/8051/3/be/src/exprs/decimal-operators.h@168
PS3, Line 168: TDecimal
> Did you follow some other example h
No, I just wanted to highlight that this T is a kind of decimal.



--
To view, visit http://gerrit.cloudera.org:8080/8051
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8216f4c0f100c1bd68891cd6048236bfe4c205f0
Gerrit-Change-Number: 8051
Gerrit-PatchSet: 3
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Lars Volker 
Gerrit-Comment-Date: Wed, 11 Oct 2017 15:44:01 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5668: Fix cast(X as timestamp) for negative subsecond Decimals

2017-10-11 Thread Csaba Ringhofer (Code Review)
Hello Lars Volker,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8051

to look at the new patch set (#4).

Change subject: IMPALA-5668: Fix cast(X as timestamp) for negative subsecond 
Decimals
..

IMPALA-5668: Fix cast(X as timestamp) for negative subsecond Decimals

The timestamp conversion from negative fractional Decimal types
(interpreted as unix timestamp) was wrong - the whole part
was rounded toward zero, and fractional part was being added
instead of being subtracted. This is fixed by subtracting the
fractional part in case of negative decimals.

Example for the wrong behaviour:
+-+
| cast(cast(-0.1 as decimal(18,10)) as timestamp) |
+-+
| 1970-01-01 00:00:00.1   |
+-+
while casting to double works correctly:
+-+
| cast(cast(-0.1 as double) as timestamp) |
+-+
| 1969-12-31 23:59:59.9   |
+-+

Change-Id: I8216f4c0f100c1bd68891cd6048236bfe4c205f0
---
M be/src/exprs/decimal-operators-ir.cc
M be/src/exprs/decimal-operators.h
M be/src/exprs/expr-test.cc
M be/src/runtime/decimal-value.h
4 files changed, 44 insertions(+), 36 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/51/8051/4
--
To view, visit http://gerrit.cloudera.org:8080/8051
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I8216f4c0f100c1bd68891cd6048236bfe4c205f0
Gerrit-Change-Number: 8051
Gerrit-PatchSet: 4
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Lars Volker 


[Impala-ASF-CR] IMPALA-5668: Fix cast(X as timestamp) for negative subsecond Decimals

2017-10-11 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8051 )

Change subject: IMPALA-5668: Fix cast(X as timestamp) for negative subsecond 
Decimals
..


Patch Set 3:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/8051/3/be/src/exprs/decimal-operators-ir.cc
File be/src/exprs/decimal-operators-ir.cc:

http://gerrit.cloudera.org:8080/#/c/8051/3/be/src/exprs/decimal-operators-ir.cc@613
PS3, Line 613:   if (seconds < numeric_limits::min() ||
 :   seconds > numeric_limits::max()) {
 : // TimeStampVal() takes int64_t.
 : return TimestampVal::null();
Is this branch now also evaluated for Decimal[4/8]Values? If so, will it have a 
perf impact?


http://gerrit.cloudera.org:8080/#/c/8051/3/be/src/exprs/decimal-operators.h
File be/src/exprs/decimal-operators.h:

http://gerrit.cloudera.org:8080/#/c/8051/3/be/src/exprs/decimal-operators.h@166
PS3, Line 166: //
nit: Keep comment formatting consistent with the rest of the file.


http://gerrit.cloudera.org:8080/#/c/8051/3/be/src/exprs/decimal-operators.h@168
PS3, Line 168: TDecimal
This naming convention usually indicates Thrift structures throughout the 
codebase. Did you follow some other example here? Otherwise "const T& 
decimal_value" seems more consistent.



--
To view, visit http://gerrit.cloudera.org:8080/8051
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8216f4c0f100c1bd68891cd6048236bfe4c205f0
Gerrit-Change-Number: 8051
Gerrit-PatchSet: 3
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Lars Volker 
Gerrit-Comment-Date: Wed, 11 Oct 2017 15:21:21 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5668: Fix cast(X as timestamp) for negative subsecond Decimals

2017-10-11 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8051 )

Change subject: IMPALA-5668: Fix cast(X as timestamp) for negative subsecond 
Decimals
..


Patch Set 2:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/8051/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/8051/2//COMMIT_MSG@9
PS2, Line 9: The timestamp conversion from negative fractional Decimal types
   : (interpreted as unix timestamp) was wrong - the whole part
   : was rounded toward zero, and fractional part was being added
   : instead of being subtracted.
> The commit message should include a brief description of how change fixes i
Done


http://gerrit.cloudera.org:8080/#/c/8051/2//COMMIT_MSG@14
PS2, Line 14: Example for the wrong behaivour:
> Spelling. Please consider setting up an automatic spell checker in your tex
I am still trying different linux text editors.


http://gerrit.cloudera.org:8080/#/c/8051/2/be/src/exprs/decimal-operators-ir.cc
File be/src/exprs/decimal-operators-ir.cc:

http://gerrit.cloudera.org:8080/#/c/8051/2/be/src/exprs/decimal-operators-ir.cc@617
PS2, Line 617:   if(dv.is_negative()) nanoseconds *= -1;
> I think it might be easier to read if you extract the sign in in line 610 a
I refactored this part a bit, now this "if" is no longer duplicated.


http://gerrit.cloudera.org:8080/#/c/8051/2/be/src/exprs/decimal-operators-ir.cc@625
PS2, Line 625: int64_t
> why is this 64bit?
You are right, a 32 bit int is always enough for the nanoseconds part. I have 
changed to code accordingly.


http://gerrit.cloudera.org:8080/#/c/8051/2/be/src/exprs/decimal-operators-ir.cc@640
PS2, Line 640: int128_t
> why is this 128bit?
Done


http://gerrit.cloudera.org:8080/#/c/8051/2/be/src/exprs/expr-test.cc
File be/src/exprs/expr-test.cc:

http://gerrit.cloudera.org:8080/#/c/8051/2/be/src/exprs/expr-test.cc@2363
PS2, Line 2363:   TestTimestampValue("cast(cast(-123.45 as decimal(9,2)) as 
timestamp)",
> If you include a negative example in the commit message, it should also be
I have added a very similar test and changed the commit message to contain the 
same example. The new test is useful because there are more than 9 fractional 
digits, which exercises a different branch in 
DecimalOperators::ConvertToNanoseconds.



--
To view, visit http://gerrit.cloudera.org:8080/8051
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8216f4c0f100c1bd68891cd6048236bfe4c205f0
Gerrit-Change-Number: 8051
Gerrit-PatchSet: 2
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Lars Volker 
Gerrit-Comment-Date: Wed, 11 Oct 2017 14:43:40 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5668: Fix cast(X as timestamp) for negative subsecond Decimals

2017-10-11 Thread Csaba Ringhofer (Code Review)
Hello Lars Volker,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8051

to look at the new patch set (#3).

Change subject: IMPALA-5668: Fix cast(X as timestamp) for negative subsecond 
Decimals
..

IMPALA-5668: Fix cast(X as timestamp) for negative subsecond Decimals

The timestamp conversion from negative fractional Decimal types
(interpreted as unix timestamp) was wrong - the whole part
was rounded toward zero, and fractional part was being added
instead of being subtracted. This is fixed by subtracting the
fractional part in case of negative decimals.

Example for the wrong behaviour:
+-+
| cast(cast(-0.1 as decimal(18,10)) as timestamp) |
+-+
| 1970-01-01 00:00:00.1   |
+-+
while casting to double works correctly:
+-+
| cast(cast(-0.1 as double) as timestamp) |
+-+
| 1969-12-31 23:59:59.9   |
+-+

Change-Id: I8216f4c0f100c1bd68891cd6048236bfe4c205f0
---
M be/src/exprs/decimal-operators-ir.cc
M be/src/exprs/decimal-operators.h
M be/src/exprs/expr-test.cc
M be/src/runtime/decimal-value.h
4 files changed, 43 insertions(+), 36 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/51/8051/3
--
To view, visit http://gerrit.cloudera.org:8080/8051
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I8216f4c0f100c1bd68891cd6048236bfe4c205f0
Gerrit-Change-Number: 8051
Gerrit-PatchSet: 3
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Lars Volker