[Impala-ASF-CR] IMPALA-6193: Track memory of incoming data streams

2018-01-26 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8914 )

Change subject: IMPALA-6193: Track memory of incoming data streams
..


Patch Set 13: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/1811/


--
To view, visit http://gerrit.cloudera.org:8080/8914
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2df1204d2483313a8a18e5e3be6cec9e402614c4
Gerrit-Change-Number: 8914
Gerrit-PatchSet: 13
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Sat, 27 Jan 2018 05:40:25 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5528: Bump total thread cache size when KRPC is enabled

2018-01-26 Thread Sailesh Mukil (Code Review)
Sailesh Mukil has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9058 )

Change subject: IMPALA-5528: Bump total thread cache size when KRPC is enabled
..


Patch Set 2: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/9058
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8407528942051fb19a0222491347c9090d4b4b8d
Gerrit-Change-Number: 9058
Gerrit-PatchSet: 2
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Sat, 27 Jan 2018 03:17:00 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6444: CTAS STORED AS KUDU not supporting reordering of columns

2018-01-26 Thread Pranay Singh (Code Review)
Pranay Singh has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/9147


Change subject: IMPALA-6444: CTAS STORED AS KUDU not supporting reordering of 
columns
..

IMPALA-6444: CTAS STORED AS KUDU not supporting reordering of columns

In the function KuduTable.isPrimaryKeyColumn() primaryKeyColumnNames_ does not
check for a matching case which causes primaryKeyExprs_ to be empty. This 
causes to
hit an assertion in InsertStmt.prepareExpressions() that generates the exception
reported in the jira.

The problem is fixed by having an ignoreCase match of the column names.

Change-Id: Ica1c8ec1544339e9e80733a7a0c78594e0a727d2
Testing: The fix is verified against the test case in JIRA that fails.
---
M fe/src/main/java/org/apache/impala/catalog/KuduTable.java
1 file changed, 10 insertions(+), 1 deletion(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/47/9147/1
--
To view, visit http://gerrit.cloudera.org:8080/9147
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Ica1c8ec1544339e9e80733a7a0c78594e0a727d2
Gerrit-Change-Number: 9147
Gerrit-PatchSet: 1
Gerrit-Owner: Pranay Singh


[Impala-ASF-CR] IMPALA-6193: Track memory of incoming data streams

2018-01-26 Thread Lars Volker (Code Review)
Hello Michael Ho, Tim Armstrong, Bikramjeet Vig,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8914

to look at the new patch set (#13).

Change subject: IMPALA-6193: Track memory of incoming data streams
..

IMPALA-6193: Track memory of incoming data streams

This change adds memory tracking to incoming transmit data RPCs when
using KRPC. We track memory against a global tracker called "Data Stream
Service" until it is handed over to the stream manager. There we track
it in a global tracker called "Data Stream Queued RPC Calls" until a
receiver registers and takes over the early sender RPCs. Inside the
receiver, memory for deferred RPCs is tracked against the fragment
instance's memtracker until we unpack the batches and add them to the
row batch queue.

The DCHECK in MemTracker::Close() covers that all memory consumed by a
tracker gets release eventually. In addition to that, this change adds a
custom cluster test that makes sure that queued memory gets tracked by
inspecting the peak consumption of the new memtrackers.

Change-Id: I2df1204d2483313a8a18e5e3be6cec9e402614c4
---
M be/src/rpc/impala-service-pool.cc
M be/src/rpc/impala-service-pool.h
M be/src/rpc/rpc-mgr.cc
M be/src/rpc/rpc-mgr.h
M be/src/runtime/exec-env.cc
M be/src/runtime/krpc-data-stream-mgr.cc
M be/src/runtime/krpc-data-stream-mgr.h
M be/src/runtime/krpc-data-stream-recvr.cc
M be/src/runtime/mem-tracker.h
M be/src/util/memory-metrics.h
M common/protobuf/data_stream_service.proto
A tests/custom_cluster/test_krpc_mem_usage.py
A tests/verifiers/mem_usage_verifier.py
13 files changed, 279 insertions(+), 35 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/14/8914/13
--
To view, visit http://gerrit.cloudera.org:8080/8914
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I2df1204d2483313a8a18e5e3be6cec9e402614c4
Gerrit-Change-Number: 8914
Gerrit-PatchSet: 13
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-6193: Track memory of incoming data streams

2018-01-26 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8914 )

Change subject: IMPALA-6193: Track memory of incoming data streams
..


Patch Set 13:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1811/


--
To view, visit http://gerrit.cloudera.org:8080/8914
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2df1204d2483313a8a18e5e3be6cec9e402614c4
Gerrit-Change-Number: 8914
Gerrit-PatchSet: 13
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Sat, 27 Jan 2018 02:28:27 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6193: Track memory of incoming data streams

2018-01-26 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8914 )

Change subject: IMPALA-6193: Track memory of incoming data streams
..


Patch Set 13: Code-Review+2

(1 comment)

Rebased the change, carrying Michael's +2.

http://gerrit.cloudera.org:8080/#/c/8914/7/be/src/runtime/exec-env.cc
File be/src/runtime/exec-env.cc:

http://gerrit.cloudera.org:8080/#/c/8914/7/be/src/runtime/exec-env.cc@343
PS7, Line 343: new MemTracker(-1, "Data Stream Queued RPC Calls", 
mem_tracker_.get()));
 : RETURN_IF_ERROR(KrpcStreamMgr()->Init(stream_mgr_tracker, 
data_svc_tracker));
 :
 : unique_ptr data_svc(new 
DataStreamService(rpc_mgr_.get()));
 : i
> Thanks for the heads up, I'll double check.
Double checked that they are removed.



--
To view, visit http://gerrit.cloudera.org:8080/8914
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2df1204d2483313a8a18e5e3be6cec9e402614c4
Gerrit-Change-Number: 8914
Gerrit-PatchSet: 13
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Sat, 27 Jan 2018 02:28:20 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6193: Track memory of incoming data streams

2018-01-26 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8914 )

Change subject: IMPALA-6193: Track memory of incoming data streams
..


Patch Set 11:

(3 comments)

Thanks for the review. I addressed the remaining comments in PS12 and will 
rebase the change next.

http://gerrit.cloudera.org:8080/#/c/8914/11/be/src/runtime/exec-env.cc
File be/src/runtime/exec-env.cc:

http://gerrit.cloudera.org:8080/#/c/8914/11/be/src/runtime/exec-env.cc@332
PS11, Line 332: // deferred RPC calls.
> buffered in the stream manager
Done, though I'm not entirely sure whether I captured your intent correctly.


http://gerrit.cloudera.org:8080/#/c/8914/11/be/src/runtime/krpc-data-stream-mgr.h
File be/src/runtime/krpc-data-stream-mgr.h:

http://gerrit.cloudera.org:8080/#/c/8914/11/be/src/runtime/krpc-data-stream-mgr.h@294
PS11, Line 294:   /// specific receiver. Not owned.
> Used only to track payloads of deferred RPCs (e.g. early senders)
Done


http://gerrit.cloudera.org:8080/#/c/8914/11/be/src/runtime/krpc-data-stream-mgr.h@299
PS11, Line 299: the service
> The wording seems a bit confusing. It sounds as if DataStreamService is tra
Done



--
To view, visit http://gerrit.cloudera.org:8080/8914
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2df1204d2483313a8a18e5e3be6cec9e402614c4
Gerrit-Change-Number: 8914
Gerrit-PatchSet: 11
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Sat, 27 Jan 2018 01:58:24 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6193: Track memory of incoming data streams

2018-01-26 Thread Lars Volker (Code Review)
Hello Michael Ho, Tim Armstrong, Bikramjeet Vig,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8914

to look at the new patch set (#12).

Change subject: IMPALA-6193: Track memory of incoming data streams
..

IMPALA-6193: Track memory of incoming data streams

This change adds memory tracking to incoming transmit data RPCs when
using KRPC. We track memory against a global tracker called "Data Stream
Service" until it is handed over to the stream manager. There we track
it in a global tracker called "Data Stream Queued RPC Calls" until a
receiver registers and takes over the early sender RPCs. Inside the
receiver, memory for deferred RPCs is tracked against the fragment
instance's memtracker until we unpack the batches and add them to the
row batch queue.

The DCHECK in MemTracker::Close() covers that all memory consumed by a
tracker gets release eventually. In addition to that, this change adds a
custom cluster test that makes sure that queued memory gets tracked by
inspecting the peak consumption of the new memtrackers.

Change-Id: I2df1204d2483313a8a18e5e3be6cec9e402614c4
---
M be/src/rpc/impala-service-pool.cc
M be/src/rpc/impala-service-pool.h
M be/src/rpc/rpc-mgr.cc
M be/src/rpc/rpc-mgr.h
M be/src/runtime/exec-env.cc
M be/src/runtime/krpc-data-stream-mgr.cc
M be/src/runtime/krpc-data-stream-mgr.h
M be/src/runtime/krpc-data-stream-recvr.cc
M be/src/runtime/mem-tracker.h
M be/src/util/memory-metrics.h
M common/protobuf/data_stream_service.proto
A tests/custom_cluster/test_krpc_mem_usage.py
A tests/verifiers/mem_usage_verifier.py
13 files changed, 284 insertions(+), 39 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/14/8914/12
--
To view, visit http://gerrit.cloudera.org:8080/8914
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I2df1204d2483313a8a18e5e3be6cec9e402614c4
Gerrit-Change-Number: 8914
Gerrit-PatchSet: 12
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-6441 addendum: fix reading rows from HS2 via Impyla

2018-01-26 Thread David Knupp (Code Review)
David Knupp has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9146 )

Change subject: IMPALA-6441 addendum: fix reading rows from HS2 via Impyla
..


Patch Set 1: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/9146
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id26d1db15c22a971dc1a346ad6d1df758306c3c5
Gerrit-Change-Number: 9146
Gerrit-PatchSet: 1
Gerrit-Owner: Michael Brown 
Gerrit-Reviewer: David Knupp 
Gerrit-Comment-Date: Sat, 27 Jan 2018 00:52:48 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6441 addendum: fix reading rows from HS2 via Impyla

2018-01-26 Thread Michael Brown (Code Review)
Michael Brown has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/9146


Change subject: IMPALA-6441 addendum: fix reading rows from HS2 via Impyla
..

IMPALA-6441 addendum: fix reading rows from HS2 via Impyla

When fetching explain output from HS2 using Impyla, rows come back in
lists of 1-tuples.

This patch exhibits the need to do end-to-end testing when the case
warrants. In this case, although the unit test for
http://gerrit.cloudera.org:8080/9141 passes, I neglected to make sure
this worked in the stress test. Mea culpa. It works now.

Change-Id: Id26d1db15c22a971dc1a346ad6d1df758306c3c5
---
M tests/stress/concurrent_select.py
1 file changed, 2 insertions(+), 1 deletion(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/46/9146/1
--
To view, visit http://gerrit.cloudera.org:8080/9146
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Id26d1db15c22a971dc1a346ad6d1df758306c3c5
Gerrit-Change-Number: 9146
Gerrit-PatchSet: 1
Gerrit-Owner: Michael Brown 


[native-toolchain-CR](cdh6.x) Bump Kudu version to c6beta-impala-toolchain-tag1

2018-01-26 Thread Thomas Tauber-Marshall (Code Review)
Thomas Tauber-Marshall has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9145 )

Change subject: Bump Kudu version to c6beta-impala-toolchain-tag1
..


Patch Set 1: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/9145
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: native-toolchain
Gerrit-Branch: cdh6.x
Gerrit-MessageType: comment
Gerrit-Change-Id: I409cf30dc1554630bdebca1e831c22fc36e024d0
Gerrit-Change-Number: 9145
Gerrit-PatchSet: 1
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Comment-Date: Sat, 27 Jan 2018 00:19:47 +
Gerrit-HasComments: No


[native-toolchain-CR](cdh6.x) Bump Kudu version to c6beta-impala-toolchain-tag1

2018-01-26 Thread Thomas Tauber-Marshall (Code Review)
Thomas Tauber-Marshall has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/9145 )

Change subject: Bump Kudu version to c6beta-impala-toolchain-tag1
..

Bump Kudu version to c6beta-impala-toolchain-tag1

Change-Id: I409cf30dc1554630bdebca1e831c22fc36e024d0
---
M buildall.sh
1 file changed, 2 insertions(+), 1 deletion(-)

Approvals:
  Philip Zeyliger: Looks good to me, approved
  Thomas Tauber-Marshall: Verified

--
To view, visit http://gerrit.cloudera.org:8080/9145
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: native-toolchain
Gerrit-Branch: cdh6.x
Gerrit-MessageType: merged
Gerrit-Change-Id: I409cf30dc1554630bdebca1e831c22fc36e024d0
Gerrit-Change-Number: 9145
Gerrit-PatchSet: 1
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Thomas Tauber-Marshall 


[Impala-ASF-CR] IMPALA-5528: Bump total thread cache size when KRPC is enabled

2018-01-26 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9058 )

Change subject: IMPALA-5528: Bump total thread cache size when KRPC is enabled
..


Patch Set 1:

(1 comment)

The changes to the BE tests weren't strictly needed until we upgrade to 2.6.3 
when the default value of aggressive-decommit changes. However, we are already 
setting that flag in the initialization code so we may as well fix those tests 
to pick up the correct value of that flag.

http://gerrit.cloudera.org:8080/#/c/9058/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/9058/1//COMMIT_MSG@30
PS1, Line 30: TCMALLOC_TRANSFER_NUM_OBJ will be tuned in a separate change. 
Previous attempt
> Nit: GPerfTools/TCMalloc needs to be upgraded to 2.6.3 to pick up the
Done



--
To view, visit http://gerrit.cloudera.org:8080/9058
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8407528942051fb19a0222491347c9090d4b4b8d
Gerrit-Change-Number: 9058
Gerrit-PatchSet: 1
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 26 Jan 2018 23:56:36 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5528: Bump total thread cache size when KRPC is enabled

2018-01-26 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9058 )

Change subject: IMPALA-5528: Bump total thread cache size when KRPC is enabled
..


Patch Set 2: Code-Review+1

Carry +1


--
To view, visit http://gerrit.cloudera.org:8080/9058
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8407528942051fb19a0222491347c9090d4b4b8d
Gerrit-Change-Number: 9058
Gerrit-PatchSet: 2
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 26 Jan 2018 23:56:50 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5528: Bump total thread cache size when KRPC is enabled

2018-01-26 Thread Michael Ho (Code Review)
Hello Sailesh Mukil, Tim Armstrong,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/9058

to look at the new patch set (#2).

Change subject: IMPALA-5528: Bump total thread cache size when KRPC is enabled
..

IMPALA-5528: Bump total thread cache size when KRPC is enabled

KRPC in general tends to put more pressure on the thread
caches due to allocations of more small objects (i.e. <1MB).
While some of them are being addressed in KUDU-1865, it's shown
that the following TCMalloc workarounds will provide reasonable
performance with KRPC:

- TCMALLOC_TRANSFER_NUM_OBJ:
   - maximum number of object per classe type to transfer between
 thread and central caches.
   - the default value of 512 in 2.5.2 seems to cause the spin lock
 in the central cache to be held for too long with KRPC. 2.5.90
 and latter reverts this value to 32 by default.

- TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES
  - total amount of memory allocated to all thread caches in bytes
  - the default value is 32MB. We need to bump it to 1GB which is the
internal cap in TCMalloc.

This change bumps the thread cache sizes to 1GB when KRPC is enabled and
FLAGS_TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES has the default value of 0.

GPerfTools/TCMalloc needs to be upgraded to 2.5.90 or above to pick up
the change of the default value of TCMALLOC_TRANSFER_NUM_OBJ. Previous
attempt to upgrade GPerfTools to 2.6.3 failed on certain platforms 
(IMPALA-6414).

Also fixes a couple of BE tests to initialize the test environment properly.

Change-Id: I8407528942051fb19a0222491347c9090d4b4b8d
---
M be/src/runtime/bufferpool/buffer-allocator-test.cc
M be/src/runtime/bufferpool/free-list-test.cc
M be/src/runtime/bufferpool/suballocator-test.cc
M be/src/runtime/exec-env.cc
4 files changed, 43 insertions(+), 12 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/58/9058/2
--
To view, visit http://gerrit.cloudera.org:8080/9058
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I8407528942051fb19a0222491347c9090d4b4b8d
Gerrit-Change-Number: 9058
Gerrit-PatchSet: 2
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-6346: Potential deadlock in KrpcDataStreamMgr

2018-01-26 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8950 )

Change subject: IMPALA-6346: Potential deadlock in KrpcDataStreamMgr
..


Patch Set 9: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/1810/


--
To view, visit http://gerrit.cloudera.org:8080/8950
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib7d1a8f12a4821092ca61ccc8a6f20c0404d56c7
Gerrit-Change-Number: 8950
Gerrit-PatchSet: 9
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Fri, 26 Jan 2018 23:31:07 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5054: [SECURITY] Enable KRPC w/ TLS in Impala

2018-01-26 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8439 )

Change subject: IMPALA-5054: [SECURITY] Enable KRPC w/ TLS in Impala
..


Patch Set 6: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/1809/


--
To view, visit http://gerrit.cloudera.org:8080/8439
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9a14a44fdea9ab668f3714eb69fdb188bce38f5a
Gerrit-Change-Number: 8439
Gerrit-PatchSet: 6
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Fri, 26 Jan 2018 23:20:52 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6356: Reduce amount of logging from RpczStore::LogTrace()

2018-01-26 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/9125 )

Change subject: IMPALA-6356: Reduce amount of logging from RpczStore::LogTrace()
..

IMPALA-6356: Reduce amount of logging from RpczStore::LogTrace()

This change bumps the threshold of RPC duration above which a RPC
is logged. It's increased from 1 second to 2 minutes which is
a conservative value in order to reduce the amount of logging from
RpczStore::LogTrace() when an Impala demon is busy.

Change-Id: I347b0dea641368e10ba84bc40ec250c26a4f43b2
Reviewed-on: http://gerrit.cloudera.org:8080/9125
Reviewed-by: Michael Ho 
Reviewed-by: Mostafa Mokhtar 
Tested-by: Impala Public Jenkins
---
M be/src/rpc/rpc-mgr.cc
1 file changed, 6 insertions(+), 0 deletions(-)

Approvals:
  Michael Ho: Looks good to me, but someone else must approve
  Mostafa Mokhtar: Looks good to me, approved
  Impala Public Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/9125
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I347b0dea641368e10ba84bc40ec250c26a4f43b2
Gerrit-Change-Number: 9125
Gerrit-PatchSet: 4
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-6356: Reduce amount of logging from RpczStore::LogTrace()

2018-01-26 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9125 )

Change subject: IMPALA-6356: Reduce amount of logging from RpczStore::LogTrace()
..


Patch Set 3: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/9125
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I347b0dea641368e10ba84bc40ec250c26a4f43b2
Gerrit-Change-Number: 9125
Gerrit-PatchSet: 3
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 26 Jan 2018 23:07:46 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6410: compare branches: use looser expression

2018-01-26 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/9135 )

Change subject: IMPALA-6410: compare_branches: use looser expression
..

IMPALA-6410: compare_branches: use looser expression

We've already got one use of "Cherry-pick:" instead of "Cherry-picks:"
in master, so I'm loosening the regular expression a bit. (And
converting the string search into a case-insensitive regexp search.)

I tested this by running it manually and inspecting results.

Change-Id: Ie3f75d9e01d2760571547b1a1a5f42bbc8455a05
Reviewed-on: http://gerrit.cloudera.org:8080/9135
Reviewed-by: Taras Bobrovytsky 
Tested-by: Impala Public Jenkins
---
M bin/compare_branches.py
1 file changed, 7 insertions(+), 6 deletions(-)

Approvals:
  Taras Bobrovytsky: Looks good to me, approved
  Impala Public Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/9135
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Ie3f75d9e01d2760571547b1a1a5f42bbc8455a05
Gerrit-Change-Number: 9135
Gerrit-PatchSet: 4
Gerrit-Owner: Philip Zeyliger 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Taras Bobrovytsky 


[Impala-ASF-CR] IMPALA-6410: compare branches: use looser expression

2018-01-26 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9135 )

Change subject: IMPALA-6410: compare_branches: use looser expression
..


Patch Set 3: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/9135
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie3f75d9e01d2760571547b1a1a5f42bbc8455a05
Gerrit-Change-Number: 9135
Gerrit-PatchSet: 3
Gerrit-Owner: Philip Zeyliger 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Comment-Date: Fri, 26 Jan 2018 22:45:56 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-2642: Fix a potential deadlock in statestore

2018-01-26 Thread Zoram Thanga (Code Review)
Hello Bharath Vissapragada, Michael Ho, Sailesh Mukil,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/9038

to look at the new patch set (#3).

Change subject: IMPALA-2642: Fix a potential deadlock in statestore
..

IMPALA-2642: Fix a potential deadlock in statestore

The statestored can deadlock if the number of subscribers has
reached STATESTORE_MAX_SUBSCRIBERS, because the DoSubscriberUpdate()
method calls OfferUpdate(), while holding subscribers_lock_, which
also tries to take the same lock in this situation.

Fix the issue by moving out the call to acquire subscribers_lock_ from
OfferUpdate(), and depend on the callers to take it. We also make
the maximum number of statestore subscribers a start-up time tuneable,
to allow us to test the limit more easily.

Testing: The problem is easily reproduced by lowering the value of
STATESTORE_MAX_SUBSCRIBERS to 3, and then launching a mini cluster
with 3 impalads. Without the fix, the statestored becomes completely
deadlocked.

A new EE test has been added to exercise this scenario. The test
verifies that statestored correctly rejects new subscription
requests when the limit it reached.

Change-Id: I5d49dede221ce1f50ec299643b5532c61f93f0c6
---
M be/src/statestore/statestore.cc
M be/src/statestore/statestore.h
A tests/custom_cluster/test_custom_statestore.py
3 files changed, 111 insertions(+), 15 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/38/9038/3
--
To view, visit http://gerrit.cloudera.org:8080/9038
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I5d49dede221ce1f50ec299643b5532c61f93f0c6
Gerrit-Change-Number: 9038
Gerrit-PatchSet: 3
Gerrit-Owner: Zoram Thanga 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Zoram Thanga 


[Impala-ASF-CR] IMPALA-6346: Potential deadlock in KrpcDataStreamMgr

2018-01-26 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8950 )

Change subject: IMPALA-6346: Potential deadlock in KrpcDataStreamMgr
..


Patch Set 9:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1810/


--
To view, visit http://gerrit.cloudera.org:8080/8950
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib7d1a8f12a4821092ca61ccc8a6f20c0404d56c7
Gerrit-Change-Number: 8950
Gerrit-PatchSet: 9
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Fri, 26 Jan 2018 19:46:34 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6346: Potential deadlock in KrpcDataStreamMgr

2018-01-26 Thread Sailesh Mukil (Code Review)
Sailesh Mukil has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8950 )

Change subject: IMPALA-6346: Potential deadlock in KrpcDataStreamMgr
..


Patch Set 9: Code-Review+2

Fixed clang tidy issue. Rebase, Carry +2.


--
To view, visit http://gerrit.cloudera.org:8080/8950
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib7d1a8f12a4821092ca61ccc8a6f20c0404d56c7
Gerrit-Change-Number: 8950
Gerrit-PatchSet: 9
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Fri, 26 Jan 2018 19:46:17 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6346: Potential deadlock in KrpcDataStreamMgr

2018-01-26 Thread Sailesh Mukil (Code Review)
Hello Michael Ho, Lars Volker, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8950

to look at the new patch set (#9).

Change subject: IMPALA-6346: Potential deadlock in KrpcDataStreamMgr
..

IMPALA-6346: Potential deadlock in KrpcDataStreamMgr

In KrpcDataStreamMgr::CreateRecvr() we take the lock_ and
then call recvr->TakeOverEarlySender() for all contexts.
recvr->TakeOverEarlySender() then calls
recvr_->mgr_->EnqueueDeserializeTask((), which can block if the
deserialize pool queue is full. The next thread to become available
in that queue will also have to acquire lock_, thus leading to a
deadlock.

We fix this by moving the EarlySendersList out of the
EarlySendersMap and dropping the lock before taking any actions on
the RPC contexts in the EarlySendersList. All functions called after
dropping 'lock_' do not require the lock to protect them as they are
thread safe.

Additionally modified the BE test data-stream-test to work with KRPC
as well.

Testing: Added a new test to data-stream-test to verify that the
deadlock does not happen. Also, I verified that this test hangs
without the fix.

Change-Id: Ib7d1a8f12a4821092ca61ccc8a6f20c0404d56c7
---
M be/src/runtime/data-stream-test.cc
M be/src/runtime/krpc-data-stream-mgr.cc
2 files changed, 257 insertions(+), 57 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/50/8950/9
--
To view, visit http://gerrit.cloudera.org:8080/8950
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ib7d1a8f12a4821092ca61ccc8a6f20c0404d56c7
Gerrit-Change-Number: 8950
Gerrit-PatchSet: 9
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 


[Impala-ASF-CR] IMPALA-5054: [SECURITY] Enable KRPC w/ TLS in Impala

2018-01-26 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8439 )

Change subject: IMPALA-5054: [SECURITY] Enable KRPC w/ TLS in Impala
..


Patch Set 6:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1809/


--
To view, visit http://gerrit.cloudera.org:8080/8439
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9a14a44fdea9ab668f3714eb69fdb188bce38f5a
Gerrit-Change-Number: 8439
Gerrit-PatchSet: 6
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Fri, 26 Jan 2018 19:36:35 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5054: [SECURITY] Enable KRPC w/ TLS in Impala

2018-01-26 Thread Sailesh Mukil (Code Review)
Sailesh Mukil has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8439 )

Change subject: IMPALA-5054: [SECURITY] Enable KRPC w/ TLS in Impala
..


Patch Set 6: Code-Review+2

(2 comments)

Rebase. Carry +2.

Ran secure stress tests to confirm that there are no regressions. Also fixed a 
minor incorrect API usage bug.

http://gerrit.cloudera.org:8080/#/c/8439/5/be/src/rpc/rpc-mgr-test.cc
File be/src/rpc/rpc-mgr-test.cc:

http://gerrit.cloudera.org:8080/#/c/8439/5/be/src/rpc/rpc-mgr-test.cc@279
PS5, Line 279:
 : TEST_F(RpcMgrTest, MultipleServices) {
> delete
Done


http://gerrit.cloudera.org:8080/#/c/8439/5/be/src/rpc/rpc-mgr.cc
File be/src/rpc/rpc-mgr.cc:

http://gerrit.cloudera.org:8080/#/c/8439/5/be/src/rpc/rpc-mgr.cc@100
PS5, Line 100: }
> Makes me wonder if we should have a test case in which remote nodes somehow
We don't support that configuration, so I think we can leave it out for now. 
Also, in the worst case, it will be caught during connection negotiation.



--
To view, visit http://gerrit.cloudera.org:8080/8439
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9a14a44fdea9ab668f3714eb69fdb188bce38f5a
Gerrit-Change-Number: 8439
Gerrit-PatchSet: 6
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Fri, 26 Jan 2018 19:36:18 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5054: [SECURITY] Enable KRPC w/ TLS in Impala

2018-01-26 Thread Sailesh Mukil (Code Review)
Hello Michael Ho, Dan Hecht,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8439

to look at the new patch set (#6).

Change subject: IMPALA-5054: [SECURITY] Enable KRPC w/ TLS in Impala
..

IMPALA-5054: [SECURITY] Enable KRPC w/ TLS in Impala

KRPC has some flags that turn on TLS. This patch sets those to enable
TLS communication.

Tests are added to rpc-mgr-test.

Change-Id: I9a14a44fdea9ab668f3714eb69fdb188bce38f5a
---
M be/src/catalog/catalogd-main.cc
M be/src/rpc/authentication-test.cc
M be/src/rpc/rpc-mgr-test.cc
M be/src/rpc/rpc-mgr.cc
M be/src/rpc/rpc-mgr.h
M be/src/rpc/thrift-server.cc
M be/src/rpc/thrift-server.h
M be/src/runtime/exec-env.cc
M be/src/service/impala-server.cc
M be/src/statestore/statestore-subscriber.cc
M be/src/statestore/statestore.cc
M be/src/statestore/statestored-main.cc
M be/src/testutil/in-process-servers.cc
M be/src/util/openssl-util.cc
M be/src/util/openssl-util.h
15 files changed, 329 insertions(+), 68 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/39/8439/6
--
To view, visit http://gerrit.cloudera.org:8080/8439
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I9a14a44fdea9ab668f3714eb69fdb188bce38f5a
Gerrit-Change-Number: 8439
Gerrit-PatchSet: 6
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 


[Impala-ASF-CR] IMPALA-6356: Reduce amount of logging from RpczStore::LogTrace()

2018-01-26 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9125 )

Change subject: IMPALA-6356: Reduce amount of logging from RpczStore::LogTrace()
..


Patch Set 3:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1808/


--
To view, visit http://gerrit.cloudera.org:8080/9125
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I347b0dea641368e10ba84bc40ec250c26a4f43b2
Gerrit-Change-Number: 9125
Gerrit-PatchSet: 3
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 26 Jan 2018 19:26:55 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6193: Track memory of incoming data streams

2018-01-26 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8914 )

Change subject: IMPALA-6193: Track memory of incoming data streams
..


Patch Set 11: Code-Review+2

(3 comments)

http://gerrit.cloudera.org:8080/#/c/8914/11/be/src/runtime/exec-env.cc
File be/src/runtime/exec-env.cc:

http://gerrit.cloudera.org:8080/#/c/8914/11/be/src/runtime/exec-env.cc@332
PS11, Line 332: // deferred RPC calls.
buffered in the stream manager


http://gerrit.cloudera.org:8080/#/c/8914/11/be/src/runtime/krpc-data-stream-mgr.h
File be/src/runtime/krpc-data-stream-mgr.h:

http://gerrit.cloudera.org:8080/#/c/8914/11/be/src/runtime/krpc-data-stream-mgr.h@294
PS11, Line 294:   /// specific receiver. Not owned.
Used only to track payloads of deferred RPCs (e.g. early senders)


http://gerrit.cloudera.org:8080/#/c/8914/11/be/src/runtime/krpc-data-stream-mgr.h@299
PS11, Line 299: the service
The wording seems a bit confusing. It sounds as if DataStreamService is 
transferring memory to itself. So may help to clarify what service here means. 
How about replacing "the service" with "datastream mgr / receiver" ?



--
To view, visit http://gerrit.cloudera.org:8080/8914
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2df1204d2483313a8a18e5e3be6cec9e402614c4
Gerrit-Change-Number: 8914
Gerrit-PatchSet: 11
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 26 Jan 2018 19:17:48 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6410: compare branches: use looser expression

2018-01-26 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9135 )

Change subject: IMPALA-6410: compare_branches: use looser expression
..


Patch Set 3:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1807/


--
To view, visit http://gerrit.cloudera.org:8080/9135
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie3f75d9e01d2760571547b1a1a5f42bbc8455a05
Gerrit-Change-Number: 9135
Gerrit-PatchSet: 3
Gerrit-Owner: Philip Zeyliger 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Comment-Date: Fri, 26 Jan 2018 19:06:25 +
Gerrit-HasComments: No


[native-toolchain-CR](cdh6.x) Bump Kudu version to c6beta-impala-toolchain-tag1

2018-01-26 Thread Philip Zeyliger (Code Review)
Philip Zeyliger has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9145 )

Change subject: Bump Kudu version to c6beta-impala-toolchain-tag1
..


Patch Set 1: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/9145
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: native-toolchain
Gerrit-Branch: cdh6.x
Gerrit-MessageType: comment
Gerrit-Change-Id: I409cf30dc1554630bdebca1e831c22fc36e024d0
Gerrit-Change-Number: 9145
Gerrit-PatchSet: 1
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Comment-Date: Fri, 26 Jan 2018 18:47:15 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5931: Generates scan ranges in planner for s3/adls

2018-01-26 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8523 )

Change subject: IMPALA-5931: Generates scan ranges in planner for s3/adls
..


Patch Set 8:

I probably won't have a chance to look through this in detail, but I think Lars 
should have a look at the scheduler changes - he's the most familiar with that 
code.


--
To view, visit http://gerrit.cloudera.org:8080/8523
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I326065adbb2f7e632814113aae85cb51ca4779a5
Gerrit-Change-Number: 8523
Gerrit-PatchSet: 8
Gerrit-Owner: Vuk Ercegovac 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Fri, 26 Jan 2018 18:20:05 +
Gerrit-HasComments: No


[native-toolchain-CR] Bump Kudu version to c6beta-impala-toolchain-tag1

2018-01-26 Thread Thomas Tauber-Marshall (Code Review)
Thomas Tauber-Marshall has abandoned this change. ( 
http://gerrit.cloudera.org:8080/9086 )

Change subject: Bump Kudu version to c6beta-impala-toolchain-tag1
..


Abandoned

see https://gerrit.cloudera.org/#/c/9145/
--
To view, visit http://gerrit.cloudera.org:8080/9086
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: native-toolchain
Gerrit-Branch: master
Gerrit-MessageType: abandon
Gerrit-Change-Id: I409cf30dc1554630bdebca1e831c22fc36e024d0
Gerrit-Change-Number: 9086
Gerrit-PatchSet: 2
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Thomas Tauber-Marshall 


[Impala-ASF-CR] IMPALA-6113: Skip row groups with predicates on NULL columns

2018-01-26 Thread Gabor Kaszab (Code Review)
Gabor Kaszab has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9140 )

Change subject: IMPALA-6113: Skip row groups with predicates on NULL columns
..


Patch Set 3:

(4 comments)

Thanks for taking a look, Zoli!

http://gerrit.cloudera.org:8080/#/c/9140/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/9140/2//COMMIT_MSG@7
PS2, Line 7: IMPALA-6113
> it's IMPALA-6113
Nice, thx :)


http://gerrit.cloudera.org:8080/#/c/9140/2/be/src/exec/parquet-column-stats.h
File be/src/exec/parquet-column-stats.h:

http://gerrit.cloudera.org:8080/#/c/9140/2/be/src/exec/parquet-column-stats.h@78
PS2, Line 78: returned
> Currently it returns a pointer to a member of ColumnChunk. This means that
In this case the ColumnChunk lives way longer than the code where I use the 
stats (ColumnChunk is part of the file_metadata_ in HDFSParquetScanner).

I prefer writing a comment to note the future users of this function about this 
restriction.


http://gerrit.cloudera.org:8080/#/c/9140/2/be/src/exec/parquet-column-stats.cc
File be/src/exec/parquet-column-stats.cc:

http://gerrit.cloudera.org:8080/#/c/9140/2/be/src/exec/parquet-column-stats.cc@132
PS2, Line 132: const int64_t* ColumnStatsBase::ReadNullCountStat(const 
parquet::ColumnChunk& col_chunk) {
 :   if (!(col_chunk.__isset.meta_data && col_c
> nit: can fit into one line
Done


http://gerrit.cloudera.org:8080/#/c/9140/2/be/src/exec/parquet-column-stats.cc@139
PS2, Line 139: }
 :
 : Sta
> nit: can be a one-liner
Done



--
To view, visit http://gerrit.cloudera.org:8080/9140
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I141317af0e0df30da8f220b29b0bfba364f40ddf
Gerrit-Change-Number: 9140
Gerrit-PatchSet: 3
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Reviewer: anujphadke 
Gerrit-Comment-Date: Fri, 26 Jan 2018 14:53:50 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6113: Skip row groups with predicates on NULL columns

2018-01-26 Thread Gabor Kaszab (Code Review)
Hello Lars Volker, Zoltan Borok-Nagy, anujphadke, Tim Armstrong,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/9140

to look at the new patch set (#3).

Change subject: IMPALA-6113: Skip row groups with predicates on NULL columns
..

IMPALA-6113: Skip row groups with predicates on NULL columns

Based on the existing Parquet column chunk level statistics null_count,
Impala's Parquet scanner is enhanced to skip an entire row group if the
null_count statistics indicate that all the values under the predicated
column are NULL as we wouldn't get any result rows from that row group
anyway.

Change-Id: I141317af0e0df30da8f220b29b0bfba364f40ddf
---
M be/src/exec/hdfs-parquet-scanner.cc
M be/src/exec/parquet-column-stats.cc
M be/src/exec/parquet-column-stats.h
M testdata/workloads/functional-query/queries/QueryTest/parquet-stats.test
M tests/query_test/test_parquet_stats.py
5 files changed, 56 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/40/9140/3
--
To view, visit http://gerrit.cloudera.org:8080/9140
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I141317af0e0df30da8f220b29b0bfba364f40ddf
Gerrit-Change-Number: 9140
Gerrit-PatchSet: 3
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Reviewer: anujphadke 


[Impala-ASF-CR] IMPALA-5237: Support a quoted string in date/time format

2018-01-26 Thread Gabor Kaszab (Code Review)
Gabor Kaszab has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8508 )

Change subject: IMPALA-5237: Support a quoted string in date/time format
..


Patch Set 7:

(5 comments)

Thanks for getting back to this review! :)

One general comment: Try to avoid sending code rebase and new changes in the 
same patch set as it's makes reviewing the changes between the current and the 
previous patch set more difficult.

http://gerrit.cloudera.org:8080/#/c/8508/7/be/src/runtime/timestamp-parse-util.h
File be/src/runtime/timestamp-parse-util.h:

http://gerrit.cloudera.org:8080/#/c/8508/7/be/src/runtime/timestamp-parse-util.h@189
PS7, Line 189:   //  This function returns three kinds of values via the output 
parameters.
This sentence is not needed in my opinion.


http://gerrit.cloudera.org:8080/#/c/8508/7/be/src/runtime/timestamp-parse-util.h@190
PS7, Line 190:   //  str is to point to the opening quote when the function is 
called. Once the function
str is assumed to point to...
This seems more natural to me. What do you think?


http://gerrit.cloudera.org:8080/#/c/8508/7/be/src/runtime/timestamp-parse-util.h@192
PS7, Line 192:   //  position_of_string_literal is start position of the string 
literal.
some separator between the param name and it's description would be great. e.g. 
check the function comments below.


http://gerrit.cloudera.org:8080/#/c/8508/7/be/src/runtime/timestamp-parse-util.cc
File be/src/runtime/timestamp-parse-util.cc:

http://gerrit.cloudera.org:8080/#/c/8508/7/be/src/runtime/timestamp-parse-util.cc@161
PS7, Line 161: const char*& str, int& position_of_string_literal, int& 
length_of_string_literal,
One thing I learned recently, that in Impala the out parameters are preferred 
as a pointer instead of references.

Tim, can you confirm?


http://gerrit.cloudera.org:8080/#/c/8508/7/be/src/runtime/timestamp-parse-util.cc@203
PS7, Line 203:   ++str;
It might worth a comment here that incrementing this would result in the str 
pointing to the first char after the closing quote. (I know it's mentioned in 
the comment for GetStringLiteralBetweenQuotes but it might increase readability 
repeating here.)



--
To view, visit http://gerrit.cloudera.org:8080/8508
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie34055ac695748bcfb110bfa6ed5308f469ea178
Gerrit-Change-Number: 8508
Gerrit-PatchSet: 7
Gerrit-Owner: Kim Jin Chul 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Kim Jin Chul 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 26 Jan 2018 14:22:28 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6311: Skip row groups with predicates on NULL columns

2018-01-26 Thread Zoltan Borok-Nagy (Code Review)
Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9140 )

Change subject: IMPALA-6311: Skip row groups with predicates on NULL columns
..


Patch Set 2:

(4 comments)

LGTM, had minor comments

http://gerrit.cloudera.org:8080/#/c/9140/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/9140/2//COMMIT_MSG@7
PS2, Line 7: IMPALA-6311
it's IMPALA-6113


http://gerrit.cloudera.org:8080/#/c/9140/2/be/src/exec/parquet-column-stats.h
File be/src/exec/parquet-column-stats.h:

http://gerrit.cloudera.org:8080/#/c/9140/2/be/src/exec/parquet-column-stats.h@78
PS2, Line 78: int64_t*
Currently it returns a pointer to a member of ColumnChunk. This means that the 
returned pointer can easily become dangling if the corresponding ColumnChunk 
object dies.
Maybe you could add a comment about it.

Or, it could return an int64_t, and -1 would indicate that there are no 
null_count statistics.


http://gerrit.cloudera.org:8080/#/c/9140/2/be/src/exec/parquet-column-stats.cc
File be/src/exec/parquet-column-stats.cc:

http://gerrit.cloudera.org:8080/#/c/9140/2/be/src/exec/parquet-column-stats.cc@132
PS2, Line 132: const int64_t* ColumnStatsBase::ReadNullCountStat(
 : const parquet::ColumnChunk& col_chunk) {
nit: can fit into one line


http://gerrit.cloudera.org:8080/#/c/9140/2/be/src/exec/parquet-column-stats.cc@139
PS2, Line 139:   if (stats.__isset.null_count) {
 : return _count;
 :   }
nit: can be a one-liner



--
To view, visit http://gerrit.cloudera.org:8080/9140
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I141317af0e0df30da8f220b29b0bfba364f40ddf
Gerrit-Change-Number: 9140
Gerrit-PatchSet: 2
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Reviewer: anujphadke 
Gerrit-Comment-Date: Fri, 26 Jan 2018 11:19:53 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5717: Support for ORC data files

2018-01-26 Thread Quanlong Huang (Code Review)
Quanlong Huang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9134 )

Change subject: IMPALA-5717: Support for ORC data files
..


Patch Set 2:

Here is a document about this patch: 
https://docs.google.com/document/d/1Lg-MmZIis-ZbmMf6cD8YJq4x2tM0UXYPyzf0AYqe6Gc


--
To view, visit http://gerrit.cloudera.org:8080/9134
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia7b6ae4ce3b9ee8125b21993702faa87537790a4
Gerrit-Change-Number: 9134
Gerrit-PatchSet: 2
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Fri, 26 Jan 2018 11:17:20 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5717: Support for ORC data files

2018-01-26 Thread Quanlong Huang (Code Review)
Quanlong Huang has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/9134


Change subject: IMPALA-5717: Support for ORC data files
..

IMPALA-5717: Support for ORC data files

This patch integrates the orc-reader into Impala and implements
HdfsOrcScanner as a middle layer between them. The HdfsOrcScanner
supplies input needed from the orc-reader, tracks memory consumption of
the reader and transfers the reader's output (orc::ColumnVectorBatch)
into impala::RowBatch.

Instead of linking the orc-reader as a third party library, it's
integrated in the code level, leaving chances for further optimization,
e.g. Predicate Pushdown, Code Generation. Currently, we haven’t changed
any codes of the orc-reader. They're in folder be/src/exec/orc.

Currently, we only support reading premitive types. Writing into ORC
table has not been supported neither.

Tests
Most of the end-to-end tests can run on ORC format. Have passed all the
tests.

Change-Id: Ia7b6ae4ce3b9ee8125b21993702faa87537790a4
---
M be/CMakeLists.txt
M be/src/exec/CMakeLists.txt
A be/src/exec/hdfs-orc-scanner-test.cc
A be/src/exec/hdfs-orc-scanner.cc
A be/src/exec/hdfs-orc-scanner.h
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node-mt.cc
A be/src/orc/Adaptor.hh
A be/src/orc/Adaptor.hh.in
A be/src/orc/ByteRLE.cc
A be/src/orc/ByteRLE.hh
A be/src/orc/C09Adapter.cc
A be/src/orc/CMakeLists.txt
A be/src/orc/ColumnPrinter.cc
A be/src/orc/ColumnPrinter.hh
A be/src/orc/ColumnReader.cc
A be/src/orc/ColumnReader.hh
A be/src/orc/Compression.cc
A be/src/orc/Compression.hh
A be/src/orc/Exceptions.cc
A be/src/orc/Exceptions.hh
A be/src/orc/Int128.cc
A be/src/orc/Int128.hh
A be/src/orc/LzoDecompressor.cc
A be/src/orc/LzoDecompressor.hh
A be/src/orc/MemoryPool.cc
A be/src/orc/MemoryPool.hh
A be/src/orc/OrcFile.cc
A be/src/orc/OrcFile.hh
A be/src/orc/RLE.cc
A be/src/orc/RLE.hh
A be/src/orc/RLEv1.cc
A be/src/orc/RLEv1.hh
A be/src/orc/RLEv2.cc
A be/src/orc/RLEv2.hh
A be/src/orc/Reader.cc
A be/src/orc/Reader.hh
A be/src/orc/Timezone.cc
A be/src/orc/Timezone.hh
A be/src/orc/Type.hh
A be/src/orc/TypeImpl.cc
A be/src/orc/TypeImpl.hh
A be/src/orc/Vector.cc
A be/src/orc/Vector.hh
A be/src/orc/orc-config.hh
A be/src/orc/orc-config.hh.in
A be/src/orc/orc_proto.proto
A be/src/orc/wrap/coded-stream-wrapper.h
A be/src/orc/wrap/gmock.h
A be/src/orc/wrap/gtest-wrapper.h
A be/src/orc/wrap/orc-proto-wrapper.cc
A be/src/orc/wrap/orc-proto-wrapper.hh
A be/src/orc/wrap/snappy-wrapper.h
A be/src/orc/wrap/zero-copy-stream-wrapper.h
M common/thrift/CatalogObjects.thrift
M fe/src/main/cup/sql-parser.cup
M fe/src/main/java/org/apache/impala/catalog/HdfsFileFormat.java
M fe/src/main/java/org/apache/impala/catalog/HdfsStorageDescriptor.java
M fe/src/main/jflex/sql-scanner.flex
M testdata/bin/generate-schema-statements.py
M testdata/bin/run-hive-server.sh
M testdata/datasets/functional/schema_constraints.csv
M testdata/workloads/functional-query/functional-query_core.csv
M testdata/workloads/functional-query/functional-query_dimensions.csv
M testdata/workloads/functional-query/functional-query_exhaustive.csv
M testdata/workloads/functional-query/functional-query_pairwise.csv
M tests/common/test_dimensions.py
M tests/comparison/cli_options.py
M tests/query_test/test_decimal_queries.py
M tests/query_test/test_scanners.py
70 files changed, 15,389 insertions(+), 8 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/34/9134/2
--
To view, visit http://gerrit.cloudera.org:8080/9134
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Ia7b6ae4ce3b9ee8125b21993702faa87537790a4
Gerrit-Change-Number: 9134
Gerrit-PatchSet: 2
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Quanlong Huang 


[Impala-ASF-CR] IMPALA-6311: Skip row groups with predicates on NULL columns

2018-01-26 Thread Gabor Kaszab (Code Review)
Gabor Kaszab has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9140 )

Change subject: IMPALA-6311: Skip row groups with predicates on NULL columns
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/9140/1/testdata/workloads/functional-query/queries/QueryTest/parquet-stats.test
File testdata/workloads/functional-query/queries/QueryTest/parquet-stats.test:

http://gerrit.cloudera.org:8080/#/c/9140/1/testdata/workloads/functional-query/queries/QueryTest/parquet-stats.test@463
PS1, Line 463: insert into table_for_null_count_test values (1, NULL), (2, 
NULL), (3, NULL);
> Can we modify these create table statements and move them
Hey Anuj, thanks for taking a look!
Sure, I can move the create table statements to the .py file.

For my benefit, could you please let me know what we gain if we moved these 
there? Can't I use the $DATABASE variable in the .test file to use the 
unique_database?



--
To view, visit http://gerrit.cloudera.org:8080/9140
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I141317af0e0df30da8f220b29b0bfba364f40ddf
Gerrit-Change-Number: 9140
Gerrit-PatchSet: 2
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Reviewer: anujphadke 
Gerrit-Comment-Date: Fri, 26 Jan 2018 11:11:56 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6311: Skip row groups with predicates on NULL columns

2018-01-26 Thread Gabor Kaszab (Code Review)
Hello Lars Volker, Zoltan Borok-Nagy, anujphadke, Tim Armstrong,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/9140

to look at the new patch set (#2).

Change subject: IMPALA-6311: Skip row groups with predicates on NULL columns
..

IMPALA-6311: Skip row groups with predicates on NULL columns

Based on the existing Parquet column chunk level statistics null_count,
Impala's Parquet scanner is enhanced to skip an entire row group if the
null_count statistics indicate that all the values under the predicated
column are NULL as we wouldn't get any result rows from that row group
anyway.

Change-Id: I141317af0e0df30da8f220b29b0bfba364f40ddf
---
M be/src/exec/hdfs-parquet-scanner.cc
M be/src/exec/parquet-column-stats.cc
M be/src/exec/parquet-column-stats.h
M testdata/workloads/functional-query/queries/QueryTest/parquet-stats.test
M tests/query_test/test_parquet_stats.py
5 files changed, 58 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/40/9140/2
--
To view, visit http://gerrit.cloudera.org:8080/9140
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I141317af0e0df30da8f220b29b0bfba364f40ddf
Gerrit-Change-Number: 9140
Gerrit-PatchSet: 2
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Reviewer: anujphadke