[Impala-ASF-CR] IMPALA-4458: Fix resource cleanup of cancelled mt scan nodes.

2016-12-01 Thread Internal Jenkins (Code Review)
Internal Jenkins has submitted this change and it was merged.

Change subject: IMPALA-4458: Fix resource cleanup of cancelled mt scan nodes.
..


IMPALA-4458: Fix resource cleanup of cancelled mt scan nodes.

The bug was that HdfsScanNodeMt::Close() did not properly
clean up all in-flight resources when called through the
query cancellation path.

The main change is to clean up all resources when passing
a NULL batch into HdfsparquetScanner::Close() which also
needed similar changes in the scanner context.

Testing: Ran test_cancellation.py, test_scanners.py and
test_nested_types.py with MT_DOP=3. Added a test query
with a limit that was failing before.
A regular private hdfs/core test run succeeded.

Change-Id: Ib32f87b3289ed9e8fc2db0885675845e11207438
Reviewed-on: http://gerrit.cloudera.org:8080/5274
Reviewed-by: Alex Behm 
Tested-by: Internal Jenkins
---
M be/src/exec/hdfs-parquet-scanner.cc
M be/src/exec/hdfs-scan-node-mt.cc
M be/src/exec/parquet-column-readers.h
M be/src/exec/scanner-context.cc
M be/src/exec/scanner-context.h
M testdata/workloads/functional-query/queries/QueryTest/mt-dop-parquet.test
6 files changed, 53 insertions(+), 35 deletions(-)

Approvals:
  Internal Jenkins: Verified
  Alex Behm: Looks good to me, approved



-- 
To view, visit http://gerrit.cloudera.org:8080/5274
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: Ib32f87b3289ed9e8fc2db0885675845e11207438
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-4458: Fix resource cleanup of cancelled mt scan nodes.

2016-12-01 Thread Internal Jenkins (Code Review)
Internal Jenkins has posted comments on this change.

Change subject: IMPALA-4458: Fix resource cleanup of cancelled mt scan nodes.
..


Patch Set 6: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/5274
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ib32f87b3289ed9e8fc2db0885675845e11207438
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4458: Fix resource cleanup of cancelled mt scan nodes.

2016-11-30 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: IMPALA-4458: Fix resource cleanup of cancelled mt scan nodes.
..


Patch Set 6: Code-Review+2

trivial fix due to conflict after rebase

-- 
To view, visit http://gerrit.cloudera.org:8080/5274
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ib32f87b3289ed9e8fc2db0885675845e11207438
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4458: Fix resource cleanup of cancelled mt scan nodes.

2016-11-30 Thread Alex Behm (Code Review)
Hello Marcel Kornacker, Internal Jenkins, Tim Armstrong,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/5274

to look at the new patch set (#6).

Change subject: IMPALA-4458: Fix resource cleanup of cancelled mt scan nodes.
..

IMPALA-4458: Fix resource cleanup of cancelled mt scan nodes.

The bug was that HdfsScanNodeMt::Close() did not properly
clean up all in-flight resources when called through the
query cancellation path.

The main change is to clean up all resources when passing
a NULL batch into HdfsparquetScanner::Close() which also
needed similar changes in the scanner context.

Testing: Ran test_cancellation.py, test_scanners.py and
test_nested_types.py with MT_DOP=3. Added a test query
with a limit that was failing before.
A regular private hdfs/core test run succeeded.

Change-Id: Ib32f87b3289ed9e8fc2db0885675845e11207438
---
M be/src/exec/hdfs-parquet-scanner.cc
M be/src/exec/hdfs-scan-node-mt.cc
M be/src/exec/parquet-column-readers.h
M be/src/exec/scanner-context.cc
M be/src/exec/scanner-context.h
M testdata/workloads/functional-query/queries/QueryTest/mt-dop-parquet.test
6 files changed, 53 insertions(+), 35 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/74/5274/6
-- 
To view, visit http://gerrit.cloudera.org:8080/5274
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ib32f87b3289ed9e8fc2db0885675845e11207438
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-4458: Fix resource cleanup of cancelled mt scan nodes.

2016-11-30 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: IMPALA-4458: Fix resource cleanup of cancelled mt scan nodes.
..


Patch Set 5: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/5274
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ib32f87b3289ed9e8fc2db0885675845e11207438
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4458: Fix resource cleanup of cancelled mt scan nodes.

2016-11-30 Thread Alex Behm (Code Review)
Hello Marcel Kornacker, Tim Armstrong,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/5274

to look at the new patch set (#5).

Change subject: IMPALA-4458: Fix resource cleanup of cancelled mt scan nodes.
..

IMPALA-4458: Fix resource cleanup of cancelled mt scan nodes.

The bug was that HdfsScanNodeMt::Close() did not properly
clean up all in-flight resources when called through the
query cancellation path.

The main change is to clean up all resources when passing
a NULL batch into HdfsparquetScanner::Close() which also
needed similar changes in the scanner context.

Testing: Ran test_cancellation.py, test_scanners.py and
test_nested_types.py with MT_DOP=3. Added a test query
with a limit that was failing before.
A regular private hdfs/core test run succeeded.

Change-Id: Ib32f87b3289ed9e8fc2db0885675845e11207438
---
M be/src/exec/hdfs-parquet-scanner.cc
M be/src/exec/hdfs-scan-node-mt.cc
M be/src/exec/parquet-column-readers.h
M be/src/exec/scanner-context.cc
M be/src/exec/scanner-context.h
M testdata/workloads/functional-query/queries/QueryTest/mt-dop-parquet.test
6 files changed, 52 insertions(+), 35 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/74/5274/5
-- 
To view, visit http://gerrit.cloudera.org:8080/5274
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ib32f87b3289ed9e8fc2db0885675845e11207438
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-4458: Fix resource cleanup of cancelled mt scan nodes.

2016-11-30 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: IMPALA-4458: Fix resource cleanup of cancelled mt scan nodes.
..


Patch Set 4:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/5274/4/be/src/exec/scanner-context.cc
File be/src/exec/scanner-context.cc:

PS4, Line 110: bufDesc
> Wrong case for C++ code. Maybe just buffer?
Oops, sorry. Done.


http://gerrit.cloudera.org:8080/#/c/5274/4/testdata/workloads/functional-query/queries/QueryTest/mt-dop-parquet.test
File testdata/workloads/functional-query/queries/QueryTest/mt-dop-parquet.test:

Line 12: from tpch_nested_parquet.customer c, c.c_orders o
> out of curiosity, why does this require a nested table to invoke the cancel
I don't think it's required but I had a hard time finding another query on our 
existing test data that could reliably reproduce the issue. The cancellation 
tests repro the problem every time, but I wanted to add another "simpler" test 
here as well.


-- 
To view, visit http://gerrit.cloudera.org:8080/5274
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ib32f87b3289ed9e8fc2db0885675845e11207438
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4458: Fix resource cleanup of cancelled mt scan nodes.

2016-11-30 Thread Marcel Kornacker (Code Review)
Marcel Kornacker has posted comments on this change.

Change subject: IMPALA-4458: Fix resource cleanup of cancelled mt scan nodes.
..


Patch Set 4: Code-Review+2

(1 comment)

http://gerrit.cloudera.org:8080/#/c/5274/4/testdata/workloads/functional-query/queries/QueryTest/mt-dop-parquet.test
File testdata/workloads/functional-query/queries/QueryTest/mt-dop-parquet.test:

Line 12: from tpch_nested_parquet.customer c, c.c_orders o
out of curiosity, why does this require a nested table to invoke the 
cancellation path?


-- 
To view, visit http://gerrit.cloudera.org:8080/5274
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ib32f87b3289ed9e8fc2db0885675845e11207438
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4458: Fix resource cleanup of cancelled mt scan nodes.

2016-11-30 Thread Alex Behm (Code Review)
Alex Behm has uploaded a new patch set (#4).

Change subject: IMPALA-4458: Fix resource cleanup of cancelled mt scan nodes.
..

IMPALA-4458: Fix resource cleanup of cancelled mt scan nodes.

The bug was that HdfsScanNodeMt::Close() did not properly
clean up all in-flight resources when called through the
query cancellation path.

The main change is to clean up all resources when passing
a NULL batch into HdfsparquetScanner::Close() which also
needed similar changes in the scanner context.

Testing: Ran test_cancellation.py, test_scanners.py and
test_nested_types.py with MT_DOP=3. Added a test query
with a limit that was failing before.
A regular private hdfs/core test run succeeded.

Change-Id: Ib32f87b3289ed9e8fc2db0885675845e11207438
---
M be/src/exec/hdfs-parquet-scanner.cc
M be/src/exec/hdfs-scan-node-mt.cc
M be/src/exec/parquet-column-readers.h
M be/src/exec/scanner-context.cc
M be/src/exec/scanner-context.h
M testdata/workloads/functional-query/queries/QueryTest/mt-dop-parquet.test
6 files changed, 53 insertions(+), 35 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/74/5274/4
-- 
To view, visit http://gerrit.cloudera.org:8080/5274
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ib32f87b3289ed9e8fc2db0885675845e11207438
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-4458: Fix resource cleanup of cancelled mt scan nodes.

2016-11-30 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: IMPALA-4458: Fix resource cleanup of cancelled mt scan nodes.
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/5274/2/be/src/exec/parquet-column-readers.h
File be/src/exec/parquet-column-readers.h:

Line 232:   virtual void Close(RowBatch* row_batch) = 0;
> Doc that row_batch can be NULL.
Missed this the first time around, sorry. Done.


-- 
To view, visit http://gerrit.cloudera.org:8080/5274
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ib32f87b3289ed9e8fc2db0885675845e11207438
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4458: Fix resource cleanup of cancelled mt scan nodes.

2016-11-30 Thread Alex Behm (Code Review)
Alex Behm has uploaded a new patch set (#3).

Change subject: IMPALA-4458: Fix resource cleanup of cancelled mt scan nodes.
..

IMPALA-4458: Fix resource cleanup of cancelled mt scan nodes.

The bug was that HdfsScanNodeMt::Close() did not properly
clean up all in-flight resources when called through the
query cancellation path.

The main change is to clean up all resources when passing
a NULL batch into HdfsparquetScanner::Close() which also
needed similar changes in the scanner context.

Testing: Ran test_cancellation.py, test_scanners.py and
test_nested_types.py with MT_DOP=3. Added a test query
with a limit that was failing before.
A regular private hdfs/core test run succeeded.

Change-Id: Ib32f87b3289ed9e8fc2db0885675845e11207438
---
M be/src/exec/hdfs-parquet-scanner.cc
M be/src/exec/hdfs-scan-node-mt.cc
M be/src/exec/parquet-column-readers.h
M be/src/exec/scanner-context.cc
M be/src/exec/scanner-context.h
M testdata/workloads/functional-query/queries/QueryTest/mt-dop-parquet.test
6 files changed, 51 insertions(+), 33 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/74/5274/3
-- 
To view, visit http://gerrit.cloudera.org:8080/5274
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ib32f87b3289ed9e8fc2db0885675845e11207438
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-4458: Fix resource cleanup of cancelled mt scan nodes.

2016-11-30 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: IMPALA-4458: Fix resource cleanup of cancelled mt scan nodes.
..


Patch Set 2:

(5 comments)

Agree completely about doing more stress/ASAN runs.

http://gerrit.cloudera.org:8080/#/c/5274/2/be/src/exec/hdfs-parquet-scanner.cc
File be/src/exec/hdfs-parquet-scanner.cc:

PS2, Line 240: .get()
> .get() is unnecessary
Done


Line 251:   DCHECK_EQ(template_tuple_pool_.get()->total_allocated_bytes(), 0);
> .get() is unnecessary here too
Done


Line 278:   if (level_cache_pool_.get() != nullptr) {
> Move this up to where we free the other pools for consistency
Done


http://gerrit.cloudera.org:8080/#/c/5274/2/be/src/exec/parquet-column-readers.h
File be/src/exec/parquet-column-readers.h:

Line 346: if (decompressed_data_pool_.get() != nullptr) {
> This check looks unnecessary since it should never be non-NULL - it's initi
Done


http://gerrit.cloudera.org:8080/#/c/5274/2/be/src/exec/scanner-context.h
File be/src/exec/scanner-context.h:

Line 258: /// pool to batch.  If 'done' is set, releases the completed 
resources.
> Mention that attaching only happens when both contains_tuple_data_ is true 
Good point, adjusted comment.


-- 
To view, visit http://gerrit.cloudera.org:8080/5274
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ib32f87b3289ed9e8fc2db0885675845e11207438
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4458: Fix resource cleanup of cancelled mt scan nodes.

2016-11-30 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-4458: Fix resource cleanup of cancelled mt scan nodes.
..


Patch Set 2:

(6 comments)

The change looks good aside from some minor comments. I think we should try 
running a local stress test under ASAN with mt_dop set just to flush out 
remaining issues. I'm happy to do that or help with that.

http://gerrit.cloudera.org:8080/#/c/5274/2/be/src/exec/hdfs-parquet-scanner.cc
File be/src/exec/hdfs-parquet-scanner.cc:

PS2, Line 240: .get()
.get() is unnecessary


Line 251:   DCHECK_EQ(template_tuple_pool_.get()->total_allocated_bytes(), 0);
.get() is unnecessary here too


Line 278:   if (level_cache_pool_.get() != nullptr) {
Move this up to where we free the other pools for consistency


http://gerrit.cloudera.org:8080/#/c/5274/2/be/src/exec/parquet-column-readers.h
File be/src/exec/parquet-column-readers.h:

Line 232:   virtual void Close(RowBatch* row_batch) = 0;
Doc that row_batch can be NULL.


Line 346: if (decompressed_data_pool_.get() != nullptr) {
This check looks unnecessary since it should never be non-NULL - it's 
initialised in the constructor and then never reset.


http://gerrit.cloudera.org:8080/#/c/5274/2/be/src/exec/scanner-context.h
File be/src/exec/scanner-context.h:

Line 258: /// pool to batch.  If 'done' is set, releases the completed 
resources.
Mention that attaching only happens when both contains_tuple_data_ is true and 
done is false?


-- 
To view, visit http://gerrit.cloudera.org:8080/5274
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ib32f87b3289ed9e8fc2db0885675845e11207438
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4458: Fix resource cleanup of cancelled mt scan nodes.

2016-11-30 Thread Alex Behm (Code Review)
Alex Behm has uploaded a new patch set (#2).

Change subject: IMPALA-4458: Fix resource cleanup of cancelled mt scan nodes.
..

IMPALA-4458: Fix resource cleanup of cancelled mt scan nodes.

The bug was that HdfsScanNodeMt::Close() did not properly
clean up all in-flight resources when called through the
query cancellation path.

The main change is to clean up all resources when passing
a NULL batch into HdfsparquetScanner::Close() which also
needed similar changes in the scanner context.

Testing: Ran test_cancellation.py, test_scanners.py and
test_nested_types.py with MT_DOP=3. Added a test query
with a limit that was failing before.
A regular private hdfs/core test run succeeded.

Change-Id: Ib32f87b3289ed9e8fc2db0885675845e11207438
---
M be/src/exec/hdfs-parquet-scanner.cc
M be/src/exec/hdfs-scan-node-mt.cc
M be/src/exec/parquet-column-readers.h
M be/src/exec/scanner-context.cc
M be/src/exec/scanner-context.h
M testdata/workloads/functional-query/queries/QueryTest/mt-dop-parquet.test
6 files changed, 42 insertions(+), 22 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/74/5274/2
-- 
To view, visit http://gerrit.cloudera.org:8080/5274
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ib32f87b3289ed9e8fc2db0885675845e11207438
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm