[Impala-ASF-CR] IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool

2017-03-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged.

Change subject: IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool
..


IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool

Add a copy of BufferedTupleStream that allocates memory from BufferPool.
This will replace the original implementation when IMPALA-3200 is
completed.

The major changes are:
* Terminology is updated from "blocks" to "pages"
* No small buffer support is needed (hooray!).
* BufferedTupleStream needs to do its own tracking of # of rows per
  page, etc instead of relying on BufferedBlockMgr to do it. A
  wrapper around PageHandle is used.
* Profile counters (unpin, pin, get new block time) are removed -
  similar counters in the BufferPool client are more useful.
* Changed the tuple null indicators so that they are allocated
  before each tuple, rather than in a block at the start of the
  page. This slightly reduces the memory density, but greatly
  simplifies much logic. In particular, it avoids problems with
  RowIdx and larger pages with offsets that don't fit in 16 bits.
* Buffer management of read/write streams uses the new pin-counting
  support to separate pinning of the read and write pages.
  This means that the reservation usage of an unpinned read/write
  stream does not fluctuate and the read/write iterators can always
  advance without requiring additional reservation.

Testing this required some further changes. TestEnv was refactored
so it can set up either BufferPool or BufferedBlockMgr. Some
BufferPool-related state is added to ExecEnv, QueryState and
RuntimeState, but is only created for backend tests that explicitly
create a BufferPool.

The following is left for future work:
* IMPALA-3808 (large row support) is not added. I've added
  TODOs to the code to places that will require changes.
* IMPALA-4179 (remove MarkNeedsDeepCopy()) is not fixed, since
  it requires global changes to operators that accumulate memory.

Testing:
All of the BufferedTupleStream unit tests are ported to the new
implementation, except for ones specifically testing the small
buffer functionality.

Change-Id: I7bb47a818564ac19a6be4ca88db0578f4ea0b709
Reviewed-on: http://gerrit.cloudera.org:8080/5811
Reviewed-by: Tim Armstrong 
Tested-by: Impala Public Jenkins
---
M be/src/benchmarks/hash-benchmark.cc
M be/src/codegen/llvm-codegen-test.cc
M be/src/exec/hash-table-test.cc
M be/src/exprs/expr-codegen-test.cc
M be/src/runtime/CMakeLists.txt
M be/src/runtime/buffered-block-mgr-test.cc
M be/src/runtime/buffered-tuple-stream-test.cc
A be/src/runtime/buffered-tuple-stream-v2-test.cc
A be/src/runtime/buffered-tuple-stream-v2.cc
A be/src/runtime/buffered-tuple-stream-v2.h
A be/src/runtime/buffered-tuple-stream-v2.inline.h
M be/src/runtime/bufferpool/buffer-pool-test.cc
M be/src/runtime/exec-env.cc
M be/src/runtime/exec-env.h
M be/src/runtime/query-state.cc
M be/src/runtime/query-state.h
M be/src/runtime/row-batch.cc
M be/src/runtime/row-batch.h
M be/src/runtime/runtime-state.cc
M be/src/runtime/runtime-state.h
M be/src/runtime/test-env.cc
M be/src/runtime/test-env.h
M be/src/runtime/tmp-file-mgr-test.cc
23 files changed, 3,012 insertions(+), 105 deletions(-)

Approvals:
  Impala Public Jenkins: Verified
  Tim Armstrong: Looks good to me, approved



-- 
To view, visit http://gerrit.cloudera.org:8080/5811
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I7bb47a818564ac19a6be4ca88db0578f4ea0b709
Gerrit-PatchSet: 18
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool

2017-03-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool
..


Patch Set 17: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/5811
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7bb47a818564ac19a6be4ca88db0578f4ea0b709
Gerrit-PatchSet: 17
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool

2017-03-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool
..


Patch Set 17:

Build started: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/386/

-- 
To view, visit http://gerrit.cloudera.org:8080/5811
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7bb47a818564ac19a6be4ca88db0578f4ea0b709
Gerrit-PatchSet: 17
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool

2017-03-16 Thread Tim Armstrong (Code Review)
Hello Impala Public Jenkins, Dan Hecht,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/5811

to look at the new patch set (#17).

Change subject: IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool
..

IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool

Add a copy of BufferedTupleStream that allocates memory from BufferPool.
This will replace the original implementation when IMPALA-3200 is
completed.

The major changes are:
* Terminology is updated from "blocks" to "pages"
* No small buffer support is needed (hooray!).
* BufferedTupleStream needs to do its own tracking of # of rows per
  page, etc instead of relying on BufferedBlockMgr to do it. A
  wrapper around PageHandle is used.
* Profile counters (unpin, pin, get new block time) are removed -
  similar counters in the BufferPool client are more useful.
* Changed the tuple null indicators so that they are allocated
  before each tuple, rather than in a block at the start of the
  page. This slightly reduces the memory density, but greatly
  simplifies much logic. In particular, it avoids problems with
  RowIdx and larger pages with offsets that don't fit in 16 bits.
* Buffer management of read/write streams uses the new pin-counting
  support to separate pinning of the read and write pages.
  This means that the reservation usage of an unpinned read/write
  stream does not fluctuate and the read/write iterators can always
  advance without requiring additional reservation.

Testing this required some further changes. TestEnv was refactored
so it can set up either BufferPool or BufferedBlockMgr. Some
BufferPool-related state is added to ExecEnv, QueryState and
RuntimeState, but is only created for backend tests that explicitly
create a BufferPool.

The following is left for future work:
* IMPALA-3808 (large row support) is not added. I've added
  TODOs to the code to places that will require changes.
* IMPALA-4179 (remove MarkNeedsDeepCopy()) is not fixed, since
  it requires global changes to operators that accumulate memory.

Testing:
All of the BufferedTupleStream unit tests are ported to the new
implementation, except for ones specifically testing the small
buffer functionality.

Change-Id: I7bb47a818564ac19a6be4ca88db0578f4ea0b709
---
M be/src/benchmarks/hash-benchmark.cc
M be/src/codegen/llvm-codegen-test.cc
M be/src/exec/hash-table-test.cc
M be/src/exprs/expr-codegen-test.cc
M be/src/runtime/CMakeLists.txt
M be/src/runtime/buffered-block-mgr-test.cc
M be/src/runtime/buffered-tuple-stream-test.cc
A be/src/runtime/buffered-tuple-stream-v2-test.cc
A be/src/runtime/buffered-tuple-stream-v2.cc
A be/src/runtime/buffered-tuple-stream-v2.h
A be/src/runtime/buffered-tuple-stream-v2.inline.h
M be/src/runtime/bufferpool/buffer-pool-test.cc
M be/src/runtime/exec-env.cc
M be/src/runtime/exec-env.h
M be/src/runtime/query-state.cc
M be/src/runtime/query-state.h
M be/src/runtime/row-batch.cc
M be/src/runtime/row-batch.h
M be/src/runtime/runtime-state.cc
M be/src/runtime/runtime-state.h
M be/src/runtime/test-env.cc
M be/src/runtime/test-env.h
M be/src/runtime/tmp-file-mgr-test.cc
23 files changed, 3,012 insertions(+), 105 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/11/5811/17
-- 
To view, visit http://gerrit.cloudera.org:8080/5811
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7bb47a818564ac19a6be4ca88db0578f4ea0b709
Gerrit-PatchSet: 17
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool

2017-03-16 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool
..


Patch Set 17: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/5811
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7bb47a818564ac19a6be4ca88db0578f4ea0b709
Gerrit-PatchSet: 17
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool

2017-03-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool
..


Patch Set 16: Verified-1

Build failed: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/385/

-- 
To view, visit http://gerrit.cloudera.org:8080/5811
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7bb47a818564ac19a6be4ca88db0578f4ea0b709
Gerrit-PatchSet: 16
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool

2017-03-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool
..


Patch Set 16:

Build started: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/385/

-- 
To view, visit http://gerrit.cloudera.org:8080/5811
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7bb47a818564ac19a6be4ca88db0578f4ea0b709
Gerrit-PatchSet: 16
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool

2017-03-16 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool
..


Patch Set 16: Code-Review+2

Fix a couple of clang-tidy issues

-- 
To view, visit http://gerrit.cloudera.org:8080/5811
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7bb47a818564ac19a6be4ca88db0578f4ea0b709
Gerrit-PatchSet: 16
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool

2017-03-16 Thread Tim Armstrong (Code Review)
Hello Impala Public Jenkins, Dan Hecht,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/5811

to look at the new patch set (#16).

Change subject: IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool
..

IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool

Add a copy of BufferedTupleStream that allocates memory from BufferPool.
This will replace the original implementation when IMPALA-3200 is
completed.

The major changes are:
* Terminology is updated from "blocks" to "pages"
* No small buffer support is needed (hooray!).
* BufferedTupleStream needs to do its own tracking of # of rows per
  page, etc instead of relying on BufferedBlockMgr to do it. A
  wrapper around PageHandle is used.
* Profile counters (unpin, pin, get new block time) are removed -
  similar counters in the BufferPool client are more useful.
* Changed the tuple null indicators so that they are allocated
  before each tuple, rather than in a block at the start of the
  page. This slightly reduces the memory density, but greatly
  simplifies much logic. In particular, it avoids problems with
  RowIdx and larger pages with offsets that don't fit in 16 bits.
* Buffer management of read/write streams uses the new pin-counting
  support to separate pinning of the read and write pages.
  This means that the reservation usage of an unpinned read/write
  stream does not fluctuate and the read/write iterators can always
  advance without requiring additional reservation.

Testing this required some further changes. TestEnv was refactored
so it can set up either BufferPool or BufferedBlockMgr. Some
BufferPool-related state is added to ExecEnv, QueryState and
RuntimeState, but is only created for backend tests that explicitly
create a BufferPool.

The following is left for future work:
* IMPALA-3808 (large row support) is not added. I've added
  TODOs to the code to places that will require changes.
* IMPALA-4179 (remove MarkNeedsDeepCopy()) is not fixed, since
  it requires global changes to operators that accumulate memory.

Testing:
All of the BufferedTupleStream unit tests are ported to the new
implementation, except for ones specifically testing the small
buffer functionality.

Change-Id: I7bb47a818564ac19a6be4ca88db0578f4ea0b709
---
M be/src/benchmarks/hash-benchmark.cc
M be/src/codegen/llvm-codegen-test.cc
M be/src/exec/hash-table-test.cc
M be/src/exprs/expr-codegen-test.cc
M be/src/runtime/CMakeLists.txt
M be/src/runtime/buffered-block-mgr-test.cc
M be/src/runtime/buffered-tuple-stream-test.cc
A be/src/runtime/buffered-tuple-stream-v2-test.cc
A be/src/runtime/buffered-tuple-stream-v2.cc
A be/src/runtime/buffered-tuple-stream-v2.h
A be/src/runtime/buffered-tuple-stream-v2.inline.h
M be/src/runtime/bufferpool/buffer-pool-test.cc
M be/src/runtime/exec-env.cc
M be/src/runtime/exec-env.h
M be/src/runtime/query-state.cc
M be/src/runtime/query-state.h
M be/src/runtime/row-batch.cc
M be/src/runtime/row-batch.h
M be/src/runtime/runtime-state.cc
M be/src/runtime/runtime-state.h
M be/src/runtime/test-env.cc
M be/src/runtime/test-env.h
M be/src/runtime/tmp-file-mgr-test.cc
23 files changed, 3,012 insertions(+), 105 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/11/5811/16
-- 
To view, visit http://gerrit.cloudera.org:8080/5811
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7bb47a818564ac19a6be4ca88db0578f4ea0b709
Gerrit-PatchSet: 16
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool

2017-03-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool
..


Patch Set 15: Verified-1

Build failed: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/384/

-- 
To view, visit http://gerrit.cloudera.org:8080/5811
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7bb47a818564ac19a6be4ca88db0578f4ea0b709
Gerrit-PatchSet: 15
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool

2017-03-15 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool
..


Patch Set 15:

Build started: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/384/

-- 
To view, visit http://gerrit.cloudera.org:8080/5811
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7bb47a818564ac19a6be4ca88db0578f4ea0b709
Gerrit-PatchSet: 15
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool

2017-03-15 Thread Tim Armstrong (Code Review)
Hello Impala Public Jenkins, Dan Hecht,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/5811

to look at the new patch set (#15).

Change subject: IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool
..

IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool

Add a copy of BufferedTupleStream that allocates memory from BufferPool.
This will replace the original implementation when IMPALA-3200 is
completed.

The major changes are:
* Terminology is updated from "blocks" to "pages"
* No small buffer support is needed (hooray!).
* BufferedTupleStream needs to do its own tracking of # of rows per
  page, etc instead of relying on BufferedBlockMgr to do it. A
  wrapper around PageHandle is used.
* Profile counters (unpin, pin, get new block time) are removed -
  similar counters in the BufferPool client are more useful.
* Changed the tuple null indicators so that they are allocated
  before each tuple, rather than in a block at the start of the
  page. This slightly reduces the memory density, but greatly
  simplifies much logic. In particular, it avoids problems with
  RowIdx and larger pages with offsets that don't fit in 16 bits.
* Buffer management of read/write streams uses the new pin-counting
  support to separate pinning of the read and write pages.
  This means that the reservation usage of an unpinned read/write
  stream does not fluctuate and the read/write iterators can always
  advance without requiring additional reservation.

Testing this required some further changes. TestEnv was refactored
so it can set up either BufferPool or BufferedBlockMgr. Some
BufferPool-related state is added to ExecEnv, QueryState and
RuntimeState, but is only created for backend tests that explicitly
create a BufferPool.

The following is left for future work:
* IMPALA-3808 (large row support) is not added. I've added
  TODOs to the code to places that will require changes.
* IMPALA-4179 (remove MarkNeedsDeepCopy()) is not fixed, since
  it requires global changes to operators that accumulate memory.

Testing:
All of the BufferedTupleStream unit tests are ported to the new
implementation, except for ones specifically testing the small
buffer functionality.

Change-Id: I7bb47a818564ac19a6be4ca88db0578f4ea0b709
---
M be/src/benchmarks/hash-benchmark.cc
M be/src/codegen/llvm-codegen-test.cc
M be/src/exec/hash-table-test.cc
M be/src/exprs/expr-codegen-test.cc
M be/src/runtime/CMakeLists.txt
M be/src/runtime/buffered-block-mgr-test.cc
M be/src/runtime/buffered-tuple-stream-test.cc
A be/src/runtime/buffered-tuple-stream-v2-test.cc
A be/src/runtime/buffered-tuple-stream-v2.cc
A be/src/runtime/buffered-tuple-stream-v2.h
A be/src/runtime/buffered-tuple-stream-v2.inline.h
M be/src/runtime/bufferpool/buffer-pool-test.cc
M be/src/runtime/exec-env.cc
M be/src/runtime/exec-env.h
M be/src/runtime/query-state.cc
M be/src/runtime/query-state.h
M be/src/runtime/row-batch.cc
M be/src/runtime/row-batch.h
M be/src/runtime/runtime-state.cc
M be/src/runtime/runtime-state.h
M be/src/runtime/test-env.cc
M be/src/runtime/test-env.h
M be/src/runtime/tmp-file-mgr-test.cc
23 files changed, 3,012 insertions(+), 105 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/11/5811/15
-- 
To view, visit http://gerrit.cloudera.org:8080/5811
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7bb47a818564ac19a6be4ca88db0578f4ea0b709
Gerrit-PatchSet: 15
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool

2017-03-15 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool
..


Patch Set 15: Code-Review+2

Fix some test failures from the rebase

-- 
To view, visit http://gerrit.cloudera.org:8080/5811
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7bb47a818564ac19a6be4ca88db0578f4ea0b709
Gerrit-PatchSet: 15
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool

2017-03-15 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool
..


Patch Set 14: Verified-1

Build failed: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/382/

-- 
To view, visit http://gerrit.cloudera.org:8080/5811
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7bb47a818564ac19a6be4ca88db0578f4ea0b709
Gerrit-PatchSet: 14
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool

2017-03-15 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool
..


Patch Set 14:

Build started: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/382/

-- 
To view, visit http://gerrit.cloudera.org:8080/5811
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7bb47a818564ac19a6be4ca88db0578f4ea0b709
Gerrit-PatchSet: 14
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool

2017-03-15 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool
..


Patch Set 14: Code-Review+2

Carry +2

-- 
To view, visit http://gerrit.cloudera.org:8080/5811
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7bb47a818564ac19a6be4ca88db0578f4ea0b709
Gerrit-PatchSet: 14
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool

2017-03-15 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool
..


Patch Set 13: Code-Review+2

Rebased onto https://gerrit.cloudera.org/#/c/6313/

-- 
To view, visit http://gerrit.cloudera.org:8080/5811
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7bb47a818564ac19a6be4ca88db0578f4ea0b709
Gerrit-PatchSet: 13
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool

2017-03-15 Thread Tim Armstrong (Code Review)
Hello Dan Hecht,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/5811

to look at the new patch set (#13).

Change subject: IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool
..

IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool

Add a copy of BufferedTupleStream that allocates memory from BufferPool.
This will replace the original implementation when IMPALA-3200 is
completed.

The major changes are:
* Terminology is updated from "blocks" to "pages"
* No small buffer support is needed (hooray!).
* BufferedTupleStream needs to do its own tracking of # of rows per
  page, etc instead of relying on BufferedBlockMgr to do it. A
  wrapper around PageHandle is used.
* Profile counters (unpin, pin, get new block time) are removed -
  similar counters in the BufferPool client are more useful.
* Changed the tuple null indicators so that they are allocated
  before each tuple, rather than in a block at the start of the
  page. This slightly reduces the memory density, but greatly
  simplifies much logic. In particular, it avoids problems with
  RowIdx and larger pages with offsets that don't fit in 16 bits.
* Buffer management of read/write streams uses the new pin-counting
  support to separate pinning of the read and write pages.
  This means that the reservation usage of an unpinned read/write
  stream does not fluctuate and the read/write iterators can always
  advance without requiring additional reservation.

Testing this required some further changes. TestEnv was refactored
so it can set up either BufferPool or BufferedBlockMgr. Some
BufferPool-related state is added to ExecEnv, QueryState and
RuntimeState, but is only created for backend tests that explicitly
create a BufferPool.

The following is left for future work:
* IMPALA-3808 (large row support) is not added. I've added
  TODOs to the code to places that will require changes.
* IMPALA-4179 (remove MarkNeedsDeepCopy()) is not fixed, since
  it requires global changes to operators that accumulate memory.

Testing:
All of the BufferedTupleStream unit tests are ported to the new
implementation, except for ones specifically testing the small
buffer functionality.

Change-Id: I7bb47a818564ac19a6be4ca88db0578f4ea0b709
---
M be/src/exec/hash-table-test.cc
M be/src/runtime/CMakeLists.txt
M be/src/runtime/buffered-block-mgr-test.cc
M be/src/runtime/buffered-tuple-stream-test.cc
A be/src/runtime/buffered-tuple-stream-v2-test.cc
A be/src/runtime/buffered-tuple-stream-v2.cc
A be/src/runtime/buffered-tuple-stream-v2.h
A be/src/runtime/buffered-tuple-stream-v2.inline.h
M be/src/runtime/bufferpool/buffer-pool-test.cc
M be/src/runtime/exec-env.cc
M be/src/runtime/exec-env.h
M be/src/runtime/query-state.cc
M be/src/runtime/query-state.h
M be/src/runtime/row-batch.cc
M be/src/runtime/row-batch.h
M be/src/runtime/runtime-state.cc
M be/src/runtime/runtime-state.h
M be/src/runtime/test-env.cc
M be/src/runtime/test-env.h
M be/src/runtime/tmp-file-mgr-test.cc
20 files changed, 3,002 insertions(+), 100 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/11/5811/13
-- 
To view, visit http://gerrit.cloudera.org:8080/5811
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7bb47a818564ac19a6be4ca88db0578f4ea0b709
Gerrit-PatchSet: 13
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool

2017-03-15 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool
..


Patch Set 12: Code-Review+2

Carry +2

-- 
To view, visit http://gerrit.cloudera.org:8080/5811
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7bb47a818564ac19a6be4ca88db0578f4ea0b709
Gerrit-PatchSet: 12
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool

2017-03-15 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool
..


Patch Set 11:

(7 comments)

http://gerrit.cloudera.org:8080/#/c/5811/11/be/src/runtime/buffered-tuple-stream-v2.cc
File be/src/runtime/buffered-tuple-stream-v2.cc:

Line 176:   CHECK_CONSISTENCY();
> maybe move these CHECK_CONSISTECY() to the end of the functions so they val
We don't check at the exit points of all functions so I wanted to at least 
check at the entry to some of these functions before they start changing state.


Line 270:   RETURN_IF_ERROR(PinPageIfNeeded(write_page_, pinned_));
> wouldn't it be better to move this to AdvanceWritePage()?  The other caller
Done


Line 299:   // May need to pin the new page for both reading and writing.
> Add: "See ExpectedPinCount()."
Done


Line 374:   pinned_ || 
buffer_pool_client_->reservation()->IncreaseReservationToFit(page_len_);
> let's add a comment, something like:
Done


Line 443: MemTracker* tracker, scoped_ptr* batch, bool* got_rows) 
{
> you don't have to do it now, but I kinda think this function should be in P
Yeah, I think we should really remove it: IMPALA-2758


http://gerrit.cloudera.org:8080/#/c/5811/11/be/src/runtime/buffered-tuple-stream-v2.h
File be/src/runtime/buffered-tuple-stream-v2.h:

PS11, Line 498: and to pin it for reading if the stream's current
  :   /// state requires it. 
> see comment in cc file on this function. if you take that suggestion, then 
Done


PS11, Line 518: write iterator is
> iterators are
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/5811
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7bb47a818564ac19a6be4ca88db0578f4ea0b709
Gerrit-PatchSet: 11
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool

2017-03-15 Thread Tim Armstrong (Code Review)
Hello Dan Hecht,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/5811

to look at the new patch set (#12).

Change subject: IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool
..

IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool

Add a copy of BufferedTupleStream that allocates memory from BufferPool.
This will replace the original implementation when IMPALA-3200 is
completed.

The major changes are:
* Terminology is updated from "blocks" to "pages"
* No small buffer support is needed (hooray!).
* BufferedTupleStream needs to do its own tracking of # of rows per
  page, etc instead of relying on BufferedBlockMgr to do it. A
  wrapper around PageHandle is used.
* Profile counters (unpin, pin, get new block time) are removed -
  similar counters in the BufferPool client are more useful.
* Changed the tuple null indicators so that they are allocated
  before each tuple, rather than in a block at the start of the
  page. This slightly reduces the memory density, but greatly
  simplifies much logic. In particular, it avoids problems with
  RowIdx and larger pages with offsets that don't fit in 16 bits.
* Buffer management of read/write streams uses the new pin-counting
  support to separate pinning of the read and write pages.
  This means that the reservation usage of an unpinned read/write
  stream does not fluctuate and the read/write iterators can always
  advance without requiring additional reservation.

Testing this required some further changes. TestEnv was refactored
so it can set up either BufferPool or BufferedBlockMgr. Some
BufferPool-related state is added to ExecEnv, QueryState and
RuntimeState, but is only created for backend tests that explicitly
create a BufferPool.

The following is left for future work:
* IMPALA-3808 (large row support) is not added. I've added
  TODOs to the code to places that will require changes.
* IMPALA-4179 (remove MarkNeedsDeepCopy()) is not fixed, since
  it requires global changes to operators that accumulate memory.

Testing:
All of the BufferedTupleStream unit tests are ported to the new
implementation, except for ones specifically testing the small
buffer functionality.

Change-Id: I7bb47a818564ac19a6be4ca88db0578f4ea0b709
---
M be/src/exec/hash-table-test.cc
M be/src/runtime/CMakeLists.txt
M be/src/runtime/buffered-block-mgr-test.cc
M be/src/runtime/buffered-tuple-stream-test.cc
A be/src/runtime/buffered-tuple-stream-v2-test.cc
A be/src/runtime/buffered-tuple-stream-v2.cc
A be/src/runtime/buffered-tuple-stream-v2.h
A be/src/runtime/buffered-tuple-stream-v2.inline.h
M be/src/runtime/bufferpool/buffer-pool-test.cc
M be/src/runtime/exec-env.cc
M be/src/runtime/exec-env.h
M be/src/runtime/query-state.cc
M be/src/runtime/query-state.h
M be/src/runtime/row-batch.cc
M be/src/runtime/row-batch.h
M be/src/runtime/runtime-state.cc
M be/src/runtime/runtime-state.h
M be/src/runtime/test-env.cc
M be/src/runtime/test-env.h
M be/src/runtime/tmp-file-mgr-test.cc
20 files changed, 3,012 insertions(+), 101 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/11/5811/12
-- 
To view, visit http://gerrit.cloudera.org:8080/5811
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7bb47a818564ac19a6be4ca88db0578f4ea0b709
Gerrit-PatchSet: 12
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool

2017-03-15 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool
..


Patch Set 11: Code-Review+2

(7 comments)

http://gerrit.cloudera.org:8080/#/c/5811/11/be/src/runtime/buffered-tuple-stream-v2.cc
File be/src/runtime/buffered-tuple-stream-v2.cc:

Line 176:   CHECK_CONSISTENCY();
maybe move these CHECK_CONSISTECY() to the end of the functions so they 
validate the resulting state, but feel free to ignore.


Line 270:   RETURN_IF_ERROR(PinPageIfNeeded(write_page_, pinned_));
wouldn't it be better to move this to AdvanceWritePage()?  The other callers 
handle it for their specific case (i.e. by calling PrepareForReadInternal(), or 
not needing it), so this is really just for when we're adding a new write page 
to an existing stream, no?


Line 299:   // May need to pin the new page for both reading and writing.
Add: "See ExpectedPinCount()."
since that's where we really document how we manage reservations/pins.


Line 374:   pinned_ || 
buffer_pool_client_->reservation()->IncreaseReservationToFit(page_len_);
let's add a comment, something like:
// If already pinned, no additional pin is needed (see ExpectedPinCount()).


Line 443: MemTracker* tracker, scoped_ptr* batch, bool* got_rows) 
{
you don't have to do it now, but I kinda think this function should be in PHJ. 
It's really just a composition of buffered-tuple-stream operations, and it's 
not something that we should be doing generally.


http://gerrit.cloudera.org:8080/#/c/5811/11/be/src/runtime/buffered-tuple-stream-v2.h
File be/src/runtime/buffered-tuple-stream-v2.h:

PS11, Line 498: and to pin it for reading if the stream's current
  :   /// state requires it. 
see comment in cc file on this function. if you take that suggestion, then 
update this.


PS11, Line 518: write iterator is
iterators are


-- 
To view, visit http://gerrit.cloudera.org:8080/5811
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7bb47a818564ac19a6be4ca88db0578f4ea0b709
Gerrit-PatchSet: 11
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool

2017-03-15 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool
..


Patch Set 10:

(19 comments)

http://gerrit.cloudera.org:8080/#/c/5811/9/be/src/runtime/buffered-tuple-stream-v2.cc
File be/src/runtime/buffered-tuple-stream-v2.cc:

PS9, Line 322: 
> !has_read_iterator() ?
Done


http://gerrit.cloudera.org:8080/#/c/5811/10/be/src/runtime/buffered-tuple-stream-v2.cc
File be/src/runtime/buffered-tuple-stream-v2.cc:

Line 101:   for (const Page& page : pages_) {
> will this be too expensive, even in debug builds (and we should make sure t
We were already doing a pass for the list in one of the Calc* methods, so it's 
not likely to be significantly worse. I removed the consistency check from 
GetNextInternal(), which is likely to be the most commonly-called.

I added a CHECK_CONSISTENCY macro to guarantee the calls get removed in release 
builds.


PS10, Line 125: read_page_ == pages_.end()
> has_read_iterator() and switch clauses?
Done


Line 129:   }
> does it make sense to print the write page as well?
It's printed on line 123


PS10, Line 149:   // This must be the first call to PrepareForWrite().
> Maybe say, this must be the first iterator? i.e. you can't call this functi
Done


PS10, Line 158:   // This must be the first call to PrepareForWrite().
> same
Done


PS10, Line 205: if
> when
Done


PS10, Line 211: If
> When
Done


PS10, Line 258: get_extra_reservation
> it's unfortunate we need this. i wonder if moving the reservation logic (an
Done


PS10, Line 272:  bool pin_for_read = has_read_iterator() && pinned_;
  :   get_extra_reservation |= pin_for_read;
> this is still pretty confusing. if you don't move the reservation logic com
I restructured this code and move the reservation logic into the callers. It 
adds a bit more boilerplate but seems less magical.


Line 283:   if (pin_for_read) RETURN_IF_ERROR(PinPage(&new_page));
> see comment above.
Done


Line 298: int64_t row_size, bool get_extra_reservation, bool* got_page) 
noexcept {
> this wrapper now seems unnecessary and not helpful for readability because 
Done


PS10, Line 339: read_page_ == pages_.end()
> !has_read_iterator()?
Done


Line 379:   ResetReadPage();
> this seems to make more sense to do in PrepareForRead() since that's the on
Done


http://gerrit.cloudera.org:8080/#/c/5811/10/be/src/runtime/buffered-tuple-stream-v2.h
File be/src/runtime/buffered-tuple-stream-v2.h:

PS10, Line 217: got_buffer
> maybe this should be 'got_reservation' now, and explained in terms of reser
Works for me - done.


PS10, Line 228: got_buffer
> same
Done


PS10, Line 496: get_extra_reservation
> this needs explanation, but first see comments in the cc file about it.
Reworked this.


Line 503:   bool* got_page) noexcept WARN_UNUSED_RESULT;
> seems like we can put this code in NewWritePage() now.
This ended up being restructured a bit - some of the code is in a new function 
AdvanceWritePage().


PS10, Line 509: got_buffer
> got_page?
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/5811
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7bb47a818564ac19a6be4ca88db0578f4ea0b709
Gerrit-PatchSet: 10
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool

2017-03-15 Thread Tim Armstrong (Code Review)
Tim Armstrong has uploaded a new patch set (#11).

Change subject: IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool
..

IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool

Add a copy of BufferedTupleStream that allocates memory from BufferPool.
This will replace the original implementation when IMPALA-3200 is
completed.

The major changes are:
* Terminology is updated from "blocks" to "pages"
* No small buffer support is needed (hooray!).
* BufferedTupleStream needs to do its own tracking of # of rows per
  page, etc instead of relying on BufferedBlockMgr to do it. A
  wrapper around PageHandle is used.
* Profile counters (unpin, pin, get new block time) are removed -
  similar counters in the BufferPool client are more useful.
* Changed the tuple null indicators so that they are allocated
  before each tuple, rather than in a block at the start of the
  page. This slightly reduces the memory density, but greatly
  simplifies much logic. In particular, it avoids problems with
  RowIdx and larger pages with offsets that don't fit in 16 bits.
* Buffer management of read/write streams uses the new pin-counting
  support to separate pinning of the read and write pages.
  This means that the reservation usage of an unpinned read/write
  stream does not fluctuate and the read/write iterators can always
  advance without requiring additional reservation.

Testing this required some further changes. TestEnv was refactored
so it can set up either BufferPool or BufferedBlockMgr. Some
BufferPool-related state is added to ExecEnv, QueryState and
RuntimeState, but is only created for backend tests that explicitly
create a BufferPool.

The following is left for future work:
* IMPALA-3808 (large row support) is not added. I've added
  TODOs to the code to places that will require changes.
* IMPALA-4179 (remove MarkNeedsDeepCopy()) is not fixed, since
  it requires global changes to operators that accumulate memory.

Testing:
All of the BufferedTupleStream unit tests are ported to the new
implementation, except for ones specifically testing the small
buffer functionality.

Change-Id: I7bb47a818564ac19a6be4ca88db0578f4ea0b709
---
M be/src/exec/hash-table-test.cc
M be/src/runtime/CMakeLists.txt
M be/src/runtime/buffered-block-mgr-test.cc
M be/src/runtime/buffered-tuple-stream-test.cc
A be/src/runtime/buffered-tuple-stream-v2-test.cc
A be/src/runtime/buffered-tuple-stream-v2.cc
A be/src/runtime/buffered-tuple-stream-v2.h
A be/src/runtime/buffered-tuple-stream-v2.inline.h
M be/src/runtime/bufferpool/buffer-pool-test.cc
M be/src/runtime/exec-env.cc
M be/src/runtime/exec-env.h
M be/src/runtime/query-state.cc
M be/src/runtime/query-state.h
M be/src/runtime/row-batch.cc
M be/src/runtime/row-batch.h
M be/src/runtime/runtime-state.cc
M be/src/runtime/runtime-state.h
M be/src/runtime/test-env.cc
M be/src/runtime/test-env.h
M be/src/runtime/tmp-file-mgr-test.cc
20 files changed, 3,010 insertions(+), 101 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/11/5811/11
-- 
To view, visit http://gerrit.cloudera.org:8080/5811
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7bb47a818564ac19a6be4ca88db0578f4ea0b709
Gerrit-PatchSet: 11
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool

2017-03-15 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool
..


Patch Set 10:

(19 comments)

http://gerrit.cloudera.org:8080/#/c/5811/9/be/src/runtime/buffered-tuple-stream-v2.cc
File be/src/runtime/buffered-tuple-stream-v2.cc:

PS9, Line 322: 
!has_read_iterator() ?


http://gerrit.cloudera.org:8080/#/c/5811/10/be/src/runtime/buffered-tuple-stream-v2.cc
File be/src/runtime/buffered-tuple-stream-v2.cc:

Line 101:   for (const Page& page : pages_) {
will this be too expensive, even in debug builds (and we should make sure this 
loop is compiled out of release builds)?


PS10, Line 125: read_page_ == pages_.end()
has_read_iterator() and switch clauses?


Line 129:   }
does it make sense to print the write page as well?


PS10, Line 149:   // This must be the first call to PrepareForWrite().
Maybe say, this must be the first iterator? i.e. you can't call this function 
after starting a read iteration or read/write iteration, either, right?


PS10, Line 158:   // This must be the first call to PrepareForWrite().
same


PS10, Line 205: if
when
(sorry, i wrote the comment but 'if' sounds confusing since it's not currently 
pinned)


PS10, Line 211: If
When


PS10, Line 258: get_extra_reservation
it's unfortunate we need this. i wonder if moving the reservation logic (and 
the ResetWritePage() call) to the callers would be simpler?

In any case, if we leave it here, how about calling this parameter 
'get_read_reservation'? also see next comment.


PS10, Line 272:  bool pin_for_read = has_read_iterator() && pinned_;
  :   get_extra_reservation |= pin_for_read;
this is still pretty confusing. if you don't move the reservation logic 
completely into the callers, it may still make sense to move the calculation 
'pinned_ && has_read_iterator()' into AddRowSlow(). The other callsites are 
known to be either static true or false. 

THen you would also move the PinPage() call into the caller (using 
PinPageIfNeeded()), which seems to make sense since in the case of 
PrepareForReadWrite(), you really want PrepareForReadInternal() to do that 
anyway, no?  Or at least you could move this after the write iterator is set up 
and still use PinPageIfNeeded().


Line 283:   if (pin_for_read) RETURN_IF_ERROR(PinPage(&new_page));
see comment above.


Line 298: int64_t row_size, bool get_extra_reservation, bool* got_page) 
noexcept {
this wrapper now seems unnecessary and not helpful for readability because 
NewWritePage() has no other callers.


PS10, Line 339: read_page_ == pages_.end()
!has_read_iterator()?


Line 379:   ResetReadPage();
this seems to make more sense to do in PrepareForRead() since that's the only 
case you can possible have a read iterator, right? and that assumption is built 
into the reservation management code.


http://gerrit.cloudera.org:8080/#/c/5811/10/be/src/runtime/buffered-tuple-stream-v2.h
File be/src/runtime/buffered-tuple-stream-v2.h:

PS10, Line 217: got_buffer
maybe this should be 'got_reservation' now, and explained in terms of 
reservation. or at least got_page?


PS10, Line 228: got_buffer
same


PS10, Line 496: get_extra_reservation
this needs explanation, but first see comments in the cc file about it.


Line 503:   bool* got_page) noexcept WARN_UNUSED_RESULT;
seems like we can put this code in NewWritePage() now.


PS10, Line 509: got_buffer
got_page?


-- 
To view, visit http://gerrit.cloudera.org:8080/5811
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7bb47a818564ac19a6be4ca88db0578f4ea0b709
Gerrit-PatchSet: 10
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool

2017-03-08 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool
..


Patch Set 10:

Made changes discussed offline - PrepareForReadWrite() and re-oriented the pin 
counting logic to be driven by an ExpectedPinCount() method.

-- 
To view, visit http://gerrit.cloudera.org:8080/5811
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7bb47a818564ac19a6be4ca88db0578f4ea0b709
Gerrit-PatchSet: 10
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool

2017-03-08 Thread Tim Armstrong (Code Review)
Tim Armstrong has uploaded a new patch set (#10).

Change subject: IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool
..

IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool

Add a copy of BufferedTupleStream that allocates memory from BufferPool.
This will replace the original implementation when IMPALA-3200 is
completed.

The major changes are:
* Terminology is updated from "blocks" to "pages"
* No small buffer support is needed (hooray!).
* BufferedTupleStream needs to do its own tracking of # of rows per
  page, etc instead of relying on BufferedBlockMgr to do it. A
  wrapper around PageHandle is used.
* Profile counters (unpin, pin, get new block time) are removed -
  similar counters in the BufferPool client are more useful.
* Changed the tuple null indicators so that they are allocated
  before each tuple, rather than in a block at the start of the
  page. This slightly reduces the memory density, but greatly
  simplifies much logic. In particular, it avoids problems with
  RowIdx and larger pages with offsets that don't fit in 16 bits.
* Buffer management of read/write streams uses the new pin-counting
  support to separate pinning of the read and write pages.
  This means that the reservation usage of an unpinned read/write
  stream does not fluctuate and the read/write iterators can always
  advance without requiring additional reservation.

Testing this required some further changes. TestEnv was refactored
so it can set up either BufferPool or BufferedBlockMgr. Some
BufferPool-related state is added to ExecEnv, QueryState and
RuntimeState, but is only created for backend tests that explicitly
create a BufferPool.

The following is left for future work:
* IMPALA-3808 (large row support) is not added. I've added
  TODOs to the code to places that will require changes.
* IMPALA-4179 (remove MarkNeedsDeepCopy()) is not fixed, since
  it requires global changes to operators that accumulate memory.

Testing:
All of the BufferedTupleStream unit tests are ported to the new
implementation, except for ones specifically testing the small
buffer functionality.

Change-Id: I7bb47a818564ac19a6be4ca88db0578f4ea0b709
---
M be/src/exec/hash-table-test.cc
M be/src/runtime/CMakeLists.txt
M be/src/runtime/buffered-block-mgr-test.cc
M be/src/runtime/buffered-tuple-stream-test.cc
A be/src/runtime/buffered-tuple-stream-v2-test.cc
A be/src/runtime/buffered-tuple-stream-v2.cc
A be/src/runtime/buffered-tuple-stream-v2.h
A be/src/runtime/buffered-tuple-stream-v2.inline.h
M be/src/runtime/bufferpool/buffer-pool-test.cc
M be/src/runtime/exec-env.cc
M be/src/runtime/exec-env.h
M be/src/runtime/query-state.cc
M be/src/runtime/query-state.h
M be/src/runtime/row-batch.cc
M be/src/runtime/row-batch.h
M be/src/runtime/runtime-state.cc
M be/src/runtime/runtime-state.h
M be/src/runtime/test-env.cc
M be/src/runtime/test-env.h
M be/src/runtime/tmp-file-mgr-test.cc
20 files changed, 3,002 insertions(+), 101 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/11/5811/10
-- 
To view, visit http://gerrit.cloudera.org:8080/5811
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7bb47a818564ac19a6be4ca88db0578f4ea0b709
Gerrit-PatchSet: 10
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool

2017-03-07 Thread Tim Armstrong (Code Review)
Tim Armstrong has uploaded a new patch set (#9).

Change subject: IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool
..

IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool

Add a copy of BufferedTupleStream that allocates memory from BufferPool.
This will replace the original implementation when IMPALA-3200 is
completed.

The major changes are:
* Terminology is updated from "blocks" to "pages"
* No small buffer support is needed (hooray!).
* BufferedTupleStream needs to do its own tracking of # of rows per
  page, etc instead of relying on BufferedBlockMgr to do it. A
  wrapper around PageHandle is used.
* Profile counters (unpin, pin, get new block time) are removed -
  similar counters in the BufferPool client are more useful.
* Changed the tuple null indicators so that they are allocated
  before each tuple, rather than in a block at the start of the
  page. This slightly reduces the memory density, but greatly
  simplifies much logic. In particular, it avoids problems with
  RowIdx and larger pages with offsets that don't fit in 16 bits.
* Buffer management of read/write streams uses the new pin-counting
  support to separate pinning of the read and write pages.
  This means that the reservation usage of an unpinned read/write
  stream does not fluctuate and the read/write iterators can always
  advance without requiring additional reservation.

Testing this required some further changes. TestEnv was refactored
so it can set up either BufferPool or BufferedBlockMgr. Some
BufferPool-related state is added to ExecEnv, QueryState and
RuntimeState, but is only created for backend tests that explicitly
create a BufferPool.

The following is left for future work:
* IMPALA-3808 (large row support) is not added. I've added
  TODOs to the code to places that will require changes.
* IMPALA-4179 (remove MarkNeedsDeepCopy()) is not fixed, since
  it requires global changes to operators that accumulate memory.

Testing:
All of the BufferedTupleStream unit tests are ported to the new
implementation, except for ones specifically testing the small
buffer functionality.

Change-Id: I7bb47a818564ac19a6be4ca88db0578f4ea0b709
---
M be/src/exec/hash-table-test.cc
M be/src/runtime/CMakeLists.txt
M be/src/runtime/buffered-block-mgr-test.cc
M be/src/runtime/buffered-tuple-stream-test.cc
A be/src/runtime/buffered-tuple-stream-v2-test.cc
A be/src/runtime/buffered-tuple-stream-v2.cc
A be/src/runtime/buffered-tuple-stream-v2.h
A be/src/runtime/buffered-tuple-stream-v2.inline.h
M be/src/runtime/bufferpool/buffer-pool-test.cc
M be/src/runtime/exec-env.cc
M be/src/runtime/exec-env.h
M be/src/runtime/query-state.cc
M be/src/runtime/query-state.h
M be/src/runtime/row-batch.cc
M be/src/runtime/row-batch.h
M be/src/runtime/runtime-state.cc
M be/src/runtime/runtime-state.h
M be/src/runtime/test-env.cc
M be/src/runtime/test-env.h
M be/src/runtime/tmp-file-mgr-test.cc
20 files changed, 2,941 insertions(+), 101 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/11/5811/9
-- 
To view, visit http://gerrit.cloudera.org:8080/5811
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7bb47a818564ac19a6be4ca88db0578f4ea0b709
Gerrit-PatchSet: 9
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool

2017-03-07 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool
..


Patch Set 8:

(19 comments)

http://gerrit.cloudera.org:8080/#/c/5811/8/be/src/runtime/buffered-tuple-stream-v2.cc
File be/src/runtime/buffered-tuple-stream-v2.cc:

PS8, Line 108: if
> iff, just to be super clear.
Done


PS8, Line 110: read_page_ != pages_.end()
> why is this condition needed? what does it mean to have read_page_ == pages
This means that there's a read iterator active.

read_page_ == pages_.end() && pinned_ means the stream is pinned but there is 
no read iterator.

I clarified in the comment of read_page_ and restructured this condition so the 
meanings of subexpressions were more explicit.


Line 116:   }
> fine with me to keep, but isn't this the same as line 109?
Done


Line 141: ss << &*read_page_;
> will it be possible to identify which page in the list printed below this i
Yes, you can match up the addresses. I made this change to reduce the amount of 
redundancy in the debug output.


Line 166:   DCHECK(read_page_ == pages_.end());
> DCHECK_EQ x2
We don't generally use DCHECK_EQ for null pointer comparison (I did switch this 
to nullptr for consistency though).

We also can't use it for iterators since they're not printable.


PS8, Line 232: read_page_ != pages_.end()
> what does this condition mean? why isn't it just: if the stream is pinned f
It means there's an active read iterator. I actually updated the 
has_read_page() method to be called has_read_iterator() and used that instead - 
I think it's clear.

If there's only a write iterator we don't need to double-pin (because it's not 
really a read/write stream).

I rewrote things in terms of the presence of read/write iterators instead of 
the read_write mode flag. Now read_write_ mode only directly influences whether 
PrepareForRead() invalidates the write iterator.


Line 241: RETURN_IF_ERROR(buffer_pool_->Pin(buffer_pool_client_, 
&new_page.handle));
> does this purposely not go through PinPage()?
I was experimenting a bit with different flows here so that related operations 
were grouped together. Seems like it was confusing so I switched back to 
PinPage().


PS8, Line 250: (double_pin ? 2 : 1)
> i guess because of this. maybe move this to be after the CreatePage() call 
Done


Line 348: *pinned = true;
> why was this needed?
Previously PinStream() was idempotent - I wanted to preserve that behaviour.


PS8, Line 356: We need to double-pin the current write page if we also have a 
read iterator.
> why? or maybe you mean just that we need to take a pin-count for reading ev
Tried to reword to be clearer.


PS8, Line 358: write_page_ == &page && read_page_ == pages_.end()
> why? i thought we now do take a separate pin count for read and for write?
This essentially implements that idea. Some of the complication is that on a 
non-read-write stream where there is only ever one iterator active, we don't 
need to do the double-pinning of the current write page.


Line 387:   if (pinned_) UnpinPage(&page);
> do we allow you to call UnpinStream() on an already unpinned stream? why?
I restructured this code to hopefully be clearer. As a bonus, calling 
UnpinStream() on an already-unpinned stream is now O(1) instead of O(n)


PS8, Line 388: continue
> i think this would be clearer as:
I think the new version is easier to follow.


http://gerrit.cloudera.org:8080/#/c/5811/8/be/src/runtime/buffered-tuple-stream-v2.h
File be/src/runtime/buffered-tuple-stream-v2.h:

PS8, Line 68: .
> maybe add: "for reading"
Done


PS8, Line 69: access
> read access?
We do update some tuples in place in the agg so that might be more misleading. 
I changed this so that it's clearer that it means that pointers to rows can be 
saved (since we don't really support random access via the stream interface).


PS8, Line 75: event
> even
Done


PS8, Line 319: bytes_pinned
> i guess should be BytesPinned() since not a simple accessor
Done


PS8, Line 385: including any pages already deleted in 'delete_on_read'
 :   /// mode.
> that's not intuitive. is there a good reason why we do that rather than mak
Currently those kind of invariants aren't maintained when reading in 
delete_on_read mode - that has to be the last pass over the stream.  num_rows_ 
also isn't updated.

I don't see a use case for why client code would depend on these values being 
updated while the stream is destructively read so the value of adding 
additional logic to keep this in sync would be limited.

I already put some effort into updating the documentation and DCHECKs so that 
it was clearer that delete_on_read was destructive and you couldn't re-read a 
stream afterwards.


PS8, Line 392: num_rows_returned_
> rows_returned_? (or rename that member, which might be better).
Updated the comment. I would rename the member in other circumstances but I 
don't want to make t

[Impala-ASF-CR] IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool

2017-03-06 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool
..


Patch Set 8:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/5811/8/be/src/runtime/buffered-tuple-stream-v2.cc
File be/src/runtime/buffered-tuple-stream-v2.cc:

PS8, Line 110: read_page_ != pages_.end()
why is this condition needed? what does it mean to have read_page_ == 
pages_.end() && pinned_?


-- 
To view, visit http://gerrit.cloudera.org:8080/5811
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7bb47a818564ac19a6be4ca88db0578f4ea0b709
Gerrit-PatchSet: 8
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool

2017-03-06 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool
..


Patch Set 8:

(20 comments)

http://gerrit.cloudera.org:8080/#/c/5811/8/be/src/runtime/buffered-tuple-stream-v2.cc
File be/src/runtime/buffered-tuple-stream-v2.cc:

PS8, Line 108: if
iff, just to be super clear.


Line 116:   }
fine with me to keep, but isn't this the same as line 109?


Line 141: ss << &*read_page_;
will it be possible to identify which page in the list printed below this is?


Line 166:   DCHECK(read_page_ == pages_.end());
DCHECK_EQ x2


PS8, Line 219: If the stream is pinned, the previous write page will remain 
pinned
 :   // with pin count 1 as a read page. If a read/write stream is 
pinned, the write
 :   // page has pin count 2, so we need to unpin once
these two sentences seem to both be talking about pinned streams, but i'm not 
sure what they are trying to conclude. i also don't understand this if-stmt 
that follows.

Why isn't it simply:

if (write_page_ != nullptr) UnpinPage(write_page_); ?

i.e. we need to unpin *for write* the page regardless of the state of the 
read-iterator, no?  i.e. can't we separate out reasoning of the read and write 
iterators?


PS8, Line 232: read_page_ != pages_.end()
what does this condition mean? why isn't it just: if the stream is pinned for 
reading, need to take a pin count for both read and write, otherwise need a pin 
count for only write?


Line 241: RETURN_IF_ERROR(buffer_pool_->Pin(buffer_pool_client_, 
&new_page.handle));
does this purposely not go through PinPage()?


PS8, Line 250: (double_pin ? 2 : 1)
i guess because of this. maybe move this to be after the CreatePage() call and 
just add 1 in there and then let PinPage() do its job?


Line 348: *pinned = true;
why was this needed?


PS8, Line 356: We need to double-pin the current write page if we also have a 
read iterator.
why? or maybe you mean just that we need to take a pin-count for reading even 
if it's already pinned for write? if that's the case, I think we should just 
talk about "read pinning" and "write pinning" separately (and this routine 
shouldn't have to muck with write pinning, right?)


PS8, Line 358: write_page_ == &page && read_page_ == pages_.end()
why? i thought we now do take a separate pin count for read and for write?


Line 387:   if (pinned_) UnpinPage(&page);
do we allow you to call UnpinStream() on an already unpinned stream? why?


PS8, Line 388: continue
i think this would be clearer as:

if (is_write_page || is_read_page) {

} else {

}

rather than the "continue" flow.


http://gerrit.cloudera.org:8080/#/c/5811/8/be/src/runtime/buffered-tuple-stream-v2.h
File be/src/runtime/buffered-tuple-stream-v2.h:

PS8, Line 68: .
maybe add: "for reading"


PS8, Line 69: access
read access?


Line 71: /// unpinned and therefore do not use the client's reservation and can 
be spilled to disk.
comments in this paragraph are meant to further clarify that pinned vs. 
unpinned stream really only applies to the read-iterator.


PS8, Line 75: event
even


PS8, Line 319: bytes_pinned
i guess should be BytesPinned() since not a simple accessor


PS8, Line 385: including any pages already deleted in 'delete_on_read'
 :   /// mode.
that's not intuitive. is there a good reason why we do that rather than make it 
always be the size of the stream? and if so, seems like it should be called 
highwater_byte_size or something.


PS8, Line 392: num_rows_returned_
rows_returned_? (or rename that member, which might be better).


-- 
To view, visit http://gerrit.cloudera.org:8080/5811
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7bb47a818564ac19a6be4ca88db0578f4ea0b709
Gerrit-PatchSet: 8
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool

2017-03-06 Thread Tim Armstrong (Code Review)
Tim Armstrong has uploaded a new patch set (#8).

Change subject: IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool
..

IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool

Add a copy of BufferedTupleStream that allocates memory from BufferPool.
This will replace the original implementation when IMPALA-3200 is
completed.

The major changes are:
* Terminology is updated from "blocks" to "pages"
* No small buffer support is needed (hooray!).
* BufferedTupleStream needs to do its own tracking of # of rows per
  page, etc instead of relying on BufferedBlockMgr to do it. A
  wrapper around PageHandle is used.
* Profile counters (unpin, pin, get new block time) are removed -
  similar counters in the BufferPool client are more useful.
* Changed the tuple null indicators so that they are allocated
  before each tuple, rather than in a block at the start of the
  page. This slightly reduces the memory density, but greatly
  simplifies much logic. In particular, it avoids problems with
  RowIdx and larger pages with offsets that don't fit in 16 bits.
* Buffer management of read/write streams uses the new pin-counting
  support to separate pinning of the read and write pages.
  This means that the reservation usage of an unpinned read/write
  stream does not fluctuate and the read/write iterators can always
  advance without requiring additional reservation.

Testing this required some further changes. TestEnv was refactored
so it can set up either BufferPool or BufferedBlockMgr. Some
BufferPool-related state is added to ExecEnv, QueryState and
RuntimeState, but is only created for backend tests that explicitly
create a BufferPool.

The following is left for future work:
* IMPALA-3808 (large row support) is not added. I've added
  TODOs to the code to places that will require changes.
* IMPALA-4179 (remove MarkNeedsDeepCopy()) is not fixed, since
  it requires global changes to operators that accumulate memory.

Testing:
All of the BufferedTupleStream unit tests are ported to the new
implementation, except for ones specifically testing the small
buffer functionality.

Change-Id: I7bb47a818564ac19a6be4ca88db0578f4ea0b709
---
M be/src/exec/hash-table-test.cc
M be/src/runtime/CMakeLists.txt
M be/src/runtime/buffered-block-mgr-test.cc
M be/src/runtime/buffered-tuple-stream-test.cc
A be/src/runtime/buffered-tuple-stream-v2-test.cc
A be/src/runtime/buffered-tuple-stream-v2.cc
A be/src/runtime/buffered-tuple-stream-v2.h
A be/src/runtime/buffered-tuple-stream-v2.inline.h
M be/src/runtime/bufferpool/buffer-pool-test.cc
M be/src/runtime/exec-env.cc
M be/src/runtime/exec-env.h
M be/src/runtime/query-state.cc
M be/src/runtime/query-state.h
M be/src/runtime/row-batch.cc
M be/src/runtime/row-batch.h
M be/src/runtime/runtime-state.cc
M be/src/runtime/runtime-state.h
M be/src/runtime/test-env.cc
M be/src/runtime/test-env.h
M be/src/runtime/tmp-file-mgr-test.cc
20 files changed, 2,931 insertions(+), 101 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/11/5811/8
-- 
To view, visit http://gerrit.cloudera.org:8080/5811
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7bb47a818564ac19a6be4ca88db0578f4ea0b709
Gerrit-PatchSet: 8
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool

2017-03-06 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool
..


Patch Set 6:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/5811/6/be/src/runtime/buffered-tuple-stream-v2.cc
File be/src/runtime/buffered-tuple-stream-v2.cc:

PS6, Line 112: DCHECK_LE(read_ptr_, read_end_ptr);
> would be nice to write this and line 106 the same to make it a little faste
Done


Line 252:   if (!read_page_->is_pinned()) {
> why does this not have to check if read and write page are the same?
I ended up reworking the buffer management as discussed to take advantage of 
the new pin-counting support. There are still some tricky corner cases but I 
think the invariants are clearer now, and we can provide stronger guarantees 
about being able to iterate through unpinned streams.


Line 276: if (!pinned_ && write_page_ != &pages_.front()) {
> this could use a comment:
Done


Line 332: && (&page == write_page_ || (read_write_ && &page == 
&*read_page_))) {
> why is this right to unpin the read_page_ if !read_write_?
This got reworked with the other changes


http://gerrit.cloudera.org:8080/#/c/5811/6/be/src/runtime/buffered-tuple-stream-v2.h
File be/src/runtime/buffered-tuple-stream-v2.h:

PS6, Line 517: CalcNumPinned
> this looks dead
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/5811
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7bb47a818564ac19a6be4ca88db0578f4ea0b709
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool

2017-03-06 Thread Tim Armstrong (Code Review)
Tim Armstrong has uploaded a new patch set (#7).

Change subject: IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool
..

IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool

Add a copy of BufferedTupleStream that allocates memory from BufferPool.
This will replace the original implementation when IMPALA-3200 is
completed.

The major changes are:
* Terminology is updated from "blocks" to "pages"
* No small buffer support is needed (hooray!).
* BufferedTupleStream needs to do its own tracking of # of rows per
  page, etc instead of relying on BufferedBlockMgr to do it. A
  wrapper around PageHandle is used.
* Profile counters (unpin, pin, get new block time) are removed -
  similar counters in the BufferPool client are more useful.
* Changed the tuple null indicators so that they are allocated
  before each tuple, rather than in a block at the start of the
  page. This slightly reduces the memory density, but greatly
  simplifies much logic. In particular, it avoids problems with
  RowIdx and larger pages with offsets that don't fit in 16 bits.
* Buffer management of read/write streams uses the new pin-counting
  support to separate pinning of the read and write pages.
  This means that the reservation usage of an unpinned read/write
  stream does not fluctuate and the read/write iterators can always
  advance without requiring additional reservation.

Testing this required some further changes. TestEnv was refactored
so it can set up either BufferPool or BufferedBlockMgr. Some
BufferPool-related state is added to ExecEnv, QueryState and
RuntimeState, but is only created for backend tests that explicitly
create a BufferPool.

The following is left for future work:
* IMPALA-3808 (large row support) is not added. I've added
  TODOs to the code to places that will require changes.
* IMPALA-4179 (remove MarkNeedsDeepCopy()) is not fixed, since
  it requires global changes to operators that accumulate memory.

Testing:
All of the BufferedTupleStream unit tests are ported to the new
implementation, except for ones specifically testing the small
buffer functionality.

Change-Id: I7bb47a818564ac19a6be4ca88db0578f4ea0b709
---
M be/src/exec/hash-table-test.cc
M be/src/runtime/CMakeLists.txt
M be/src/runtime/buffered-block-mgr-test.cc
M be/src/runtime/buffered-tuple-stream-test.cc
A be/src/runtime/buffered-tuple-stream-v2-test.cc
A be/src/runtime/buffered-tuple-stream-v2.cc
A be/src/runtime/buffered-tuple-stream-v2.h
A be/src/runtime/buffered-tuple-stream-v2.inline.h
M be/src/runtime/bufferpool/buffer-pool-test.cc
M be/src/runtime/exec-env.cc
M be/src/runtime/exec-env.h
M be/src/runtime/query-state.cc
M be/src/runtime/query-state.h
M be/src/runtime/row-batch.cc
M be/src/runtime/row-batch.h
M be/src/runtime/runtime-state.cc
M be/src/runtime/runtime-state.h
M be/src/runtime/test-env.cc
M be/src/runtime/test-env.h
M be/src/runtime/tmp-file-mgr-test.cc
20 files changed, 2,935 insertions(+), 101 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/11/5811/7
-- 
To view, visit http://gerrit.cloudera.org:8080/5811
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7bb47a818564ac19a6be4ca88db0578f4ea0b709
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool

2017-03-02 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool
..


Patch Set 6:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/5811/5/be/src/runtime/buffered-tuple-stream-v2.cc
File be/src/runtime/buffered-tuple-stream-v2.cc:

Line 256: // actually fail once we have variable-length pages.
> I just kept this logic from the old BufferedTupleStream.
How would the stream use more reservation after UnpinStream()? Wouldn't we 
spend a reservation on the read and write pages both when the stream is pinned 
and then unpinned?

I need to read the code through some more, but it seems to that a fair amount 
of complexity comes from sharing the reservation when they overlap.

If we do end up sticking with this scheme, how about verifying the pin count is 
exactly 1 for the read/write page(s)?


http://gerrit.cloudera.org:8080/#/c/5811/6/be/src/runtime/buffered-tuple-stream-v2.cc
File be/src/runtime/buffered-tuple-stream-v2.cc:

PS6, Line 112: DCHECK_LE(read_ptr_, read_end_ptr);
would be nice to write this and line 106 the same to make it a little faster to 
read


Line 252:   if (!read_page_->is_pinned()) {
why does this not have to check if read and write page are the same?


Line 276: if (!pinned_ && write_page_ != &pages_.front()) {
this could use a comment:
// pages_.front() will become the read_page_.


Line 332: && (&page == write_page_ || (read_write_ && &page == 
&*read_page_))) {
why is this right to unpin the read_page_ if !read_write_?


http://gerrit.cloudera.org:8080/#/c/5811/6/be/src/runtime/buffered-tuple-stream-v2.h
File be/src/runtime/buffered-tuple-stream-v2.h:

PS6, Line 517: CalcNumPinned
this looks dead


-- 
To view, visit http://gerrit.cloudera.org:8080/5811
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7bb47a818564ac19a6be4ca88db0578f4ea0b709
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool

2017-03-01 Thread Tim Armstrong (Code Review)
Tim Armstrong has uploaded a new patch set (#6).

Change subject: IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool
..

IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool

Add a copy of BufferedTupleStream that allocates memory from BufferPool.
This will replace the original implementation when IMPALA-3200 is
completed.

The major changes are:
* Terminology is updated from "blocks" to "pages"
* No small buffer support is needed (hooray!).
* BufferedTupleStream needs to do its own tracking of # of rows per
  page, etc instead of relying on BufferedBlockMgr to do it. A
  wrapper around PageHandle is used.
* Profile counters (unpin, pin, get new block time) are removed -
  similar counters in the BufferPool client are more useful.
* Changed the tuple null indicators so that they are allocated
  before each tuple, rather than in a block at the start of the
  page. This slightly reduces the memory density, but greatly
  simplifies much logic. In particular, it avoids problems with
  RowIdx and larger pages with offsets that don't fit in 16 bits.

Testing this required some further changes. TestEnv was refactored
so it can set up either BufferPool or BufferedBlockMgr. Some
BufferPool-related state is added to ExecEnv, QueryState and
RuntimeState, but is only created for backend tests that explicitly
create a BufferPool.

The following is left for future work:
* IMPALA-3808 (large row support) is not added. I've added
  TODOs to the code to places that will require changes.
* IMPALA-4179 (remove MarkNeedsDeepCopy()) is not fixed, since
  it requires global changes to operators that accumulate memory.

Testing:
All of the BufferedTupleStream unit tests are ported to the new
implementation, except for ones specifically testing the small
buffer functionality.

Change-Id: I7bb47a818564ac19a6be4ca88db0578f4ea0b709
---
M be/src/exec/hash-table-test.cc
M be/src/runtime/CMakeLists.txt
M be/src/runtime/buffered-block-mgr-test.cc
M be/src/runtime/buffered-tuple-stream-test.cc
A be/src/runtime/buffered-tuple-stream-v2-test.cc
A be/src/runtime/buffered-tuple-stream-v2.cc
A be/src/runtime/buffered-tuple-stream-v2.h
A be/src/runtime/buffered-tuple-stream-v2.inline.h
M be/src/runtime/bufferpool/buffer-pool-test.cc
M be/src/runtime/exec-env.cc
M be/src/runtime/exec-env.h
M be/src/runtime/query-state.cc
M be/src/runtime/query-state.h
M be/src/runtime/row-batch.cc
M be/src/runtime/row-batch.h
M be/src/runtime/runtime-state.cc
M be/src/runtime/runtime-state.h
M be/src/runtime/test-env.cc
M be/src/runtime/test-env.h
M be/src/runtime/tmp-file-mgr-test.cc
20 files changed, 2,822 insertions(+), 101 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/11/5811/6
-- 
To view, visit http://gerrit.cloudera.org:8080/5811
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7bb47a818564ac19a6be4ca88db0578f4ea0b709
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool

2017-03-01 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool
..


Patch Set 5:

(39 comments)

http://gerrit.cloudera.org:8080/#/c/5811/5/be/src/runtime/buffered-tuple-stream-v2.cc
File be/src/runtime/buffered-tuple-stream-v2.cc:

PS5, Line 181: write_page_->is_pinned()
> when do we have an unpinned write page? what does that mean?
I documented that they have to be pinned and added some consistency checks. If 
I set write_page_ = null earlier that keeps it in a consistent state.


Line 189:   *got_page = 
buffer_pool_client_->reservation()->IncreaseReservationToFit(page_len);
> shouldn't this come with IMPALA-3208? or do we need this already?
We need it already for the first page or if got_page was false on an earlier 
call and we're retrying.


Line 198:   write_end_ptr_ = new_page.data() + page_len;
> would be slightly clearer to compute this based on write_page_
Done


Line 234: DCHECK(&*read_page_ != write_page_);
> here and above: DCHECK_NE/EQ
It doesn't work for the first one since the values of iterators aren't 
printable.


Line 246:   bytes_in_mem_ -= read_page_->len();
> UnpinPage()?
Done


Line 256:   if (!read_page_->is_pinned()) {
> is this check sufficient? couldn't the page by pinned due to it also being 
I just kept this logic from the old BufferedTupleStream.

Added a comment to 'write_page_' to explain.

I think it's easiest to leave as-is for now. Having a pin per iterator makes a 
lot of sense when we have a separate iterator abstraction that manages its own 
reservation, but it interacts strangely with the current UnpinStream() API - if 
you UnpinStream() a read-write stream with a single page then it could actually 
use more reservation. The only current user -AnalyticEvalNode - shouldn't do 
that but it adds a weird edge case to be wary of.

I'm open to changing it but it wasn't a strict improvement in my eyes.


Line 300: bytes_in_mem_ += pages_.front().len();
> maybe factor this into PinPage() to match UnpinPage()
Done


PS5, Line 324: is_pinned
> i guess this use is probably okay
I think we'd need to change it if we started pinned pages multiple times - we 
might need to unpin a page here if it was pinned multiple times.


http://gerrit.cloudera.org:8080/#/c/5811/5/be/src/runtime/buffered-tuple-stream-v2.h
File be/src/runtime/buffered-tuple-stream-v2.h:

Line 65: //
> ///
Done


PS5, Line 66: In the pinned mode all pages are
: /// pinned and the data is fully in memory.
> seems redundant with the next sentence
Done


PS5, Line 74: PrepareForWrite
> shouldn't that be PrepareForRead()?
Done


PS5, Line 74: the
> to
Done


Line 75: /// pass over the stream.
> this should go away soon after switching to BufferPool, right? i.e. buffers
I think the concept of a destructive read pass will continue to  exist, but the 
pages would be destroyed and buffers attached instead of the current dance with 
NeedsDeepCopy().

There will probably be other related changes though, e.g. around buffer 
management for non-destructive read passes. To support non-destructive unpinned 
read passes we might actually need a new BufferPool API to extract a buffer 
from a page without destroying the page.


Line 121: ///  <---12b---> <---12b---> <-5b>
> i guess these examples all assume non-nullable?
Added a nulllable example


Line 127: ///   caller does not need to copy if it is streaming.
> are these listed in order of precedence? i.e. if a stream is both delete on
Reworked this and the memory lifetime section to avoid ambiguity.


Line 139: ///   or free up other memory before retrying.
> nit:indentation of some of these paragraphs is inconsistent
Done


Line 152: /// layout described above.
> maybe a TODO/JIRA for what we had discussed with unifying AddRow() and Allo
Done


Line 199:   ///Returns false and sets 'status' to an error.
> nit: indentations of these bullets are inconsistent
Done


Line 262:   /// rows will remain valid until the stream is unpinned, destroyed, 
etc.
> let's also put TODO IMPALA-4179 to make it easier to keep straight of where
Done


Line 267:   /// stream remains pinned.
> does this interface require the stream to be pinned? if so, let's be explic
Done


PS5, Line 370: offset
> this isn't really the offset.
Done


Line 374:   uint8_t* read_end_ptr_;
> given we store number of rows in the page, what is this used for? oh, i see
Done


Line 380:   uint8_t* write_end_ptr_;
> same, and this one looks dead anyway.
This one is used in AllocateRow() where the perf matters - it likely saves a 
few cycles per row. I ended up removing write_page_bytes_remaining() and just 
inlining the calculations.


Line 382:   /// Number of rows returned to the caller from GetNext().
> ... since PrepareForRead() was called?
Done


Line 386:   int read_page_idx_;
> this looks dead
Done


Line 393:   int num_pinned_;
> this doesn't looked 

[Impala-ASF-CR] IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool

2017-02-27 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool
..


Patch Set 5:

(39 comments)

Will finish buffered-tuple-stream-v2.cc after you reply to the pin count 
management question.

http://gerrit.cloudera.org:8080/#/c/5811/5/be/src/runtime/buffered-tuple-stream-v2.cc
File be/src/runtime/buffered-tuple-stream-v2.cc:

PS5, Line 181: write_page_->is_pinned()
when do we have an unpinned write page? what does that mean?
oh, i guess we can get into that state if we return at 190? any way to simplify 
these possible states?


Line 189:   *got_page = 
buffer_pool_client_->reservation()->IncreaseReservationToFit(page_len);
shouldn't this come with IMPALA-3208? or do we need this already?


Line 198:   write_end_ptr_ = new_page.data() + page_len;
would be slightly clearer to compute this based on write_page_


Line 234: DCHECK(&*read_page_ != write_page_);
here and above: DCHECK_NE/EQ


Line 246:   bytes_in_mem_ -= read_page_->len();
UnpinPage()?


Line 256:   if (!read_page_->is_pinned()) {
is this check sufficient? couldn't the page by pinned due to it also being the 
write page. in which case shouldn't we increment the pin count? i.e. shouldn't 
this check by !is_pinned_?

in fact, is using page->is_pinned() ever the right thing given that we don't 
know whether it is pinned as the read or write page?

oh, i guess we're careful to handle the read_page==write_page case in all 
pinning/unpinning places? i.e the stream only takes one pin count per page even 
if used as both read and write page, is that right? is this explained anywhere? 

(wasn't the plan to just take a pin for each iterator? does that not work?)


Line 300: bytes_in_mem_ += pages_.front().len();
maybe factor this into PinPage() to match UnpinPage()


PS5, Line 324: is_pinned
i guess this use is probably okay


http://gerrit.cloudera.org:8080/#/c/5811/5/be/src/runtime/buffered-tuple-stream-v2.h
File be/src/runtime/buffered-tuple-stream-v2.h:

Line 65: //
///


PS5, Line 66: In the pinned mode all pages are
: /// pinned and the data is fully in memory.
seems redundant with the next sentence


PS5, Line 74: the
to


PS5, Line 74: PrepareForWrite
shouldn't that be PrepareForRead()?


Line 75: /// pass over the stream.
this should go away soon after switching to BufferPool, right? i.e. buffers can 
be attached to the row-batches rather than using needs-deep-copy mechanism.
if so, how about a TODO (and reference IMPALA-4179)?


Line 121: ///  <---12b---> <---12b---> <-5b>
i guess these examples all assume non-nullable?


Line 127: ///   caller does not need to copy if it is streaming.
are these listed in order of precedence? i.e. if a stream is both delete on 
read and pinned, the page is deleted, right?


Line 139: ///   or free up other memory before retrying.
nit:indentation of some of these paragraphs is inconsistent


Line 152: /// layout described above.
maybe a TODO/JIRA for what we had discussed with unifying AddRow() and 
AllocateRow() by evaluating expression?


Line 199:   ///Returns false and sets 'status' to an error.
nit: indentations of these bullets are inconsistent


Line 262:   /// rows will remain valid until the stream is unpinned, destroyed, 
etc.
let's also put TODO IMPALA-4179 to make it easier to keep straight of where 
we're headed.


Line 267:   /// stream remains pinned.
does this interface require the stream to be pinned? if so, let's be explicit 
about that. if not, why not?


PS5, Line 370: offset
this isn't really the offset.


Line 374:   uint8_t* read_end_ptr_;
given we store number of rows in the page, what is this used for? oh, i see 
this is the physical end, but only used in dchecks. how about removing this and 
just use data() + len() in those dchecks?


Line 380:   uint8_t* write_end_ptr_;
same, and this one looks dead anyway.


Line 382:   /// Number of rows returned to the caller from GetNext().
... since PrepareForRead() was called?
i.e. this doesn't accumulate over passes, right?


Line 386:   int read_page_idx_;
this looks dead


Line 393:   int num_pinned_;
this doesn't looked used either, except for DebugString() which iterates the 
list anyway, so how about removing. or do the exec nodes really need to know 
that?


Line 407:   /// to AddRow() must occur before calling PrepareForRead() and 
subsequent calls to
or AllocateRow()


PS5, Line 414:   
nit extra space


PS5, Line 419: This changes the
 :   /// memory management of the stream
this sentence isn't very helpful


PS5, Line 454: vailable
available?
but does this try to increase reservation? etc


PS5, Line 463: blocks
is this talking about blocking on disk in order to repin a page? or something 
else?


Line 517:   int CalcNumPinned();
won't need this if we get rid of num_pinned_


http://gerrit.cloudera.org:8080/#/c/5811/5/be/src/runtime/buffered-tuple-stream-v2.inlin

[Impala-ASF-CR] IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool

2017-02-03 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool
..


Patch Set 5:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/5811/5//COMMIT_MSG
Commit Message:

Line 25:   RowIdx and larger pages with offsets that don't fit in 16 bits.
> how much is the memory density reduced?  in the worse case, does this use 2
The impact is mixed. 

In general density is hurt if tuples are mostly non-null. I think the worst 
case is 8x, if the row has no materialised slots and therefore all memory is 
devoted to the bitstring.

My feeling is that adding an extra byte per row is acceptable, even if it's an 
8x regression.

In other cases, where tuples are mostly null, memory density can improve more 
than 2x. E.g. if you have a very wide nullable tuple, with many nulls, then the 
previous encoding is very inefficient because it runs out of space in the 
fixed-size bit vector before it runs out of space in the page.


-- 
To view, visit http://gerrit.cloudera.org:8080/5811
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7bb47a818564ac19a6be4ca88db0578f4ea0b709
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool

2017-02-03 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool
..


Patch Set 5:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/5811/5//COMMIT_MSG
Commit Message:

Line 25:   RowIdx and larger pages with offsets that don't fit in 16 bits.
how much is the memory density reduced?  in the worse case, does this use 2x 
the memory (single bool slot)?


-- 
To view, visit http://gerrit.cloudera.org:8080/5811
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7bb47a818564ac19a6be4ca88db0578f4ea0b709
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool

2017-02-02 Thread Tim Armstrong (Code Review)
Tim Armstrong has uploaded a new patch set (#5).

Change subject: IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool
..

IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool

Add a copy of BufferedTupleStream that allocates memory from BufferPool.
This will replace the original implementation when IMPALA-3200 is
completed.

The major changes are:
* Terminology is updated from "blocks" to "pages"
* No small buffer support is needed (hooray!).
* BufferedTupleStream needs to do its own tracking of # of rows per
  page, etc instead of relying on BufferedBlockMgr to do it. A
  wrapper around PageHandle is used.
* Profile counters (unpin, pin, get new block time) are removed -
  similar counters in the BufferPool client are more useful.
* Changed the tuple null indicators so that they are allocated
  before each tuple, rather than in a block at the start of the
  page. This slightly reduces the memory density, but greatly
  simplifies much logic. In particular, it avoids problems with
  RowIdx and larger pages with offsets that don't fit in 16 bits.

Testing this required some further changes. TestEnv was refactored
so it can set up either BufferPool or BufferedBlockMgr. Some
BufferPool-related state is added to ExecEnv, QueryState and
RuntimeState, but is only created for backend tests that explicitly
create a BufferPool.

The following is left for future work:
* IMPALA-3808 (large row support) is not added. I've added
  TODOs to the code to places that will require changes.
* IMPALA-4179 (remove MarkNeedsDeepCopy()) is not fixed, since
  it requires global changes to operators that accumulate memory.

Testing:
All of the BufferedTupleStream unit tests are ported to the new
implementation, except for ones specifically testing the small
buffer functionality.

Change-Id: I7bb47a818564ac19a6be4ca88db0578f4ea0b709
---
M be/src/exec/hash-table-test.cc
M be/src/runtime/CMakeLists.txt
M be/src/runtime/buffered-block-mgr-test.cc
M be/src/runtime/buffered-tuple-stream-test.cc
A be/src/runtime/buffered-tuple-stream-v2-test.cc
A be/src/runtime/buffered-tuple-stream-v2.cc
A be/src/runtime/buffered-tuple-stream-v2.h
A be/src/runtime/buffered-tuple-stream-v2.inline.h
M be/src/runtime/bufferpool/buffer-pool-test.cc
M be/src/runtime/exec-env.cc
M be/src/runtime/exec-env.h
M be/src/runtime/query-state.cc
M be/src/runtime/query-state.h
M be/src/runtime/row-batch.cc
M be/src/runtime/row-batch.h
M be/src/runtime/runtime-state.cc
M be/src/runtime/runtime-state.h
M be/src/runtime/test-env.cc
M be/src/runtime/test-env.h
M be/src/runtime/tmp-file-mgr-test.cc
20 files changed, 2,831 insertions(+), 99 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/11/5811/5
-- 
To view, visit http://gerrit.cloudera.org:8080/5811
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7bb47a818564ac19a6be4ca88db0578f4ea0b709
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool

2017-02-02 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool
..


Patch Set 4:

(37 comments)

http://gerrit.cloudera.org:8080/#/c/5811/4/be/src/runtime/buffered-tuple-stream-v2.cc
File be/src/runtime/buffered-tuple-stream-v2.cc:

Line 326:   LOG(INFO) << "Failed to get reservation for " << 
pages_.front().len();
> I need to remove this log message.
Done


http://gerrit.cloudera.org:8080/#/c/5811/4/be/src/runtime/buffered-tuple-stream-v2.h
File be/src/runtime/buffered-tuple-stream-v2.h:

PS4, Line 41:  or scratch disk storage
> what does this mean? isn't it always backed by BufferPool (which  may spill
Reworded to reflect that.


PS4, Line 45: the APIs are called from a single thread
> this is saying per BufferTupleStream object, right? maybe clarify that
Done


Line 46: ///
> it'd be nice to explain the point of this abstraction a bit more explicitly
Done


PS4, Line 49: pinned or unpinned
> I think these terms need to be defined relative to the stream as far as wha
Done


PS4, Line 59: page size
> wouldn't it also be a function of tuple size (i.e. tuples per page)?
I don't think this added much - the previous sentence better describes the 
number of null indicators.


PS4, Line 77: pages
> pages' buffers.
Done


PS4, Line 80: location in memory
> or you could say "buffer".
Done


PS4, Line 110: in pages_
> in the stream
Done


PS4, Line 111: in pages_
> same
Done


PS4, Line 114: (if read_write is
 : ///   true, then two pages need to be in memory).
> rather than saying this in parenthesis, i think we should pull it up to say
Reworded this part to say that we only need enough reservation for a single 
read/write buffer.


PS4, Line 117: caller's reservation is insufficient
> does it first try to increase the reservation at this point?
Done


Line 124: /// returned so far from the stream may be freed on the next call to 
GetNext().
> do we have a jira we can reference as a TODO here to simplify this?
Done


Line 126: /// Manual construction of rows with AllocateRow():
> not for now, but is it possible to unify things so we only have one way to 
Updated to make it clearer that it's called instead of AddRow().


Line 138: /// this array as new rows are inserted in the page. If we do so, 
then there will be
> is that still something we want to do?
No longer relevant with the changes I made to the null indicator layout.


PS4, Line 146: id
> what is 'id'? is this index?
Done


PS4, Line 147: With 8MB pages that is 512GB per stream.
> where does this limit come into play?  is this something that will cause tr
I believe we only use RowIdx currently if the right side of a join has multiple 
tuples per row.

I reworked this to drastically simplify it. Instead of RowIdx we just return a 
pointer to the start of the data. This avoids all the problems with limits here.


PS4, Line 186: page_len: the size of pages to use in the stream
> what does that mean for large rows?
We need something like a default and max page length. Added a TODO.


PS4, Line 198: of
> off
Done


PS4, Line 203: AddRow
> is it also used before AllocateRow()?
Done


Line 207:   /// if an error status is returned.
> broken line break
Done


Line 212:   /// sets 'status' to an error if an error occurred.
> hard to parse that sentence.  but, can we simplify this dance now that we h
Updated this comment to be clearer about the possible scenarios.

It's guaranteed to succeed for now. It's actually a bit more complicated for 
variable-length pages since that would rely on the client leaving enough slack 
in the reservation to read the largest-possible page.


Line 223:   /// tuples.
> same questions. let's think of this interface can be simplified now.
Updated to refer back to the AddRow() comment.


Line 228:   /// been allocated with the stream's row desc.
> is the row valid only as long as the stream remains pinned?
Done


PS4, Line 232: begin reading
> before the first GetNext()?
Done


Line 236:   /// false if the page could not be pinned and no error was 
encountered.
> can we explain that in terms of reservations? and like questions above, doe
Done


PS4, Line 240: enough memory
> what does that mean in terms of reservations?  and does it still make sense
Done


Line 253:   };
> this seems like a wrinkle to the class comment. can it be succinctly descri
I think it now covers it (it mentions that it only pins the pages being read 
and written in unpinned mode)


PS4, Line 259: must be copied out by the caller
> this seems to contradict the class comment about "needs_deep_copy"
Updated the comment.


PS4, Line 267: *got_rows is false if the stream could not be pinne
> does that still make sense with reservations?  if so, it should be expresse
Reworded


PS4, Line 271: date
> data?
Done


PS4, Line 271: buffers containing page
> are there buffers that don't contain page data?
Reworded, t

[Impala-ASF-CR] IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool

2017-02-01 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool
..


Patch Set 4:

(32 comments)

First set of comments focusing on the public interface.

http://gerrit.cloudera.org:8080/#/c/5811/4/be/src/runtime/buffered-tuple-stream-v2.h
File be/src/runtime/buffered-tuple-stream-v2.h:

PS4, Line 41:  or scratch disk storage
what does this mean? isn't it always backed by BufferPool (which  may spill to 
disk internally)?


PS4, Line 45: the APIs are called from a single thread
this is saying per BufferTupleStream object, right? maybe clarify that


Line 46: ///
it'd be nice to explain the point of this abstraction a bit more explicitly. 
like when we talk about BTS, we talk in terms of iterators and the iteration 
patterns, but this only alludes to that. and the random access pattern. etc.


PS4, Line 49: pinned or unpinned
I think these terms need to be defined relative to the stream as far as what it 
means to a client and why they would change between states.


PS4, Line 59: page size
wouldn't it also be a function of tuple size (i.e. tuples per page)?


PS4, Line 77: pages
pages' buffers.


PS4, Line 80: location in memory
or you could say "buffer".


PS4, Line 110: in pages_
in the stream


PS4, Line 111: in pages_
same


PS4, Line 114: (if read_write is
 : ///   true, then two pages need to be in memory).
rather than saying this in parenthesis, i think we should pull it up to say 
that there can be both a read and write iterator simultaneously.


PS4, Line 117: caller's reservation is insufficient
does it first try to increase the reservation at this point?


Line 124: /// returned so far from the stream may be freed on the next call to 
GetNext().
do we have a jira we can reference as a TODO here to simplify this?


Line 126: /// Manual construction of rows with AllocateRow():
not for now, but is it possible to unify things so we only have one way to 
write into the stream?

In the mean time, how does this mode relate to the "write" mode above? is it a 
sub-mode for pinned?


Line 138: /// this array as new rows are inserted in the page. If we do so, 
then there will be
is that still something we want to do?


PS4, Line 146: id
what is 'id'? is this index?


PS4, Line 147: With 8MB pages that is 512GB per stream.
where does this limit come into play?  is this something that will cause 
trouble with going to 2MB pages?


PS4, Line 186: page_len: the size of pages to use in the stream
what does that mean for large rows?


PS4, Line 198: of
off


PS4, Line 203: AddRow
is it also used before AllocateRow()?


Line 207:   /// if an error status is returned.
broken line break


Line 212:   /// sets 'status' to an error if an error occurred.
hard to parse that sentence.  but, can we simplify this dance now that we have 
real reservations?
also, does this operation try to increase reservation before failing to append?
is this guaranteed to succeed (other than runtime errors) for unpinned streams 
(maybe after you add support for large rows)?


Line 223:   /// tuples.
same questions. let's think of this interface can be simplified now.


Line 228:   /// been allocated with the stream's row desc.
is the row valid only as long as the stream remains pinned?


PS4, Line 232: begin reading
before the first GetNext()?


Line 236:   /// false if the page could not be pinned and no error was 
encountered.
can we explain that in terms of reservations? and like questions above, does 
this got_buffer==false interface still make sense with reservations?


PS4, Line 240: enough memory
what does that mean in terms of reservations?  and does it still make sense 
with reservations?


Line 253:   };
this seems like a wrinkle to the class comment. can it be succinctly described 
as part of the iterator discussion?


PS4, Line 259: must be copied out by the caller
this seems to contradict the class comment about "needs_deep_copy"


PS4, Line 267: *got_rows is false if the stream could not be pinne
does that still make sense with reservations?  if so, it should be expressed in 
terms of reservation.


PS4, Line 271: date
data?


PS4, Line 271: buffers containing page
are there buffers that don't contain page data?


PS4, Line 286: Returns the byte size of the stream that is currently pinned in 
memory.
Is this trying to say:
Returns the number of bytes currently pinned in memory by the stream?


-- 
To view, visit http://gerrit.cloudera.org:8080/5811
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7bb47a818564ac19a6be4ca88db0578f4ea0b709
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool

2017-01-31 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool
..


Patch Set 4:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/5811/4/be/src/runtime/buffered-tuple-stream-v2.cc
File be/src/runtime/buffered-tuple-stream-v2.cc:

Line 326:   LOG(INFO) << "Failed to get reservation for " << 
pages_.front().len();
I need to remove this log message.


http://gerrit.cloudera.org:8080/#/c/5811/4/be/src/runtime/query-state.cc
File be/src/runtime/query-state.cc:

Line 100:   if (ExecEnv::GetInstance()->buffer_pool() != nullptr) {
Use exec-env directly.


http://gerrit.cloudera.org:8080/#/c/5811/4/be/src/runtime/query-state.h
File be/src/runtime/query-state.h:

Line 29: #include "runtime/tmp-file-mgr.h" // TODO: can we avoid this?
Remove TODO


Line 121:   Status CheckSpillingEnabled();
This is currently dead code - can remove it.


http://gerrit.cloudera.org:8080/#/c/5811/4/be/src/runtime/row-batch.h
File be/src/runtime/row-batch.h:

Line 442:   /// TODO: ownership semantics
Need to fix this TODO


-- 
To view, visit http://gerrit.cloudera.org:8080/5811
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7bb47a818564ac19a6be4ca88db0578f4ea0b709
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool

2017-01-27 Thread Tim Armstrong (Code Review)
Tim Armstrong has uploaded a new patch set (#4).

Change subject: IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool
..

IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool

Add a copy of BufferedTupleStream that allocates memory from BufferPool.
This will replace the original implementation when IMPALA-3200 is
completed.

The major changes are:
* Terminology is updated from "blocks" to "pages"
* No small buffer support is needed (hooray!).
* BufferedTupleStream needs to do its own tracking of # of rows per
  page, etc instead of relying on BufferedBlockMgr to do it. A
  wrapper around PageHandle is used.
* Profile counters (unpin, pin, get new block time) are removed -
  similar counters in the BufferPool client are more useful.

The logic for serialising and deserialising rows has not changed,
since that is implemented in terms of read and write pointers into
the buffers.

Testing this required some further changes. TestEnv was refactored
so it can set up either BufferPool or BufferedBlockMgr. Some
BufferPool-related state is added to ExecEnv, QueryState and
RuntimeState, but is only created for backend tests that explicitly
create a BufferPool.

The following is left for future work:
* IMPALA-3808 (large row support) is not added. I've added
  TODOs to the code to places that will require changes.
* IMPALA-4179 (remove MarkNeedsDeepCopy()) is not fixed, since
  it requires global changes to operators that accumulate memory.

Testing:
All of the BufferedTupleStream unit tests are ported to the new
implementation, except for ones specifically testing the small
buffer functionality.

Change-Id: I7bb47a818564ac19a6be4ca88db0578f4ea0b709
---
M be/src/exec/hash-table-test.cc
M be/src/runtime/CMakeLists.txt
M be/src/runtime/buffered-block-mgr-test.cc
M be/src/runtime/buffered-tuple-stream-test.cc
A be/src/runtime/buffered-tuple-stream-v2-test.cc
A be/src/runtime/buffered-tuple-stream-v2.cc
A be/src/runtime/buffered-tuple-stream-v2.h
A be/src/runtime/buffered-tuple-stream-v2.inline.h
M be/src/runtime/bufferpool/buffer-pool-test.cc
M be/src/runtime/exec-env.cc
M be/src/runtime/exec-env.h
M be/src/runtime/query-state.cc
M be/src/runtime/query-state.h
M be/src/runtime/row-batch.cc
M be/src/runtime/row-batch.h
M be/src/runtime/runtime-state.cc
M be/src/runtime/runtime-state.h
M be/src/runtime/test-env.cc
M be/src/runtime/test-env.h
M be/src/runtime/tmp-file-mgr-test.cc
20 files changed, 2,942 insertions(+), 99 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/11/5811/4
-- 
To view, visit http://gerrit.cloudera.org:8080/5811
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7bb47a818564ac19a6be4ca88db0578f4ea0b709
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 


[Impala-ASF-CR] IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool

2017-01-27 Thread Tim Armstrong (Code Review)
Tim Armstrong has uploaded a new patch set (#3).

Change subject: IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool
..

IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool

Add a copy of BufferedTupleStream that allocates memory from BufferPool.
This will replace the original implementation when IMPALA-3200 is
completed.

The major changes are:
* Terminology is updated from "blocks" to "pages"
* No small buffer support is needed (hooray!).
* BufferedTupleStream needs to do its own tracking of # of rows per
  page, etc instead of relying on BufferedBlockMgr to do it. A
  wrapper around PageHandle is used.
* Profile counters (unpin, pin, get new block time) are removed -
  similar counters in the BufferPool client are more useful.

The logic for serialising and deserialising rows has not changed,
since that is implemented in terms of read and write pointers into
the buffers.

Testing this required some further changes. TestEnv was refactored
so it can set up either BufferPool or BufferedBlockMgr. Some
BufferPool-related state is added to ExecEnv, QueryState and
RuntimeState, but is only created for backend tests that explicitly
create a BufferPool.

The following is left for future work:
* IMPALA-3808 (large row support) is not added. I've added
  TODOs to the code to places that will require changes.
* IMPALA-4179 (remove MarkNeedsDeepCopy()) is not fixed, since
  it requires global changes to operators that accumulate memory.

Change-Id: I7bb47a818564ac19a6be4ca88db0578f4ea0b709
---
M be/src/exec/hash-table-test.cc
M be/src/runtime/CMakeLists.txt
M be/src/runtime/buffered-block-mgr-test.cc
M be/src/runtime/buffered-tuple-stream-test.cc
A be/src/runtime/buffered-tuple-stream-v2-test.cc
A be/src/runtime/buffered-tuple-stream-v2.cc
A be/src/runtime/buffered-tuple-stream-v2.h
A be/src/runtime/buffered-tuple-stream-v2.inline.h
M be/src/runtime/bufferpool/buffer-pool-test.cc
M be/src/runtime/exec-env.cc
M be/src/runtime/exec-env.h
M be/src/runtime/query-state.cc
M be/src/runtime/query-state.h
M be/src/runtime/row-batch.cc
M be/src/runtime/row-batch.h
M be/src/runtime/runtime-state.cc
M be/src/runtime/runtime-state.h
M be/src/runtime/test-env.cc
M be/src/runtime/test-env.h
M be/src/runtime/tmp-file-mgr-test.cc
20 files changed, 2,942 insertions(+), 99 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/11/5811/3
-- 
To view, visit http://gerrit.cloudera.org:8080/5811
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7bb47a818564ac19a6be4ca88db0578f4ea0b709
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong