[kudu-CR] client: allocate InFlightOp from a Batcher Arena

2020-07-14 Thread Todd Lipcon (Code Review)
Todd Lipcon has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/16189 )

Change subject: client: allocate InFlightOp from a Batcher Arena
..

client: allocate InFlightOp from a Batcher Arena

InFlightOps are always associated with a single Batcher, and Batchers
are always associated with a single "flush". Given that, we can allocate
them from a Batcher-local Arena instead of from the heap. This improves
CPU consumption and throughput of the client by about 10%.

Before:

INSERT report
rows total: 4000
time total: 10302.3 ms
  time per row: 0.000257559 ms
Dropping auto-created table 
'default.loadgen_auto_0334edaa55fe4fdba242b0b4fcc5d360'

 Performance counter stats for './build/latest/bin/kudu perf loadgen localhost 
-num_rows_per_thread=500 -num_threads=8':

 100857.18 msec task-clock#9.738 CPUs utilized
124584  context-switches  #0.001 M/sec
  6047  cpu-migrations#0.060 K/sec
 31199  page-faults   #0.309 K/sec
  412192743252  cycles#4.087 GHz
  (83.24%)
   95984615757  stalled-cycles-frontend   #   23.29% frontend cycles 
idle (83.35%)
   80498019077  stalled-cycles-backend#   19.53% backend cycles 
idle  (83.22%)
  225757843097  instructions  #0.55  insn per cycle
  #0.43  stalled cycles per 
insn  (83.38%)
   43262879676  branches  #  428.952 M/sec  
  (83.37%)
 276103070  branch-misses #0.64% of all branches
  (83.43%)

  10.356736781 seconds time elapsed

  98.263563000 seconds user
   2.646264000 seconds sys

After:

INSERT report
rows total: 4000
time total: 9730.09 ms
  time per row: 0.000243252 ms
Dropping auto-created table 
'default.loadgen_auto_ffc0ea72dd064db7b3fa12b037454eb2'

 Performance counter stats for './build/latest/bin/kudu.client-opt perf loadgen 
localhost -num_rows_per_thread=500 -num_threads=8':

  91317.10 msec task-clock#9.327 CPUs utilized
 57090  context-switches  #0.625 K/sec
  6661  cpu-migrations#0.073 K/sec
 34568  page-faults   #0.379 K/sec
  374170608484  cycles#4.097 GHz
  (83.31%)
   82165215635  stalled-cycles-frontend   #   21.96% frontend cycles 
idle (83.35%)
   57601295335  stalled-cycles-backend#   15.39% backend cycles 
idle  (83.25%)
  218776486065  instructions  #0.58  insn per cycle
  #0.38  stalled cycles per 
insn  (83.35%)
   41762336591  branches  #  457.333 M/sec  
  (83.34%)
 140528380  branch-misses #0.34% of all branches
  (83.40%)

   9.790174460 seconds time elapsed

  89.079083000 seconds user
   2.276474000 seconds sys

Change-Id: I446a8d21253b7a274872bff6d3e76705ac95d0d5
Reviewed-on: http://gerrit.cloudera.org:8080/16189
Tested-by: Kudu Jenkins
Reviewed-by: Andrew Wong 
---
M src/kudu/client/batcher.cc
M src/kudu/client/batcher.h
2 files changed, 10 insertions(+), 9 deletions(-)

Approvals:
  Kudu Jenkins: Verified
  Andrew Wong: Looks good to me, approved

--
To view, visit http://gerrit.cloudera.org:8080/16189
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I446a8d21253b7a274872bff6d3e76705ac95d0d5
Gerrit-Change-Number: 16189
Gerrit-PatchSet: 3
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Bankim Bhavsar 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] client: allocate InFlightOp from a Batcher Arena

2020-07-14 Thread Andrew Wong (Code Review)
Andrew Wong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16189 )

Change subject: client: allocate InFlightOp from a Batcher Arena
..


Patch Set 2: Code-Review+2

(1 comment)

http://gerrit.cloudera.org:8080/#/c/16189/1/src/kudu/client/batcher.cc
File src/kudu/client/batcher.cc:

http://gerrit.cloudera.org:8080/#/c/16189/1/src/kudu/client/batcher.cc@379
PS1, Line 379: int op_prefetch = i + kPrefetchDistance * 2;
 : int row_prefetch = i + kPrefetchDistance;
 : if (ifo_prefetch >= 0 && ifo_prefetch < size) {
 :   __builtin_prefetch(ops_[ifo_prefetch], 0, 
PREFETCH_HINT_T0);
 : }
 : if (op_prefetch >= 0 && op_prefetch < size) {
 :   const auto* op = ops_[op_prefetch]->write_op.get();
 :   if (op) {
 : __builtin_prefetch(>row().isset_bitmap_, 0, 
PREFETCH_HINT_T0);
 :   }
 : }
 : if (row_prefetch >= 0 && row_prefetch < size) {
 :   const auto* op = ops_[row_prefetch]->write_op.get();
 :   if (op) {
 : __builtin_prefetch(op->row().isset_bitmap_, 0, 
PREFETCH_HINT_T0);
 :   }
 : }
> yea, and even within the same arena, they're allocated in the order the ops
Makes sense. Thanks for clarifying!



--
To view, visit http://gerrit.cloudera.org:8080/16189
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I446a8d21253b7a274872bff6d3e76705ac95d0d5
Gerrit-Change-Number: 16189
Gerrit-PatchSet: 2
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Bankim Bhavsar 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Tue, 14 Jul 2020 06:12:13 +
Gerrit-HasComments: Yes


[kudu-CR] client: allocate InFlightOp from a Batcher Arena

2020-07-13 Thread Todd Lipcon (Code Review)
Hello Alexey Serbin, Andrew Wong, Kudu Jenkins, Andrew Wong, Bankim Bhavsar,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/16189

to look at the new patch set (#2).

Change subject: client: allocate InFlightOp from a Batcher Arena
..

client: allocate InFlightOp from a Batcher Arena

InFlightOps are always associated with a single Batcher, and Batchers
are always associated with a single "flush". Given that, we can allocate
them from a Batcher-local Arena instead of from the heap. This improves
CPU consumption and throughput of the client by about 10%.

Before:

INSERT report
rows total: 4000
time total: 10302.3 ms
  time per row: 0.000257559 ms
Dropping auto-created table 
'default.loadgen_auto_0334edaa55fe4fdba242b0b4fcc5d360'

 Performance counter stats for './build/latest/bin/kudu perf loadgen localhost 
-num_rows_per_thread=500 -num_threads=8':

 100857.18 msec task-clock#9.738 CPUs utilized
124584  context-switches  #0.001 M/sec
  6047  cpu-migrations#0.060 K/sec
 31199  page-faults   #0.309 K/sec
  412192743252  cycles#4.087 GHz
  (83.24%)
   95984615757  stalled-cycles-frontend   #   23.29% frontend cycles 
idle (83.35%)
   80498019077  stalled-cycles-backend#   19.53% backend cycles 
idle  (83.22%)
  225757843097  instructions  #0.55  insn per cycle
  #0.43  stalled cycles per 
insn  (83.38%)
   43262879676  branches  #  428.952 M/sec  
  (83.37%)
 276103070  branch-misses #0.64% of all branches
  (83.43%)

  10.356736781 seconds time elapsed

  98.263563000 seconds user
   2.646264000 seconds sys

After:

INSERT report
rows total: 4000
time total: 9730.09 ms
  time per row: 0.000243252 ms
Dropping auto-created table 
'default.loadgen_auto_ffc0ea72dd064db7b3fa12b037454eb2'

 Performance counter stats for './build/latest/bin/kudu.client-opt perf loadgen 
localhost -num_rows_per_thread=500 -num_threads=8':

  91317.10 msec task-clock#9.327 CPUs utilized
 57090  context-switches  #0.625 K/sec
  6661  cpu-migrations#0.073 K/sec
 34568  page-faults   #0.379 K/sec
  374170608484  cycles#4.097 GHz
  (83.31%)
   82165215635  stalled-cycles-frontend   #   21.96% frontend cycles 
idle (83.35%)
   57601295335  stalled-cycles-backend#   15.39% backend cycles 
idle  (83.25%)
  218776486065  instructions  #0.58  insn per cycle
  #0.38  stalled cycles per 
insn  (83.35%)
   41762336591  branches  #  457.333 M/sec  
  (83.34%)
 140528380  branch-misses #0.34% of all branches
  (83.40%)

   9.790174460 seconds time elapsed

  89.079083000 seconds user
   2.276474000 seconds sys

Change-Id: I446a8d21253b7a274872bff6d3e76705ac95d0d5
---
M src/kudu/client/batcher.cc
M src/kudu/client/batcher.h
2 files changed, 10 insertions(+), 9 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/89/16189/2
--
To view, visit http://gerrit.cloudera.org:8080/16189
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I446a8d21253b7a274872bff6d3e76705ac95d0d5
Gerrit-Change-Number: 16189
Gerrit-PatchSet: 2
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Bankim Bhavsar 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] client: allocate InFlightOp from a Batcher Arena

2020-07-13 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16189 )

Change subject: client: allocate InFlightOp from a Batcher Arena
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/16189/1/src/kudu/client/batcher.cc
File src/kudu/client/batcher.cc:

http://gerrit.cloudera.org:8080/#/c/16189/1/src/kudu/client/batcher.cc@379
PS1, Line 379: int ifo_prefetch = i + kPrefetchDistance * 3;
 : int op_prefetch = i + kPrefetchDistance * 2;
 : int row_prefetch = i + kPrefetchDistance;
 : if (ifo_prefetch >= 0 && ifo_prefetch < size) {
 :   __builtin_prefetch(ops_[ifo_prefetch], 0, 
PREFETCH_HINT_T0);
 : }
 : if (op_prefetch >= 0 && op_prefetch < size) {
 :   const auto* op = ops_[op_prefetch]->write_op.get();
 :   if (op) {
 : __builtin_prefetch(>row().isset_bitmap_, 0, 
PREFETCH_HINT_T0);
 :   }
 : }
 : if (row_prefetch >= 0 && row_prefetch < size) {
 :   const auto* op = ops_[row_prefetch]->write_op.get();
 :   if (op) {
 : __builtin_prefetch(op->row().isset_bitmap_, 0, 
PREFETCH_HINT_T0);
 :   }
> Just making sure I understand this bit: while ops_ are all allocated in the
yea, and even within the same arena, they're allocated in the order the ops are 
submitted, but in the case that the ops are strewn about more than one tablet 
due to partitioning, the ops associated with a single WriteRpc will not 
necessarily be entirely contiguous/sequential in this loop, so the prefetching 
is probably still effective/useful.



--
To view, visit http://gerrit.cloudera.org:8080/16189
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I446a8d21253b7a274872bff6d3e76705ac95d0d5
Gerrit-Change-Number: 16189
Gerrit-PatchSet: 1
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Bankim Bhavsar 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Tue, 14 Jul 2020 03:10:09 +
Gerrit-HasComments: Yes


[kudu-CR] client: allocate InFlightOp from a Batcher Arena

2020-07-13 Thread Andrew Wong (Code Review)
Andrew Wong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16189 )

Change subject: client: allocate InFlightOp from a Batcher Arena
..


Patch Set 1: Code-Review+1

(1 comment)

http://gerrit.cloudera.org:8080/#/c/16189/1/src/kudu/client/batcher.cc
File src/kudu/client/batcher.cc:

http://gerrit.cloudera.org:8080/#/c/16189/1/src/kudu/client/batcher.cc@379
PS1, Line 379: int ifo_prefetch = i + kPrefetchDistance * 3;
 : int op_prefetch = i + kPrefetchDistance * 2;
 : int row_prefetch = i + kPrefetchDistance;
 : if (ifo_prefetch >= 0 && ifo_prefetch < size) {
 :   __builtin_prefetch(ops_[ifo_prefetch], 0, 
PREFETCH_HINT_T0);
 : }
 : if (op_prefetch >= 0 && op_prefetch < size) {
 :   const auto* op = ops_[op_prefetch]->write_op.get();
 :   if (op) {
 : __builtin_prefetch(>row().isset_bitmap_, 0, 
PREFETCH_HINT_T0);
 :   }
 : }
 : if (row_prefetch >= 0 && row_prefetch < size) {
 :   const auto* op = ops_[row_prefetch]->write_op.get();
 :   if (op) {
 : __builtin_prefetch(op->row().isset_bitmap_, 0, 
PREFETCH_HINT_T0);
 :   }
Just making sure I understand this bit: while ops_ are all allocated in the 
arena, the underlying KuduWriteOps may still strewn strewn about since they're 
allocated by clients, which makes this optimization still valuable, right?



--
To view, visit http://gerrit.cloudera.org:8080/16189
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I446a8d21253b7a274872bff6d3e76705ac95d0d5
Gerrit-Change-Number: 16189
Gerrit-PatchSet: 1
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Bankim Bhavsar 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Tue, 14 Jul 2020 01:15:08 +
Gerrit-HasComments: Yes


[kudu-CR] client: allocate InFlightOp from a Batcher Arena

2020-07-13 Thread Bankim Bhavsar (Code Review)
Bankim Bhavsar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16189 )

Change subject: client: allocate InFlightOp from a Batcher Arena
..


Patch Set 1: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/16189
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I446a8d21253b7a274872bff6d3e76705ac95d0d5
Gerrit-Change-Number: 16189
Gerrit-PatchSet: 1
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Bankim Bhavsar 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Tue, 14 Jul 2020 00:28:58 +
Gerrit-HasComments: No


[kudu-CR] client: allocate InFlightOp from a Batcher Arena

2020-07-13 Thread Todd Lipcon (Code Review)
Hello Alexey Serbin, Andrew Wong,

I'd like you to do a code review. Please visit

http://gerrit.cloudera.org:8080/16189

to review the following change.


Change subject: client: allocate InFlightOp from a Batcher Arena
..

client: allocate InFlightOp from a Batcher Arena

InFlightOps are always associated with a single Batcher, and Batchers
are always associated with a single "flush". Given that, we can allocate
them from a Batcher-local Arena instead of from the heap. This improves
CPU consumption and throughput of the client by about 10%.

Before:

INSERT report
rows total: 4000
time total: 10302.3 ms
  time per row: 0.000257559 ms
Dropping auto-created table 
'default.loadgen_auto_0334edaa55fe4fdba242b0b4fcc5d360'

 Performance counter stats for './build/latest/bin/kudu perf loadgen localhost 
-num_rows_per_thread=500 -num_threads=8':

 100857.18 msec task-clock#9.738 CPUs utilized
124584  context-switches  #0.001 M/sec
  6047  cpu-migrations#0.060 K/sec
 31199  page-faults   #0.309 K/sec
  412192743252  cycles#4.087 GHz
  (83.24%)
   95984615757  stalled-cycles-frontend   #   23.29% frontend cycles 
idle (83.35%)
   80498019077  stalled-cycles-backend#   19.53% backend cycles 
idle  (83.22%)
  225757843097  instructions  #0.55  insn per cycle
  #0.43  stalled cycles per 
insn  (83.38%)
   43262879676  branches  #  428.952 M/sec  
  (83.37%)
 276103070  branch-misses #0.64% of all branches
  (83.43%)

  10.356736781 seconds time elapsed

  98.263563000 seconds user
   2.646264000 seconds sys

After:

INSERT report
rows total: 4000
time total: 9730.09 ms
  time per row: 0.000243252 ms
Dropping auto-created table 
'default.loadgen_auto_ffc0ea72dd064db7b3fa12b037454eb2'

 Performance counter stats for './build/latest/bin/kudu.client-opt perf loadgen 
localhost -num_rows_per_thread=500 -num_threads=8':

  91317.10 msec task-clock#9.327 CPUs utilized
 57090  context-switches  #0.625 K/sec
  6661  cpu-migrations#0.073 K/sec
 34568  page-faults   #0.379 K/sec
  374170608484  cycles#4.097 GHz
  (83.31%)
   82165215635  stalled-cycles-frontend   #   21.96% frontend cycles 
idle (83.35%)
   57601295335  stalled-cycles-backend#   15.39% backend cycles 
idle  (83.25%)
  218776486065  instructions  #0.58  insn per cycle
  #0.38  stalled cycles per 
insn  (83.35%)
   41762336591  branches  #  457.333 M/sec  
  (83.34%)
 140528380  branch-misses #0.34% of all branches
  (83.40%)

   9.790174460 seconds time elapsed

  89.079083000 seconds user
   2.276474000 seconds sys

Change-Id: I446a8d21253b7a274872bff6d3e76705ac95d0d5
---
M src/kudu/client/batcher.cc
M src/kudu/client/batcher.h
2 files changed, 10 insertions(+), 8 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/89/16189/1
--
To view, visit http://gerrit.cloudera.org:8080/16189
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I446a8d21253b7a274872bff6d3e76705ac95d0d5
Gerrit-Change-Number: 16189
Gerrit-PatchSet: 1
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong