[kudu-CR] KUDU-1865 (part 1): reduce some cross-thread allocations

2020-05-26 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/5905 )

Change subject: KUDU-1865 (part 1): reduce some cross-thread allocations
..


Patch Set 8:

> Without positive (statistically significant) perf results I'm not
 > sure it's worth doing this patch (at least the std::function struct
 > stuff which is somewhat "weird" looking).
 >
 > I seem to recall that I did this work originally as part of the
 > integration of KRPC into Impala, and Impala was suffering majorly
 > from the "AllocLarge" bottleneck on older versions of tcmalloc
 > (fixed in the versions we use today). Impala did a lot of >1M
 > allocations which went throuhg the AllocLarge slow path while
 > holding the page heap lock, so any time that the RPC threads had to
 > go through the central free list, they were likely to get blocked
 > behind some other thread holding that lock. This made the
 > cross-thread allocations really bad.
 >
 > Now with more modern tcmalloc, it's still preferable to avoid the
 > central free lists, but I don't think it's worth being too crazy
 > without showing a real benefit.

Thank you for the information, Todd.  Yep, it seems this patch doesn't add a 
lot of improvement with current version of gperftools, etc.  I posted the 
latest version with benchmark results just for posterity.


--
To view, visit http://gerrit.cloudera.org:8080/5905
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7d4d5f14fb302196b1797c712b21cfce81f157c1
Gerrit-Change-Number: 5905
Gerrit-PatchSet: 8
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Bankim Bhavsar 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Wed, 27 May 2020 04:25:51 +
Gerrit-HasComments: No


[kudu-CR] KUDU-1865 (part 1): reduce some cross-thread allocations

2020-05-26 Thread Alexey Serbin (Code Review)
Alexey Serbin has uploaded a new patch set (#8) to the change originally 
created by Todd Lipcon. ( http://gerrit.cloudera.org:8080/5905 )

Change subject: KUDU-1865 (part 1): reduce some cross-thread allocations
..

KUDU-1865 (part 1): reduce some cross-thread allocations

Per the analysis in the JIRA, each RPC caused two "cross-thread"
allocations of ReactorTasks (one in the client, one in the server).
These cross-thread allocations harm tcmalloc caching.

ReactorTasks don't actually need to be heap-allocated -- that was only
an easy mechanism to use a normal-looking "interface" paradigm. Instead,
if we use a struct with some std::functions in it, and std::move() it
to/from the pending tasks container, we avoid the heap allocation. More
importantly, we avoid the worst kind of heap allocation which is
allocated on one thread and freed on another.

Below are results from various RPC benchmark tests.  The improvement
is minuscule (less than 1%) for synthetic tests scenarios simulating
real-world concurrent workloads.


Using boost::function in ReactorTask


CentOS 7.5, RELEASE configuration built with gcc (GCC) 4.8.5

Without patch:
  GetTableLocations PRC:  29902.1 req/sec
  GetTableSchema RPC   : 207100.0 req/sec

With patch:
  GetTableLocations PRC:  30202.9 req/sec
  GetTableSchema RPC   : 207674.0 req/sec



Results from rpc-bench, 60 seconds runtime
  Mode:Sync
  Client threads:   16
  Worker threads:   1
  Server reactors:  4
  Encryption:   0

Without patch:
  Reqs/sec: 155254
  User CPU per req: 31.7217us
  Sys CPU per req:  51.3626us
  Ctx Sw. per req:  3.46345

With patch:
  Reqs/sec: 155015
  User CPU per req: 32.9851us
  Sys CPU per req:  51.1958us
  Ctx Sw. per req:  3.48085


Using std::function in ReactorTask


GetTableLocations RPC (RELEASE build with third-party CLANG, 60 seconds):
  Without patch: 15528.3 req/sec
  Withpatch: 15535.4 req/sec



GetTableSchema RPC (RELEASE build with third-party CLANG, 60 seconds):
  Without patch: 103892 req/sec
  Withpatch: 102711 req/sec



Results from rpc-bench, 60 seconds runtime
  Mode:Sync
  Client threads:   16
  Worker threads:   1
  Server reactors:  4
  Encryption:   0
(RELEASE build with third-party CLANG):

Without patch:
  Reqs/sec: 68191.4
  User CPU per req: 34.8228us
  Sys CPU per req:  172.426us
  Ctx Sw. per req:  3.58111
  Server reactor load histogram
  Count: 11980
  Mean: 36.4902
  Percentiles:
 0%  (min) = 16
25%= 26
50%  (med) = 28
75%= 31
95%= 78
99%= 84
99.9%  = 88
99.99% = 90
100% (max) = 91

  Server reactor latency histogram
  Count: 13517690
  Mean: 29.8539
  Percentiles:
 0%  (min) = 0
25%= 18
50%  (med) = 28
75%= 36
95%= 58
99%= 86
99.9%  = 135
99.99% = 191
100% (max) = 28239



With patch:
  Reqs/sec: 67295.2
  User CPU per req: 37.7613us
  Sys CPU per req:  171.141us
  Ctx Sw. per req:  3.59101
  Server reactor load histogram
  Count: 11979
  Mean: 36.1685
  Percentiles:
 0%  (min) = 16
25%= 26
50%  (med) = 28
75%= 32
95%= 80
99%= 87
99.9%  = 91
99.99% = 93
100% (max) = 93
  Server reactor latency histogram
  Count: 13141249
  Mean: 30.3945
  Percentiles:
 0%  (min) = 0
25%= 18
50%  (med) = 29
75%= 37
95%= 58
99%= 86
99.9%  = 141
99.99% = 204
100% (max) = 26220

Change-Id: I7d4d5f14fb302196b1797c712b21cfce81f157c1
---
M src/kudu/rpc/connection.cc
M src/kudu/rpc/connection.h
M src/kudu/rpc/messenger.cc
M src/kudu/rpc/messenger.h
M src/kudu/rpc/negotiation.cc
M src/kudu/rpc/reactor.cc
M src/kudu/rpc/reactor.h
7 files changed, 184 insertions(+), 283 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/05/5905/8
--
To view, visit http://gerrit.cloudera.org:8080/5905
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7d4d5f14fb302196b1797c712b21cfce81f157c1
Gerrit-Change-Number: 5905
Gerrit-PatchSet: 8
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-R

[kudu-CR] KUDU-1865 (part 1): reduce some cross-thread allocations

2020-05-26 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/5905 )

Change subject: KUDU-1865 (part 1): reduce some cross-thread allocations
..


Patch Set 7:

Without positive (statistically significant) perf results I'm not sure it's 
worth doing this patch (at least the std::function struct stuff which is 
somewhat "weird" looking).

I seem to recall that I did this work originally as part of the integration of 
KRPC into Impala, and Impala was suffering majorly from the "AllocLarge" 
bottleneck on older versions of tcmalloc (fixed in the versions we use today). 
Impala did a lot of >1M allocations which went throuhg the AllocLarge slow path 
while holding the page heap lock, so any time that the RPC threads had to go 
through the central free list, they were likely to get blocked behind some 
other thread holding that lock. This made the cross-thread allocations really 
bad.

Now with more modern tcmalloc, it's still preferable to avoid the central free 
lists, but I don't think it's worth being too crazy without showing a real 
benefit.


--
To view, visit http://gerrit.cloudera.org:8080/5905
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7d4d5f14fb302196b1797c712b21cfce81f157c1
Gerrit-Change-Number: 5905
Gerrit-PatchSet: 7
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Bankim Bhavsar 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Wed, 27 May 2020 03:58:23 +
Gerrit-HasComments: No


[kudu-CR] KUDU-1865 (part 1): reduce some cross-thread allocations

2020-05-26 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/5905 )

Change subject: KUDU-1865 (part 1): reduce some cross-thread allocations
..


Patch Set 7:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/5905/7//COMMIT_MSG
Commit Message:

PS7:
I also replaced std::function with boost::function in PS7 and ran synthetic 
tests ("master-test --gtest_filter='*ConcurrentGetT
ableSchemaTest.Rpc*'" and "table_locations-itest 
--gtest_filter='*.GetTableLocationsBenchmark") on CentOS 7.5 machine with gcc 
(GCC) 4.8.5 20150623 (Red Hat 4.8.5-28).  From the results below it's clear 
that the improvement is minuscule (under 1%).



KUDU-1865 with boost::function instead of std::function;
CentOS 7.5, RELEASE configuration built with gcc (GCC) 4.8.5
  
Without patch:  
  GetTableLocations PRC:  29902.1 req/sec   
  GetTableSchema RPC   : 207100.0 req/sec
  
With patch: 
  GetTableLocations PRC:  30202.9 req/sec
  GetTableSchema RPC   : 207674.0 req/sec



As for rpc-bench, the improvements are tiny as well:

Without patch:

Mode:Sync
Client threads:   16
Worker threads:   1
Server reactors:  4
Encryption:   0
--
Reqs/sec: 155254
User CPU per req: 31.7217us
Sys CPU per req:  51.3626us
Ctx Sw. per req:  3.46345



With patch:

Mode:Sync
Client threads:   16
Worker threads:   1
Server reactors:  4
Encryption:   0
--
Reqs/sec: 155015
User CPU per req: 32.9851us
Sys CPU per req:  51.1958us
Ctx Sw. per req:  3.48085



--
To view, visit http://gerrit.cloudera.org:8080/5905
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7d4d5f14fb302196b1797c712b21cfce81f157c1
Gerrit-Change-Number: 5905
Gerrit-PatchSet: 7
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Bankim Bhavsar 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Tue, 26 May 2020 23:01:41 +
Gerrit-HasComments: Yes


[kudu-CR] KUDU-1865 (part 1): reduce some cross-thread allocations

2020-05-26 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/5905 )

Change subject: KUDU-1865 (part 1): reduce some cross-thread allocations
..


Patch Set 7:

> I am curious, Did you see an improvement in some of the RPC
 > benchmarks as a result of this change?

The improvements are minuscule and sometimes there is improvement at all in 
synthetic test scenarios (such as GetTableLocations and GetTableSchema 
concurrent RPC scenarios).  I also tried to replace std::function with 
boost::function in ReactorTask, and that helps a bit with number of allocations 
per tcmalloc's allocation tracing, but not much as for the real-world 
benchmarks.

Below are some results:

GetTableLocations RPC (RELEASE build with third-party CLANG, 60 seconds):
  Without patch: 15528.3 req/sec
  Withpatch: 15535.4 req/sec



GetTableSchema RPC (RELEASE build with third-party CLANG, 60 seconds):
  Without patch: 103892 req/sec
  Withpatch: 102711 req/sec

Results from rpc-bench, 60 seconds runtime
  Mode:Sync
  Client threads:   16
  Worker threads:   1
  Server reactors:  4
  Encryption:   0
(RELEASE build with third-party CLANG):

Without patch:
  Reqs/sec: 68191.4
  User CPU per req: 34.8228us
  Sys CPU per req:  172.426us
  Ctx Sw. per req:  3.58111
  Server reactor load histogram
  Count: 11980
  Mean: 36.4902
  Percentiles:
 0%  (min) = 16
25%= 26
50%  (med) = 28
75%= 31
95%= 78
99%= 84
99.9%  = 88
99.99% = 90
100% (max) = 91
  Server reactor latency histogram
  Count: 13517690
  Mean: 29.8539
  Percentiles:
 0%  (min) = 0
25%= 18
50%  (med) = 28
75%= 36
95%= 58
99%= 86
99.9%  = 135
99.99% = 191
100% (max) = 28239

--

With patch:
  Reqs/sec: 67295.2
  User CPU per req: 37.7613us
  Sys CPU per req:  171.141us
  Ctx Sw. per req:  3.59101
  Server reactor load histogram
  Count: 11979
  Mean: 36.1685
  Percentiles:
 0%  (min) = 16
25%= 26
50%  (med) = 28
75%= 32
95%= 80
99%= 87
99.9%  = 91
99.99% = 93
100% (max) = 93
  Server reactor latency histogram
  Count: 13141249
  Mean: 30.3945
  Percentiles:
 0%  (min) = 0
25%= 18
50%  (med) = 29
75%= 37
95%= 58
99%= 86
99.9%  = 141
99.99% = 204
100% (max) = 26220


--
To view, visit http://gerrit.cloudera.org:8080/5905
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7d4d5f14fb302196b1797c712b21cfce81f157c1
Gerrit-Change-Number: 5905
Gerrit-PatchSet: 7
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Bankim Bhavsar 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Tue, 26 May 2020 22:31:39 +
Gerrit-HasComments: No


[kudu-CR] KUDU-1865 (part 1): reduce some cross-thread allocations

2020-05-26 Thread Grant Henke (Code Review)
Grant Henke has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/5905 )

Change subject: KUDU-1865 (part 1): reduce some cross-thread allocations
..


Patch Set 7:

I am curious, Did you see an improvement in some of the RPC benchmarks as a 
result of this change?


--
To view, visit http://gerrit.cloudera.org:8080/5905
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7d4d5f14fb302196b1797c712b21cfce81f157c1
Gerrit-Change-Number: 5905
Gerrit-PatchSet: 7
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Bankim Bhavsar 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Tue, 26 May 2020 21:36:28 +
Gerrit-HasComments: No


[kudu-CR] KUDU-1865 (part 1): reduce some cross-thread allocations

2020-05-26 Thread Bankim Bhavsar (Code Review)
Bankim Bhavsar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/5905 )

Change subject: KUDU-1865 (part 1): reduce some cross-thread allocations
..


Patch Set 7:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/5905/7/src/kudu/rpc/connection.cc
File src/kudu/rpc/connection.cc:

http://gerrit.cloudera.org:8080/#/c/5905/7/src/kudu/rpc/connection.cc@597
PS7, Line 597: [=](ReactorThread* /*rt*/) {
 :   this->QueueOutbound(unique_ptr(t));
 : },
 : [=](const Status& s) {
 :   t->Abort(s);
 :   delete t;
 : },
Is there a possibility of a race wrt OutboundTransfer object ptr 't'?

Underlying t could be deleted by Abort while Run is active and under mgmt of 
unique_ptr?


http://gerrit.cloudera.org:8080/#/c/5905/7/src/kudu/rpc/connection.cc@596
PS7, Line 596:   ReactorTask task = {
 : [=](ReactorThread* /*rt*/) {
 :   this->QueueOutbound(unique_ptr(t));
 : },
 : [=](const Status& s) {
 :   t->Abort(s);
 :   delete t;
 : },
 :   };
Hmm, this is an interesting mechanism to create an anonymous class from a base 
class using lambdas.
Is there a way to specify the function supplied/overridden?

Add a comment about the function being supplied if no such mechanisms is 
available.


http://gerrit.cloudera.org:8080/#/c/5905/7/src/kudu/rpc/connection.cc@818
PS7, Line 818:  // The lambdas below need to hold a reference to the connection 
in case
 :   // it has been closed before the task gets scheduled/aborted.
Can you explain why that wouldn't apply above on queueing response?



--
To view, visit http://gerrit.cloudera.org:8080/5905
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7d4d5f14fb302196b1797c712b21cfce81f157c1
Gerrit-Change-Number: 5905
Gerrit-PatchSet: 7
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Bankim Bhavsar 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Tue, 26 May 2020 18:12:29 +
Gerrit-HasComments: Yes


[kudu-CR] KUDU-1865 (part 1): reduce some cross-thread allocations

2020-05-25 Thread Alexey Serbin (Code Review)
Alexey Serbin has uploaded a new patch set (#7) to the change originally 
created by Todd Lipcon. ( http://gerrit.cloudera.org:8080/5905 )

Change subject: KUDU-1865 (part 1): reduce some cross-thread allocations
..

KUDU-1865 (part 1): reduce some cross-thread allocations

Per the analysis in the JIRA, each RPC caused two "cross-thread"
allocations of ReactorTasks (one in the client, one in the server).
These cross-thread allocations harm tcmalloc caching.

ReactorTasks don't actually need to be heap-allocated -- that was only
an easy mechanism to use a normal-looking "interface" paradigm. Instead,
if we use a struct with some std::functions in it, and std::move() it
to/from the pending tasks container, we avoid the heap allocation. More
importantly, we avoid the worst kind of heap allocation which is
allocated on one thread and freed on another.

I verified that this removed the cross-thread allocation traffic
using tcmalloc tracing and the script posted to the JIRA.

Change-Id: I7d4d5f14fb302196b1797c712b21cfce81f157c1
---
M src/kudu/rpc/connection.cc
M src/kudu/rpc/connection.h
M src/kudu/rpc/messenger.cc
M src/kudu/rpc/messenger.h
M src/kudu/rpc/negotiation.cc
M src/kudu/rpc/reactor.cc
M src/kudu/rpc/reactor.h
7 files changed, 179 insertions(+), 284 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/05/5905/7
--
To view, visit http://gerrit.cloudera.org:8080/5905
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7d4d5f14fb302196b1797c712b21cfce81f157c1
Gerrit-Change-Number: 5905
Gerrit-PatchSet: 7
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] KUDU-1865 (part 1): reduce some cross-thread allocations

2020-05-22 Thread Alexey Serbin (Code Review)
Alexey Serbin has restored this change. ( http://gerrit.cloudera.org:8080/5905 )

Change subject: KUDU-1865 (part 1): reduce some cross-thread allocations
..


Restored
--
To view, visit http://gerrit.cloudera.org:8080/5905
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: restore
Gerrit-Change-Id: I7d4d5f14fb302196b1797c712b21cfce81f157c1
Gerrit-Change-Number: 5905
Gerrit-PatchSet: 4
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] KUDU-1865 (part 1): reduce some cross-thread allocations

2017-11-02 Thread Todd Lipcon (Code Review)
Todd Lipcon has abandoned this change. ( http://gerrit.cloudera.org:8080/5905 )

Change subject: KUDU-1865 (part 1): reduce some cross-thread allocations
..


Abandoned

Actually it looks like a lot of this code got change, and the patch is full of 
conflicts, so I'm gonna abandon this for now.
--
To view, visit http://gerrit.cloudera.org:8080/5905
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: abandon
Gerrit-Change-Id: I7d4d5f14fb302196b1797c712b21cfce81f157c1
Gerrit-Change-Number: 5905
Gerrit-PatchSet: 4
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] KUDU-1865 (part 1): reduce some cross-thread allocations

2017-11-02 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/5905 )

Change subject: KUDU-1865 (part 1): reduce some cross-thread allocations
..


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/5905/4/src/kudu/rpc/connection.cc
File src/kudu/rpc/connection.cc:

http://gerrit.cloudera.org:8080/#/c/5905/4/src/kudu/rpc/connection.cc@408
PS4, Line 408: OutboundTransfer* t = 
OutboundTransfer::CreateForCallResponse(slices, cb);
> please leave some comment about the lifetime and ownership of 't'
done. also removed the comment above which should have been removed in an 
earlier commit (it no longer applies to this code here)



--
To view, visit http://gerrit.cloudera.org:8080/5905
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7d4d5f14fb302196b1797c712b21cfce81f157c1
Gerrit-Change-Number: 5905
Gerrit-PatchSet: 4
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Fri, 03 Nov 2017 00:14:08 +
Gerrit-HasComments: Yes


[kudu-CR] KUDU-1865 (part 1): reduce some cross-thread allocations

2017-10-10 Thread David Ribeiro Alves (Code Review)
David Ribeiro Alves has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/5905 )

Change subject: KUDU-1865 (part 1): reduce some cross-thread allocations
..


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/5905/4/src/kudu/rpc/connection.cc
File src/kudu/rpc/connection.cc:

http://gerrit.cloudera.org:8080/#/c/5905/4/src/kudu/rpc/connection.cc@408
PS4, Line 408: OutboundTransfer* t = 
OutboundTransfer::CreateForCallResponse(slices, cb);
please leave some comment about the lifetime and ownership of 't'



--
To view, visit http://gerrit.cloudera.org:8080/5905
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7d4d5f14fb302196b1797c712b21cfce81f157c1
Gerrit-Change-Number: 5905
Gerrit-PatchSet: 4
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Tue, 10 Oct 2017 21:06:58 +
Gerrit-HasComments: Yes


[kudu-CR] KUDU-1865 (part 1): reduce some cross-thread allocations

2017-10-09 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/5905 )

Change subject: KUDU-1865 (part 1): reduce some cross-thread allocations
..


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/5905/4/src/kudu/rpc/connection.cc
File src/kudu/rpc/connection.cc:

http://gerrit.cloudera.org:8080/#/c/5905/4/src/kudu/rpc/connection.cc@a391
PS4, Line 391:
> This may be a naive question but is there a reason to not keep this class i
The issue with using a class is that it has to be heap-allocated rather than 
just being used as a value type. Using the struct with function members makes 
it moveable and avoids allocation.



--
To view, visit http://gerrit.cloudera.org:8080/5905
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7d4d5f14fb302196b1797c712b21cfce81f157c1
Gerrit-Change-Number: 5905
Gerrit-PatchSet: 4
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Mon, 09 Oct 2017 23:25:56 +
Gerrit-HasComments: Yes


[kudu-CR] KUDU-1865 (part 1): reduce some cross-thread allocations

2017-07-07 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change.

Change subject: KUDU-1865 (part 1): reduce some cross-thread allocations
..


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/5905/4/src/kudu/rpc/connection.cc
File src/kudu/rpc/connection.cc:

PS4, Line 391: 
This may be a naive question but is there a reason to not keep this class 
instead ?


-- 
To view, visit http://gerrit.cloudera.org:8080/5905
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7d4d5f14fb302196b1797c712b21cfce81f157c1
Gerrit-PatchSet: 4
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] KUDU-1865 (part 1): reduce some cross-thread allocations

2017-04-05 Thread Todd Lipcon (Code Review)
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/5905

to look at the new patch set (#3).

Change subject: KUDU-1865 (part 1): reduce some cross-thread allocations
..

KUDU-1865 (part 1): reduce some cross-thread allocations

Per the analysis in the JIRA, each RPC caused two "cross-thread"
allocations of ReactorTasks (one in the client, one in the server).
These cross-thread allocations harm tcmalloc caching.

ReactorTasks don't actually need to be heap-allocated -- that was only
an easy mechanism to use a normal-looking "interface" paradigm. Instead,
if we use a struct with some std::functions in it, and std::move() it
to/from the pending tasks container, we avoid the heap allocation. More
importantly, we avoid the worst kind of heap allocation which is
allocated on one thread and freed on another.

I verified that this removed the cross-thread allocation traffic using
tcmalloc tracing and the script posted to the JIRA.

Change-Id: I7d4d5f14fb302196b1797c712b21cfce81f157c1
---
M src/kudu/rpc/connection.cc
M src/kudu/rpc/connection.h
M src/kudu/rpc/messenger.cc
M src/kudu/rpc/messenger.h
M src/kudu/rpc/reactor.cc
M src/kudu/rpc/reactor.h
6 files changed, 154 insertions(+), 249 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/05/5905/3
-- 
To view, visit http://gerrit.cloudera.org:8080/5905
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7d4d5f14fb302196b1797c712b21cfce81f157c1
Gerrit-PatchSet: 3
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] KUDU-1865 (part 1): reduce some cross-thread allocations

2017-02-09 Thread Sailesh Mukil (Code Review)
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/5905

to look at the new patch set (#2).

Change subject: KUDU-1865 (part 1): reduce some cross-thread allocations
..

KUDU-1865 (part 1): reduce some cross-thread allocations

Per the analysis in the JIRA, each RPC caused two "cross-thread"
allocations of ReactorTasks (one in the client, one in the server).
These cross-thread allocations harm tcmalloc caching.

ReactorTasks don't actually need to be heap-allocated -- that was only
an easy mechanism to use a normal-looking "interface" paradigm. Instead,
if we use a struct with some std::functions in it, and std::move() it
to/from the pending tasks container, we avoid the heap allocation. More
importantly, we avoid the worst kind of heap allocation which is
allocated on one thread and freed on another.

I verified that this removed the cross-thread allocation traffic using
tcmalloc tracing and the script posted to the JIRA.

Change-Id: I7d4d5f14fb302196b1797c712b21cfce81f157c1
---
M src/kudu/rpc/connection.cc
M src/kudu/rpc/connection.h
M src/kudu/rpc/messenger.cc
M src/kudu/rpc/messenger.h
M src/kudu/rpc/reactor.cc
M src/kudu/rpc/reactor.h
6 files changed, 152 insertions(+), 249 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/05/5905/2
-- 
To view, visit http://gerrit.cloudera.org:8080/5905
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7d4d5f14fb302196b1797c712b21cfce81f157c1
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] KUDU-1865 (part 1): reduce some cross-thread allocations

2017-02-09 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change.

Change subject: KUDU-1865 (part 1): reduce some cross-thread allocations
..


Patch Set 1:

nope... doesn't ring a bell. I wouldn't let those block some perf testing, and 
then if we see a big perf win we can come back and make the patch test clean?

-- 
To view, visit http://gerrit.cloudera.org:8080/5905
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7d4d5f14fb302196b1797c712b21cfce81f157c1
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: No


[kudu-CR] KUDU-1865 (part 1): reduce some cross-thread allocations

2017-02-09 Thread Sailesh Mukil (Code Review)
Sailesh Mukil has posted comments on this change.

Change subject: KUDU-1865 (part 1): reduce some cross-thread allocations
..


Patch Set 1:

> (2 comments)
 > 
 > I found a couple of bugs, but there are still some memory leaks I
 > haven't been able to figure out. This now passes DEBUG testing.
 > I'll keep at it.

I looked at this a bit more and it seems that the leaks are coming from 
ThreadRestrictions::LoadTLS() and KernelStackWatchdog::GetTLS().

Any idea how there could be leaks there, since that code hasn't changed for 
this patch?

-- 
To view, visit http://gerrit.cloudera.org:8080/5905
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7d4d5f14fb302196b1797c712b21cfce81f157c1
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tidy Bot
Gerrit-HasComments: No


[kudu-CR] KUDU-1865 (part 1): reduce some cross-thread allocations

2017-02-06 Thread Sailesh Mukil (Code Review)
Sailesh Mukil has posted comments on this change.

Change subject: KUDU-1865 (part 1): reduce some cross-thread allocations
..


Patch Set 1:

(2 comments)

I found a couple of bugs, but there are still some memory leaks I haven't been 
able to figure out. This now passes DEBUG testing. I'll keep at it.

http://gerrit.cloudera.org:8080/#/c/5905/1/src/kudu/rpc/connection.cc
File src/kudu/rpc/connection.cc:

Line 621: DCHECK(this->reactor_thread()->reactor()->closing());
This is a use-after-free if the Connection object is shutdown and cleared from 
the ReactorThread::server_conns_ list. Since this new way of passing function 
pointers doesn't increase the refcount of the 'Connection' object, the object 
can get freed before this abort_func() gets called.


http://gerrit.cloudera.org:8080/#/c/5905/1/src/kudu/rpc/reactor.cc
File src/kudu/rpc/reactor.cc:

Line 596:   };
This still needs a dummy 'abort_func()', else it results in a runtime error.


-- 
To view, visit http://gerrit.cloudera.org:8080/5905
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7d4d5f14fb302196b1797c712b21cfce81f157c1
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tidy Bot
Gerrit-HasComments: Yes


[kudu-CR] KUDU-1865 (part 1): reduce some cross-thread allocations

2017-02-03 Thread Todd Lipcon (Code Review)
Hello Sailesh Mukil,

I'd like you to do a code review.  Please visit

http://gerrit.cloudera.org:8080/5905

to review the following change.

Change subject: KUDU-1865 (part 1): reduce some cross-thread allocations
..

KUDU-1865 (part 1): reduce some cross-thread allocations

Per the analysis in the JIRA, each RPC caused two "cross-thread"
allocations of ReactorTasks (one in the client, one in the server).
These cross-thread allocations harm tcmalloc caching.

ReactorTasks don't actually need to be heap-allocated -- that was only
an easy mechanism to use a normal-looking "interface" paradigm. Instead,
if we use a struct with some std::functions in it, and std::move() it
to/from the pending tasks container, we avoid the heap allocation. More
importantly, we avoid the worst kind of heap allocation which is
allocated on one thread and freed on another.

I verified that this removed the cross-thread allocation traffic using
tcmalloc tracing and the script posted to the JIRA.

Change-Id: I7d4d5f14fb302196b1797c712b21cfce81f157c1
---
M src/kudu/rpc/connection.cc
M src/kudu/rpc/connection.h
M src/kudu/rpc/messenger.cc
M src/kudu/rpc/messenger.h
M src/kudu/rpc/reactor.cc
M src/kudu/rpc/reactor.h
6 files changed, 148 insertions(+), 249 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/05/5905/1
-- 
To view, visit http://gerrit.cloudera.org:8080/5905
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I7d4d5f14fb302196b1797c712b21cfce81f157c1
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Sailesh Mukil