[kudu-CR] cfile set: reduce memory usage of reader map

2017-03-24 Thread Todd Lipcon (Code Review)
Todd Lipcon has submitted this change and it was merged.

Change subject: cfile_set: reduce memory usage of reader map
..


cfile_set: reduce memory usage of reader map

We previously used an unordered_map>. This patch
changes to using boost's flat_map container, which is based on sorted
entries in a vector. This means we get one contiguous memory location vs
a bunch of smaller per-item allocations in the separate-chained
hashtable. Additionally switches to unique_ptr which avoids an extra
16-byte control structure per CFileReader.

It's hard to measure the exact win here, but I wrote a little test
program which compared inserting 40 entries into the two types of maps,
leaking the maps, and looking at the LeakSanitizer reports. With the old
map and shared_ptr, it cost 2832 bytes, and with the flat_map and
unique_ptr, it only took 824 bytes. So, we saved about 50 bytes per
entry.

Extrapolating this out to a server with 600k cfiles open, we'll save
around 28MB. Not huge, but worth doing.

Change-Id: I3812861a11b00e149ca47b090447bc918c465ad4
Reviewed-on: http://gerrit.cloudera.org:8080/6466
Reviewed-by: Adar Dembo 
Tested-by: Kudu Jenkins
---
M src/kudu/tablet/cfile_set.cc
M src/kudu/tablet/cfile_set.h
2 files changed, 9 insertions(+), 6 deletions(-)

Approvals:
  Adar Dembo: Looks good to me, approved
  Kudu Jenkins: Verified



-- 
To view, visit http://gerrit.cloudera.org:8080/6466
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I3812861a11b00e149ca47b090447bc918c465ad4
Gerrit-PatchSet: 4
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] cfile set: reduce memory usage of reader map

2017-03-24 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change.

Change subject: cfile_set: reduce memory usage of reader map
..


Patch Set 3: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/6466
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I3812861a11b00e149ca47b090447bc918c465ad4
Gerrit-PatchSet: 3
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: No


[kudu-CR] cfile set: reduce memory usage of reader map

2017-03-24 Thread Todd Lipcon (Code Review)
Hello Adar Dembo, Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/6466

to look at the new patch set (#3).

Change subject: cfile_set: reduce memory usage of reader map
..

cfile_set: reduce memory usage of reader map

We previously used an unordered_map>. This patch
changes to using boost's flat_map container, which is based on sorted
entries in a vector. This means we get one contiguous memory location vs
a bunch of smaller per-item allocations in the separate-chained
hashtable. Additionally switches to unique_ptr which avoids an extra
16-byte control structure per CFileReader.

It's hard to measure the exact win here, but I wrote a little test
program which compared inserting 40 entries into the two types of maps,
leaking the maps, and looking at the LeakSanitizer reports. With the old
map and shared_ptr, it cost 2832 bytes, and with the flat_map and
unique_ptr, it only took 824 bytes. So, we saved about 50 bytes per
entry.

Extrapolating this out to a server with 600k cfiles open, we'll save
around 28MB. Not huge, but worth doing.

Change-Id: I3812861a11b00e149ca47b090447bc918c465ad4
---
M src/kudu/tablet/cfile_set.cc
M src/kudu/tablet/cfile_set.h
2 files changed, 9 insertions(+), 6 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/66/6466/3
-- 
To view, visit http://gerrit.cloudera.org:8080/6466
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I3812861a11b00e149ca47b090447bc918c465ad4
Gerrit-PatchSet: 3
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] cfile set: reduce memory usage of reader map

2017-03-24 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change.

Change subject: cfile_set: reduce memory usage of reader map
..


Patch Set 2: Code-Review+2

(1 comment)

http://gerrit.cloudera.org:8080/#/c/6466/2/src/kudu/tablet/cfile_set.h
File src/kudu/tablet/cfile_set.h:

PS2, Line 122: flat_hash_map
Not "flat_map"?


-- 
To view, visit http://gerrit.cloudera.org:8080/6466
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I3812861a11b00e149ca47b090447bc918c465ad4
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] cfile set: reduce memory usage of reader map

2017-03-23 Thread Todd Lipcon (Code Review)
Hello Adar Dembo, Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/6466

to look at the new patch set (#2).

Change subject: cfile_set: reduce memory usage of reader map
..

cfile_set: reduce memory usage of reader map

We previously used an unordered_map>. This patch
changes to using boost's flat_map container, which is based on sorted
entries in a vector. This means we get one contiguous memory location vs
a bunch of smaller per-item allocations in the separate-chained
hashtable. Additionally switches to unique_ptr which avoids an extra
16-byte control structure per CFileReader.

It's hard to measure the exact win here, but I wrote a little test
program which compared inserting 40 entries into the two types of maps,
leaking the maps, and looking at the LeakSanitizer reports. With the old
map and shared_ptr, it cost 2832 bytes, and with the flat_map and
unique_ptr, it only took 824 bytes. So, we saved about 50 bytes per
entry.

Extrapolating this out to a server with 600k cfiles open, we'll save
around 28MB. Not huge, but worth doing.

Change-Id: I3812861a11b00e149ca47b090447bc918c465ad4
---
M src/kudu/tablet/cfile_set.cc
M src/kudu/tablet/cfile_set.h
2 files changed, 9 insertions(+), 6 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/66/6466/2
-- 
To view, visit http://gerrit.cloudera.org:8080/6466
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I3812861a11b00e149ca47b090447bc918c465ad4
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] cfile set: reduce memory usage of reader map

2017-03-23 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change.

Change subject: cfile_set: reduce memory usage of reader map
..


Patch Set 1:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/6466/1/src/kudu/tablet/cfile_set.cc
File src/kudu/tablet/cfile_set.cc:

Line 103: gscoped_ptr reader;
> Can you convert to unique_ptr in the affected call sites? Or does that spil
probably the latter, but I'll give it a go in a second patch on top of this one 
to keep the patches focused


http://gerrit.cloudera.org:8080/#/c/6466/1/src/kudu/tablet/cfile_set.h
File src/kudu/tablet/cfile_set.h:

Line 121:   // Map of column ID to reader. These are lazily initialized as 
needed.
> May want to leave a comment justifying flat_map, since it's somewhat unorth
Done


PS1, Line 122:  >
> Nit: eliminate empty space.
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/6466
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I3812861a11b00e149ca47b090447bc918c465ad4
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] cfile set: reduce memory usage of reader map

2017-03-23 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change.

Change subject: cfile_set: reduce memory usage of reader map
..


Patch Set 1:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/6466/1/src/kudu/tablet/cfile_set.cc
File src/kudu/tablet/cfile_set.cc:

Line 103: gscoped_ptr reader;
Can you convert to unique_ptr in the affected call sites? Or does that spill 
out all over the place?


http://gerrit.cloudera.org:8080/#/c/6466/1/src/kudu/tablet/cfile_set.h
File src/kudu/tablet/cfile_set.h:

Line 121:   // Map of column ID to reader. These are lazily initialized as 
needed.
May want to leave a comment justifying flat_map, since it's somewhat unorthodox 
for us.


PS1, Line 122:  >
Nit: eliminate empty space.


-- 
To view, visit http://gerrit.cloudera.org:8080/6466
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I3812861a11b00e149ca47b090447bc918c465ad4
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-HasComments: Yes


[kudu-CR] cfile set: reduce memory usage of reader map

2017-03-23 Thread Todd Lipcon (Code Review)
Hello Adar Dembo,

I'd like you to do a code review.  Please visit

http://gerrit.cloudera.org:8080/6466

to review the following change.

Change subject: cfile_set: reduce memory usage of reader map
..

cfile_set: reduce memory usage of reader map

We previously used an unordered_map>. This patch
changes to using boost's flat_map container, which is based on sorted
entries in a vector. This means we get one contiguous memory location vs
a bunch of smaller per-item allocations in the separate-chained
hashtable. Additionally switches to unique_ptr which avoids an extra
16-byte control structure per CFileReader.

It's hard to measure the exact win here, but I wrote a little test
program which compared inserting 40 entries into the two types of maps,
leaking the maps, and looking at the LeakSanitizer reports. With the old
map and shared_ptr, it cost 2832 bytes, and with the flat_map and
unique_ptr, it only took 824 bytes. So, we saved about 50 bytes per
entry.

Extrapolating this out to a server with 600k cfiles open, we'll save
around 28MB. Not huge, but worth doing.

Change-Id: I3812861a11b00e149ca47b090447bc918c465ad4
---
M src/kudu/tablet/cfile_set.cc
M src/kudu/tablet/cfile_set.h
2 files changed, 7 insertions(+), 6 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/66/6466/1
-- 
To view, visit http://gerrit.cloudera.org:8080/6466
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I3812861a11b00e149ca47b090447bc918c465ad4
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Adar Dembo