[kudu-CR] cfile set: reduce memory usage of reader map
Todd Lipcon has submitted this change and it was merged. Change subject: cfile_set: reduce memory usage of reader map .. cfile_set: reduce memory usage of reader map We previously used an unordered_map>. This patch changes to using boost's flat_map container, which is based on sorted entries in a vector. This means we get one contiguous memory location vs a bunch of smaller per-item allocations in the separate-chained hashtable. Additionally switches to unique_ptr which avoids an extra 16-byte control structure per CFileReader. It's hard to measure the exact win here, but I wrote a little test program which compared inserting 40 entries into the two types of maps, leaking the maps, and looking at the LeakSanitizer reports. With the old map and shared_ptr, it cost 2832 bytes, and with the flat_map and unique_ptr, it only took 824 bytes. So, we saved about 50 bytes per entry. Extrapolating this out to a server with 600k cfiles open, we'll save around 28MB. Not huge, but worth doing. Change-Id: I3812861a11b00e149ca47b090447bc918c465ad4 Reviewed-on: http://gerrit.cloudera.org:8080/6466 Reviewed-by: Adar Dembo Tested-by: Kudu Jenkins --- M src/kudu/tablet/cfile_set.cc M src/kudu/tablet/cfile_set.h 2 files changed, 9 insertions(+), 6 deletions(-) Approvals: Adar Dembo: Looks good to me, approved Kudu Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/6466 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: merged Gerrit-Change-Id: I3812861a11b00e149ca47b090447bc918c465ad4 Gerrit-PatchSet: 4 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon
[kudu-CR] cfile set: reduce memory usage of reader map
Adar Dembo has posted comments on this change. Change subject: cfile_set: reduce memory usage of reader map .. Patch Set 3: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/6466 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I3812861a11b00e149ca47b090447bc918c465ad4 Gerrit-PatchSet: 3 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: No
[kudu-CR] cfile set: reduce memory usage of reader map
Hello Adar Dembo, Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/6466 to look at the new patch set (#3). Change subject: cfile_set: reduce memory usage of reader map .. cfile_set: reduce memory usage of reader map We previously used an unordered_map>. This patch changes to using boost's flat_map container, which is based on sorted entries in a vector. This means we get one contiguous memory location vs a bunch of smaller per-item allocations in the separate-chained hashtable. Additionally switches to unique_ptr which avoids an extra 16-byte control structure per CFileReader. It's hard to measure the exact win here, but I wrote a little test program which compared inserting 40 entries into the two types of maps, leaking the maps, and looking at the LeakSanitizer reports. With the old map and shared_ptr, it cost 2832 bytes, and with the flat_map and unique_ptr, it only took 824 bytes. So, we saved about 50 bytes per entry. Extrapolating this out to a server with 600k cfiles open, we'll save around 28MB. Not huge, but worth doing. Change-Id: I3812861a11b00e149ca47b090447bc918c465ad4 --- M src/kudu/tablet/cfile_set.cc M src/kudu/tablet/cfile_set.h 2 files changed, 9 insertions(+), 6 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/66/6466/3 -- To view, visit http://gerrit.cloudera.org:8080/6466 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I3812861a11b00e149ca47b090447bc918c465ad4 Gerrit-PatchSet: 3 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon
[kudu-CR] cfile set: reduce memory usage of reader map
Adar Dembo has posted comments on this change. Change subject: cfile_set: reduce memory usage of reader map .. Patch Set 2: Code-Review+2 (1 comment) http://gerrit.cloudera.org:8080/#/c/6466/2/src/kudu/tablet/cfile_set.h File src/kudu/tablet/cfile_set.h: PS2, Line 122: flat_hash_map Not "flat_map"? -- To view, visit http://gerrit.cloudera.org:8080/6466 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I3812861a11b00e149ca47b090447bc918c465ad4 Gerrit-PatchSet: 2 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: Yes
[kudu-CR] cfile set: reduce memory usage of reader map
Hello Adar Dembo, Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/6466 to look at the new patch set (#2). Change subject: cfile_set: reduce memory usage of reader map .. cfile_set: reduce memory usage of reader map We previously used an unordered_map>. This patch changes to using boost's flat_map container, which is based on sorted entries in a vector. This means we get one contiguous memory location vs a bunch of smaller per-item allocations in the separate-chained hashtable. Additionally switches to unique_ptr which avoids an extra 16-byte control structure per CFileReader. It's hard to measure the exact win here, but I wrote a little test program which compared inserting 40 entries into the two types of maps, leaking the maps, and looking at the LeakSanitizer reports. With the old map and shared_ptr, it cost 2832 bytes, and with the flat_map and unique_ptr, it only took 824 bytes. So, we saved about 50 bytes per entry. Extrapolating this out to a server with 600k cfiles open, we'll save around 28MB. Not huge, but worth doing. Change-Id: I3812861a11b00e149ca47b090447bc918c465ad4 --- M src/kudu/tablet/cfile_set.cc M src/kudu/tablet/cfile_set.h 2 files changed, 9 insertions(+), 6 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/66/6466/2 -- To view, visit http://gerrit.cloudera.org:8080/6466 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I3812861a11b00e149ca47b090447bc918c465ad4 Gerrit-PatchSet: 2 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon
[kudu-CR] cfile set: reduce memory usage of reader map
Todd Lipcon has posted comments on this change. Change subject: cfile_set: reduce memory usage of reader map .. Patch Set 1: (3 comments) http://gerrit.cloudera.org:8080/#/c/6466/1/src/kudu/tablet/cfile_set.cc File src/kudu/tablet/cfile_set.cc: Line 103: gscoped_ptr reader; > Can you convert to unique_ptr in the affected call sites? Or does that spil probably the latter, but I'll give it a go in a second patch on top of this one to keep the patches focused http://gerrit.cloudera.org:8080/#/c/6466/1/src/kudu/tablet/cfile_set.h File src/kudu/tablet/cfile_set.h: Line 121: // Map of column ID to reader. These are lazily initialized as needed. > May want to leave a comment justifying flat_map, since it's somewhat unorth Done PS1, Line 122: > > Nit: eliminate empty space. Done -- To view, visit http://gerrit.cloudera.org:8080/6466 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I3812861a11b00e149ca47b090447bc918c465ad4 Gerrit-PatchSet: 1 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: Yes
[kudu-CR] cfile set: reduce memory usage of reader map
Adar Dembo has posted comments on this change. Change subject: cfile_set: reduce memory usage of reader map .. Patch Set 1: (3 comments) http://gerrit.cloudera.org:8080/#/c/6466/1/src/kudu/tablet/cfile_set.cc File src/kudu/tablet/cfile_set.cc: Line 103: gscoped_ptr reader; Can you convert to unique_ptr in the affected call sites? Or does that spill out all over the place? http://gerrit.cloudera.org:8080/#/c/6466/1/src/kudu/tablet/cfile_set.h File src/kudu/tablet/cfile_set.h: Line 121: // Map of column ID to reader. These are lazily initialized as needed. May want to leave a comment justifying flat_map, since it's somewhat unorthodox for us. PS1, Line 122: > Nit: eliminate empty space. -- To view, visit http://gerrit.cloudera.org:8080/6466 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I3812861a11b00e149ca47b090447bc918c465ad4 Gerrit-PatchSet: 1 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Kudu Jenkins Gerrit-HasComments: Yes
[kudu-CR] cfile set: reduce memory usage of reader map
Hello Adar Dembo, I'd like you to do a code review. Please visit http://gerrit.cloudera.org:8080/6466 to review the following change. Change subject: cfile_set: reduce memory usage of reader map .. cfile_set: reduce memory usage of reader map We previously used an unordered_map>. This patch changes to using boost's flat_map container, which is based on sorted entries in a vector. This means we get one contiguous memory location vs a bunch of smaller per-item allocations in the separate-chained hashtable. Additionally switches to unique_ptr which avoids an extra 16-byte control structure per CFileReader. It's hard to measure the exact win here, but I wrote a little test program which compared inserting 40 entries into the two types of maps, leaking the maps, and looking at the LeakSanitizer reports. With the old map and shared_ptr, it cost 2832 bytes, and with the flat_map and unique_ptr, it only took 824 bytes. So, we saved about 50 bytes per entry. Extrapolating this out to a server with 600k cfiles open, we'll save around 28MB. Not huge, but worth doing. Change-Id: I3812861a11b00e149ca47b090447bc918c465ad4 --- M src/kudu/tablet/cfile_set.cc M src/kudu/tablet/cfile_set.h 2 files changed, 7 insertions(+), 6 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/66/6466/1 -- To view, visit http://gerrit.cloudera.org:8080/6466 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I3812861a11b00e149ca47b090447bc918c465ad4 Gerrit-PatchSet: 1 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Adar Dembo