[Impala-ASF-CR] IMPALA-5229: huge page-backed buffers with TCMalloc
Impala Public Jenkins has posted comments on this change. Change subject: IMPALA-5229: huge page-backed buffers with TCMalloc .. Patch Set 6: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/6687 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: If84b46a46efed9aee6af41b5f10bf3f4b15889b8 Gerrit-PatchSet: 6 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5229: huge page-backed buffers with TCMalloc
Impala Public Jenkins has submitted this change and it was merged. Change subject: IMPALA-5229: huge page-backed buffers with TCMalloc .. IMPALA-5229: huge page-backed buffers with TCMalloc This commit enables transparent huge pages when we're allocating via malloc(), not just mmap(). This gives us the perf benefits of huge pages, without the challenge that the mmap() path presented - the overhead of mapping and unmapping memory and the difficulty in reasoning about peak virtual memory consumption. Also sneak in some cleanup - use Rvalue refs for BufferHandle methods where appropriate. Testing: Updated backend tests to ensure this combination is covered. Ran some end-to-end tests and stress tests on my buffer pool dev branch and all looks good. Perf: Compared to current master, this provides a pretty clear perf benefit: I ran benchmarks on a single daemon with a reasonably large TPC-H scale factor. Large aggregations are much faster and everything else is the same (within variance) or slightly faster. Report Generated on 2017-04-18 Run Description: "Base: 68f32e52bc42bef578330a4fe0edc5b292891eea vs Ref: f39d69bcd8bdc7d6d4fb42ef19966a26dea3a29d" Cluster Name: UNKNOWN Lab Run Info: UNKNOWN Impala Version: impalad version 2.9.0-SNAPSHOT RELEASE () Baseline Impala Version: impalad version 2.9.0-SNAPSHOT RELEASE (2017-04-06) ++---+-++++ | Workload | File Format | Avg (s) | Delta(Avg) | GeoMean(s) | Delta(GeoMean) | ++---+-++++ | TARGETED-PERF(_60) | parquet / none / none | 19.30 | -3.05% | 4.91 | -0.91% | ++---+-++++ +++---++-++++-+---+ | Workload | Query | File Format | Avg(s) | Base Avg(s) | Delta(Avg) | StdDev(%) | Base StdDev(%) | Num Clients | Iters | +++---++-++++-+---+ | TARGETED-PERF(_60) | PERF_LIMIT-Q1 | parquet / none / none | 0.01 | 0.01| R +22.95% | 6.12%| 2.30%| 1 | 5 | | TARGETED-PERF(_60) | primitive_topn_bigint | parquet / none / none | 5.14 | 4.66| +10.35% | 5.00%| * 13.39% * | 1 | 5 | | TARGETED-PERF(_60) | primitive_conjunct_ordering_4 | parquet / none / none | 0.24 | 0.23| +8.12% | * 12.81% * | 1.76%| 1 | 5 | | TARGETED-PERF(_60) | primitive_broadcast_join_3 | parquet / none / none | 7.86 | 7.39| +6.44% | 1.49%| 1.41%| 1 | 5 | | TARGETED-PERF(_60) | PERF_STRING-Q6 | parquet / none / none | 10.53 | 10.30 | +2.24% | 0.61%| 0.30%| 1 | 5 | | TARGETED-PERF(_60) | primitive_conjunct_ordering_5 | parquet / none / none | 17.23 | 16.90 | +1.90% | 1.61%| 1.05%| 1 | 5 | | TARGETED-PERF(_60) | primitive_conjunct_ordering_3 | parquet / none / none | 3.19 | 3.13| +1.81% | 1.47%| 0.45%| 1 | 5 | | TARGETED-PERF(_60) | primitive_filter_bigint_non_selective | parquet / none / none | 1.08 | 1.06| +1.60% | 0.26%| 2.07%| 1 | 5 | | TARGETED-PERF(_60) | PERF_STRING-Q3 | parquet / none / none | 3.75 | 3.71| +1.14% | 0.39%| 0.80%| 1 | 5 | | TARGETED-PERF(_60) | primitive_broadcast_join_2 | parquet / none / none | 5.15 | 5.09| +1.11% | 1.18%| 0.89%| 1 | 5 | | TARGETED-PERF(_60) | PERF_STRING-Q2 | parquet / none / none | 3.47 | 3.44| +1.03% | 1.27%| 0.61%| 1 | 5 | | TARGETED-PERF(_60) | PERF_AGG-Q1| parquet / none / none | 2.53 | 2.51| +1.01% | 1.75%| 1.91%| 1 | 5 | | TARGETED-PERF(_60) | PERF_STRING-Q7 | parquet / none / none | 8.37 | 8.31| +0.81% | 0.49%| 0.5
[Impala-ASF-CR] IMPALA-5229: huge page-backed buffers with TCMalloc
Impala Public Jenkins has posted comments on this change. Change subject: IMPALA-5229: huge page-backed buffers with TCMalloc .. Patch Set 6: Build started: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/493/ -- To view, visit http://gerrit.cloudera.org:8080/6687 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: If84b46a46efed9aee6af41b5f10bf3f4b15889b8 Gerrit-PatchSet: 6 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5229: huge page-backed buffers with TCMalloc
Tim Armstrong has posted comments on this change. Change subject: IMPALA-5229: huge page-backed buffers with TCMalloc .. Patch Set 5: Code-Review+2 Rebase -- To view, visit http://gerrit.cloudera.org:8080/6687 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: If84b46a46efed9aee6af41b5f10bf3f4b15889b8 Gerrit-PatchSet: 5 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5229: huge page-backed buffers with TCMalloc
Tim Armstrong has posted comments on this change. Change subject: IMPALA-5229: huge page-backed buffers with TCMalloc .. Patch Set 6: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/6687 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: If84b46a46efed9aee6af41b5f10bf3f4b15889b8 Gerrit-PatchSet: 6 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5229: huge page-backed buffers with TCMalloc
Dan Hecht has posted comments on this change. Change subject: IMPALA-5229: huge page-backed buffers with TCMalloc .. Patch Set 5: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/6687 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: If84b46a46efed9aee6af41b5f10bf3f4b15889b8 Gerrit-PatchSet: 5 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5229: huge page-backed buffers with TCMalloc
Tim Armstrong has uploaded a new patch set (#5). Change subject: IMPALA-5229: huge page-backed buffers with TCMalloc .. IMPALA-5229: huge page-backed buffers with TCMalloc This commit enables transparent huge pages when we're allocating via malloc(), not just mmap(). This gives us the perf benefits of huge pages, without the challenge that the mmap() path presented - the overhead of mapping and unmapping memory and the difficulty in reasoning about peak virtual memory consumption. Also sneak in some cleanup - use Rvalue refs for BufferHandle methods where appropriate. Testing: Updated backend tests to ensure this combination is covered. Ran some end-to-end tests and stress tests on my buffer pool dev branch and all looks good. Perf: Compared to current master, this provides a pretty clear perf benefit: I ran benchmarks on a single daemon with a reasonably large TPC-H scale factor. Large aggregations are much faster and everything else is the same (within variance) or slightly faster. Report Generated on 2017-04-18 Run Description: "Base: 68f32e52bc42bef578330a4fe0edc5b292891eea vs Ref: f39d69bcd8bdc7d6d4fb42ef19966a26dea3a29d" Cluster Name: UNKNOWN Lab Run Info: UNKNOWN Impala Version: impalad version 2.9.0-SNAPSHOT RELEASE () Baseline Impala Version: impalad version 2.9.0-SNAPSHOT RELEASE (2017-04-06) ++---+-++++ | Workload | File Format | Avg (s) | Delta(Avg) | GeoMean(s) | Delta(GeoMean) | ++---+-++++ | TARGETED-PERF(_60) | parquet / none / none | 19.30 | -3.05% | 4.91 | -0.91% | ++---+-++++ +++---++-++++-+---+ | Workload | Query | File Format | Avg(s) | Base Avg(s) | Delta(Avg) | StdDev(%) | Base StdDev(%) | Num Clients | Iters | +++---++-++++-+---+ | TARGETED-PERF(_60) | PERF_LIMIT-Q1 | parquet / none / none | 0.01 | 0.01| R +22.95% | 6.12%| 2.30%| 1 | 5 | | TARGETED-PERF(_60) | primitive_topn_bigint | parquet / none / none | 5.14 | 4.66| +10.35% | 5.00%| * 13.39% * | 1 | 5 | | TARGETED-PERF(_60) | primitive_conjunct_ordering_4 | parquet / none / none | 0.24 | 0.23| +8.12% | * 12.81% * | 1.76%| 1 | 5 | | TARGETED-PERF(_60) | primitive_broadcast_join_3 | parquet / none / none | 7.86 | 7.39| +6.44% | 1.49%| 1.41%| 1 | 5 | | TARGETED-PERF(_60) | PERF_STRING-Q6 | parquet / none / none | 10.53 | 10.30 | +2.24% | 0.61%| 0.30%| 1 | 5 | | TARGETED-PERF(_60) | primitive_conjunct_ordering_5 | parquet / none / none | 17.23 | 16.90 | +1.90% | 1.61%| 1.05%| 1 | 5 | | TARGETED-PERF(_60) | primitive_conjunct_ordering_3 | parquet / none / none | 3.19 | 3.13| +1.81% | 1.47%| 0.45%| 1 | 5 | | TARGETED-PERF(_60) | primitive_filter_bigint_non_selective | parquet / none / none | 1.08 | 1.06| +1.60% | 0.26%| 2.07%| 1 | 5 | | TARGETED-PERF(_60) | PERF_STRING-Q3 | parquet / none / none | 3.75 | 3.71| +1.14% | 0.39%| 0.80%| 1 | 5 | | TARGETED-PERF(_60) | primitive_broadcast_join_2 | parquet / none / none | 5.15 | 5.09| +1.11% | 1.18%| 0.89%| 1 | 5 | | TARGETED-PERF(_60) | PERF_STRING-Q2 | parquet / none / none | 3.47 | 3.44| +1.03% | 1.27%| 0.61%| 1 | 5 | | TARGETED-PERF(_60) | PERF_AGG-Q1| parquet / none / none | 2.53 | 2.51| +1.01% | 1.75%| 1.91%| 1 | 5 | | TARGETED-PERF(_60) | PERF_STRING-Q7 | parquet / none / none | 8.37 | 8.31| +0.81% | 0.49%| 0.58%| 1
[Impala-ASF-CR] IMPALA-5229: huge page-backed buffers with TCMalloc
Tim Armstrong has posted comments on this change. Change subject: IMPALA-5229: huge page-backed buffers with TCMalloc .. Patch Set 4: (1 comment) http://gerrit.cloudera.org:8080/#/c/6687/4/be/src/runtime/bufferpool/system-allocator.cc File be/src/runtime/bufferpool/system-allocator.cc: PS4, Line 147: The memory region may still be backed by huge pages, but TCMalloc will decommit : // those for us with its "aggressive decommit" mode. > Given our discussion, how about clarifying with something like: Done -- To view, visit http://gerrit.cloudera.org:8080/6687 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: If84b46a46efed9aee6af41b5f10bf3f4b15889b8 Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5229: huge page-backed buffers with TCMalloc
Dan Hecht has posted comments on this change. Change subject: IMPALA-5229: huge page-backed buffers with TCMalloc .. Patch Set 4: (1 comment) http://gerrit.cloudera.org:8080/#/c/6687/4/be/src/runtime/bufferpool/system-allocator.cc File be/src/runtime/bufferpool/system-allocator.cc: PS4, Line 147: The memory region may still be backed by huge pages, but TCMalloc will decommit : // those for us with its "aggressive decommit" mode. Given our discussion, how about clarifying with something like: "This depends on tcmalloc "aggressive decommit". Otherwise, this large page region may be divvied up and subsequently decommitted in smaller chunks, which may not actually release the physical memory, causing impala physical memory usage to exceed the process limit." Probably could use some more refining, but the point is I think this should be spelled out more. -- To view, visit http://gerrit.cloudera.org:8080/6687 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: If84b46a46efed9aee6af41b5f10bf3f4b15889b8 Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5229: huge page-backed buffers with TCMalloc
Tim Armstrong has posted comments on this change. Change subject: IMPALA-5229: huge page-backed buffers with TCMalloc .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/6687/2/be/src/runtime/bufferpool/system-allocator.cc File be/src/runtime/bufferpool/system-allocator.cc: Line 50: // Free() assumes that aggressive decommit is enabled for TCMalloc. > Hmm, seems strange the kernel doesn't break the huge page automatically whe It does seem a little strange but I'm not sure what all the heuristics are and how they interact. Yep that's what I'm saying. -- To view, visit http://gerrit.cloudera.org:8080/6687 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: If84b46a46efed9aee6af41b5f10bf3f4b15889b8 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5229: huge page-backed buffers with TCMalloc
Dan Hecht has posted comments on this change. Change subject: IMPALA-5229: huge page-backed buffers with TCMalloc .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/6687/2/be/src/runtime/bufferpool/system-allocator.cc File be/src/runtime/bufferpool/system-allocator.cc: Line 50: // Free() assumes that aggressive decommit is enabled for TCMalloc. > I believe MADV_NOHUGEPAGE just prevents future merging into huge pages and Hmm, seems strange the kernel doesn't break the huge page automatically when needed. But I guess it's possible that it doesn't. So, is what you're saying that we need aggressive decomit so that the entire mapping is unmapped immediately before it's carved up by tc-malloc (and then potentially decommitted in smaller pieces)? -- To view, visit http://gerrit.cloudera.org:8080/6687 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: If84b46a46efed9aee6af41b5f10bf3f4b15889b8 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5229: huge page-backed buffers with TCMalloc
Tim Armstrong has posted comments on this change. Change subject: IMPALA-5229: huge page-backed buffers with TCMalloc .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/6687/2/be/src/runtime/bufferpool/system-allocator.cc File be/src/runtime/bufferpool/system-allocator.cc: Line 50: // Free() assumes that aggressive decommit is enabled for TCMalloc. > I saw the comment in Free() but don't follow how that means aggressive deco I believe MADV_NOHUGEPAGE just prevents future merging into huge pages and can leave the memory backed by huge pages. Some people at least claim that madvise(DONTNEED) doesn't work if you call it part of a transparent huge pages: https://www.percona.com/blog/2014/07/23/why-tokudb-hates-transparent-hugepages/ -- To view, visit http://gerrit.cloudera.org:8080/6687 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: If84b46a46efed9aee6af41b5f10bf3f4b15889b8 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5229: huge page-backed buffers with TCMalloc
Dan Hecht has posted comments on this change. Change subject: IMPALA-5229: huge page-backed buffers with TCMalloc .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/6687/2/be/src/runtime/bufferpool/system-allocator.cc File be/src/runtime/bufferpool/system-allocator.cc: Line 50: // Free() assumes that aggressive decommit is enabled for TCMalloc. > The comment in free explains why aggressive decommit is important. Otherwis I saw the comment in Free() but don't follow how that means aggressive decommit is important. It seems to say that the MADV_NOHUGEPAGE is important. But once that happens, can't the (small) pages be returned (i.e. madvise(DONTNEED) individually? -- To view, visit http://gerrit.cloudera.org:8080/6687 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: If84b46a46efed9aee6af41b5f10bf3f4b15889b8 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5229: huge page-backed buffers with TCMalloc
Tim Armstrong has uploaded a new patch set (#4). Change subject: IMPALA-5229: huge page-backed buffers with TCMalloc .. IMPALA-5229: huge page-backed buffers with TCMalloc This commit enables transparent huge pages when we're allocating via malloc(), not just mmap(). This gives us the perf benefits of huge pages, without the challenge that the mmap() path presented - the overhead of mapping and unmapping memory and the difficulty in reasoning about peak virtual memory consumption. Also sneak in some cleanup - use Rvalue refs for BufferHandle methods where appropriate. Testing: Updated backend tests to ensure this combination is covered. Ran some end-to-end tests and stress tests on my buffer pool dev branch and all looks good. Perf: Compared to current master, this provides a pretty clear perf benefit: I ran benchmarks on a single daemon with a reasonably large TPC-H scale factor. Large aggregations are much faster and everything else is the same (within variance) or slightly faster. Report Generated on 2017-04-18 Run Description: "Base: 68f32e52bc42bef578330a4fe0edc5b292891eea vs Ref: f39d69bcd8bdc7d6d4fb42ef19966a26dea3a29d" Cluster Name: UNKNOWN Lab Run Info: UNKNOWN Impala Version: impalad version 2.9.0-SNAPSHOT RELEASE () Baseline Impala Version: impalad version 2.9.0-SNAPSHOT RELEASE (2017-04-06) ++---+-++++ | Workload | File Format | Avg (s) | Delta(Avg) | GeoMean(s) | Delta(GeoMean) | ++---+-++++ | TARGETED-PERF(_60) | parquet / none / none | 19.30 | -3.05% | 4.91 | -0.91% | ++---+-++++ +++---++-++++-+---+ | Workload | Query | File Format | Avg(s) | Base Avg(s) | Delta(Avg) | StdDev(%) | Base StdDev(%) | Num Clients | Iters | +++---++-++++-+---+ | TARGETED-PERF(_60) | PERF_LIMIT-Q1 | parquet / none / none | 0.01 | 0.01| R +22.95% | 6.12%| 2.30%| 1 | 5 | | TARGETED-PERF(_60) | primitive_topn_bigint | parquet / none / none | 5.14 | 4.66| +10.35% | 5.00%| * 13.39% * | 1 | 5 | | TARGETED-PERF(_60) | primitive_conjunct_ordering_4 | parquet / none / none | 0.24 | 0.23| +8.12% | * 12.81% * | 1.76%| 1 | 5 | | TARGETED-PERF(_60) | primitive_broadcast_join_3 | parquet / none / none | 7.86 | 7.39| +6.44% | 1.49%| 1.41%| 1 | 5 | | TARGETED-PERF(_60) | PERF_STRING-Q6 | parquet / none / none | 10.53 | 10.30 | +2.24% | 0.61%| 0.30%| 1 | 5 | | TARGETED-PERF(_60) | primitive_conjunct_ordering_5 | parquet / none / none | 17.23 | 16.90 | +1.90% | 1.61%| 1.05%| 1 | 5 | | TARGETED-PERF(_60) | primitive_conjunct_ordering_3 | parquet / none / none | 3.19 | 3.13| +1.81% | 1.47%| 0.45%| 1 | 5 | | TARGETED-PERF(_60) | primitive_filter_bigint_non_selective | parquet / none / none | 1.08 | 1.06| +1.60% | 0.26%| 2.07%| 1 | 5 | | TARGETED-PERF(_60) | PERF_STRING-Q3 | parquet / none / none | 3.75 | 3.71| +1.14% | 0.39%| 0.80%| 1 | 5 | | TARGETED-PERF(_60) | primitive_broadcast_join_2 | parquet / none / none | 5.15 | 5.09| +1.11% | 1.18%| 0.89%| 1 | 5 | | TARGETED-PERF(_60) | PERF_STRING-Q2 | parquet / none / none | 3.47 | 3.44| +1.03% | 1.27%| 0.61%| 1 | 5 | | TARGETED-PERF(_60) | PERF_AGG-Q1| parquet / none / none | 2.53 | 2.51| +1.01% | 1.75%| 1.91%| 1 | 5 | | TARGETED-PERF(_60) | PERF_STRING-Q7 | parquet / none / none | 8.37 | 8.31| +0.81% | 0.49%| 0.58%| 1
[Impala-ASF-CR] IMPALA-5229: huge page-backed buffers with TCMalloc
Tim Armstrong has posted comments on this change. Change subject: IMPALA-5229: huge page-backed buffers with TCMalloc .. Patch Set 2: (3 comments) http://gerrit.cloudera.org:8080/#/c/6687/2/be/src/runtime/bufferpool/system-allocator.cc File be/src/runtime/bufferpool/system-allocator.cc: Line 50: // Free() assumes that aggressive decommit is enabled for TCMalloc. > what about Free is assuming that? why do we care how tc-malloc manages it's The comment in free explains why aggressive decommit is important. Otherwise it's hard to reason about what happens if we return memory backed by huge pages to TCMalloc - TCMalloc definitely doesn't expect that. TCMalloc might break up the memory range into smaller allocations and returns them. It wouldn't be the end of the world, but is hard to reason about the consequences of something like that happening. PS2, Line 120: itcan > missing space Done PS2, Line 121: SMALL_PAGE_SIZE > do we enforce that buffer sizes are larger than 4k? Currently on my development branch, the minimum buffer size I think would be a command-line config, defaulting to 64k. It would make sense to impose a hard minimum on that (or make it non-configurable). I guess I don't have any enforcement that the buffers are > 4k but I think that would make sense to add as a DCHECK somewhere. -- To view, visit http://gerrit.cloudera.org:8080/6687 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: If84b46a46efed9aee6af41b5f10bf3f4b15889b8 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5229: huge page-backed buffers with TCMalloc
Dan Hecht has posted comments on this change. Change subject: IMPALA-5229: huge page-backed buffers with TCMalloc .. Patch Set 2: (3 comments) http://gerrit.cloudera.org:8080/#/c/6687/2/be/src/runtime/bufferpool/system-allocator.cc File be/src/runtime/bufferpool/system-allocator.cc: Line 50: // Free() assumes that aggressive decommit is enabled for TCMalloc. what about Free is assuming that? why do we care how tc-malloc manages it's memory as long as it's available for new allocations after we free()? do you mean because we're removing that GC code in the other patch or something else? PS2, Line 120: itcan missing space PS2, Line 121: SMALL_PAGE_SIZE do we enforce that buffer sizes are larger than 4k? -- To view, visit http://gerrit.cloudera.org:8080/6687 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: If84b46a46efed9aee6af41b5f10bf3f4b15889b8 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Dan Hecht Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5229: huge page-backed buffers with TCMalloc
Tim Armstrong has uploaded a new patch set (#2). Change subject: IMPALA-5229: huge page-backed buffers with TCMalloc .. IMPALA-5229: huge page-backed buffers with TCMalloc This commit enables transparent huge pages when we're allocating via malloc(), not just mmap(). This gives us the perf benefits of huge pages, without the challenge that the mmap() path presented - the overhead of mapping and unmapping memory and the difficulty in reasoning about peak virtual memory consumption. Also sneak in some cleanup - use Rvalue refs for BufferHandle methods where appropriate. Testing: Updated backend tests to ensure this combination is covered. Ran some end-to-end tests and stress tests on my buffer pool dev branch and all looks good. Perf: Compared to current master, this provides a pretty clear perf benefit: I ran benchmarks on a single daemon with a reasonably large TPC-H scale factor. Large aggregations are much faster and everything else is the same (within variance) or slightly faster. Report Generated on 2017-04-18 Run Description: "Base: 68f32e52bc42bef578330a4fe0edc5b292891eea vs Ref: f39d69bcd8bdc7d6d4fb42ef19966a26dea3a29d" Cluster Name: UNKNOWN Lab Run Info: UNKNOWN Impala Version: impalad version 2.9.0-SNAPSHOT RELEASE () Baseline Impala Version: impalad version 2.9.0-SNAPSHOT RELEASE (2017-04-06) ++---+-++++ | Workload | File Format | Avg (s) | Delta(Avg) | GeoMean(s) | Delta(GeoMean) | ++---+-++++ | TARGETED-PERF(_60) | parquet / none / none | 19.30 | -3.05% | 4.91 | -0.91% | ++---+-++++ +++---++-++++-+---+ | Workload | Query | File Format | Avg(s) | Base Avg(s) | Delta(Avg) | StdDev(%) | Base StdDev(%) | Num Clients | Iters | +++---++-++++-+---+ | TARGETED-PERF(_60) | PERF_LIMIT-Q1 | parquet / none / none | 0.01 | 0.01| R +22.95% | 6.12%| 2.30%| 1 | 5 | | TARGETED-PERF(_60) | primitive_topn_bigint | parquet / none / none | 5.14 | 4.66| +10.35% | 5.00%| * 13.39% * | 1 | 5 | | TARGETED-PERF(_60) | primitive_conjunct_ordering_4 | parquet / none / none | 0.24 | 0.23| +8.12% | * 12.81% * | 1.76%| 1 | 5 | | TARGETED-PERF(_60) | primitive_broadcast_join_3 | parquet / none / none | 7.86 | 7.39| +6.44% | 1.49%| 1.41%| 1 | 5 | | TARGETED-PERF(_60) | PERF_STRING-Q6 | parquet / none / none | 10.53 | 10.30 | +2.24% | 0.61%| 0.30%| 1 | 5 | | TARGETED-PERF(_60) | primitive_conjunct_ordering_5 | parquet / none / none | 17.23 | 16.90 | +1.90% | 1.61%| 1.05%| 1 | 5 | | TARGETED-PERF(_60) | primitive_conjunct_ordering_3 | parquet / none / none | 3.19 | 3.13| +1.81% | 1.47%| 0.45%| 1 | 5 | | TARGETED-PERF(_60) | primitive_filter_bigint_non_selective | parquet / none / none | 1.08 | 1.06| +1.60% | 0.26%| 2.07%| 1 | 5 | | TARGETED-PERF(_60) | PERF_STRING-Q3 | parquet / none / none | 3.75 | 3.71| +1.14% | 0.39%| 0.80%| 1 | 5 | | TARGETED-PERF(_60) | primitive_broadcast_join_2 | parquet / none / none | 5.15 | 5.09| +1.11% | 1.18%| 0.89%| 1 | 5 | | TARGETED-PERF(_60) | PERF_STRING-Q2 | parquet / none / none | 3.47 | 3.44| +1.03% | 1.27%| 0.61%| 1 | 5 | | TARGETED-PERF(_60) | PERF_AGG-Q1| parquet / none / none | 2.53 | 2.51| +1.01% | 1.75%| 1.91%| 1 | 5 | | TARGETED-PERF(_60) | PERF_STRING-Q7 | parquet / none / none | 8.37 | 8.31| +0.81% | 0.49%| 0.58%| 1
[Impala-ASF-CR] IMPALA-5229: huge page-backed buffers with TCMalloc
Tim Armstrong has uploaded a new change for review. http://gerrit.cloudera.org:8080/6687 Change subject: IMPALA-5229: huge page-backed buffers with TCMalloc .. IMPALA-5229: huge page-backed buffers with TCMalloc This commit enables transparent huge pages when we're allocating via malloc(), not just mmap(). This gives us the perf benefits of huge pages, without the challenge that the mmap() path presented - the overhead of mapping and unmapping memory and the difficulty in reasoning about peak virtual memory consumption. Also sneak in some cleanup - use Rvalue refs for BufferHandle methods where appropriate. Testing: Updated backend tests to ensure this combination is covered. Ran some end-to-end tests and stress tests on my buffer pool dev branch and all looks good. Perf: Compared to current master, this provides a pretty clear perf benefit: I ran benchmarks on a single daemon with a reasonably large TPC-H scale factor. Large aggregations are much faster and everything else is the same (within variance) or slightly faster. Report Generated on 2017-04-18 Run Description: "Base: 68f32e52bc42bef578330a4fe0edc5b292891eea vs Ref: f39d69bcd8bdc7d6d4fb42ef19966a26dea3a29d" Cluster Name: UNKNOWN Lab Run Info: UNKNOWN Impala Version: impalad version 2.9.0-SNAPSHOT RELEASE () Baseline Impala Version: impalad version 2.9.0-SNAPSHOT RELEASE (2017-04-06) ++---+-++++ | Workload | File Format | Avg (s) | Delta(Avg) | GeoMean(s) | Delta(GeoMean) | ++---+-++++ | TARGETED-PERF(_60) | parquet / none / none | 19.30 | -3.05% | 4.91 | -0.91% | ++---+-++++ +++---++-++++-+---+ | Workload | Query | File Format | Avg(s) | Base Avg(s) | Delta(Avg) | StdDev(%) | Base StdDev(%) | Num Clients | Iters | +++---++-++++-+---+ | TARGETED-PERF(_60) | PERF_LIMIT-Q1 | parquet / none / none | 0.01 | 0.01| R +22.95% | 6.12%| 2.30%| 1 | 5 | | TARGETED-PERF(_60) | primitive_topn_bigint | parquet / none / none | 5.14 | 4.66| +10.35% | 5.00%| * 13.39% * | 1 | 5 | | TARGETED-PERF(_60) | primitive_conjunct_ordering_4 | parquet / none / none | 0.24 | 0.23| +8.12% | * 12.81% * | 1.76%| 1 | 5 | | TARGETED-PERF(_60) | primitive_broadcast_join_3 | parquet / none / none | 7.86 | 7.39| +6.44% | 1.49%| 1.41%| 1 | 5 | | TARGETED-PERF(_60) | PERF_STRING-Q6 | parquet / none / none | 10.53 | 10.30 | +2.24% | 0.61%| 0.30%| 1 | 5 | | TARGETED-PERF(_60) | primitive_conjunct_ordering_5 | parquet / none / none | 17.23 | 16.90 | +1.90% | 1.61%| 1.05%| 1 | 5 | | TARGETED-PERF(_60) | primitive_conjunct_ordering_3 | parquet / none / none | 3.19 | 3.13| +1.81% | 1.47%| 0.45%| 1 | 5 | | TARGETED-PERF(_60) | primitive_filter_bigint_non_selective | parquet / none / none | 1.08 | 1.06| +1.60% | 0.26%| 2.07%| 1 | 5 | | TARGETED-PERF(_60) | PERF_STRING-Q3 | parquet / none / none | 3.75 | 3.71| +1.14% | 0.39%| 0.80%| 1 | 5 | | TARGETED-PERF(_60) | primitive_broadcast_join_2 | parquet / none / none | 5.15 | 5.09| +1.11% | 1.18%| 0.89%| 1 | 5 | | TARGETED-PERF(_60) | PERF_STRING-Q2 | parquet / none / none | 3.47 | 3.44| +1.03% | 1.27%| 0.61%| 1 | 5 | | TARGETED-PERF(_60) | PERF_AGG-Q1| parquet / none / none | 2.53 | 2.51| +1.01% | 1.75%| 1.91%| 1 | 5 | | TARGETED-PERF(_60) | PERF_STRING-Q7 | parquet / none / none | 8.37 | 8.31| +0.81%