[kudu-CR] KUDU-2038-p1: Introduce CRoaring
Hello Kudu Jenkins, Andrew Wong, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/12588 to look at the new patch set (#5). Change subject: KUDU-2038-p1: Introduce CRoaring .. KUDU-2038-p1: Introduce CRoaring Introduce CRoaring and do a simple memory and disk consumption tests. Memory consumption: records | discrete | range | serial -+--+-+- 1 | 6253750 | 4689375 |1250 1 | 18 | 10 |8208 records | discrete | range | serial -+---+---+- 10 | 625037500 | 468768750 | 12500 10 | 180 | 1551424 | 16416 records | discrete |range| serial -+-+-+- 100 | 62500375000 | 46875187500 | 125000 100 | 1800| 1800| 131328 records | discrete| range | serial --+---+---+- 1000 | 625000375 | 4687501875000 | 125 1000 | 18000 | 18000 | 1255824 Disk consumption: records | discrete | range | serial -+--+-+- 1 | 6253750 | 4689375 |1250 1 | 9 | 65000 | 16 records | discrete | range | serial -+---+---+- 10 | 625037500 | 468768750 | 12500 10 | 90 | 65 | 26 records | discrete |range| serial -+-+-+- 100 | 62500375000 | 46875187500 | 125000 100 | 900 | 650 | 231 records | discrete| range | serial --+---+---+- 1000 | 625000375 | 4687501875000 | 125 1000 | 9000 | 6500 |2167 Change-Id: I878da34e388fb402c52c51364a68e738f785673d --- M CMakeLists.txt A cmake_modules/FindCRoaring.cmake M src/kudu/util/CMakeLists.txt A src/kudu/util/bitmap-vs-croaring-test.cc M thirdparty/build-definitions.sh M thirdparty/build-thirdparty.sh M thirdparty/download-thirdparty.sh A thirdparty/patches/croaring-function-name-conflict.patch M thirdparty/vars.sh 9 files changed, 361 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/88/12588/5 -- To view, visit http://gerrit.cloudera.org:8080/12588 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I878da34e388fb402c52c51364a68e738f785673d Gerrit-Change-Number: 12588 Gerrit-PatchSet: 5 Gerrit-Owner: helifu Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: helifu
[kudu-CR] KUDU-2038-p1: Introduce CRoaring
helifu has abandoned this change. ( http://gerrit.cloudera.org:8080/13064 ) Change subject: KUDU-2038-p1: Introduce CRoaring .. Abandoned -- To view, visit http://gerrit.cloudera.org:8080/13064 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: abandon Gerrit-Change-Id: Ibf9e30edb9c9871abd8536ad2fa15f802e7c030d Gerrit-Change-Number: 13064 Gerrit-PatchSet: 1 Gerrit-Owner: helifu Gerrit-Reviewer: Kudu Jenkins (120)
[kudu-CR] KUDU-2038-p1: Introduce CRoaring
Hello Kudu Jenkins, Andrew Wong, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/12588 to look at the new patch set (#4). Change subject: KUDU-2038-p1: Introduce CRoaring .. KUDU-2038-p1: Introduce CRoaring Introduce CRoaring and do a simple memory and disk consumption tests. Memory consumption: records | discrete | range | serial -+--+-+- 1 | 6253750 | 4689375 |1250 1 | 18 | 10 |8208 records | discrete | range | serial -+---+---+- 10 | 625037500 | 468768750 | 12500 10 | 180 | 1551424 | 16416 records | discrete |range| serial -+-+-+- 100 | 62500375000 | 46875187500 | 125000 100 | 1800| 1800| 131328 records | discrete| range | serial --+---+---+- 1000 | 625000375 | 4687501875000 | 125 1000 | 18000 | 18000 | 1255824 Disk consumption: records | discrete | range | serial -+--+-+- 1 | 6253750 | 4689375 |1250 1 | 9 | 65000 | 16 records | discrete | range | serial -+---+---+- 10 | 625037500 | 468768750 | 12500 10 | 90 | 65 | 26 records | discrete |range| serial -+-+-+- 100 | 62500375000 | 46875187500 | 125000 100 | 900 | 650 | 231 records | discrete| range | serial --+---+---+- 1000 | 625000375 | 4687501875000 | 125 1000 | 9000 | 6500 |2167 Change-Id: I878da34e388fb402c52c51364a68e738f785673d --- M CMakeLists.txt A cmake_modules/FindCRoaring.cmake M src/kudu/util/CMakeLists.txt A src/kudu/util/bitmap-vs-croaring-test.cc M thirdparty/build-definitions.sh M thirdparty/build-thirdparty.sh M thirdparty/download-thirdparty.sh A thirdparty/patches/croaring-function-name-conflict.patch M thirdparty/vars.sh 9 files changed, 361 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/88/12588/4 -- To view, visit http://gerrit.cloudera.org:8080/12588 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I878da34e388fb402c52c51364a68e738f785673d Gerrit-Change-Number: 12588 Gerrit-PatchSet: 4 Gerrit-Owner: helifu Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: helifu
[kudu-CR] KUDU-2038-p1: Introduce CRoaring
helifu has uploaded this change for review. ( http://gerrit.cloudera.org:8080/13064 Change subject: KUDU-2038-p1: Introduce CRoaring .. KUDU-2038-p1: Introduce CRoaring Introduce CRoaring and do a simple memory and disk consumption tests. Memory consumption: records | discrete | range | serial -+--+-+- 1 | 6253750 | 4689375 |1250 1 | 18 | 10 |8208 records | discrete | range | serial -+---+---+- 10 | 625037500 | 468768750 | 12500 10 | 180 | 1551424 | 16416 records | discrete |range| serial -+-+-+- 100 | 62500375000 | 46875187500 | 125000 100 | 1800| 1800| 131328 records | discrete| range | serial --+---+---+- 1000 | 625000375 | 4687501875000 | 125 1000 | 18000 | 18000 | 1255824 Disk consumption: records | discrete | range | serial -+--+-+- 1 | 6253750 | 4689375 |1250 1 | 9 | 65000 | 16 records | discrete | range | serial -+---+---+- 10 | 625037500 | 468768750 | 12500 10 | 90 | 65 | 26 records | discrete |range| serial -+-+-+- 100 | 62500375000 | 46875187500 | 125000 100 | 900 | 650 | 231 records | discrete| range | serial --+---+---+- 1000 | 625000375 | 4687501875000 | 125 1000 | 9000 | 6500 |2167 Change-Id: I878da34e388fb402c52c51364a68e738f785673d update Change-Id: Ibf9e30edb9c9871abd8536ad2fa15f802e7c030d --- M CMakeLists.txt A cmake_modules/FindCRoaring.cmake M src/kudu/util/CMakeLists.txt A src/kudu/util/bitmap-vs-croaring-test.cc M thirdparty/build-definitions.sh M thirdparty/build-thirdparty.sh M thirdparty/download-thirdparty.sh A thirdparty/patches/croaring-function-name-conflict.patch M thirdparty/vars.sh 9 files changed, 361 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/64/13064/1 -- To view, visit http://gerrit.cloudera.org:8080/13064 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: Ibf9e30edb9c9871abd8536ad2fa15f802e7c030d Gerrit-Change-Number: 13064 Gerrit-PatchSet: 1 Gerrit-Owner: helifu
[kudu-CR] KUDU-2038-p1: Introduce CRoaring
Hello Kudu Jenkins, Andrew Wong, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/12588 to look at the new patch set (#3). Change subject: KUDU-2038-p1: Introduce CRoaring .. KUDU-2038-p1: Introduce CRoaring Introduce CRoaring and do a simple memory and disk consumption tests. Memory consumption: records | discrete | range | serial -+--+-+- 1 | 6253750 | 4689375 |1250 1 | 18 | 10 |8208 records | discrete | range | serial -+---+---+- 10 | 625037500 | 468768750 | 12500 10 | 180 | 1551424 | 16416 records | discrete |range| serial -+-+-+- 100 | 62500375000 | 46875187500 | 125000 100 | 1800| 1800| 131328 records | discrete| range | serial --+---+---+- 1000 | 625000375 | 4687501875000 | 125 1000 | 18000 | 18000 | 1255824 Disk consumption: records | discrete | range | serial -+--+-+- 1 | 6253750 | 4689375 |1250 1 | 9 | 65000 | 16 records | discrete | range | serial -+---+---+- 10 | 625037500 | 468768750 | 12500 10 | 90 | 65 | 26 records | discrete |range| serial -+-+-+- 100 | 62500375000 | 46875187500 | 125000 100 | 900 | 650 | 231 records | discrete| range | serial --+---+---+- 1000 | 625000375 | 4687501875000 | 125 1000 | 9000 | 6500 |2167 Change-Id: I878da34e388fb402c52c51364a68e738f785673d --- M CMakeLists.txt A cmake_modules/FindCRoaring.cmake M src/kudu/util/CMakeLists.txt A src/kudu/util/bitmap-vs-croaring-test.cc M thirdparty/build-definitions.sh M thirdparty/build-thirdparty.sh M thirdparty/download-thirdparty.sh A thirdparty/patches/croaring-function-name-conflict.patch M thirdparty/vars.sh 9 files changed, 362 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/88/12588/3 -- To view, visit http://gerrit.cloudera.org:8080/12588 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I878da34e388fb402c52c51364a68e738f785673d Gerrit-Change-Number: 12588 Gerrit-PatchSet: 3 Gerrit-Owner: helifu Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: helifu
[kudu-CR] KUDU-2038-p1: Introduce CRoaring
helifu has posted comments on this change. ( http://gerrit.cloudera.org:8080/12588 ) Change subject: KUDU-2038-p1: Introduce CRoaring .. Patch Set 2: (6 comments) Sorry for the late reply. http://gerrit.cloudera.org:8080/#/c/12588/2/cmake_modules/FindCRoaring.cmake File cmake_modules/FindCRoaring.cmake: http://gerrit.cloudera.org:8080/#/c/12588/2/cmake_modules/FindCRoaring.cmake@18 PS2, Line 18: # - Find CRoaring (roaring/roaring.hh, libroaring.a, and libroaring.so) > Are all of these necessary? E.g. could we get by with just the static lib? yep, the static lib is enough. http://gerrit.cloudera.org:8080/#/c/12588/2/src/kudu/tablet/index/croaring-test.cc File src/kudu/tablet/index/croaring-test.cc: PS2: > nit: I think this makes more sense put into the /util directory, given ther Done http://gerrit.cloudera.org:8080/#/c/12588/2/src/kudu/tablet/index/croaring-test.cc@37 PS2, Line 37: TEST_TYPE_SINGLE, : TEST_TYPE_PAIR, : TEST_TYPE_SKIP > nit: add comments documenting what these are. Done http://gerrit.cloudera.org:8080/#/c/12588/2/src/kudu/tablet/index/croaring-test.cc@66 PS2, Line 66: t.n_bytes_array_containers + :t.n_bytes_run_containers + :t.n_bytes_bitset_containers; > How is this different from getSizeInBytes? "t.n_bytes_xxx_containers" means number of allocated bytes, and "getSizeInBytes" means how many bytes are required to serialize the object(roaring). They are different. http://gerrit.cloudera.org:8080/#/c/12588/2/src/kudu/tablet/index/croaring-test.cc@119 PS2, Line 119: printf(" Single PairSkip\n"); > nit: I think there's a DataTable class you can use that helps print things Done http://gerrit.cloudera.org:8080/#/c/12588/2/src/kudu/tablet/index/croaring-test.cc@120 PS2, Line 120: *6*/ > Remove this. Done -- To view, visit http://gerrit.cloudera.org:8080/12588 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I878da34e388fb402c52c51364a68e738f785673d Gerrit-Change-Number: 12588 Gerrit-PatchSet: 2 Gerrit-Owner: helifu Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: helifu Gerrit-Comment-Date: Wed, 17 Apr 2019 13:01:50 + Gerrit-HasComments: Yes
[kudu-CR] KUDU-2038-p1: Introduce CRoaring
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/12588 ) Change subject: KUDU-2038-p1: Introduce CRoaring .. Patch Set 2: (9 comments) http://gerrit.cloudera.org:8080/#/c/12588/2//COMMIT_MSG Commit Message: PS2: Given that this patch contains some tests to benchmark CRoaring, I think it's worth adding some numbers to the commit message. http://gerrit.cloudera.org:8080/#/c/12588/2//COMMIT_MSG@7 PS2, Line 7: KUDU-2038-p1: Introduce CRoaring In your original patch, you had two abstractions: "data" (i.e. bitmaps and indexing), and "readers/writers" (i.e. iterators and writers) that used the bitmap data. What do you think about doing the separation as: - bitmaps: a class that provides a clean API to the indexes/readers/writers - indexes, iterators, and writers: the logical components that use bitmaps to evaluate predicates, write blocks, etc. I think this would be cleaner because it: 1) hides some of the CRoaring-specific bits and encapsulates them in one place, 2) would therefore allow us to more easily move to a different bitmapping library in the future if we wanted to, and 3) would allow us reuse these "Kudu" bitmaps in other places if we wanted to without digging through CRoaring APIs. Defining a good API, though, requires that we have a solid understanding of how we will use it, that is, the "indexes, iterators, and writers". Since you have that context, do you agree with this separation? Or do you prefer the "data" and "reader/writers" approach that you had? FWIW I also think it would make it easier to check in this change; adding CRoaring alone with nothing else seems a bit bare. Ideally, each patch has a clear trajectory towards usefulness. http://gerrit.cloudera.org:8080/#/c/12588/2//COMMIT_MSG@9 PS2, Line 9: Introduce CRoaring and do a simple memory and disk consumption tests. nit: It may be obvious with the context we have, but it's probably worth noting how this fits into a larger feature. E.g.: This library will be used as the underlying bitmap for the upcoming bitmap indexing feature. http://gerrit.cloudera.org:8080/#/c/12588/2/cmake_modules/FindCRoaring.cmake File cmake_modules/FindCRoaring.cmake: http://gerrit.cloudera.org:8080/#/c/12588/2/cmake_modules/FindCRoaring.cmake@18 PS2, Line 18: # - Find CRoaring (roaring/roaring.hh, libroaring.a, and libroaring.so) Are all of these necessary? E.g. could we get by with just the static lib? http://gerrit.cloudera.org:8080/#/c/12588/2/src/kudu/tablet/index/croaring-test.cc File src/kudu/tablet/index/croaring-test.cc: PS2: nit: I think this makes more sense put into the /util directory, given there isn't anything tablet-specific about CRoaring, unless you pull more scope into this patch. http://gerrit.cloudera.org:8080/#/c/12588/2/src/kudu/tablet/index/croaring-test.cc@37 PS2, Line 37: TEST_TYPE_SINGLE, : TEST_TYPE_PAIR, : TEST_TYPE_SKIP nit: add comments documenting what these are. http://gerrit.cloudera.org:8080/#/c/12588/2/src/kudu/tablet/index/croaring-test.cc@66 PS2, Line 66: t.n_bytes_array_containers + :t.n_bytes_run_containers + :t.n_bytes_bitset_containers; How is this different from getSizeInBytes? http://gerrit.cloudera.org:8080/#/c/12588/2/src/kudu/tablet/index/croaring-test.cc@119 PS2, Line 119: printf(" Single PairSkip\n"); nit: I think there's a DataTable class you can use that helps print things out nicely. Same above. http://gerrit.cloudera.org:8080/#/c/12588/2/src/kudu/tablet/index/croaring-test.cc@120 PS2, Line 120: *6*/ Remove this. -- To view, visit http://gerrit.cloudera.org:8080/12588 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I878da34e388fb402c52c51364a68e738f785673d Gerrit-Change-Number: 12588 Gerrit-PatchSet: 2 Gerrit-Owner: helifu Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Comment-Date: Tue, 05 Mar 2019 00:27:27 + Gerrit-HasComments: Yes
[kudu-CR] KUDU-2038-p1: Introduce CRoaring
Andrew Wong has removed a vote on this change. Change subject: KUDU-2038-p1: Introduce CRoaring .. Removed -Verified by Kudu Jenkins (120) -- To view, visit http://gerrit.cloudera.org:8080/12588 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: deleteVote Gerrit-Change-Id: I878da34e388fb402c52c51364a68e738f785673d Gerrit-Change-Number: 12588 Gerrit-PatchSet: 2 Gerrit-Owner: helifu Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Kudu Jenkins (120)
[kudu-CR] KUDU-2038-p1: Introduce CRoaring
Andrew Wong has removed a vote on this change. Change subject: KUDU-2038-p1: Introduce CRoaring .. Removed Verified-1 by Kudu Jenkins (120) -- To view, visit http://gerrit.cloudera.org:8080/12588 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: deleteVote Gerrit-Change-Id: I878da34e388fb402c52c51364a68e738f785673d Gerrit-Change-Number: 12588 Gerrit-PatchSet: 2 Gerrit-Owner: helifu Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Kudu Jenkins (120)
[kudu-CR] KUDU-2038-p1: Introduce CRoaring
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/12588 ) Change subject: KUDU-2038-p1: Introduce CRoaring .. Patch Set 2: Verified+1 Failure was an unrelated python issue. -- To view, visit http://gerrit.cloudera.org:8080/12588 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I878da34e388fb402c52c51364a68e738f785673d Gerrit-Change-Number: 12588 Gerrit-PatchSet: 2 Gerrit-Owner: helifu Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Comment-Date: Wed, 27 Feb 2019 04:58:09 + Gerrit-HasComments: No
[kudu-CR] KUDU-2038-p1: Introduce CRoaring
Hello Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/12588 to look at the new patch set (#2). Change subject: KUDU-2038-p1: Introduce CRoaring .. KUDU-2038-p1: Introduce CRoaring Introduce CRoaring and do a simple memory and disk consumption tests. Change-Id: I878da34e388fb402c52c51364a68e738f785673d --- M CMakeLists.txt A cmake_modules/FindCRoaring.cmake M src/kudu/tablet/CMakeLists.txt A src/kudu/tablet/index/croaring-test.cc M thirdparty/build-definitions.sh M thirdparty/build-thirdparty.sh M thirdparty/download-thirdparty.sh A thirdparty/patches/croaring-function-name-conflict.patch M thirdparty/vars.sh 9 files changed, 258 insertions(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/88/12588/2 -- To view, visit http://gerrit.cloudera.org:8080/12588 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I878da34e388fb402c52c51364a68e738f785673d Gerrit-Change-Number: 12588 Gerrit-PatchSet: 2 Gerrit-Owner: helifu Gerrit-Reviewer: Kudu Jenkins (120)
[kudu-CR] KUDU-2038-p1: Introduce CRoaring
helifu has uploaded this change for review. ( http://gerrit.cloudera.org:8080/12588 Change subject: KUDU-2038-p1: Introduce CRoaring .. KUDU-2038-p1: Introduce CRoaring Introduce CRoaring and do a simple memory and disk consumption tests. Change-Id: I878da34e388fb402c52c51364a68e738f785673d --- M CMakeLists.txt A cmake_modules/FindCRoaring.cmake M src/kudu/tablet/CMakeLists.txt A src/kudu/tablet/index/croaring-test.cc M thirdparty/build-definitions.sh M thirdparty/build-thirdparty.sh M thirdparty/download-thirdparty.sh A thirdparty/patches/croaring-function-name-conflict.patch M thirdparty/vars.sh 9 files changed, 258 insertions(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/88/12588/1 -- To view, visit http://gerrit.cloudera.org:8080/12588 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I878da34e388fb402c52c51364a68e738f785673d Gerrit-Change-Number: 12588 Gerrit-PatchSet: 1 Gerrit-Owner: helifu