Dan Burkert has posted comments on this change. Change subject: Add docs for non-covering range partitioning ......................................................................
Patch Set 1: (7 comments) http://gerrit.cloudera.org:8080/#/c/3796/1/docs/kudu_impala_integration.adoc File docs/kudu_impala_integration.adoc: Line 799: Starting with Kudu 0.10, Kudu supports the use of non-covering range partitions, which I don't think this feature (specifically the RANGE BOUND syntax) will be ready in Impala for the 0.10 release. I'm not sure if we should document this in the impala integration until its done. Line 830: range should be added. APIs `AlterTableRequestPB::ADD_RANGE_PARTITION` and This keyword is the specific Protobuf identifier that Kudu uses internally; when add/drop range partition is added to impala it will hopefully look significantly different. Perhaps say more generally an alter table [add|drop] range partition operation. http://gerrit.cloudera.org:8080/#/c/3796/1/docs/schema_design.adoc File docs/schema_design.adoc: Line 49: across tablet servers. This is most impacted by the primary key design and the This is only influenced by the partition schema; the primary key only influences how data is stored/accessed within an individual tablet. Line 53: - Only important data would be transmitted to clients, by means of encoding, compression, I think what this is trying to get at is that an efficient schema will be designed such that scans will read the minimum amount of data necessary to fulfill a query. The biggest tool here is primary key design, but partition design also plays into it via partition pruning. Line 147: Kudu 0.10 introduces non-covering range partitions, to mitigate some limitations of This doesn't explain what non-covering range partitions are. Line 165: ==== Caveats of Non-Covering Range Partitions These caveats look like they were taken right from the design doc? I think they are too low level, and don't really fit with the rest of this guide. They mention some features which likely will never be implemented, and some of them require expert-level knowledge of partitioning to understand. I think the most important caveat when using non-covered range partitions are: * Writes into non-covered ranges will fail with a tablet not found error. * When adding a range partition to an existing table through an alter table operation, the new range partition may not become visible to other existing clients of the table until waiting for an entire 'table_locations_ttl' period, as configured on the master (default 1 hour). Line 288: [[column-design]] Are there changes in here, or just a straight move? -- To view, visit http://gerrit.cloudera.org:8080/3796 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I3b0fd7500c5399db9dcad617ae67fea247307353 Gerrit-PatchSet: 1 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Misty Stanley-Jones <mi...@apache.org> Gerrit-Reviewer: Dan Burkert <d...@cloudera.com> Gerrit-Reviewer: Dinesh Bhat <din...@cloudera.com> Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon <t...@apache.org> Gerrit-HasComments: Yes