Adar Dembo has posted comments on this change.

Change subject: Non-covering Range Partitions design doc
......................................................................


Patch Set 1:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/2772/1/docs/design-docs/non-covering-range-partitions.md
File docs/design-docs/non-covering-range-partitions.md:

Line 69: but schema designers may find it useful to be able to use
       : both options
> What I'm concerned about is the cognitive overhead of having two ways to ac
If you're concerned about the cognitive overhead of having both range bounds 
and split rows, why not just deprecate split rows entirely? Range bounds are 
net more expressive, right?


Line 87:               RANGE BOUND (("North America"), ("North America\0")),
       :               RANGE BOUND (("Europe"), ("Europe\0")),
       :               RANGE BOUND (("Asia"), ("Asia\0"));
> Having Kudu automatically create partitions is beyond the scope of this des
Agreed that inclusive upper bound would remove some of the pain (nul 
terminators in string "point" ranges), leaving just the verbosity behind.


Line 104: If
        : the client limits the scan to a non-existent range partition through 
either
        : predicates or primary key bounds no results will be returned at all.
> Right, no error in that case. But I'm more thinking about the case where th
I'm coming around to the always-no-error perspective. See Todd's argument 
below: tablets are an implementation detail, and so from the clients' 
perspective, a scan in-range without data is semantically equivalent to a scan 
out-of-range without data.


Line 107: 
> Currently, the meta cache for both clients is implemented as a sorted (tree
Right, you've answered my first question, but not the second. Or do you expect 
the number of tablets per table to remain roughly the same?


Line 121: only
        : recontacting the master after a configurable timeout.
> @adar: when the application attempts to write into a range that the meta ca
I see, so the configurable timeout you wrote about is for "negative" lookup 
results. I didn't understand that in my first read through; could you clarify 
it in the doc?

In any case, I think a negative cache makes sense, provided it's reasonable 
smart. For example, could it track upper and lower bounds of negative space? Or 
merely certain points (i.e. individual rows) where no range existed? I think 
the answer largely depends on how much additional information the server 
provides with TabletNotFound: if you try to write row x, the server could say 
"TabletNotFound, the upper bound of the previous tablet is x-10 and the lower 
bound of the next tablet is x+3". Obviously tracking missing space as a set of 
ranges and not points is advantageous: it means an attempt to insert N rows 
with different keys all outside of a range won't result in N lookups.

I suggested the following pathological case on Slack for motivation for a 
negative cache: one "bad" client is repeatedly trying to insert rows that don't 
exist, and as a result, is placing a lot of load on the master, which could 
affect other "good" clients. This won't be as much of an issue in the future 
when clients could go to any master for read operations, but it could be an 
issue now.


Line 131: Unlike the add range partition case in which a
        : client can not know whether a new range partition has been added 
since the last
        : master lookup, during a drop range partition the client will be able 
to
        : recognize a dropped tablet when trying to insert or scan the tablet
> 1. is correct
We discussed this on Slack and now I understand what you mean. To summarize:

1. Without a negative cache, ADD and DROP are symmetric, because all operations 
will end up going to a server, either the tserver (meta cache hit for a 
scan/write), or the master (meta cache miss).
2. With a negative cache, ADD becomes more problematic, because the meta cache 
can now "hit, but fail locally" an operation that would have succeeded had it 
been allowed to go to the server (i.e. an operation on a range that was just 
added). This is asymmetric with respect to DROP because the regular existence 
cache behavior is to allow the operation to proceed; a dropped range would 
yield a server response that could be used to invalidate the existence cache.


-- 
To view, visit http://gerrit.cloudera.org:8080/2772
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I3e530eda60c00faf066c41b6bdb2b37f6d96a5dc
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Dan Burkert <[email protected]>
Gerrit-Reviewer: Adar Dembo <[email protected]>
Gerrit-Reviewer: Binglin Chang <[email protected]>
Gerrit-Reviewer: Dan Burkert <[email protected]>
Gerrit-Reviewer: David Ribeiro Alves <[email protected]>
Gerrit-Reviewer: Jean-Daniel Cryans
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <[email protected]>
Gerrit-Reviewer: Todd Lipcon <[email protected]>
Gerrit-HasComments: Yes

Reply via email to