Yifan Zhang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12158 )

Change subject: KUDU-2348: Pick a random replica in RemoteTablet.java
......................................................................


Patch Set 9:

> Thinking about this a bit, I wonder what behavior we really want.
 >
 > Today, every client inserts the servers into a hashmap, and then
 > would return the last server in hashmap iteration order. In other
 > words, we end up ranking the servers by something like
 > hashcode%num_hashmap_buckets. Given that num_hashmap_buckets is
 > likely constant across all tablets (these hashtables almost always
 > have 3 elements so we likely have the default of 16 buckets, the
 > ranking function is more or less consistent across all clients and
 > all tablets.
 >
 > The major problem this causes is this: in a cluster without
 > locality (eg fully remote), whichever servers have high hashcode%16
 > are going to get significantly more read load than those with low
 > hashcode%16. I wrote a little simulation here:
 >
 > https://gist.github.com/b3de552784da4afa29a2f1f66673b187
 >
 > Running this script results in a load distribution like:
 >
 > ts_idx       % of load
 > -----------------
 > 0    10.1
 > 2    9.2
 > 27   8.0
 > 12   8.0
 > 10   7.8
 > 13   6.5
 > 11   6.2
 > 22   6.2
 > 24   6.0
 > 20   5.0
 > 23   4.0
 > 15   3.6
 > 18   2.9
 > 3    2.9
 > 21   2.9
 > 16   2.1
 > 4    1.8
 > 25   1.2
 > 6    1.1
 > 5    1.1
 > 9    1.0
 > 1    1.0
 > 7    0.4
 > 19   0.3
 > 26   0.3
 > 29   0.2
 > 28   0.0
 > 8    0.0
 > 14   0.0
 > 17   0.0
 >
 > It seems that this patch will change the behavior so that the
 > server preference is randomized and dependent on the client, which
 > solves the issue, but also means that, for a given tablet, load
 > will be spread evenly across the replicas if there are multiple
 > clients. Depending on the workload, that may be good or bad -- in
 > many cases you would prefer _not_ to spread the load, so that you
 > can make more efficient use of cache memory. The spreading of load
 > is then accomplished by partitioning rather than replication.
 >
 > Anyone have thoughts on how we might express this preference
 > through the API?
 >
 > A separate concern with the particular implementation is that pid
 > may have a lot of correlation across machines, particularly if the
 > client is running inside Docker containers or set to start at boot.
 > AFAIK pids are sequentially assigned, so within Docker containers
 > you would expect all clients to end up with identical pids. If we
 > need a randomized id for a process I think it's better to use
 > Java's random number generation to get one and assign it in a
 > static intializer.

Considering making efficient use of cache memory, different clients
would choose the same server for a fixed tablet. But it may also cause
heavy load on one server if all clients scan a particular tablet. So
we should make a trade-off on making efficient use of cache memory and
spreading load across all servers in cluster.

Your concern about pid is right, it's better to use a random seed when
a client is created.


--
To view, visit http://gerrit.cloudera.org:8080/12158
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3d70e45d4c9532bb32223c1dddd0936b4ff8fd99
Gerrit-Change-Number: 12158
Gerrit-PatchSet: 9
Gerrit-Owner: Yifan Zhang <chinazhangyi...@163.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Todd Lipcon <t...@apache.org>
Gerrit-Reviewer: Will Berkeley <wdberke...@gmail.com>
Gerrit-Reviewer: Yifan Zhang <chinazhangyi...@163.com>
Gerrit-Comment-Date: Thu, 30 May 2019 08:43:55 +0000
Gerrit-HasComments: No

Reply via email to