The updates are fairly frequent (a few per minute) and have a tight freshness
requirement.
We really don’t want to show tutors who are not available. Luckily, it is a
smallish
collection, a few hundred thousand.
The traffic isn’t a problem and the cluster is working very well. This is about
A couple of notes:
TLOG replicas will have the same issue. When I said that leaders
forwarded to followers, what that's really about is that the follower
guarantees that the docs have been written to the TLOG. So if you
change your model to use TLOG replicas, don't expect a change.
PULL
Thanks, that is exactly what I was curious about.
All our updates are single documents. We need to track the availability of
online
tutors, so we don’t batch them.
Right now, we have a replication factor of 36 (way too many), so that means each
update means 3 x 35 internal communications.
Walter:
Each update is roughly
request goes to leader (may be forwarded)
leader sends the update to _each_ replica. depending on how many docs
you're sending per update request this may be more than one request.
IIRC there was some JIRA a while ago where the forwarding wasn't all
that
I’m comparing request counts from New Relic, which is reporting 16 krpm
aggregate
requests across the cluster, and the AWS load balancer is reporting 1 krpm. Or
it might
be 1k requests per 5 minutes because CloudWatch is like that.
This is a 36 node cluster, not sharded. We are going to shrink
There are a single persistent HTTP connection open from the leader to each
replica in the shard. All updates coming to the leader are expanded (for
atomic updates) and streamed over that single connection. When using
in-place docvalues updates, there is a possibility of the replica making a
How many messages are sent back and forth between a leader and replica with NRT?
We have a collection that gets frequent updates and we are seeing a ton of
internal
cluster traffic.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)