kocolosk commented on issue #1338: Configuring node "flavors" in a cluster
URL: https://github.com/apache/couchdb/issues/1338#issuecomment-392141033
 
 
   Good point on the capabilities thing in the `GET /` response @wohali.
   
   The gist of my thinking is to make it possible to have the different 
capabilities inside a CouchDB cluster delivered by processes that have a 
slightly higher degree of isolation between them (say, separate containers). I 
think the best ROI comes from these different groupings:
   
   1. `replicator` - replication jobs can be fairly resource-intensive on the 
cluster node where they are mediated. Technically one could just spin up a 
separate cluster for running replications (and that's not a bad idea) but this 
would enable some isolation of the replication resources without any changes to 
the API.
   2. `coordinator` / `api` (I like @iilyak 's suggestion on the name here) - 
these containers would have just enough ephemeral storage to hold the internal 
databases like `/_nodes` and `/_dbs` that are replicated to every node. They 
would handle client TCP connections, run the `chttpd` logic, and submit 
`fabric_rpc` requests to the `storage` nodes hosting the actual user data. One 
could auto-scale this group up and down, although we'd likely want to make the 
`/_up` endpoint smarter so a load balancer in front of this tier could 
automatically determine when a new `api` container is actually ready to start 
handling requests (i.e., it has replicated the full content of `/_nodes` and 
`/_dbs`).
   3. `storage` - these containers would not do any clustered coordination or 
HTTP traffic; they would simply receive `fabric_rpc` requests coming in on 
`rexi_server` and respond to them. (Of course we could leave the fabric and 
chttpd apps running for debugging purposes here, but the point is that the 
workload running on this container is simpler.
   4 `compute` / `functions` - this would be a future  item, but if we could 
iterate on the view server protocol and come up with something that made sense 
over the wire I could see running an autoscaled pool of JS processes separated 
from the `storage` containers. This could give us some additional sandboxing 
tools and also allow for an easier and more efficient response to variable 
compute demands.
   
   In this design steps 1-3 don't change anything about the communication 
protocols and interfaces used to communicate between different parts of the 
stack inside CouchDB. Each of those containers is a full Erlang VM and a full 
member of the distributed Erlang cluster. In the future it may well make sense 
to investigate some alternatives there but no changes would be required 
initially.
   
   Does that help?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to