Decided to kick the tires on the new RFC proposal issue type and created one for shard splitting:
https://github.com/apache/couchdb/issues/1920 Let's see how it goes. Being it's the first one let me know if I missed anything obvious. Also I'd like to thank everyone who contributed to the discussion. The API is looking more solid and is much improved from where it started. Cheers, -Nick On Wed, Feb 13, 2019 at 12:03 PM Nick Vatamaniuc <vatam...@gmail.com> wrote: > > > On Wed, Feb 13, 2019 at 11:52 AM Jan Lehnardt <j...@apache.org> wrote: > >> >> >> > On 13. Feb 2019, at 17:12, Nick Vatamaniuc <vatam...@gmail.com> wrote: >> > >> > Hi Jan, >> > >> > Thanks for taking a look! >> > >> > On Wed, Feb 13, 2019 at 6:28 AM Jan Lehnardt <j...@apache.org> wrote: >> > >> >> Nick, this is great, I have a few tiny nits left, apologies I only now >> got >> >> to it. >> >> >> >>> On 12. Feb 2019, at 18:08, Nick Vatamaniuc <vatam...@gmail.com> >> wrote: >> >>> >> >>> Shard Splitting API Proposal >> >>> >> >>> I'd like thank everyone who contributed to the API discussion. As a >> >> result >> >>> we have a much better and consistent API that what we started with. >> >>> >> >>> Before continuing I wanted to summarize to see what we ended up with. >> The >> >>> main changes since the initial proposal were switching to using >> /_reshard >> >>> as the main endpoint and having a detailed state transition history >> for >> >>> jobs. >> >>> >> >>> * GET /_reshard >> >>> >> >>> Top level summary. Besides the new _reshard endpoint, there `reason` >> and >> >>> the stats are more detailed. >> >>> >> >>> Returns >> >>> >> >>> { >> >>> "completed": 3, >> >>> "failed": 4, >> >>> "running": 0, >> >>> "state": "stopped", >> >>> "state_reason": "Manual rebalancing", >> >>> "stopped": 0, >> >>> "total": 7 >> >>> } >> >>> >> >>> * PUT /_reshard/state >> >>> >> >>> Start or stop global rebalacing. >> >>> >> >>> Body >> >>> >> >>> { >> >>> "state": "stopped", >> >>> "reason": "Manual rebalancing" >> >>> } >> >>> >> >>> Returns >> >>> >> >>> { >> >>> "ok": true >> >>> } >> >>> >> >>> * GET /_reshard/state >> >>> >> >>> Return global resharding state and reason. >> >>> >> >>> { >> >>> "reason": "Manual rebalancing", >> >>> "state": “stopped” >> >>> } >> >> >> >> More a note than a change request, but `state` is a very generic term >> that >> >> often confuses folks when they are new to something. If the set of >> possible >> >> states is `started` and `stopped`, how about making this endpoint a >> boolean? >> >> >> >> /_reshard/enabled >> >> >> >> { >> >> "enabled": true|false, >> >> "reason": "Manual rebalancing" >> >> } >> >> >> >> >> > I thought of that as well. However _reshard/state seemed consistent with >> > _reshard/jobs/$jobid/state. Setting "state":"stopped" _reshard/state >> will >> > lead to all individual running job state to become "stopped" as well. >> And >> > "running" will make jobs that are not individually stopped also become >> > "running". In other words since it directly toggle job's state (with a >> job >> > being to override stopped state) I like that it had the same arguments >> >> Got it, makes perfect sense. >> >> > and": true|false >> > >> > >> >> >> >>> * GET /_reshard/jobs >> >>> >> >>> Get the state of all the resharding jobs on the cluster. Now we have a >> >>> detailed >> >>> state transition history which looks similar what _scheduler/jobs >> have. >> >>> >> >>> { >> >>> "jobs": [ >> >>> { >> >>> "history": [ >> >>> { >> >>> "detail": null, >> >>> "timestamp": "2019-02-06T22:28:06Z", >> >>> "type": "new" >> >>> }, >> >>> ... >> >>> { >> >>> "detail": null, >> >>> "timestamp": "2019-02-06T22:28:10Z", >> >>> "type": "completed" >> >>> } >> >>> ], >> >>> "id": >> >>> >> "001-0a308ef9f7bd24bd4887d6e619682a6d3bb3d0fd94625866c5216ec1167b4e23", >> >>> "job_state": "completed", >> >>> "node": "node1@127.0.0.1", >> >>> "source": "shards/00000000-ffffffff/db1.1549492084", >> >>> "split_state": "completed", >> >>> "start_time": "2019-02-06T22:28:06Z", >> >>> "state_info": {}, >> >>> "targets": [ >> >>> "shards/00000000-7fffffff/db1.1549492084", >> >>> "shards/80000000-ffffffff/db1.1549492084" >> >>> ], >> >> >> >> Since we went from /_split to /_reshard to prepare for merging shards, >> we >> >> should reconsider source (singular) and targets (plural). Either a >> merge >> >> job (in the future) uses sources (plural) and target (singular) and >> the job >> >> schema is intentionally different, or we unify things to, maybe >> singular: >> >> source/target which would have the nice property of being analogous to >> our >> >> replication job schema. The type definition then is source:String and >> >> target:Array(2) for split jobs and source:Array(2) target:String for >> >> (future) merge jobs. >> >> >> >> >> > Joan suggested adding a "type" field to both job creation POST body and >> > also returning it when we inspect the job(s) state. So the >> "type":"split" >> > would toggle the schema. It could be "merge" in the future, or even >> > something like "rebalance" where it would merge some and split others >> > perhaps :-) and since we have a type it would be easier to differentiate >> > between the merge and split jobs. But if there is a consensus from >> others >> > about switching targets to target that's easily as well. >> >> Ah, I’m less concerned here about not being able to tell whether it’s a >> split or a merge, and more about that having an indiscriminate plural >> form (sourceS/targetS) depending on the type. It’s just an easy thing to >> get wrong. >> >> In addition, we already have source/target in CouchDB replication, >> which people already use successfully, so making a similar thing that >> behaves slightly differently doesn’t sit quite right with me. >> >> I understand that I’m arguing to remove an ’s’ for very nitpicky >> but these are the kind of nitpick discussions we’ve done a lot in >> the early days which resulted in a by and large decent API that >> has served as well, and it’s something I’d like to see taken forward. >> Apologies if this all sounds very strict ;) >> >> > Thanks for the longer explanation. I understand now and agree, let's make > it target. No worries about sounding nitpicky we should be nitpicky about > APIs! > > >> > >> >> >> >> And just another question, sorry if I missed this elsewhere, would we >> ever >> >> consider adding to split/merge ratio different from 1:2, say 1:4, or >> will >> >> folks have to run 1:2, 1:2, 1:2 to get to the same result? I’m fine >> with >> >> either and if 1:2 fixed makes things simpler, I’m all for it ;) >> >> >> >> >> > Good point. Actually it's already implemented that way already :-) Right >> > below the API surface it has a split=2 parameter and it just creates the >> > targets based on that. It could be 2, 3, 4, ... 10 etc. However I was >> > thinking of keeping it hard coded at 2 at first to keep the behavior >> > simpler at first and open that parameter to be user facing in a later >> > release based on user feedback. >> >> Ace, again, fully on board with shipping 1:2 first and maybe offering >> other >> options later. >> >> Best >> Jan >> — >> >> > >> > Cheers, >> > >> > -Nick >> >>