nickva commented on pull request #651:
URL: 
https://github.com/apache/couchdb-documentation/pull/651#issuecomment-879250910


   Agree with @rnewson . Even if we switch the index storage format to allow 
paralelizable updates, adding a static Q would be a step back it seem.
   
   One issue is  at the user/API level. We'd bring back Q, which we didn't want 
to have to deal with now using FDB. And then in the code, we just removed 
sharding code in fabric, I am not too excited about bringing parts of it back, 
unless it's a last resort and nothing else works. We invent some auto-sharding 
of course, but that would be even more complexity.
   
   It seems we'd also want to separate a bit better change feed improvements vs 
indexing improvements. Could we speed up indexing without a static Q sharding 
of change feed with all the API changes involved and hand-written resharding 
code (epochs) and hard values?
   
   I think we can, if we invent a new index structure that allow paralelizable 
updates. Like say an inverted json index for Mango Queries based on 
https://github.com/cockroachdb/cockroach/blob/master/docs/RFCS/20171020_inverted_indexes.md.
 
   
   The idea I had was to use locality API to split the _changes feed into 
sub-sequences, and either start a separate couch_jobs job (or just processes 
under a single couch_job indexer) to fetch docs, process and write to the index 
in parallel. So, if the _changes sequence looks like `[10, 20, 25, 30]`, 
locality API might split them as `[10, 20]`, `[25, 30]`. Then two indexers 
would index those in parallel. In the meantime the doc at sequence 20, could be 
updated to and now be at sequence [35]. Then we'd catch up from 35 to up the 
next db sequence and so on. The benefit there would be to avoid managing a 
static Q at all. The downside is it would work only for a write-paralelizable 
index and would only work if we "hide" the index being built in the background 
from queries (as it would look quite odd with as it wouldn't built in changes 
feed order). Then, once it's built, if we can update the index transactionally, 
we'd get consistent reads on it.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscr...@couchdb.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to