All,

Just to be clear before starting: this is a proposal from Cloudant not just 
myself :D

## What?

Introduce a user-specified shard key per document and a way for the user to 
scope queries to a single shard using this key, thereby reducing query latency 
by allowing the database to consult only the shard required to answer the 
user's query. Documents can share a shard key and all documents with a given 
shard key are written to the same shard. The shard key therefore overrides 
using the document ID as the key to shard by.

## What problem does this proposal address?

To improve performance and throughput of queries generally, and specifically to 
significantly decrease the effects of data size on query latency and cluster 
load leading to better data scaling characteristics of the database.

CouchDB's current queries -- whether to views, search indexes or mango -- 
perform badly when the shard count (Q) for a database is high. This is because 
the coordinator node can't know in advance which shards hold results for the 
query, so must make a scatter-gather request to all shards, even those that in 
the end return no results. In turn this generates more load on all servers in 
the cluster because of many possibly redundant file reads.

We did some hacky performance testing of this idea by modifying the current 
view API to allow one to specify a specific view shard in the query. The 
results were very promising, as they showed that queries started scaling with 
similar characteristics to lookups, as you'd hope for, in that Q ceased having 
such a large effect and load on the cluster decreased.

## API

There are a few general considerations which act as constraints on the API. To 
get these out in the open:

• The combined <shardkey>:<dockey> must be a valid document id. This enforces 
uniqueness of shardkey:docid across the database and invariance between 
document updates.
• Aside from calculation of shard location, we treat the combined 
<shardkey>:<dockey> as we do any other document _id.
• All existing tooling should continue to work as-is (e.g., changes feed 
doesn't have to change to specify a shardkey:dockey pair and the replicator 
will "just work").
• For Cloudant, it's useful to be able to differentiate queries by path for our 
internal accounting.

For this we have a documentid becoming a composite thing: 
`<shardkey>:<documentkey>`

• shardkey can be any valid characters for a document ID, but must not start 
with an _ or contain a : or /.
• documentkey can be any valid characters for a document ID. Further : 
characters are treated as part of the key; meaningless.

From this, our proposed API is, which should also help clarify a bit how things 
work:

• Enable shard keys using database create: `PUT /db?use_shard_key=true`. 
Default false.
        • For databases where this is enabled, every document needs a shard key.
• Upload a document: `PUT /db/<shardkey>:<dockey>`
        • In these databases, documents need user-specified document IDs as the 
shard key needs to be set by the user.
        • Therefore reject POSTing documents to /db for simplicity (rather than 
having validation of an _id field).
• Other things that create documents will need the validation of document ID 
too:
        • Copy a document. Same as now but reject when destination document id 
does not have a shard key.
        • Calls like `POST /db/_bulk_docs`, `POST 
/db/_design/mydoc/_update/myupdatefunction` and others that accept documents 
via POST bodies will need to inspect the body for shard keys.
• Get a document: `GET /db/<shardkey>:<dockey>`.
• Special documents:
        • Design documents and local documents have no shardkey in their docID. 
This is enforced.
• Reject documents with IDs of the form `<shardkey>:_design/foo`.
• Query requests introduce a shardkey into the path, following an explicit 
`_shardkey` path part:
        • Mango: `POST /db/_shardkey/<shardkey>/_find`.
        • Views
                • `GET 
/db/_shardkey/<shardkey>/_design/mydoc/_view/myview?include_docs=true`
                • `POST 
/db/_shardkey/<shardkey>/_design/mydoc/_view/myview?include_docs=true`
        • Query results are restricted to documents with the shard key 
specified. Which makes things harder but leaves the door open for future things 
like shard-splitting without changing result sets. And it seems like what one 
would expect!
• The above is a white list of APIs, so a few notes on some cases I've left out 
with reasons:
        • _all_docs (can see uses, but would rather keep API small to start 
with). Note that Mango depends on the internal _all_docs API.
        • _changes (use-cases better served by explicit shard API).

## What are we NOT addressing here?

This proposal only provides indirect shard addressing -- it's specifically not 
covering things discussed previously [1] per-shard changes feeds where one 
needs to directly address each shard. The shard key only controls the bucketing 
of documents into shards and doesn't say anything about, for example, whether 
two shard keys end up on the same shard -- that's an internal detail.

I'd really appreciate thoughts and comments before Cloudant get started on 
implementation of this.

Mike.

[1]: https://issues.apache.org/jira/browse/COUCHDB-2791

Reply via email to