wohali commented on a change in pull request #385: Documentation for partitioned dbs URL: https://github.com/apache/couchdb-documentation/pull/385#discussion_r253232537
########## File path: src/partitioned-dbs/index.rst ########## @@ -0,0 +1,384 @@ +.. Licensed under the Apache License, Version 2.0 (the "License"); you may not +.. use this file except in compliance with the License. You may obtain a copy of +.. the License at +.. +.. http://www.apache.org/licenses/LICENSE-2.0 +.. +.. Unless required by applicable law or agreed to in writing, software +.. distributed under the License is distributed on an "AS IS" BASIS, WITHOUT +.. WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the +.. License for the specific language governing permissions and limitations under +.. the License. + +.. _partitioned-dbs: + +===================== +Partitioned Databases +===================== + +As a means to introducing partitioned databases we'll consider a motivating +use case to describe the benefits of this feature. For this example we'll +consider a database that stores readings from a large network of soil +moisture sensors. + +.. note:: + Before reading this document you should be familiar with the + :ref:`theory <cluster/theory>` of :ref:`sharding <cluster/sharding>` + in CouchDB. + + +Traditionally, a document in this database may have something like the +following structure: + +.. code-block:: javascript + + { + "_id": "sensor-reading-ca33c748-2d2c-4ed1-8abf-1bca4d9d03cf", + "_rev":"1-14e8f3262b42498dbd5c672c9d461ff0", + "sensor_id": "sensor-260", + "location": [41.6171031, -93.7705674], + "field_name": "Bob's Corn Field #5", + "readings": [ + ["2019-01-21T00:00:00", 0.15], + ["2019-01-21T06:00:00", 0.14], + ["2019-01-21T12:00:00", 0.16], + ["2019-01-21T18:00:00", 0.11] + ] + } + + +.. note:: + While this example uses IoT sensors, the main thing to consider is that + there is a logical grouping of documents. Similar use cases might be + documents grouped by user or scientific data grouped by experiment. + + +So we've got a bunch of sensors, all grouped by the field they monitor +along with their readouts for a given day (or other appropriate time period). + +Along with our documents we might expect to have two secondary indexes +for querying our database that might look something like: + +.. code-block:: javascript + + function(doc) { + if(doc._id.indexOf("sensor-reading-") != 0) { + return; + } + for(var r in doc.readings) { + emit([doc.sensor_id, r[0]], r[1]) + } + } + +and: + +.. code-block:: javascript + + function(doc) { + if(doc._id.indexOf("sensor-reading-") != 0) { + return; + } + emit(doc.field_name, doc.sensor_id) + } + +With these two indexes defined we can easily find all requests for a given +sensor, or list all sensors in a given field. + +Unfortunately, in CouchDB, when we read from either of these indexes, it +requires finding a copy of every shard and asking for any documents related +to the particular sensor or field. This means that as our database scales +up the number of shards, every index request must perform more work. +Fortunately for you, dear reader, partitioned databases were created to solve +this precise problem. + + +What is a partition? +==================== + +In the previous section, we introduced a hypothetical database that contains +sensor readings from an IoT field monitoring service. In this particular +use case, it's quite logical to group all documents by their ``sensor_id`` +field. In this case, we would call the ``sensor_id`` the partition. + +A good partition has two basic properties. First, it should have a high +cardinality. That is, there is a large number of values for the partition. +A database that has a single partition would be an anti-pattern for this +feature. Secondly, the amount of data per partition should be "small". The +general recommendation is to limit individual partitions to less than ten +gigabytes of data. Which, for the example of sensor documents, equates to roughly +60,000 years of data. + + +Why use partitions? +=================== + +The primary benefit of using partitioned databases is for the performance +of partitioned queries. Large databases with lots of documents often +have a similar pattern where there are groups of related documents that +are queried together often. + +By using partitions, we can execute queries against these individual groups +of documents more efficiently by placing the entire group within a specific +shard on disk. Thus, the view engine only has to consult one copy of the +given shard range when executing a query instead of executing +the query across all ``Q`` shards in the database. + + +Partitions By Example +===================== + +To create a partitioned database, we simply need to pass a query string +parameter. + +.. code-block:: bash + + shell> curl -X PUT http://127.0.0.1:5984/my_new_db?partitioned=true + {"ok":true} + +To see that our database is partitioned, we can look at the database +information: + +.. code-block:: bash + + shell> curl http://127.0.0.1:5984/my_new_db + { + "cluster": { + "n": 3, + "q": 8, + "r": 2, + "w": 2 + }, + "compact_running": false, + "data_size": 0, + "db_name": "my_new_db", + "disk_format_version": 7, + "disk_size": 66784, + "doc_count": 0, + "doc_del_count": 0, + "instance_start_time": "0", + "other": { + "data_size": 0 + }, + "props": { + "partitioned": true + }, + "purge_seq": "0-g1AAAAFDeJzLYWBg4M...", + "sizes": { + "active": 0, + "external": 0, + "file": 66784 + }, + "update_seq": "0-g1AAAAFDeJzLYWBg4M..." + } + + +You'll now see that the ``"props"`` member contains ``"partitioned": true``. + +.. note:: + + The format for document ids in a partitioned database is + ``partition:docid``. Every regular document (i.e., everything + except design and local documents) in a partitioned database + must follow this format. + +.. note:: + + System databases are *not* allowed to be partitioned. This is + due to system databases already having their own incompatible + requirements on document ids. + +Now that we've created a partitioned database, it's time to add some documents. +Using our earlier example, we could do this as such: + +.. code-block:: bash + + shell> cat doc.json + { + "_id": "sensor-260:sensor-reading-ca33c748-2d2c-4ed1-8abf-1bca4d9d03cf", + "sensor_id": "sensor-260", + "location": [41.6171031, -93.7705674], + "field_name": "Bob's Corn Field #5", + "readings": [ + ["2019-01-21T00:00:00", 0.15], + ["2019-01-21T06:00:00", 0.14], + ["2019-01-21T12:00:00", 0.16], + ["2019-01-21T18:00:00", 0.11] + ] + } + shell> $ curl -X POST -H "Content-Type: application/json" \ + http://127.0.0.1:5984/my_new_db -d @doc.json + { + "ok": true, + "id": "sensor-260:sensor-reading-ca33c748-2d2c-4ed1-8abf-1bca4d9d03cf", + "rev": "1-05ed6f7abf84250e213fcb847387f6f5" + } + +The only change required to the first example document is that we are now +including the partition name in the document id by prepending it to the +old id separated by a colon. + +.. note:: + + The partition name in the document id is not magical. Internally, + the database is simply using only the partition for hashing + the document to a given shard, instead of the entire document id. + +Working with documents in a partitioned database is no different than +a non-partitioned database. All APIs are available, and existing client +code will all work seamlessly. + +Now that we have created a document, we can get some info about the partition +containing the document: + +.. code-block:: bash + + shell> curl http://127.0.0.1:5984/my_new_db/_partition/sensor-260 + { + "db_name": "my_new_db", + "doc_count": 1, + "doc_del_count": 0, + "partition": "sensor-260", + "sizes": { + "active": 244, + "external": 347 + } + } + +And we can also list all documents in a partition: + +.. code-block:: bash + + shell> curl http://127.0.0.1:5984/my_new_db/_partition/sensor-260/_all_docs + {"total_rows": 1, "offset": 0, "rows":[ + { + "id":"sensor-260:sensor-reading-ca33c748-2d2c-4ed1-8abf-1bca4d9d03cf", + "key":"sensor-260:sensor-reading-ca33c748-2d2c-4ed1-8abf-1bca4d9d03cf", + "value": {"rev": "1-05ed6f7abf84250e213fcb847387f6f5"} + } + ]} + +Note that we can use all of the normal bells and whistles available to +``_all_docs`` requests. Accessing ``_all_docs`` through the +``/dbname/_partition/name/_all_docs`` endpoint is mostly a convenience +so that requests are guaranteed to be scoped to a given partition. Users +are free to use the normal ``/dbname/_all_docs`` to read documents from +multiple partitions. + +Next, we'll create a design document containing our index for +getting all readings from a given sensor. The map function is similar to +our earlier example except we've accounted for the change in the document +id. + +.. code-block:: javascript + + function(doc) { + if(doc._id.indexOf(":sensor-reading-") < 0) { + return; + } + for(var r in doc.readings) { + emit([doc.sensor_id, r[0]], r[1]) + } + } + +We can upload our design document and try out a partitioned +query: + +.. code-block:: bash + + shell> cat ddoc.json + { + "_id": "_design/sensor-readings", + "views": { + "by_sensor": { + "map": "function(doc) { ... }" + } + } + } + shell> curl http://127.0.0.1:5984/my_new_db/_partition/sensor-260/_design/sensor-readings/_view/by_sensor + {"total_rows":4,"offset":0,"rows":[ + {"id":"sensor-260:sensor-reading-ca33c748-2d2c-4ed1-8abf-1bca4d9d03cf","key":["sensor-260","0"],"value":null}, + {"id":"sensor-260:sensor-reading-ca33c748-2d2c-4ed1-8abf-1bca4d9d03cf","key":["sensor-260","1"],"value":null}, + {"id":"sensor-260:sensor-reading-ca33c748-2d2c-4ed1-8abf-1bca4d9d03cf","key":["sensor-260","2"],"value":null}, + {"id":"sensor-260:sensor-reading-ca33c748-2d2c-4ed1-8abf-1bca4d9d03cf","key":["sensor-260","3"],"value":null} + ]} + +Hooray! Our first partitioned query. For experienced users, that may not +be the most exciting development, given that the only things that have +changed are a slight tweak to the document id, and accessing views with +a slightly different path. However, for anyone who likes performance +improvements, its actually a big deal. By knowing that the view results Review comment: "it's" because "it is" actually a big deal. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services