This is an automated email from the ASF dual-hosted git repository.

wohali pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/couchdb-documentation.git

commit f3e1ce497a6e7a92957b7141c6be5291966201e0
Author: Joan Touzet <jo...@atypical.net>
AuthorDate: Mon Dec 17 17:50:08 2018 -0500

    Migrate stats aggregation howto from MoinMoin
---
 src/best-practices/documents.rst | 21 +++++++++++++++++++++
 1 file changed, 21 insertions(+)

diff --git a/src/best-practices/documents.rst b/src/best-practices/documents.rst
index 7375cff..991731c 100644
--- a/src/best-practices/documents.rst
+++ b/src/best-practices/documents.rst
@@ -48,3 +48,24 @@ to ensure unique identifiers for each row in a database 
table. CouchDB generates
 unique ids on its own and you can specify your own as well, so you don't really
 need a sequence here. If you use a sequence for something else, you will be
 better off finding another way to express it in CouchDB in another way.
+
+Pre-aggregating your data
+-------------------------
+
+If your intent for CouchDB is as a collect-and-report model, not a real-time 
view,
+you may not need to store a single document for every event you're recording.
+In this case, pre-aggregating your data may be a good idea. You probably don't
+need 1000 documents per second if all you are trying to do is to track
+summary statistics about those documents. This reduces the computational 
pressure
+on CouchDB's MapReduce engine(s), as well as reduces its storage requirements.
+
+In this case, using an in-memory store to summarize your statistical 
information,
+then writing out to CouchDB every 10 seconds / 1 minute / whatever level of
+granularity you need would greatly reduce the number of documents you'll put in
+your database.
+
+Later, you can then further `decimate
+<https://en.wikipedia.org/wiki/Downsampling_(signal_processing)>`_ your data by
+walking the entire database and generating documents to be stored in a new
+database with a lower level of granularity (say, 1 document a day). You can 
then
+delete the older, more fine-grained database when you're done with it.

Reply via email to