I was wondering if the Cloudant recommendation was based on the Cloudant superstructure, or the underlying couchdb architecture. And particularly how important the issue of continuous replication on each of those is in the assessment. Here is our use case:

Each user has their own user database, which is mirrored on the local pouchdb client. (in the browser, electron offline, apk offline). We have an "online" mode, in which data objects are read directly from couchdb (shared database - we do not use per user databases for access control, but for improved performance over poor networks). It saves any document in a local db cache for working. In our use case, which is a business process management reporting tool, there are always a number of documents in the packet to be processed. It is important that all the documents save correctly, or none at all. Therefore, when the user does the final submit, all the documents are processed to the user's local copy of the User database, not the shared one. From here the list of documents are packaged into a transaction object (which can be quite large) and replicated to the users' local copy on the server. From here, the transaction manager picks up the new document, processes it and saves it back into the shared database as part of the transaction process.

Because we use a one-way packet driven replication, triggered by a save event and not a continuous replication, this limits the performance issues (so we believe) as long as the transaction manager can process all the incoming documents effectively. And that can be scaled up without too much difficulty.

I would be interested to hear if there is a reason that we should be concerned?

Willem

On 2020/02/09 18:02, Marcus wrote:
How many databases can be used without causing issues with replication and 
server performance?

I found two very different opinions. The pouchdb blog quotes 100K (based on a 
discussion about Cloudant in 2014). However a Cloudant blog series from March 
2019 recommends a maximum of 500.

Can anyone explain the huge difference? I understand it's going to depend on 
use cases, but a difference of 90,500 databases is significant.

500 are too few when databases are needed for read access control using roles. 
One for each user's personal document locker, one for public data (web), and 
one for a private group. That leaves about 160 users.

Here are two excerpts from that Cloudant blog series of March 2019.

"Rule 4: Fewer databases are better than many

If you can, limit the number of databases per Cloudant account to 500 or fewer. 
While there is nothing magical about this particular number (Cloudant can safely 
handle more), there are several use cases that are adversely affected by large 
numbers of databases in an account."

"Rule 5: Avoid the “database per user” anti-pattern like the plague
If you’re building out a multi-user service on top of Cloudant, it is tempting to 
let each user store their data in a separate database under the application account. 
That works well, mostly, if the number of users is small."

Source: https://www.ibm.com/cloud/blog/cloudant-best-and-worst-practices-part-1

What are your personal experiences with large numbers of databases?

Marcus


Reply via email to