After all, we've decided not to rely on filtered replication for our use case.
The issue is that we will not only support an offline-first mode where a filtered copy of the data will be retrieved, but there will also be an online-only mode (e.g., when accessing the app from an untrusted device, where the users might prefer not to store anything locally). In the online-only mode, the users will need to directly access the database, but it'll also need to be filtered and I'm not sure if there's a safe way to do that. What we've chosen to do now is to keep the information colocated in _users and to go through an API to retrieve the subset of information that is required (e.g., n properties all members of database X). This way it works fine in the online-only scenario, but also for the offline-first one since we can persist the information after having retrieve it once. We also keep better control over what happens with the data (up to some extent) and can wipe it if/when necessary. This issue is rather hairy form a privacy protection point of view, but such use cases are critical for multi-user offline-first systems. Thanks again for the useful feedback! kr, Sébastien On Sun, Oct 13, 2019 at 10:34 AM Stefan Klein <[email protected]> wrote: > Hi Sebastien, > > Am Sa., 12. Okt. 2019 um 15:55 Uhr schrieb Sebastien <[email protected] > >: > > > Taking that as starting point, one option could indeed be as you propose > to > > copy a subset of that "persons" database into each other database (of > > course again only a subset of the info, ideally controllable by the end > > users). One problem that I imagine with that is mainly the amount of > > incurred data duplication. > > With the duplication it needs to be absolutely clear which of the > copies is the authoritative version of the document and which are just > copies, then it's manageable. > > > For instance, imagine that persons contains [A, B, C, D, E, F], then: > > - If [A, B, C] have access to database X, then those users should have a > > copy of [A, B, C] locally > > - If [A, D, E] have access to database Y, then those users should have a > > copy of [A,D,E] locally > > Consequently, A should have A, B, C, D, E in his local "persons" database > > copy. > > If at some point E is removed from database Y, then user A should not > have > > E in his local database anymore. > > > > Does that sound like something that can be handled through filtered > > replication? > > I am not aware of any way to delete documents in the target that still > exist in the source. > But if you have a copy of E in Y and delete E from Y at a later point, > this delete will be replicated to the local DB too (If you don't > filter out deleted documents). > Since you probably have some kind of management system to remove E > from Ys _security, you could either delete Es profile from Y in the > same step or have a cron job or similar to remove the redundant > profiles from the databases. > > One possible issue here though: > If E gains access to Y again while Es profile wasn't changed, the > former _deleted revision is still the "current" revision and Es > profile stays _deleted in database Y. > You would have to modify Es person document in the persons database, > so it gets a new revision. > > > I hope that my system will be able to handle hundreds/thousands of > > databases with 1-100 users in each database; each use having access to > > ~1-10 database, thus potentially having access to ~1K user documents > > locally (thus is really just an early guesstimate). > > Can't comment on pouchdb. > From my experience CouchDB doesn't care about how many databases > exist, as long there is no current access to a database it is just a > file in the file system. > > > The system currently doesn't allow users to manage their own profile but > > it's indeed a requirement. I'll probably only allow users to modify their > > own information while online through a dedicated API endpoint checking > the > > user's identity instead of letting them directly write to the "persons" > > database. > > With this you do have a clear dataflow: > Users modify their profile via API, this changes the persons database. > Documents from the persons database are distributed to the destination > databases. > So there should be no issue with data duplication. > > regards, > Stefan >
