Lloyd asked me to send along some first thoughts and a demo, unfortunately due to firefox-os commitments I havent had a whole lot of time to do a demo, however yesterday freed up and if I can get persona behaving nicely inside the fxos browser then I should have one before the meeting or at the latest tomorrow, apologies since I did promise that a while ago.
First Ill introduce myself to get any bias out the way, prior to Mozilla I worked at Couch.io -> CouchOne, the CouchDB companies with all the founders of the project, This then was bought by Couchbase (an entirely different database) that I worked on for a few years before moving to Mozilla. While at mozilla I started a side project of reimplementing the full CouchDB API in JavaScript (http://pouchdb.com/) which uses various storage engines (IndexedDB / LevelDB / WebSQL / HTTP), I am a (not very active) CouchDB contributor. At mozilla I work on firefox os, am on the browser team and mostly work somewhere between front end and gecko stuff. There were 3 main issues mentioned here I wanted to address For the Couch Server, I dont really believe having attack mitigation, auth etc sitting in front of CouchDB servers is an issue, this is very much standard practice. Pretty much the only way couchdb is exposed directly to webfacing server (in production) is when is in a secure mode and has had a token server authorising the client beforehand, these are just problems that will have to be solved whether or not couch is involved. As for not being able to control local space usage, I am not entirely sure I understand why you mean here but there are a variety of ways in which to control what gets saved, both as in built functionality to couch (filtered replication), external process (purgers / compactors) and as trivial modifications to couch (or a client) Now for the data models and protocol, I would / do have concerns around, Couch has various reasonable solutions for representing trees but it is obviously a data model not designed for them, I previously seen references to a treeSync 'on top' of couch however I can only really see these working as collections implemented as a batch edit / materialised views while still using the existing per object / document to sync The replication protocol currently shared by couch + clients definitely has some major shortcomings and I would very much not expect it to be used as is, its terribly inefficient when replicating an existing data set to a new profile as well as resuming long disconnected peers, there is a difference between implementation and protocol definition though, I have yet to see whether there are irreconcilable flaws in the protocol or there are implementation issues that can be fixed. My approach for building PouchDB has been to copy the core storage + protocol of CouchDB, flaws and all with the aim of getting to the point of a known good and starting optimisations from there, Obviously sync is a terribly hard problem and despite CouchDB having a fairly simple protocol I have still seen 4 years of edge cases found in everything from the protocol to the assumptions that the disk storage has to uphold (as well as 4 years of failed attempts at building a decent custom sync solution). I am not particularly good at 'inventing' protocols / technologies so my preference has always gone towards reusing / retooling what works, PouchDB as a server replacement for Couch works and is light enough to be heavily modified without too much concern, this would be one way to start at a 'known good' and make explicit changes when required, having it reach anywhere near the robustness / scalability of CouchDB (on the server) will be a large job though (currently implementation is leveldb with no transactions, its definitely not thread safe), CouchDB has been battle tested enough that I would always look to fit my problems on top of what it currently does, improvements to replication performance etc are reasonably trivial, as a client PouchDB is young enough that there is plenty of fairly trivial bugs, but I am confident that at it is now sound at a base and will quickly stabilise, For coming up with a new custom sync protocol I dont really have much experience advice, just a warning that it will be very hard, which I dont doubt you know. Some other points of note, from a firefox os perspective I dont speak authoritatively but we only really have 2 options and thats a solution to that works in web content or to create a new webAPI, we have a few very minor holes in which we firefox os system can talk to chrome, but they are exceptional, not very flexible (mostly system messages) and are to be gotten rid off. There are a few places where we could make use of a 'one off web api' (such as a download manager, hooking into places) but those do seem to get the lowest priority, I think something that cant work in web content would be taking firefox os off the target for a long time. I cant find them now but I seen numbers relating to the data stored by sync users and worries about indexeddb performance, I think the opportunity to share a code base between 3 projects (desktop / android / fxos) the uses a webapi we already want to be as fast as possible could be a big win, as they were the numbers seemed manageable though. And last of all I wanted to mention that all the above comes from my experience of years of working with couch and very little experience and context into firefox sync / weave and its requirements / constraints, so not making judgement call or anything just sharing what I have learnt, and pretty excited to learn more about whats in store (having sync on firefox os is probably my #1 awaited feature, possibly #2 behind spotify) Sorry for the essay Cheers Dale On 25 July 2013 06:43, Richard Newman <[email protected]> wrote: > > Monitoring, attack mitigation, and security auditing don't seem unique > to couchdb. > > Many of the concerns one might have about Couch are concerns one would > have about most COTS software, or at least OTS document stores. I don't > think that means they aren't concerns we have about applying Couch to this > problem. > > > "Protocol Limitations" and perhaps "Scalability" and maybe "Data > Representation" (thinking that protocol limitations can pretty deeply > affects representation design, not everyone agrees). > > Yes; if a hypothetical protocol+server supported transactions, for > example, we could use different representations. > > > Ok, I'm admitting we're going to have a pretty protracted conversation > on this tomorrow, and just wanted to ensure concerns were represented. > > > > I *think* I've done a fair job. We shall see! :} > > Yeah, only one way to find out :) It's an enormous topic. > > Will be there to help! > > _______________________________________________ > Sync-dev mailing list > [email protected] > https://mail.mozilla.org/listinfo/sync-dev >
_______________________________________________ Sync-dev mailing list [email protected] https://mail.mozilla.org/listinfo/sync-dev

