I have a dream… (pardon the plagiarism)
I want to live in a world where people are empowered to understand and are capable to decide where their data lives. I want to live in a world where developers build apps that support that, not because they went out of their way to implement it, but because it is a feature of the software platform they are using. I want to be able to help people improve their lives in regions of the world where ubiquitous network access isn’t — and sometimes that is just a major western capital’s subway — but more likely is it a lesser developed location, or a rural area that will never see mobile broadband, let alone wired broadband because there is no financial incentive. I want to live in a world where technology solves more problems than it creates. One of those ways is allow people to use software wherever they are in whatever context they need it in. More often than not, that means far away from fast network access (Despite what @dhh is trying to tell you). My primary motivation for working on Apache CouchDB is to help build the world I want to live in. The same motivation drives my motivation behind Hoodie (http://hood.ie), which builds on top of CouchDB and wouldn’t be possible without it. * * * In the past year I have interviewed a fair number of people, let’s say 50, from those who have heard about CouchDB to users to core devs. The ONE feature that makes CouchDB relevant is multi-master replication. There is no exception, this is the ONE thing that makes CouchDB exceptional. NOBODY else has that, and even the decent proprietary solutions that are just coming to market suck where we KICK ASS. There are many other things that people like about CouchDB: reliability, no schema, HTTP interface, the view system, etc. But NONE of these people would care if CouchDB didn’t have multi-master replication. * * * The number one thing that people did NOT like about CouchDB is that it is confused. CouchDB has a torn identity, half database, half application server. It wasn’t clear (and I am part responsible for this) what CouchDB is and wants to be. In everybody’s defence, I think, it just took a while to figure it out. Now is a good time to put our findings in writing and fix this. The number one request from people was to clear up CouchDB’s story, to have a clear, bold vision that captures people and that they can easily understand and share and support and move forward. * * * Here is a narrative about what CouchDB has, that has formed in my head in the past year. I have shared this with some people privately for some feedback and they all liked it, so it has that going for it. I also tried out bringing some of these issues up in presentations I have given, to again great feedback. E.g.: http://www.youtube.com/watch?v=7mdG-iAizVc or http://www.youtube.com/watch?v=edbi9jJZkpg Before I lay it out, I understand that I will be ruffling some feathers. I think that is both necessary and healthy. I think the picture I am going to paint will make a lot of people in the CouchDB community happy, some with concessions, but I utterly and strongly believe that this vision of what CouchDB is has the power to set the course for the next five years of the project and attract a whole lot of new people both as users and contributors. * * * CouchDB is a database that replicates. Think of it as git for your data-layer. Not in a sense where you manage text files and diff and merge, but in the sense that you have a local version of your data and one or multiple remote ones and you can seamlessly move your data between them, back and forth and crossover. Imagine a local checkout of your data that you can work on, and then share it with Lucie across the table, she finds some issues and fixes up the data, and shares it with Tim across the room. Tim fixes two more issues and you pull both their changes into your copy. We conclude the whole thing is golden and we push it to staging, where our continuous integration runs and decides that the data is good to go into production, so it pushes it to production. There the data is picked up from various clients, some mobile over there, some web over here, a backup system in the Tokyo office… Or you have hospitals in remote regions in Africa that collect local health data, like how many malaria infections a region has and they all share their results over unreliable mobile connections and the data still makes it eventually maybe with a few hours delay and the malaria expert in the capital city sees an increased outbreak of some illness and is able to send out medicine in time to arrive for the patients to help. Where today the expert takes months to travel between the hospitals to collect that data manually and find out that there was a lethal outbreak two months ago and everybody died. (Somebody built this, CouchDB does save lives, I get teary every time I tell this story (like now). Our work doesn’t get more noble than this.) Or imagine millions of mobile users with access to terabytes of data in the cloud, replicating the bits they need to their phones and tablets, allowing super-fast low-latency access for a stellar user experience, while giving access to sheer amounts of data and allowing full write access on the mobile device to be replicated back to the cloud when connections exist. (Our friends at Cloudant have a couple of those customers.) That is the power of CouchDB. * * * Replication is the PRIMARY feature of CouchDB. “is a database” means “stores your data, safely and securely”, “that replicates” highlights the primary feature. There are many more very cool features of CouchDB, even the details on how we achieve reliability and data safety or how replication works are mindblowingly cool. The simple HTTP interface, the JSON store, the app-server features, map reduce views, all very excellent things that make CouchDB unique, but it is very important to understand that they are SECONDARY features. * * * I want to learn from understanding what the PRIMARY and SECONDARY features for CouchDB are. I already feel a bit bad about that the PRIMARY ones are two (“a database” *and* “that replicates”), but I think that is as little as it gets. I want CouchDB’s new identity to be a database that replicates. I want to provide a slide deck for a “CouchDB in 25 minutes” presentation* that everybody can take and give and customise, but I want that one of the first things you say “CouchDB is a database that replicates”. I want that if you ask anyone inside the CouchDB developer community (you!) about what CouchDB is to answer “CouchDB is a database that replicates” and then follow up explaining what we mean, and *then* add a few more of the SECONDARY features that you particularly like. * https://dl.dropboxusercontent.com/u/82149/CouchDB-in-25-Minutes.pdf Full talk at: http://vimeo.com/62599420 (sorry this one is German, still trying to find an English version of this) I want that people who barely look at CouchDB comment on an unrelated Hacker News thread write “…CouchDB is a database that replicates, maybe that is a better fit for your problem”. I want that the CTO of the newly funded startup thinks “I seem to have a replication problem to solve, maybe CouchDB can help.” I want to move CouchDB’s development forward, and when we ask ourselves whether to add a feature, we run it by our PRIMARY feature set and ask “does it support ‘CouchDB is a database that replicates’” and if it does we go ahead and build it, and if it doesn’t we may consider it as a SECONDARY feature, or we discard it altogether. (I don’t actually care what the final slogan will be, and please bike-shed this to no avail, but it should capture what I mean with “CouchDB is a database that replicates”, a phrase that we can burn into everybody’s head that captures CouchDB’s PRIMARY feature, its PRIMARY value proposition, the ONE thing that explains WHY we are excited about CouchDB.) * * * Now, you might be miffed that your pet feature didn’t make the PRIMARY list. Do not worry, I believe I have a solution for that. I have brought this up before, but I really do think the holy grail to all this is a very well done plugin system that allows us to follow the “small core, massive plugin repository” paradigm that other’s ever so successfully pioneered. This allows us to focus on what CouchDB is for internal and external communication, for roadmap discussions and attraction of developer talent. More importantly, it allows us to keep all the fringe things that makes CouchDB so very appealing to a lot of different people. It also allows us to open up development to people who feel intimidated working on core CouchDB, but can easily write a little plugin or three (this is basically me, I have like 20 branches on GitHub that are useful to maybe 5% of our users and they don’t get used any). A wise person once said “Core is where features go to rot.”, and if you look at a number of CouchDB features, you can see that we suffer from that. We need a kick-ass plugin system that allows us to easily create, publish, maintain and update little pieces of code that allow our users to make their CouchDB their own. (I am signing up to build that, but I will need your help, there is a shit ton of work to do :) * * * ALERT: OPINION (your opinion may differ and we need to hear it) There is a discussion we need to have what the “small core” means for CouchDB. There is a discrepancy between the absolute minimum to fulfil the “CouchDB is a database that replicates“ mantra and what would be a useful-out-of-the-box product that our users could set up and be productive with. My minimum set looks roughly like this: - core database management (crud dbs & json/mime-docs, clustering) - remote & local replication - MR-views & GeoCouch enabled by default (ideally abstracted away with nice “query dsl”) - HTTP interface - Fu/Fauxton - configuration - stats - docs - plugin system with Erlang (and in the future JavaScript support via Node.js) This makes for a useful CouchDB default setup. Everything else should be a plugin. A piece of code that can be installed with a quick search and a click of a button in Futon (or a `curl`-call on the HTTP interface). Not far away, definitely not “siberia” (if you get the PHP reference), but close to the core and encouraged to be used. And yes, this explicitly includes things like shows and lists and update functions and rewrites and vhosts. We should make it super simple to add these, but for a default experience, they are very, very confusing. We should have a single plugin “CouchApp Engine” which includes Benoit’s vision of CouchApps done right that is just a click away to install. In terms of highlighting the strengths of the core CouchDB “product”, this is what I’d put on the website: - Apache CouchDB implements the CouchDB vision: It is a database that replicates. - Document Database: - Data records are standard JSON. - Unlimited Binary data storage with attachments. - (alternatively arbitrary mime docs with special rules for JSON docs) - Fault-tolerant: - Data is always safe. Tail-append storage ensures no messing with already committed data. - Errors are isolated, recovery is local and doesn’t affect other parallel requests. - Recovery of fatal errors is immediate. There is no “fixup phase” after a restart. - Software updates and bugfix deployment without downtime. - Highly Concurrent: - Erlang makes good use of massively parallel network server installations. - Garbage collection happens roughly on a per-request basis. GC in one request doesn’t affect other requests. - Cluster / BigCouch / Big Data: - Includes a Dynamo-style clustering and cluster-management feature that allows to spread data and load over multiple physical machines. - Scales up to Petabytes of data. - Secondary 2D and 3D indexing - Using incremental and asynchronous index updates for high-performance queries. - Makes good use of hardware: - Tail-append storage allows for serial write access to storage media, which is a best-case-scenario for spinning disks and SSDs. - Small Core & Flexible Plugin System: - Some features are only useful for a small group of people, these can be installed with a super simple plugin management system that is built into the admin interface. - Get new features with a click or tap. - Plugins can be written in Erlang (and in JavaScript in the future). - Cross Platform Support - Runs on any POSIX UNIX as well as Windows. - Support for some embedded devices like Android and RaspberryPi. I think this would make for a compelling list of technical features. (I’d probably also add a blip about the ASF and the Apache 2.0 License for good measure) ALERT END * * * And then, CouchDB is one more thing. CouchDB isn’t just the Erlang implementation of this whole replicating database idea. CouchDB is also the wire protocol, the specification that makes all the magic work. Apache CouchDB is the focal point for The Replicating Society*. (* cue your Blade Runner jokes) Apache CouchDB is THE standard for data freedom and exchange and is the clearing house, the centre for an ecosystem that includes fantastic projects like PouchDB and the TouchDBs, MAx Ogden’s `dat` and whichever else follow these. Not saying we merge those projects in, they can stand on their own, but we should embrace everything that makes the interoperable replication world a reality. http://couchdb.apache.org is going to be the centre of the data replication universe. * * * Now all of this is my vision and I bringing it to this table now. I have to admit that I am very nervous about this. A lot of things aren’t very well thought out and at the same time, I care very deeply about this project and it’s community and their future, so there is a little anxiety doing this little emotional striptease in front of all of you. What we will end up with, is not what I dream up and that’s that, but I hope I can inform and set the direction of where we are going, and then we can all together figure out the hard parts, and question my assumptions and change little thing or lots. I don’t want to make this mine, but ours. To keep and to be proud of. The last thing I want is to stifle diversity, in thought and code, and I am very sure that some of you will find a lot to disagree with what I am saying, and that’s great, because this should, again, be ours, not mine. But the one thing I am convinced of is the little pivot that this project hinges on* between relative obscurity and blasting success is that we need to find our version of a simplified, streamlined and aligned way of defining, building and communicating what Apache CouchDB is. (* I suck at metaphors) And yes that means that some thing that *YOU* think are important are getting a second row seat instead of the front row. Heck even some of my pet features get a second row seat, but that is fine because they aren’t gone, there is still room for all the crazy and not-so-crazy-but-not-essential stuff that people love in the plugin system, one click away. All this so we can benefit from being able to focus on building a modern, compelling, fun, humble and clever database that we can build the future, our future, on. * * * I want to live in a world where people are empowered to understand and are capable to decide where their data lives. I want to live in a world where technology solves more problems than it creates. My primary motivation for working on Apache CouchDB is to help build the world I want to live in. The ONE feature that makes CouchDB relevant is multi-master replication. I want to learn from understanding what the PRIMARY and SECONDARY features for CouchDB are. Apache CouchDB is the focal point for The Replicating Society. I don’t want to make this mine, but ours. To keep and to be proud of. * * * CouchDB is a database that replicates. I’m excited about your feedback! <3 Sincerely, Jan -- Thanks to Noah for kicking off this way overdue discussion. On Jul 24, 2013, at 15:28 , Noah Slater <[email protected]> wrote: > Okay, here are some rough thoughts. > > Why? > > - We believe that distributed data should be easy > > How? > > - Painless multi-master replication > - Effortless clustering and sharding > - Co-location of data, queries, and views > - Deep browser and platform integration > - Built of the Web > > What? > > - Erlang > - HTTP > - JSON > - JavaScript > - MapReduce > > (That last list could go on, and on, and on...) > > Anyway. This is just a rough sketch of the sort of hierarchy I am thinking > about. > > Whatever this ends up looking like, I think this is how we should talk > about CouchDB. This structure could be a template for anything. A talk, a > sales pitch, the homepage itself. The important thing is that we start from > "why?" and we build up from foundations. > > > On 24 July 2013 13:15, Noah Slater <[email protected]> wrote: > >> I'm trying to imagine what our "I have a dream" speech would be like for >> CouchDB. If we were the Wright brothers, we might stand up and say "I have >> a dream that one day man will fly." We might say, "I have a dream that >> distributed data will be easy." (I mean, that about covers it, right? >> Doesn't have to be complex. The hard part is making sure we actually focus >> in on the root dream we all have.) >> >> Jan mentioned a few months ago that CouchDB almost wants to be the Git, >> for databases. What is Git? What would Git's "dream" be? I can imagine >> Linus saying "I have a dream that distributed version control will be >> easy." Same sorta thing, right? >> >> >> On 24 July 2013 13:06, Noah Slater <[email protected]> wrote: >> >>> Benoit, >>> >>> You should defo watch that video and see what you think. Note that it >>> does not matter if we are a company. This insight applies to companies, >>> products, loose groups of people working towards one thing (like the Wright >>> brothers) and even individuals. (i.e. What is your personal "why" and how >>> are the things you are doing working towards that.) >>> >>> I also want to put you at ease by saying that having a single shared >>> "why" doesn't mean that anybody's vision, or personal goals have to be left >>> by the wayside. People can still come to the project with their own goals, >>> and their own perspective. But the project itself should have a clear sense >>> of what we are trying to accomplish. >>> >>> I think the "why" we come up with can easily be something that inspires >>> and is important to the Hoodie peeps, the Kanso peeps, the CouchApp peeps, >>> the "big data" peeps, the mobile platform peeps. Think about a why that >>> might evolve out of "your data, everywhere". Who (in our existing >>> communities) wouldn't love that and want to rally behind that? (But this is >>> just one idea.) >>> >>> Asking "what are the core features" misses the point. Why are these core >>> features? Why did we add them in the first place? What are we working >>> towards? See, you hit on it in your final sentence: "relax we take care >>> about your data and the way you exchange and render them wherever they >>> are". This! This is the kind of thing that I think we should hone, and >>> figure out, and document. >>> >>> Once we have that, it can inform our "how". When we're talking about >>> features, about product direction (i.e. what we add, what we subtract) we >>> can say "well, how is this related to what we're trying to do here?" Do you >>> see what I mean? :) >>> >>> "Painless distributed systems" is also a step in the right direction for >>> answering the question "why?" >>> >>> So far we have: >>> >>> * Relax >>> * Decentralised web >>> * Peer-to-peer replication of apps and datasets >>> * Your data, everywhere >>> * Put the data where you need it >>> * We handle your data / you handle display >>> * Painless distributed systems >>> >>> Somewhere in here ^ (and perhaps in a follow up reply) is a single shared >>> value system. Something we all hold dear. >>> >>> >>> >>> >>> On 24 July 2013 12:48, Benoit Chesneau <[email protected]> wrote: >>> >>>> Anyway, CouchDB is not like apple or dell. This isn't a company. And we >>>> don't have to share all the same vision, but only common values, a core. >>>> I'm not sure it enter in the what you describe. What kind of vision are >>>> you >>>> speaking about? >>>> >>>> Also I would remove any pro-tip from your mail if we want to start from a >>>> neutral base. >>>> >>>> Couchdb is known for the replication but not only. Couchapps and the way >>>> people hack around is another (hoodie, kanso, erica/ couchapp all >>>> differents visions of what is a couchapp but all are using couchdb the >>>> same_.. Message hub is another (nodejistsu, hoodie are using couchdb as a >>>> message hub somehow, not only but a lot of their arch is based on >>>> changes). >>>> And now we we can add some kind of big data handling. Not forgetting >>>> people >>>> that are using apache couchdb on their mobile, they exists and the >>>> patches >>>> will be release. >>>> >>>> All have different visions. But they share some common features. I don't >>>> want to forget someone because of a vision of some. I only know that >>>> couchdb has some strong features that could be improved. >>>> >>>> All that to say that rather than thinking to a vision, maybe we could >>>> collect all the usages around and see what emerges from it. What are the >>>> core features, What couchdb should focus on and itterrate depending on >>>> the >>>> new usage. I guess it's some kind of philosophy: "relax we take care >>>> about >>>> your data and the way you exchange and render them wherever they are". >>>> >>>> - benoit >>>> >>>> >>>> On Wed, Jul 24, 2013 at 1:24 PM, Noah Slater <[email protected]> wrote: >>>> >>>>> Hi devs, >>>>> >>>>> I came across this video recently: >>>>> >>>>> Simon Sinek: How great leaders inspire action >>>>> >>>> http://www.ted.com/talks/simon_sinek_how_great_leaders_inspire_action.html >>>>> >>>>> In it he sets out what he calls the Golden Circle: >>>>> >>>>> Why >>>>> >>>>> - What's your purpose? >>>>> - What's your cause? >>>>> - What's your belief? >>>>> >>>>> How >>>>> >>>>> - How do we do it? >>>>> - How does our product differentiate? >>>>> - How are we different? >>>>> - How are we better? >>>>> >>>>> What >>>>> >>>>> - What do we do? >>>>> - What do we make? >>>>> >>>>> He points out that the difference between companies like Apple and >>>>> companies like Dell. >>>>> >>>>> Dell tells you what they do, and how. "We make great computers. They're >>>>> well designed and work well. Wanna buy a computer?" Most companies do >>>> it >>>>> like this. But they often miss out the "why". >>>>> >>>>> But then you look at Apple, and they do it the other way around. Apple >>>> tell >>>>> you what their purpose is. The rest is almost an afterthought. "We >>>> believe >>>>> in challenging the status quo. We believe in thinking different. We do >>>> that >>>>> with great design and a focus on the user experience. We just happen to >>>>> make computers." He then joking quips: "Ready to buy one yet?" >>>>> >>>>> (His talk gives several other examples, with his thesis being that >>>> telling >>>>> your story from the outside in is what separates all the great >>>> companies >>>>> and leaders. One of his main examples is the Wright brothers.) >>>>> >>>>> He comments that if you talk about what you believe, you will attract >>>> those >>>>> that believe what you believe. That when you talk about what you >>>> believe, >>>>> people will join you for their own reasons, for their own purpose. And >>>> that >>>>> what you do simply serves as proof of what you believe. Or as he quips: >>>>> "Martin Luther King gave his 'I have a dream' speech, not his 'i have a >>>>> plan' speech." >>>>> >>>>> Why am I bringing this to the dev list? >>>>> >>>>> Because our message stinks. "Apache CouchDB™ is a database that uses >>>> JSON >>>>> for documents, JavaScript for MapReduce queries, and regular HTTP for >>>> an >>>>> API" is a terrible way to introduce who we are, what we stand for, and >>>> why >>>>> we build this thing. (And I'm allowed to say all that, because I'm the >>>> one >>>>> who wrote it, with lots of help from Jan.) >>>>> >>>>> So what am I proposing? I'm proposing that we figure out our why. That >>>> we >>>>> figure out what we stand for, what we believe in. And then we figure >>>> out >>>>> how we're gonna do that (pro tip: replication is more important than >>>> the >>>>> data format we use). Not only will this define a consistent internal >>>> vision >>>>> for the project (what *are* we working towards anyway?) but it will >>>> help us >>>>> to attract people who believe in what we believe. >>>>> >>>>> So, if you have any thoughts about this, speak up! >>>>> >>>>> Thanks, >>>>> >>>>> -- >>>>> NS >>>>> >>>> >>> >>> >>> >>> -- >>> NS >>> >> >> >> >> -- >> NS >> > > > > -- > NS
