On 11/07/19 2:29 PM, Jan Lehnardt wrote:
Hi Chintan,

On 10. Jul 2019, at 18:25, Chintan Mishra <chin...@rebhu.com> wrote:

On 09/07/19 9:33 PM, Joan Touzet wrote:

Hi Chintan,

Reading through your proposal, I have one main point to make.

At the Apache Software Foundation, the people who lead the projects are
the people who do the work on them. We use the wrong word "meritocracy"
to explain this principle; a better word would be "do-ocracy."

   http://www.apache.org/foundation/how-it-works.html#decision-making
   https://incubator.apache.org/guides/participation.html#as_a_developer
   https://communitywiki.org/wiki/DoOcracy

That means that your project can completely proceed on its own if it
wants to; the only thing over which you're not in control is whether
that project gets to call itself CouchDB or not. That decision is
reached by the people who have built CouchDB into what it is today.
I appreciate that you shared these links. I now understand what I have to do 
next.
-----

On that last point, there's a lot that would need to be done for you to
convince the PMC that your vision is the one, true future of CouchDB.

What you propose is both a significant rewrite, as well as requiring an
entirely new set of skills from the developer base (Rust, MQTT, Kotlin,
Swift).
 From Slack conversations, it appears the community has some inclination 
towards building a Rust based CouchDB some day. As for other technologies those 
changes are not happening today. I do not propose to start with all the changes 
at once. Storage engine is a good place to start.
Since I brought it up in Slack, let me clarify: I do not suggest that
we should move CouchDB to Rust today or any time later.

What I am suggesting is that we should look at the things required to
support your idea of an IoT-capable CouchDB-like thing. My suggestion
is to not change CouchDB, but to make a new CouchDB compatible project.
I am building upon this idea. CouchDB has the strength to give its user base the ability to create an ecosystem of developer products around it. Bringing people behind these gems will start conversations which will create more opportunities.

Devices are only getting smaller, so a lower level language is needed
to ensure performance and good battery use. That leaves C, C++, Go and
Rust.

When I’m looking at what likely people I could excite to contribute to
such a thing, in my filter bubble that is folks getting into Rust or
and the rest of the Rust community.

If it ends up being Go, C or C++, because someone who runs with this
prefers those, I don’t really mind.
I also don't mind. I had some inclination towards Rust. But after noticing interest, I went all-in. I am also fine with C/C++. I haven't used GoLang to comment about it.

* * *

In particular, we should look at a more detailed IoT use-case and how
CouchDB can help.

Correct me if I’m wrong, but this is mostly about devices with sensors
generating measurements over time that should be aggregated into a
cloud service for analysis.

In that world, a hypothetical API for an IoT app using our new RustyCouch
could look like this:

db = RustyCouch.open('file.foo’)

db.save(measurement)

db.push('https://cloud.measurements.com’)

repeat

This is a very small subset of the CouchDB API, but it would cover the
majority of your billion IoT use-cases.

There are a few things to be considered about data persistence and
concurrency control, but in another email, you already mentioned
SQLite, which solves most of those for you already.

db.save() would generate a JSON document with a uuid as _id and
corresponding _rev and an entry in an index that allows us to query,
at a later point: in what order were these docs written, which we are
going to need for db.push()

db.push() then opens that index, checks with the cloud which docs
it already has (as per the standard CouchDB replication protocol)
and then sends all local docs that aren’t on the cloud yet in a
couple of _bulk_docs batches.

Voilá, a low-level, embeddable library that allows you to sync
stuff to CouchDB.

This is a scope that a single developer could make a prototype
of, even in a language that you are just starting out with.

With this in hand then, the next step is to talk to the folks
who build IoT platforms and applications to see if they want to
use something like that.

And once we have this, we can talk about changes to the replication
protocol.
A great MVP. This would work if IoTs were using HTTP. But they rely on MQTT. Replication will need re-imagination. I've started working on an alternative. It is a plugin written in Elixir for VerneMQ server. This will replicate messages received on certain topics to CouchDB server. Not the perfect solution but it works.

* * *

If you want to take this further, and make a library that also
supports interactive querying, for say native applications on
phones and watches and whatnot, you already have a decent
foundation, but you’ll have a little more work to do.

* * *

But none of this requires changing CouchDB itself, or a 10 year
effort of porting something, while solving all the needs you
have.

* * *

Finally, I’d like to caution against being flippant about the
current project direction with FoundationDB. This is something
the team that has been doing this for over 10 years looked at
“in-depth” and decided it is the right thing to do.

The alternative would be to build a FoundationDB like thing
ourselves, which is a multi-million dollar investment that
I haven’t any one seen commit to at the moment.

In particular, I’m one of the champions for smaller CouchDB
installations in this project, and moving forward is always
a give-and-take. We are not in a position yet to gauge what
the problems are with an FDB-Couch for a single-node instance
but I’m sure going to work hard on making it easy for our
downstream users.

I’m the maintainer of the Mac binaries which are extremely
popular. Any database that can’t be set up with a download,
unzip and double click to start to get a dev environment is
going to have trouble attracting new developers. So I’ll make
sure we can retain this experience as much as possible.
I didn't mean to sound condescending. Can you privately share the part which made you feel this way?

* * *

Let’s be pragmatic and consider incremental change or small
scope side projects to move this forward. Grand visions,
in my experience almost never work out. The only reason I have
trust in an FDB transition is because someone with authority
and budget said “the team that ostensibly built CouchDB 2.x
is going to do this”. That’s the only way it could possible work.
We need to get a rough idea of where we want to go and then start treading the path. As path becomes clearer, so will our vision.

And don’t mistake my RustyCouch suggestion about being
dismissive or sidelining. I’ve wanted something like this since
about 2008, and many people have tried with various attempts,
so my suggestion above is very serious *and* fed by the
experience or all these failed attempts.
I don't take it as such. I was duly warned.

CouchDB’s strength is its replication protocol. We didn’t
rewrite CouchDB in JavaScript because we suddenly realised
there are a billion browsers, but PouchDB came along with
a compatible data model and replication engine so that the
two projects complement each other perfectly and anyone on
the CouchDB will tell you that PouchDB is one of the biggest
drivers of CouchDB adoption.
I see.

How about we just re-run this strategy for IoT: build a
small thing that is useful for one use-case and make it work,
then make it more complicated to be useful for more use-cases.
At each point, make sure replication with CouchDB works.
That’s a winning strategy. We already know it.

Best
Jan
—


  It is in direct competition with the proposal being worked on
this list for the FoundationDB backend swap. With the addition of MQTT,
it sounds like the entire replication protocol and methodology would
need to be revisited, as the semantic changes you're proposing would
break existing client replication.
The HTTP replication protocol more or less remains the same in the foreseeable 
future. A new MQTT replication strategy will be built upon the existing method. 
The two will not work in parallel. Either one of these will work per database.
Finally, the proposal to push into
the mobile space would directly compete with our sister project PouchDB,
who have put in tens of thousands of development hours as well.
The community will evolve at some point. And bringing people from sister 
project onto CouchDB will allow faster development. The diagram in the proposal 
missed a part for Web Browser based CouchDB. This missed part is an interface 
for JavaScript and CouchDB-Web Browser. So, we will need some JavaScript 
developers too. And they can help improve Fauxton.
  This all
adds up to a much bigger scoped project than CouchDB is today, and I
daresay may be bigger than I think even you realize.
I do realize that I want CouchDB to be in a billion mobile and embedded device by 2025. I 
understand this is a challenging scale. I brought this here because I see how much we 
need a DB for a "Cluster Of Unreliable Commodity Hardware". I assume proposed 
path will take somewhere between 18-21 months to come to fruition for a team of 15 people 
working 40 hours/week.

With my PMC hat on, I have to ask:

* Do you already have developers versed in these skills you can bring to
   the project (beyond yourself)? Are they ready to commit the 40+ hours
   a week each to making it a reality?
No, I do not have a team in place for this.
* Do you have experience in building a distributed system of this scale,
   using the specific technologies you propose?
I have been reading about distributed systems. I want to take up an Open Source 
project which solves replication problem for devices coming up with emerging 
technologies.CouchDB is the best fit as it already solves theproblem of 
replication across remote devices.
* How do you plan to convince other developers of your approach
   specifically?
What got us(you) here, won't get us(you) there! -- Marshall Goldsmith

CouchDB led the way by being years ahead. This is just the same thing happening 
again in a newer market. CouchDB is already great at replication. What I am 
proposing is taking this simple-but-powerful methodology a step further and 
building it for planetary scale use-cases(idea derived from Lasp-Lang).Here are 
some ways with which we drive more developers, users, and eyes.

* Helping users realize that CouchDB lets them relax while building
   applications for devices with any form factor.
* Reaching out to the developers who have built their own solution for
   replicating stuff from their device of any form factor to CouchDB
* On-boarding developers who will become early adopter and test it out
   on their IoT devices. Thus, proving an unmet market need.
* Promoting offline-first strategy among mobile and embedded
   developers will drive contributors from these communities.
* Documenting comparisons between existing mobile and embedded
   solutions which provide replication solutions like Realm, and CouchBase.

* How do you intend to train up our existing developers on the new
   languages and technologies involved?
If people are excited about the future they are building then this is a smaller 
problem to tackle. People in this community when and if they come to a 
consensus about the proposal then this can be tackled by 'Each one, teach one' 
followed by Yamaha Motors. This is a buddy system where people get new partners 
to tackle a problem/PR. They share issues, their understanding of the codebase 
and language, etc. with each other. As buddies rotate everyone gets on the same 
page after a few cycles. I have found 3-pair buddy system works best in 
software.But this may differ based on culture, language, timezone, and 
availability.
* How do you perceive the advantages and disadvantages of your approach
   *specifically* vs. the FDB approach already outlined?
Value addition (Horizontal) > >
----
Proposal (Vertical) \/ \/
        
Pros

        
Cons

FoundationDB
        

* Improving what works for majority of existing users
* Iterates CouchDB to a better form
* Prospect of immediate consistency for ACID transactions

        

* Losing some small and mid-sized developers
* Fragments community


Polyglot-unification
        

* Growth by tapping newer prospects
* Reduces fragmentation of user community and codebase
* Reimagines CouchDB as if it was built in 2019

        

* Tons of work
* Uses RocksDB, overlooks FoundationDB migration


Email with subject "CouchDb Rewrite/Fork" by 'Reddy B. <redd...@live.fr>' has 
mentioned some other concerns. This proposal introduces a new story for CouchDB. This proposal 
would require using RocksDB instead of FoundationDB.

-Joan

On 2019-07-09 10:28, Chintan Mishra wrote:
Hello team!!

Years of time and effort help move a product to the heights that CouchDB
has reached. And as a non-contributor, rather a very new CouchDB
user(1.5 years) who failed to find some relevant emails, I came up with
a version of the future for CouchDB that I thought would help us grow.
But Jan and Robert helped me realize that it takes a village to raise a
child(CouchDB). So this is a proposal to find a middle ground from where
we are headed and where the market is going next. The proposal I wrote
was solely driven by what I have read over the years about the growth of
the product and the community. I have attached the file or if you prefer
reading in a browser, then click here<https://gitlab.com/snippets/1873543>.

It will roughly take 4-5 minutes of your time. A proposed direction is
to start an entirely new project. That is not what I desire. I want to
join the community behind CouchDB not build a new one using it. My goal
from this proposal is to generate leverage by creating early mover
advantage and help grow the community.

Thanking you.

--
Chintan Mishra
Rebhu Computing
Founder and CEO

Reply via email to