Yeah, lets do both! :) I always had trepidations about leaving things as is
with ZooKeeper there. Can we have this new internal system be what replaces
that but still make it modular somewhat.

The problem with any new system is that everyone already trusts and relies
on the existing scars we know heal. That is why we all are still using
ZooKeeper ( I bet at least 3 clusters are still on 3.3.4 and one maybe
3.3.1 or something nutty ).

etcd
consul
c*
riak
akka

All have viable solutions and i have no idea what will be best or worst or
even work but lots of folks are working on it now trying to get things to
be different and work right for them.

I think a native version should be there in the project and I am 100% on
board with that native version NOT be ZooKeeper but homegrown.

I also think the native default should use the KIP-30 interface so other
server can also connect the feature they are solving also (that way
deployments that have already adopted XYZ for consensus can use it).

~ Joe Stein
- - - - - - - - - - - - - - - - - - -
     [image: Logo-Black.jpg]
  http://www.elodina.net
    http://www.stealth.ly
- - - - - - - - - - - - - - - - - - -

On Tue, Dec 1, 2015 at 2:58 PM, Jay Kreps <j...@confluent.io> wrote:

> Hey Joe,
>
> Thanks for raising this. People really want to get rid of the ZK
> dependency, I agree it is among the most asked for things. Let me give a
> quick critique and a more radical plan.
>
> I don't think making ZK pluggable is the right thing to do. I have a lot of
> experience with this dynamic of introducing plugins for core functionality
> because I previously worked on a key-value store called Voldemort in which
> we made both the protocol and storage engine totally pluggable. I
> originally felt this was a good thing both philosophically and practically,
> but in retrospect came to believe it was a huge mistake--what people really
> wanted was one really excellent implementation with the kind of insane
> levels of in-production usage and test coverage that infrastructure
> demands. Pluggability is actually really at odds with this, and the ability
> to actually abstract over some really meaty dependency like a storage
> engine never quite works.
>
> People dislike the ZK dependency because it effectively doubles the
> operational load of Kafka--it doubles the amount of configuration,
> monitoring, and understanding needed. Replacing ZK with a similar system
> won't fix this problem though--all the other consensus services are equally
> complex (and often less mature)--and it will cause two new problems. First
> there will be a layer of indirection that will make reasoning and improving
> the ZK implementation harder. For example, note that your plug-in api
> doesn't seem to cover multi-get and multi-write, when we added that we
> would end up breaking all plugins. Each new thing will be like that. Ops
> tools, config, documentation, etc will no longer be able to include any
> coverage of ZK because we can't assume ZK so all that becomes much harder.
> The second problem is that this introduces a combinatorial testing problem.
> People say they want to swap out ZK but they are assuming whatever they
> swap in will work equally well. How will we know that is true? The only way
> to explode out the testing to run with every possible plugin.
>
> If you want to see this in action take a look at ActiveMQ. ActiveMQ is less
> a system than a family of co-operating plugins and a configuration language
> for assembling them. Software engineers and open source communities are
> really prone to this kind of thing because "we can just make it pluggable"
> ends any argument. But the actual implementation is a mess, and later
> improvements in their threading, I/O, and other core models simply couldn't
> be made across all the plugins.
>
> This blog post on configurability in UI is a really good summary of a
> similar dynamic:
> http://ometer.com/free-software-ui.html
>
> Anyhow, not to go too far off on a rant. Clearly I have plugin PTSD :-)
>
> I think instead we should explore the idea of getting rid of the zookeeper
> dependency and replace it with an internal facility. Let me explain what I
> mean. In terms of API what Kafka and ZK do is super different, but
> internally it is actually quite similar--they are both trying to maintain a
> CP log.
>
> What would actually make the system significantly simpler would be to
> reimplement the facilities you describe on top of Kafka's existing
> infrastructure--using the same log implementation, network stack, config,
> monitoring, etc. If done correctly this would dramatically lower the
> operational load of the system versus the current Kafka+ZK or proposed
> Kafka+X.
>
> I don't have a proposal for how this would work and it's some effort to
> scope it out. The obvious thing to do would just be to keep the existing
> ISR/Controller setup and rebuild the controller etc on a RAFT/Paxos impl
> using the Kafka network/log/etc and have a replicated config database
> (maybe rocksdb) that was fed off the log and shared by all nodes.
>
> If done well this could have the advantage of potentially allowing us to
> scale the number of partitions quite significantly (the k/v store would not
> need to be all in memory), though you would likely still have limits on the
> number of partitions per machine. This would make the minimum Kafka cluster
> size be just your replication factor.
>
> People tend to feel that implementing things like RAFT or Paxos is too hard
> for mere mortals. But I actually think it is within our capabilities, and
> our testing capabilities as well as experience with this type of thing have
> improved to the point where we should not be scared off if it is the right
> path.
>
> This approach is likely more work then plugins (though maybe not, once you
> factor in all the docs, testing, etc) but if done correctly it would be an
> unambiguous step forward--a simpler, more scalable implementation with no
> operational dependencies.
>
> Thoughts?
>
> -Jay
>
>
>
>
>
> On Tue, Dec 1, 2015 at 11:12 AM, Joe Stein <joe.st...@stealth.ly> wrote:
>
> > I would like to start a discussion around the work that has started in
> > regards to KIP-30
> >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-30+-+Allow+for+brokers+to+have+plug-able+consensus+and+meta+data+storage+sub+systems
> >
> > The impetus for working on this came a lot from the community. For the
> last
> > year(~+) it has been the most asked question at any talk I have given
> > (personally speaking). It has come up a bit also on the mailing list
> > talking about zkclient vs currator. A lot of folks want to use Kafka but
> > introducing dependencies are hard for the enterprise so the goals behind
> > this is making it so that using Kafka can be done as easy as possible for
> > the operations teams to-do when they do. If they are already supporting
> > ZooKeeper they can keep doing that but if not they want (users) to use
> > something else they are already supporting that can plug-in to-do the
> same
> > things.
> >
> > For the core project I think we should leave in upstream what we have.
> This
> > gives a great baseline regression for folks and makes the work for
> "making
> > what we have plug-able work" a good defined task (carve out, layer in API
> > impl, push back tests pass). From there then when folks want their
> > implementation to be something besides ZooKeeper they can develop, test
> and
> > support that if they choose.
> >
> > We would like to suggest that we have the plugin interface be Java based
> > for minimizing depends for JVM impl. This could be in another directory
> > something TBD /<name>.
> >
> > If you have a server you want to try to get it working but you aren't on
> > the JVM don't be afraid just think about a REST impl and if you can work
> > inside of that you have some light RPC layers (this was the first pass
> > prototype we did to flush-out the public api presented on the KIP).
> >
> > There are a lot of parts to working on this and the more implementations
> we
> > have the better we can flush out the public interface. I will leave the
> > technical details and design to JIRA tickets that are linked through the
> > confluence page as these decisions come about and code starts for reviews
> > and we can target the specific modules having the context separate is
> > helpful especially if multiple folks are working on it.
> > https://issues.apache.org/jira/browse/KAFKA-2916
> >
> > Do other folks want to build implementations? Maybe we should start a
> > confluence page for those or use an existing one and add to it so we can
> > coordinate some there to.
> >
> > Thanks!
> >
> > ~ Joe Stein
> > - - - - - - - - - - - - - - - - - - -
> >      [image: Logo-Black.jpg]
> >   http://www.elodina.net
> >     http://www.stealth.ly
> > - - - - - - - - - - - - - - - - - - -
> >
>

Reply via email to