+ some people explicitly On 10 June 2016 at 12:42, Sravya Tirukkovalur <sra...@apache.org> wrote: > Excited to see DistributedLog come to ASF! > > I see that you already have good list of nominated mentors. As a member of > recently graduated project, I can offer mentorship(informal) as well if > needed. I am not an IPMC member, so I guess I cannot be a formal mentor. > > Regards, > > On Wed, Jun 8, 2016 at 9:34 PM, Sijie Guo <si...@apache.org> wrote: > >> Hi, >> >> I would like to propose DistributedLog to be an Apache Incubator project. >> >> DistributedLog is a high performance replicated log service. >> It offers durability, replication and strong consistency, which provides >> a fundamental building block for building reliable distributed systems, >> e.g replicated-state-machines, general pub/sub systems, distributed >> databases, distributed queues and etc. >> >> Here's a link to the proposal in the Incubator wiki >> >> https://wiki.apache.org/incubator/DistributedLogProposal >> >> I've also pasted the initial contents below. >> >> Thanks, >> >> Sijie >> >> = Abstract = >> DistributedLog is a high-performance replicated log service. It offers >> durability, replication and strong consistency, which provides a >> fundamental building block for building reliable distributed systems, >> e.g replicated-state-machines, general pub/sub systems, distributed >> databases, distributed queues and etc. >> >> See “Building Distributedlog - Twitter’s high performance replicated >> log service” for details: >> >> https://blog.twitter.com/2015/building-distributedlog-twitter-s-high-performance-replicated-log-service >> >> = Proposal = >> We propose to contribute DistributedLog codebase and associated >> artifacts (e.g. documentation, web-site content etc.) to the Apache >> Software Foundation with the intent of forming a productive, >> meritocratic and open community around DistributedLog’s continued >> development, according to the ‘Apache Way’. >> >> = Background = >> Engineers at Twitter began developing DistributedLog in early 2013. >> DistributedLog is described in a Twitter engineering blog post and >> presented at the Messaging Meetup in Sep 2015. It has been released as >> an Apache-licensed open-source project on GitHub in May 2016. >> >> DistributedLog is a high-performance replicated log service, which >> provides simple stream-oriented abstractions over log-segments and >> offers durability, replication and strong consistency for building >> reliable distributed systems. The features offered by DistributedLog >> includes: >> * Simple high-level, stream oriented interface >> * Naming and metadata scheme for managing streams and other entities >> * Log data management policies, include data segmentation and data >> retention >> * Fast write pipeline leveraging batching and compression >> * Fast read mechanism leveraging long-poll and read-ahead caching >> * Service tiers supporting writer fan-in and reader fan-out >> * Geo-replicated logs >> >> DistributedLog’s most important benefit is high-performance with a >> strong durability guarantee, making it extremely appropriate for >> running different workloads from distributed database journaling to >> real-time stream computing. Its modern, layered architecture makes it >> easy to run the service tiers in multi-tenant datacenter environments >> such as Apache Mesos or cloud environments such as EC2. >> >> = Rationale = >> DistributedLog is designed to provide core fundamental features like >> high-performance, durability and strong consistency to anyone who is >> building reliable distributed systems, in a simple and efficient way. >> >> We believe that the ASF is the right venue to foster an open-source >> community around DistributedLog’s development. We expect that >> DistributedLog will benefit from collaboration with related Apache >> projects, and under the auspices of the ASF will attract talented >> contributors who will push DistributedLog’s development forward at a >> faster pace. >> >> We believe that the timing is right for DistributedLog’s development >> to move to the ASF: DistributedLog has already run in production at >> Twitter for 3 years and served various workloads including a >> distributed database journal, reliable cross datacenter replication, >> search ingestion, andgeneral pub/sub messaging. The project is stable. >> We are excited to see where an ASF-based community can take >> DistributedLog. >> >> = Current Status = >> DistributedLog is a stable project that has been used in production at >> Twitter for 3 years. The source code is public at github.com/twitter, >> which will seed the Apache git repository. >> >> = Meritocracy = >> We understand the central importance of meritocracy to the Apache Way. >> We will work to establish a welcoming, fair and meritocratic >> community. Several companies have already expressed interest in this >> project, and we intend to invite additional developers to participate. >> We look forward to growing a rich user and developer community. >> >> = Community = >> There is a large need for a performant replicated log service for >> applications such as distributed databases, distributed transactional >> systems, replicated-state-machines and pub/sub messaging/queuing. We >> want to attract more developers to the project, and we believe that >> the ASF’s open and meritocratic philosophy will help us with this. We >> note the success of other similar projects already part of the ASF, >> like Kafka. >> >> = Core Developers = >> DistributedLog is actively developed within Twitter. Most of the >> developers are from Twitter. Many of them are committers or PMC >> members of Apache BookKeeper. Others aren’t currently affiliated with >> ASF so they will require new ICLAs. >> >> = Alignment = >> DistributedLog is related to several other Apache projects: >> * DistributedLog stores log segments as Ledgers in Apache BookKeeper. >> * DistributedLog uses Apache ZooKeeper for naming and metadata >> management and tracking the ownership of logs. >> * DistributedLog uses Apache Thrift as its RPC and serialization >> framework. >> * In the long-term, DistributedLog’s data will be stored in Apache >> Hadoop clusters powered by HDFS filesystem for archives and backup. >> >> = Known Risks = >> >> == Orphaned Products == >> DistributedLog is used as the fundamental messaging infrastructure at >> Twitter. It has been serving production traffic for online database >> systems, search ingestion and a general pub/sub system. Twitter >> remains committed to developing and supporting the project. Twitter >> has a strong track record in standing behind projects that were >> contributed to the ASF by its employees, including Apache Mesos, >> Apache Aurora, Apache BookKeeper, Apache Hadoop. There are many >> companies are interested in using it in production. >> >> == Inexperience with Open Source == >> The core developers of DistributedLog are committers of Apache >> BookKeeper. Although other committers on the initial list are >> committers or have less experience with the ASF, they already are >> active in Apache BookKeeper community. We are confident that the >> project can be run in accordance with Apache principles on an ongoing >> basis. >> >> == Homogeneous Developers == >> The initial committers are from Twitter. We hope to encourage >> contributions from other developers and grow them into committers >> after they have had time to continue their contributions. >> >> == Reliance on Salaried Developers == >> Many of DistributedLog’s initial set of committers work full-time on >> DistributedLog, and are paid to do so. However, as mentioned >> elsewhere, we anticipate growth in the developer community which we >> hope will include people from industry, hobbyists, and academics who >> have an interested in distributed messaging systems. >> >> == Relationships with Other Apache Products == >> DistributedLog uses Apache BookKeeper to store log segments and Apache >> ZooKeeper to store log metadata and manage log namespaces. It provides >> an end-to-end solution for replicated logs, to make building reliable >> distributed systems much easier. Unlike Kafka or ActiveMQ, >> DistributedLog is not a full-fledged pub/sub, queuing or messaging >> system. Instead, it is targeting on providing a fundamental building >> block for other distributed systems, offering durability, replication >> and consistency. So it could be used by other distributed systems, >> such as transaction log for replicated state machines (e.g., HDFS >> NameNode), WAL for distributed databases (e.g. HBase), Journal for >> in-memory services (e.g., Kestrel) and even storage backend for a >> full-fledged messaging system. >> >> == An Excessive Fascination with the Apache Brand == >> DistributedLog builds on two existing top-level projects, Apache >> BookKeeper and Apache ZooKeeper. Some of the core developers actively >> participate in both projects and understand well the implications of >> being hosted by Apache. We would like this project to build on the >> same core values of ASF and to grow a community based on meritocracy. >> Also, there are several other projects already hosted by ASF in this >> space of reliable messaging and that overlap with DistributedLog in >> interests and scope. Consequently, the combination of all these >> observations makes us believe that DistributedLog should be hosted by >> the ASF. >> >> = Documentation = >> Building DistributedLog: Twitter’s high performance replicated log >> service ( >> https://blog.twitter.com/2015/building-distributedlog-twitter-s-high-performance-replicated-log-service >> ) >> >> Documentation located in http://distributedlog.io. >> >> = Initial Source = >> DistributedLog’s initial source contribution will come from >> http://github.com/twitter/distributedlog/. >> >> = External Dependencies = >> DistributedLog depends upon a number of third-party libraries, which >> we list below. >> * Apache BookKeeper (Apache Software License v2.0) >> * Apache Commons (Apache Software License v2.0) >> * Apache Maven (Apache Software License v2.0) >> * Apache Thrift (Apache Software License v2.0) >> * Apache ZooKeeper (Apache Software License v2.0) >> * Google Guava (Apache Software License v2.0) >> * Mockito (MIT License) >> * Junit (Eclipse Public License 1.0) >> * LZ4-java (Apache Software License v2.0) >> * SLF4J (MIT License) >> * Twitter Finagle (Apache Software License v2.0) >> * Twitter Scrooge (Apache Software License v2.0) >> * Twitter Util (Apache Software License v2.0) >> >> = Required Resources = >> We request that following resources be created for the project to use: >> >> == Mailing lists == >> * priv...@distributedlog.incubator.apache.org (moderated subscriptions) >> * comm...@distributedlog.incubator.apache.org >> * d...@distributedlog.incubator.apache.org >> * u...@distributedlog.incubator.apache.org >> >> == Git repository == >> https://git.apache.org/distributedlog.git >> >> == JIRA instance == >> JIRA project DLOG (DLOG or DL) >> >> = Initial Committers = >> * Sijie Guo (Apache BookKeeper Committer, Twitter) >> * Robin Dhamankar (Apache BookKeeper Committer) >> * Leigh Stewart (Twitter) >> * Dave Rusek (Twitter) >> * Honggang Zhang (Twitter) >> * Jordan Bull (Twitter) >> * Satish Kotha (Twitter) >> * Aniruddha Laud >> * Franck Cuny (Twitter) >> * Eitan Adler (Twitter) >> >> == Affiliations == >> >> Most of the initial committers are employees of Twitter, except Robin >> Dhamankar and Aniruddha Laud. >> >> = Sponsors = >> >> == Champion == >> >> Flavio Junqueira >> >> == Nominated Mentors == >> >> * Flavio Junqueira >> * Chris Nauroth >> * Henry Saputra >> >> = Sponsoring Entity = >> >> We ask that the Apache Incubator PMC to sponsor this proposal. >>
-- Eitan Adler --------------------------------------------------------------------- To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org For additional commands, e-mail: general-h...@incubator.apache.org