Re: [DISCUSS] DistributedLog Incubation Proposal

2016-06-16 Thread Sijie Guo
Any feedback here, folks? If no, I'd like to start a voting thread soon.

- Sijie

On Wed, Jun 8, 2016 at 9:34 PM, Sijie Guo  wrote:

> Hi,
>
> I would like to propose DistributedLog to be an Apache Incubator project.
>
> DistributedLog is a high performance replicated log service.
> It offers durability, replication and strong consistency, which provides
> a fundamental building block for building reliable distributed systems,
> e.g replicated-state-machines, general pub/sub systems, distributed
> databases, distributed queues and etc.
>
> Here's a link to the proposal in the Incubator wiki
>
> https://wiki.apache.org/incubator/DistributedLogProposal
>
> I've also pasted the initial contents below.
>
> Thanks,
>
> Sijie
>
> = Abstract =
> DistributedLog is a high-performance replicated log service. It offers 
> durability, replication and strong consistency, which provides a fundamental 
> building block for building reliable distributed systems, e.g 
> replicated-state-machines, general pub/sub systems, distributed databases, 
> distributed queues and etc.
>
> See “Building Distributedlog - Twitter’s high performance replicated log 
> service” for details: 
> https://blog.twitter.com/2015/building-distributedlog-twitter-s-high-performance-replicated-log-service
>
> = Proposal =
> We propose to contribute DistributedLog codebase and associated artifacts 
> (e.g. documentation, web-site content etc.) to the Apache Software Foundation 
> with the intent of forming a productive, meritocratic and open community 
> around DistributedLog’s continued development, according to the ‘Apache Way’.
>
> = Background =
> Engineers at Twitter began developing DistributedLog in early 2013. 
> DistributedLog is described in a Twitter engineering blog post and presented 
> at the Messaging Meetup in Sep 2015. It has been released as an 
> Apache-licensed open-source project on GitHub in May 2016.
>
> DistributedLog is a high-performance replicated log service, which provides 
> simple stream-oriented abstractions over log-segments and offers durability, 
> replication and strong consistency for building reliable distributed systems. 
> The features offered by DistributedLog includes:
>  * Simple high-level, stream oriented interface
>  * Naming and metadata scheme for managing streams and other entities
>  * Log data management policies, include data segmentation and data retention
>  * Fast write pipeline leveraging batching and compression
>  * Fast read mechanism leveraging long-poll and read-ahead caching
>  * Service tiers supporting writer fan-in and reader fan-out
>  * Geo-replicated logs
>
> DistributedLog’s most important benefit is high-performance with a strong 
> durability guarantee, making it extremely appropriate for running different 
> workloads from distributed database journaling to real-time stream computing. 
> Its modern, layered architecture makes it easy to run the service tiers in 
> multi-tenant datacenter environments such as Apache Mesos or cloud 
> environments such as EC2.
>
> = Rationale =
> DistributedLog is designed to provide core fundamental features like 
> high-performance, durability and strong consistency to anyone who is building 
> reliable distributed systems, in a simple and efficient way.
>
> We believe that the ASF is the right venue to foster an open-source community 
> around DistributedLog’s development. We expect that DistributedLog will 
> benefit from collaboration with related Apache projects, and under the 
> auspices of the ASF will attract talented contributors who will push 
> DistributedLog’s development forward at a faster pace.
>
> We believe that the timing is right for DistributedLog’s development to move 
> to the ASF: DistributedLog has already run in production at Twitter for 3 
> years and served various workloads including a distributed database journal, 
> reliable cross datacenter replication, search ingestion, andgeneral pub/sub 
> messaging. The project is stable. We are excited to see where an ASF-based 
> community can take DistributedLog.
>
> = Current Status =
> DistributedLog is a stable project that has been used in production at 
> Twitter for 3 years. The source code is public at github.com/twitter, which 
> will seed the Apache git repository.
>
> = Meritocracy =
> We understand the central importance of meritocracy to the Apache Way. We 
> will work to establish a welcoming, fair and meritocratic community. Several 
> companies have already expressed interest in this project, and we intend to 
> invite additional developers to participate. We look forward to growing a 
> rich user and developer community.
>
> = Community =
> There is a large need for a performant replicated log service for 
> applications such as distributed databases, distributed transactional 
> systems, replicated-state-machines and pub/sub messaging/queuing. We want to 
> attract more developers to the project, and we believe that the ASF’s open 
> and meritocratic philosop

Re: [DISCUSS] DistributedLog Incubation Proposal

2016-06-11 Thread Henry Saputra
Sravya,

Thank you for the interest and willingness to help.
We definitely looking for more contributors as the projects coming to ASF.

Being mentor is not necessary requirement to help a podling as you probably
already know.

Looking forward to see you in the community :)

- Henry

On Saturday, June 11, 2016, Sravya Tirukkovalur  wrote:

> @Sijie: As I am not an IPMC member, I am not eligible to be a mentor. I am
> figuring out if I can still contribute in some way informally. Will keep
> you posted. So no, I do not think you should add me to the proposal.
>
> Thanks for your interest though!
>
> On Sat, Jun 11, 2016 at 12:38 PM, Sijie Guo  > wrote:
>
>>
>>
>> Thanks Eitan for adding me.
>>
>> Sravya, cool! I am glad that you are interested in mentoring this
>> project. Shall I add you to the proposal?
>>
>> Sijie
>>
>>
>> On Saturday, June 11, 2016, Eitan Adler > > wrote:
>>
>>> + some people explicitly
>>>
>>> On 10 June 2016 at 12:42, Sravya Tirukkovalur  wrote:
>>> > Excited to see DistributedLog come to ASF!
>>> >
>>> > I see that you already have good list of nominated mentors. As a
>>> member of
>>> > recently graduated project, I can offer mentorship(informal) as well if
>>> > needed. I am not an IPMC member, so I guess I cannot be a formal
>>> mentor.
>>> >
>>> > Regards,
>>> >
>>> > On Wed, Jun 8, 2016 at 9:34 PM, Sijie Guo  wrote:
>>> >
>>> >> Hi,
>>> >>
>>> >> I would like to propose DistributedLog to be an Apache Incubator
>>> project.
>>> >>
>>> >> DistributedLog is a high performance replicated log service.
>>> >> It offers durability, replication and strong consistency, which
>>> provides
>>> >> a fundamental building block for building reliable distributed
>>> systems,
>>> >> e.g replicated-state-machines, general pub/sub systems, distributed
>>> >> databases, distributed queues and etc.
>>> >>
>>> >> Here's a link to the proposal in the Incubator wiki
>>> >>
>>> >> https://wiki.apache.org/incubator/DistributedLogProposal
>>> >>
>>> >> I've also pasted the initial contents below.
>>> >>
>>> >> Thanks,
>>> >>
>>> >> Sijie
>>> >>
>>> >> = Abstract =
>>> >> DistributedLog is a high-performance replicated log service. It offers
>>> >> durability, replication and strong consistency, which provides a
>>> >> fundamental building block for building reliable distributed systems,
>>> >> e.g replicated-state-machines, general pub/sub systems, distributed
>>> >> databases, distributed queues and etc.
>>> >>
>>> >> See “Building Distributedlog - Twitter’s high performance replicated
>>> >> log service” for details:
>>> >>
>>> >>
>>> https://blog.twitter.com/2015/building-distributedlog-twitter-s-high-performance-replicated-log-service
>>> >>
>>> >> = Proposal =
>>> >> We propose to contribute DistributedLog codebase and associated
>>> >> artifacts (e.g. documentation, web-site content etc.) to the Apache
>>> >> Software Foundation with the intent of forming a productive,
>>> >> meritocratic and open community around DistributedLog’s continued
>>> >> development, according to the ‘Apache Way’.
>>> >>
>>> >> = Background =
>>> >> Engineers at Twitter began developing DistributedLog in early 2013.
>>> >> DistributedLog is described in a Twitter engineering blog post and
>>> >> presented at the Messaging Meetup in Sep 2015. It has been released as
>>> >> an Apache-licensed open-source project on GitHub in May 2016.
>>> >>
>>> >> DistributedLog is a high-performance replicated log service, which
>>> >> provides simple stream-oriented abstractions over log-segments and
>>> >> offers durability, replication and strong consistency for building
>>> >> reliable distributed systems. The features offered by DistributedLog
>>> >> includes:
>>> >>  * Simple high-level, stream oriented interface
>>> >>  * Naming and metadata scheme for managing streams and other entities
>>> >>  * Log data management policies, include data segmentation and data
>>> >> retention
>>> >>  * Fast write pipeline leveraging batching and compression
>>> >>  * Fast read mechanism leveraging long-poll and read-ahead caching
>>> >>  * Service tiers supporting writer fan-in and reader fan-out
>>> >>  * Geo-replicated logs
>>> >>
>>> >> DistributedLog’s most important benefit is high-performance with a
>>> >> strong durability guarantee, making it extremely appropriate for
>>> >> running different workloads from distributed database journaling to
>>> >> real-time stream computing. Its modern, layered architecture makes it
>>> >> easy to run the service tiers in multi-tenant datacenter environments
>>> >> such as Apache Mesos or cloud environments such as EC2.
>>> >>
>>> >> = Rationale =
>>> >> DistributedLog is designed to provide core fundamental features like
>>> >> high-performance, durability and strong consistency to anyone who is
>>> >> building reliable distributed systems, in a simple and efficient way.
>>> >>
>>> >> We believe that the ASF is the right venue to foster an open-source
>>> >> community around DistributedLog’s development

Re: [DISCUSS] DistributedLog Incubation Proposal

2016-06-11 Thread Sravya Tirukkovalur
@Sijie: As I am not an IPMC member, I am not eligible to be a mentor. I am
figuring out if I can still contribute in some way informally. Will keep
you posted. So no, I do not think you should add me to the proposal.

Thanks for your interest though!

On Sat, Jun 11, 2016 at 12:38 PM, Sijie Guo  wrote:

>
>
> Thanks Eitan for adding me.
>
> Sravya, cool! I am glad that you are interested in mentoring this project.
> Shall I add you to the proposal?
>
> Sijie
>
>
> On Saturday, June 11, 2016, Eitan Adler  wrote:
>
>> + some people explicitly
>>
>> On 10 June 2016 at 12:42, Sravya Tirukkovalur  wrote:
>> > Excited to see DistributedLog come to ASF!
>> >
>> > I see that you already have good list of nominated mentors. As a member
>> of
>> > recently graduated project, I can offer mentorship(informal) as well if
>> > needed. I am not an IPMC member, so I guess I cannot be a formal mentor.
>> >
>> > Regards,
>> >
>> > On Wed, Jun 8, 2016 at 9:34 PM, Sijie Guo  wrote:
>> >
>> >> Hi,
>> >>
>> >> I would like to propose DistributedLog to be an Apache Incubator
>> project.
>> >>
>> >> DistributedLog is a high performance replicated log service.
>> >> It offers durability, replication and strong consistency, which
>> provides
>> >> a fundamental building block for building reliable distributed systems,
>> >> e.g replicated-state-machines, general pub/sub systems, distributed
>> >> databases, distributed queues and etc.
>> >>
>> >> Here's a link to the proposal in the Incubator wiki
>> >>
>> >> https://wiki.apache.org/incubator/DistributedLogProposal
>> >>
>> >> I've also pasted the initial contents below.
>> >>
>> >> Thanks,
>> >>
>> >> Sijie
>> >>
>> >> = Abstract =
>> >> DistributedLog is a high-performance replicated log service. It offers
>> >> durability, replication and strong consistency, which provides a
>> >> fundamental building block for building reliable distributed systems,
>> >> e.g replicated-state-machines, general pub/sub systems, distributed
>> >> databases, distributed queues and etc.
>> >>
>> >> See “Building Distributedlog - Twitter’s high performance replicated
>> >> log service” for details:
>> >>
>> >>
>> https://blog.twitter.com/2015/building-distributedlog-twitter-s-high-performance-replicated-log-service
>> >>
>> >> = Proposal =
>> >> We propose to contribute DistributedLog codebase and associated
>> >> artifacts (e.g. documentation, web-site content etc.) to the Apache
>> >> Software Foundation with the intent of forming a productive,
>> >> meritocratic and open community around DistributedLog’s continued
>> >> development, according to the ‘Apache Way’.
>> >>
>> >> = Background =
>> >> Engineers at Twitter began developing DistributedLog in early 2013.
>> >> DistributedLog is described in a Twitter engineering blog post and
>> >> presented at the Messaging Meetup in Sep 2015. It has been released as
>> >> an Apache-licensed open-source project on GitHub in May 2016.
>> >>
>> >> DistributedLog is a high-performance replicated log service, which
>> >> provides simple stream-oriented abstractions over log-segments and
>> >> offers durability, replication and strong consistency for building
>> >> reliable distributed systems. The features offered by DistributedLog
>> >> includes:
>> >>  * Simple high-level, stream oriented interface
>> >>  * Naming and metadata scheme for managing streams and other entities
>> >>  * Log data management policies, include data segmentation and data
>> >> retention
>> >>  * Fast write pipeline leveraging batching and compression
>> >>  * Fast read mechanism leveraging long-poll and read-ahead caching
>> >>  * Service tiers supporting writer fan-in and reader fan-out
>> >>  * Geo-replicated logs
>> >>
>> >> DistributedLog’s most important benefit is high-performance with a
>> >> strong durability guarantee, making it extremely appropriate for
>> >> running different workloads from distributed database journaling to
>> >> real-time stream computing. Its modern, layered architecture makes it
>> >> easy to run the service tiers in multi-tenant datacenter environments
>> >> such as Apache Mesos or cloud environments such as EC2.
>> >>
>> >> = Rationale =
>> >> DistributedLog is designed to provide core fundamental features like
>> >> high-performance, durability and strong consistency to anyone who is
>> >> building reliable distributed systems, in a simple and efficient way.
>> >>
>> >> We believe that the ASF is the right venue to foster an open-source
>> >> community around DistributedLog’s development. We expect that
>> >> DistributedLog will benefit from collaboration with related Apache
>> >> projects, and under the auspices of the ASF will attract talented
>> >> contributors who will push DistributedLog’s development forward at a
>> >> faster pace.
>> >>
>> >> We believe that the timing is right for DistributedLog’s development
>> >> to move to the ASF: DistributedLog has already run in production at
>> >> Twitter for 3 years and served various workloads incl

Re: [DISCUSS] DistributedLog Incubation Proposal

2016-06-11 Thread Sijie Guo
Thanks Eitan for adding me.

Sravya, cool! I am glad that you are interested in mentoring this project.
Shall I add you to the proposal?

Sijie

On Saturday, June 11, 2016, Eitan Adler  wrote:

> + some people explicitly
>
> On 10 June 2016 at 12:42, Sravya Tirukkovalur  > wrote:
> > Excited to see DistributedLog come to ASF!
> >
> > I see that you already have good list of nominated mentors. As a member
> of
> > recently graduated project, I can offer mentorship(informal) as well if
> > needed. I am not an IPMC member, so I guess I cannot be a formal mentor.
> >
> > Regards,
> >
> > On Wed, Jun 8, 2016 at 9:34 PM, Sijie Guo  > wrote:
> >
> >> Hi,
> >>
> >> I would like to propose DistributedLog to be an Apache Incubator
> project.
> >>
> >> DistributedLog is a high performance replicated log service.
> >> It offers durability, replication and strong consistency, which provides
> >> a fundamental building block for building reliable distributed systems,
> >> e.g replicated-state-machines, general pub/sub systems, distributed
> >> databases, distributed queues and etc.
> >>
> >> Here's a link to the proposal in the Incubator wiki
> >>
> >> https://wiki.apache.org/incubator/DistributedLogProposal
> >>
> >> I've also pasted the initial contents below.
> >>
> >> Thanks,
> >>
> >> Sijie
> >>
> >> = Abstract =
> >> DistributedLog is a high-performance replicated log service. It offers
> >> durability, replication and strong consistency, which provides a
> >> fundamental building block for building reliable distributed systems,
> >> e.g replicated-state-machines, general pub/sub systems, distributed
> >> databases, distributed queues and etc.
> >>
> >> See “Building Distributedlog - Twitter’s high performance replicated
> >> log service” for details:
> >>
> >>
> https://blog.twitter.com/2015/building-distributedlog-twitter-s-high-performance-replicated-log-service
> >>
> >> = Proposal =
> >> We propose to contribute DistributedLog codebase and associated
> >> artifacts (e.g. documentation, web-site content etc.) to the Apache
> >> Software Foundation with the intent of forming a productive,
> >> meritocratic and open community around DistributedLog’s continued
> >> development, according to the ‘Apache Way’.
> >>
> >> = Background =
> >> Engineers at Twitter began developing DistributedLog in early 2013.
> >> DistributedLog is described in a Twitter engineering blog post and
> >> presented at the Messaging Meetup in Sep 2015. It has been released as
> >> an Apache-licensed open-source project on GitHub in May 2016.
> >>
> >> DistributedLog is a high-performance replicated log service, which
> >> provides simple stream-oriented abstractions over log-segments and
> >> offers durability, replication and strong consistency for building
> >> reliable distributed systems. The features offered by DistributedLog
> >> includes:
> >>  * Simple high-level, stream oriented interface
> >>  * Naming and metadata scheme for managing streams and other entities
> >>  * Log data management policies, include data segmentation and data
> >> retention
> >>  * Fast write pipeline leveraging batching and compression
> >>  * Fast read mechanism leveraging long-poll and read-ahead caching
> >>  * Service tiers supporting writer fan-in and reader fan-out
> >>  * Geo-replicated logs
> >>
> >> DistributedLog’s most important benefit is high-performance with a
> >> strong durability guarantee, making it extremely appropriate for
> >> running different workloads from distributed database journaling to
> >> real-time stream computing. Its modern, layered architecture makes it
> >> easy to run the service tiers in multi-tenant datacenter environments
> >> such as Apache Mesos or cloud environments such as EC2.
> >>
> >> = Rationale =
> >> DistributedLog is designed to provide core fundamental features like
> >> high-performance, durability and strong consistency to anyone who is
> >> building reliable distributed systems, in a simple and efficient way.
> >>
> >> We believe that the ASF is the right venue to foster an open-source
> >> community around DistributedLog’s development. We expect that
> >> DistributedLog will benefit from collaboration with related Apache
> >> projects, and under the auspices of the ASF will attract talented
> >> contributors who will push DistributedLog’s development forward at a
> >> faster pace.
> >>
> >> We believe that the timing is right for DistributedLog’s development
> >> to move to the ASF: DistributedLog has already run in production at
> >> Twitter for 3 years and served various workloads including a
> >> distributed database journal, reliable cross datacenter replication,
> >> search ingestion, andgeneral pub/sub messaging. The project is stable.
> >> We are excited to see where an ASF-based community can take
> >> DistributedLog.
> >>
> >> = Current Status =
> >> DistributedLog is a stable project that has been used in production at
> >> Twitter for 3 years. The source code is public at github.com/twitte

Re: [DISCUSS] DistributedLog Incubation Proposal

2016-06-11 Thread Eitan Adler
+ some people explicitly

On 10 June 2016 at 12:42, Sravya Tirukkovalur  wrote:
> Excited to see DistributedLog come to ASF!
>
> I see that you already have good list of nominated mentors. As a member of
> recently graduated project, I can offer mentorship(informal) as well if
> needed. I am not an IPMC member, so I guess I cannot be a formal mentor.
>
> Regards,
>
> On Wed, Jun 8, 2016 at 9:34 PM, Sijie Guo  wrote:
>
>> Hi,
>>
>> I would like to propose DistributedLog to be an Apache Incubator project.
>>
>> DistributedLog is a high performance replicated log service.
>> It offers durability, replication and strong consistency, which provides
>> a fundamental building block for building reliable distributed systems,
>> e.g replicated-state-machines, general pub/sub systems, distributed
>> databases, distributed queues and etc.
>>
>> Here's a link to the proposal in the Incubator wiki
>>
>> https://wiki.apache.org/incubator/DistributedLogProposal
>>
>> I've also pasted the initial contents below.
>>
>> Thanks,
>>
>> Sijie
>>
>> = Abstract =
>> DistributedLog is a high-performance replicated log service. It offers
>> durability, replication and strong consistency, which provides a
>> fundamental building block for building reliable distributed systems,
>> e.g replicated-state-machines, general pub/sub systems, distributed
>> databases, distributed queues and etc.
>>
>> See “Building Distributedlog - Twitter’s high performance replicated
>> log service” for details:
>>
>> https://blog.twitter.com/2015/building-distributedlog-twitter-s-high-performance-replicated-log-service
>>
>> = Proposal =
>> We propose to contribute DistributedLog codebase and associated
>> artifacts (e.g. documentation, web-site content etc.) to the Apache
>> Software Foundation with the intent of forming a productive,
>> meritocratic and open community around DistributedLog’s continued
>> development, according to the ‘Apache Way’.
>>
>> = Background =
>> Engineers at Twitter began developing DistributedLog in early 2013.
>> DistributedLog is described in a Twitter engineering blog post and
>> presented at the Messaging Meetup in Sep 2015. It has been released as
>> an Apache-licensed open-source project on GitHub in May 2016.
>>
>> DistributedLog is a high-performance replicated log service, which
>> provides simple stream-oriented abstractions over log-segments and
>> offers durability, replication and strong consistency for building
>> reliable distributed systems. The features offered by DistributedLog
>> includes:
>>  * Simple high-level, stream oriented interface
>>  * Naming and metadata scheme for managing streams and other entities
>>  * Log data management policies, include data segmentation and data
>> retention
>>  * Fast write pipeline leveraging batching and compression
>>  * Fast read mechanism leveraging long-poll and read-ahead caching
>>  * Service tiers supporting writer fan-in and reader fan-out
>>  * Geo-replicated logs
>>
>> DistributedLog’s most important benefit is high-performance with a
>> strong durability guarantee, making it extremely appropriate for
>> running different workloads from distributed database journaling to
>> real-time stream computing. Its modern, layered architecture makes it
>> easy to run the service tiers in multi-tenant datacenter environments
>> such as Apache Mesos or cloud environments such as EC2.
>>
>> = Rationale =
>> DistributedLog is designed to provide core fundamental features like
>> high-performance, durability and strong consistency to anyone who is
>> building reliable distributed systems, in a simple and efficient way.
>>
>> We believe that the ASF is the right venue to foster an open-source
>> community around DistributedLog’s development. We expect that
>> DistributedLog will benefit from collaboration with related Apache
>> projects, and under the auspices of the ASF will attract talented
>> contributors who will push DistributedLog’s development forward at a
>> faster pace.
>>
>> We believe that the timing is right for DistributedLog’s development
>> to move to the ASF: DistributedLog has already run in production at
>> Twitter for 3 years and served various workloads including a
>> distributed database journal, reliable cross datacenter replication,
>> search ingestion, andgeneral pub/sub messaging. The project is stable.
>> We are excited to see where an ASF-based community can take
>> DistributedLog.
>>
>> = Current Status =
>> DistributedLog is a stable project that has been used in production at
>> Twitter for 3 years. The source code is public at github.com/twitter,
>> which will seed the Apache git repository.
>>
>> = Meritocracy =
>> We understand the central importance of meritocracy to the Apache Way.
>> We will work to establish a welcoming, fair and meritocratic
>> community. Several companies have already expressed interest in this
>> project, and we intend to invite additional developers to participate.
>> We look forward to growing a rich user and develo

Re: [DISCUSS] DistributedLog Incubation Proposal

2016-06-10 Thread Sravya Tirukkovalur
Excited to see DistributedLog come to ASF!

I see that you already have good list of nominated mentors. As a member of
recently graduated project, I can offer mentorship(informal) as well if
needed. I am not an IPMC member, so I guess I cannot be a formal mentor.

Regards,

On Wed, Jun 8, 2016 at 9:34 PM, Sijie Guo  wrote:

> Hi,
>
> I would like to propose DistributedLog to be an Apache Incubator project.
>
> DistributedLog is a high performance replicated log service.
> It offers durability, replication and strong consistency, which provides
> a fundamental building block for building reliable distributed systems,
> e.g replicated-state-machines, general pub/sub systems, distributed
> databases, distributed queues and etc.
>
> Here's a link to the proposal in the Incubator wiki
>
> https://wiki.apache.org/incubator/DistributedLogProposal
>
> I've also pasted the initial contents below.
>
> Thanks,
>
> Sijie
>
> = Abstract =
> DistributedLog is a high-performance replicated log service. It offers
> durability, replication and strong consistency, which provides a
> fundamental building block for building reliable distributed systems,
> e.g replicated-state-machines, general pub/sub systems, distributed
> databases, distributed queues and etc.
>
> See “Building Distributedlog - Twitter’s high performance replicated
> log service” for details:
>
> https://blog.twitter.com/2015/building-distributedlog-twitter-s-high-performance-replicated-log-service
>
> = Proposal =
> We propose to contribute DistributedLog codebase and associated
> artifacts (e.g. documentation, web-site content etc.) to the Apache
> Software Foundation with the intent of forming a productive,
> meritocratic and open community around DistributedLog’s continued
> development, according to the ‘Apache Way’.
>
> = Background =
> Engineers at Twitter began developing DistributedLog in early 2013.
> DistributedLog is described in a Twitter engineering blog post and
> presented at the Messaging Meetup in Sep 2015. It has been released as
> an Apache-licensed open-source project on GitHub in May 2016.
>
> DistributedLog is a high-performance replicated log service, which
> provides simple stream-oriented abstractions over log-segments and
> offers durability, replication and strong consistency for building
> reliable distributed systems. The features offered by DistributedLog
> includes:
>  * Simple high-level, stream oriented interface
>  * Naming and metadata scheme for managing streams and other entities
>  * Log data management policies, include data segmentation and data
> retention
>  * Fast write pipeline leveraging batching and compression
>  * Fast read mechanism leveraging long-poll and read-ahead caching
>  * Service tiers supporting writer fan-in and reader fan-out
>  * Geo-replicated logs
>
> DistributedLog’s most important benefit is high-performance with a
> strong durability guarantee, making it extremely appropriate for
> running different workloads from distributed database journaling to
> real-time stream computing. Its modern, layered architecture makes it
> easy to run the service tiers in multi-tenant datacenter environments
> such as Apache Mesos or cloud environments such as EC2.
>
> = Rationale =
> DistributedLog is designed to provide core fundamental features like
> high-performance, durability and strong consistency to anyone who is
> building reliable distributed systems, in a simple and efficient way.
>
> We believe that the ASF is the right venue to foster an open-source
> community around DistributedLog’s development. We expect that
> DistributedLog will benefit from collaboration with related Apache
> projects, and under the auspices of the ASF will attract talented
> contributors who will push DistributedLog’s development forward at a
> faster pace.
>
> We believe that the timing is right for DistributedLog’s development
> to move to the ASF: DistributedLog has already run in production at
> Twitter for 3 years and served various workloads including a
> distributed database journal, reliable cross datacenter replication,
> search ingestion, andgeneral pub/sub messaging. The project is stable.
> We are excited to see where an ASF-based community can take
> DistributedLog.
>
> = Current Status =
> DistributedLog is a stable project that has been used in production at
> Twitter for 3 years. The source code is public at github.com/twitter,
> which will seed the Apache git repository.
>
> = Meritocracy =
> We understand the central importance of meritocracy to the Apache Way.
> We will work to establish a welcoming, fair and meritocratic
> community. Several companies have already expressed interest in this
> project, and we intend to invite additional developers to participate.
> We look forward to growing a rich user and developer community.
>
> = Community =
> There is a large need for a performant replicated log service for
> applications such as distributed databases, distributed transactional
> systems, replicate

[DISCUSS] DistributedLog Incubation Proposal

2016-06-08 Thread Sijie Guo
Hi,

I would like to propose DistributedLog to be an Apache Incubator project.

DistributedLog is a high performance replicated log service.
It offers durability, replication and strong consistency, which provides
a fundamental building block for building reliable distributed systems,
e.g replicated-state-machines, general pub/sub systems, distributed
databases, distributed queues and etc.

Here's a link to the proposal in the Incubator wiki

https://wiki.apache.org/incubator/DistributedLogProposal

I've also pasted the initial contents below.

Thanks,

Sijie

= Abstract =
DistributedLog is a high-performance replicated log service. It offers
durability, replication and strong consistency, which provides a
fundamental building block for building reliable distributed systems,
e.g replicated-state-machines, general pub/sub systems, distributed
databases, distributed queues and etc.

See “Building Distributedlog - Twitter’s high performance replicated
log service” for details:
https://blog.twitter.com/2015/building-distributedlog-twitter-s-high-performance-replicated-log-service

= Proposal =
We propose to contribute DistributedLog codebase and associated
artifacts (e.g. documentation, web-site content etc.) to the Apache
Software Foundation with the intent of forming a productive,
meritocratic and open community around DistributedLog’s continued
development, according to the ‘Apache Way’.

= Background =
Engineers at Twitter began developing DistributedLog in early 2013.
DistributedLog is described in a Twitter engineering blog post and
presented at the Messaging Meetup in Sep 2015. It has been released as
an Apache-licensed open-source project on GitHub in May 2016.

DistributedLog is a high-performance replicated log service, which
provides simple stream-oriented abstractions over log-segments and
offers durability, replication and strong consistency for building
reliable distributed systems. The features offered by DistributedLog
includes:
 * Simple high-level, stream oriented interface
 * Naming and metadata scheme for managing streams and other entities
 * Log data management policies, include data segmentation and data retention
 * Fast write pipeline leveraging batching and compression
 * Fast read mechanism leveraging long-poll and read-ahead caching
 * Service tiers supporting writer fan-in and reader fan-out
 * Geo-replicated logs

DistributedLog’s most important benefit is high-performance with a
strong durability guarantee, making it extremely appropriate for
running different workloads from distributed database journaling to
real-time stream computing. Its modern, layered architecture makes it
easy to run the service tiers in multi-tenant datacenter environments
such as Apache Mesos or cloud environments such as EC2.

= Rationale =
DistributedLog is designed to provide core fundamental features like
high-performance, durability and strong consistency to anyone who is
building reliable distributed systems, in a simple and efficient way.

We believe that the ASF is the right venue to foster an open-source
community around DistributedLog’s development. We expect that
DistributedLog will benefit from collaboration with related Apache
projects, and under the auspices of the ASF will attract talented
contributors who will push DistributedLog’s development forward at a
faster pace.

We believe that the timing is right for DistributedLog’s development
to move to the ASF: DistributedLog has already run in production at
Twitter for 3 years and served various workloads including a
distributed database journal, reliable cross datacenter replication,
search ingestion, andgeneral pub/sub messaging. The project is stable.
We are excited to see where an ASF-based community can take
DistributedLog.

= Current Status =
DistributedLog is a stable project that has been used in production at
Twitter for 3 years. The source code is public at github.com/twitter,
which will seed the Apache git repository.

= Meritocracy =
We understand the central importance of meritocracy to the Apache Way.
We will work to establish a welcoming, fair and meritocratic
community. Several companies have already expressed interest in this
project, and we intend to invite additional developers to participate.
We look forward to growing a rich user and developer community.

= Community =
There is a large need for a performant replicated log service for
applications such as distributed databases, distributed transactional
systems, replicated-state-machines and pub/sub messaging/queuing. We
want to attract more developers to the project, and we believe that
the ASF’s open and meritocratic philosophy will help us with this. We
note the success of other similar projects already part of the ASF,
like Kafka.

= Core Developers =
DistributedLog is actively developed within Twitter. Most of the
developers are from Twitter. Many of them are committers or PMC
members of Apache BookKeeper. Others aren’t currently affiliated with
ASF so they will requi