Changing moderation settings

2013-12-12 Thread Nathan Marz
How can I change the moderation settings for the Storm user and dev lists?
I'm getting enormous amounts of moderation emails (including lots triggered
by JIRA). Is there a way to whitelist accounts, turn off moderation, and/or
approve in bulk (like via a web interface)?


Re: [PROPOSAL] Storm for Apache Incubator

2013-09-08 Thread Nathan Marz
(Sorry for slow response, been traveling in Asia)

Thanks for volunteering guys! Added you both as mentors on the proposal.

Since there's no more pending concerns, is the next step to do a vote?


On Sep 5, 2013, at 5:42 AM, Benjamin Hindman benjamin.hind...@gmail.com wrote:

 Proposal looks great Nathan, I'd like to volunteer as a mentor if you're
 looking for more help!
 
 
 On Wed, Sep 4, 2013 at 1:07 AM, Nathan Marz nat...@nathanmarz.com wrote:
 
 Hi everyone,
 
 I'd like to propose Storm to be an Apache Incubator project. After much
 thought I believe this is the right next step for the project, and I look
 forward to hearing everyone's thoughts and feedback!
 
 Here's a link to the proposal:
 https://wiki.apache.org/incubator/StormProposal
 
 The proposal is also pasted below.
 
 -Nathan
 
 
 = Storm Proposal =
 
 == Abstract ==
 
 Storm is a distributed, fault-tolerant, and high-performance realtime
 computation system that provides strong guarantees on the processing of
 data.
 
 == Proposal ==
 
 Storm is a distributed real-time computation system. Similar to how Hadoop
 provides a set of general primitives for doing batch processing, Storm
 provides a set of general primitives for doing real-time computation. Its
 use cases span stream processing, distributed RPC, continuous computation,
 and more. Storm has become a preferred technology for near-realtime
 big-data processing by many organizations worldwide (see a partial list at
 https://github.com/nathanmarz/storm/wiki/Powered-By). As an open source
 project, Storm’s developer community has grown rapidly to 46 members.
 
 == Background ==
 
 The past decade has seen a revolution in data processing. MapReduce,
 Hadoop, and related technologies have made it possible to store and process
 data at scales previously unthinkable. Unfortunately, these data processing
 technologies are not realtime systems, nor are they meant to be. The lack
 of a Hadoop of realtime has become the biggest hole in the data
 processing ecosystem. Storm fills that hole.
 
 Storm was initially developed and deployed at BackType in 2011. After 7
 months of development BackType was acquired by Twitter in July 2011. Storm
 was open sourced in September 2011.
 
 Storm has been under continuous development on its Github repository since
 being open-sourced. It has undergone four major releases (0.5, 0.6, 0.7,
 0.8) and many minor ones.
 
 == Rationale ==
 
 Storm is a general platform for low-latency big-data processing. It is
 complementary to the existing Apache projects, such as Hadoop. Many
 applications are actually exploring using both Hadoop and Storm for
 big-data processing. Bringing Storm into Apache is very beneficial to both
 Apache community and Storm community.
 
 The rapid growth of Storm community is empowered by open source. We believe
 the Apache foundation is a great fit as the long-term home for Storm, as it
 provides an established process for community-driven development and
 decision making by consensus. This is exactly the model we want for future
 Storm development.
 
 == Initial Goals ==
 
 * Move the existing codebase to Apache
 * Integrate with the Apache development process
 * Ensure all dependencies are compliant with Apache License version 2.0
 * Incremental development and releases per Apache guidelines
 
 == Current Status ==
 
 Storm has undergone four major releases (0.5, 0.6, 0.7, 0.8) and many minor
 ones. Storm 0.9 is about to be released. Storm is being used in production
 by over 50 organizations. Storm codebase is currently hosted at github.com
 ,
 which will seed the Apache git repository.
 
 === Meritocracy ===
 
 We plan to invest in supporting a meritocracy. We will discuss the
 requirements in an open forum. Several companies have already expressed
 interest in this project, and we intend to invite additional developers to
 participate. We will encourage and monitor community participation so that
 privileges can be extended to those that contribute.
 
 === Community ===
 
 The need for a low-latency big-data processing platform in the open source
 is tremendous. Storm is currently being used by at least 50 organizations
 worldwide (see https://github.com/nathanmarz/storm/wiki/Powered-By), and
 is
 the most starred Java project on Github. By bringing Storm into Apache, we
 believe that the community will grow even bigger.
 
 === Core Developers ===
 
 Storm was started by Nathan Marz at BackType, and now has developers from
 Yahoo!, Microsoft, Alibaba, Infochimps, and many other companies.
 
 === Alignment ===
 
 In the big-data processing ecosystem, Storm is a very popular low-latency
 platform, while Hadoop is the primary platform for batch processing. We
 believe that it will help the further growth of big-data community by
 having Hadoop and Storm aligned within Apache foundation. The alignment is
 also beneficial to other Apache communities (such as Zookeeper, Thrift,
 Mesos). We could include additional sub-projects, Storm

Write access to wiki

2013-09-04 Thread Nathan Marz
May I have write access to the incubator wiki (username NathanMarz) so that
I can add a proposal for Storm?

Thanks,
Nathan


Write access to wiki

2013-09-04 Thread Nathan Marz
May I have write access to the incubator wiki (username NathanMarz) so that
I can add a proposal for Storm?

Thanks,
Nathan


[PROPOSAL] Storm for Apache Incubator

2013-09-04 Thread Nathan Marz
Hi everyone,

I'd like to propose Storm to be an Apache Incubator project. After much
thought I believe this is the right next step for the project, and I look
forward to hearing everyone's thoughts and feedback!

Here's a link to the proposal:
https://wiki.apache.org/incubator/StormProposal

The proposal is also pasted below.

-Nathan


= Storm Proposal =

== Abstract ==

Storm is a distributed, fault-tolerant, and high-performance realtime
computation system that provides strong guarantees on the processing of
data.

== Proposal ==

Storm is a distributed real-time computation system. Similar to how Hadoop
provides a set of general primitives for doing batch processing, Storm
provides a set of general primitives for doing real-time computation. Its
use cases span stream processing, distributed RPC, continuous computation,
and more. Storm has become a preferred technology for near-realtime
big-data processing by many organizations worldwide (see a partial list at
https://github.com/nathanmarz/storm/wiki/Powered-By). As an open source
project, Storm’s developer community has grown rapidly to 46 members.

== Background ==

The past decade has seen a revolution in data processing. MapReduce,
Hadoop, and related technologies have made it possible to store and process
data at scales previously unthinkable. Unfortunately, these data processing
technologies are not realtime systems, nor are they meant to be. The lack
of a Hadoop of realtime has become the biggest hole in the data
processing ecosystem. Storm fills that hole.

Storm was initially developed and deployed at BackType in 2011. After 7
months of development BackType was acquired by Twitter in July 2011. Storm
was open sourced in September 2011.

Storm has been under continuous development on its Github repository since
being open-sourced. It has undergone four major releases (0.5, 0.6, 0.7,
0.8) and many minor ones.

== Rationale ==

Storm is a general platform for low-latency big-data processing. It is
complementary to the existing Apache projects, such as Hadoop. Many
applications are actually exploring using both Hadoop and Storm for
big-data processing. Bringing Storm into Apache is very beneficial to both
Apache community and Storm community.

The rapid growth of Storm community is empowered by open source. We believe
the Apache foundation is a great fit as the long-term home for Storm, as it
provides an established process for community-driven development and
decision making by consensus. This is exactly the model we want for future
Storm development.

== Initial Goals ==

  * Move the existing codebase to Apache
  * Integrate with the Apache development process
  * Ensure all dependencies are compliant with Apache License version 2.0
  * Incremental development and releases per Apache guidelines

== Current Status ==

Storm has undergone four major releases (0.5, 0.6, 0.7, 0.8) and many minor
ones. Storm 0.9 is about to be released. Storm is being used in production
by over 50 organizations. Storm codebase is currently hosted at github.com,
which will seed the Apache git repository.

=== Meritocracy ===

We plan to invest in supporting a meritocracy. We will discuss the
requirements in an open forum. Several companies have already expressed
interest in this project, and we intend to invite additional developers to
participate. We will encourage and monitor community participation so that
privileges can be extended to those that contribute.

=== Community ===

The need for a low-latency big-data processing platform in the open source
is tremendous. Storm is currently being used by at least 50 organizations
worldwide (see https://github.com/nathanmarz/storm/wiki/Powered-By), and is
the most starred Java project on Github. By bringing Storm into Apache, we
believe that the community will grow even bigger.

=== Core Developers ===

Storm was started by Nathan Marz at BackType, and now has developers from
Yahoo!, Microsoft, Alibaba, Infochimps, and many other companies.

=== Alignment ===

In the big-data processing ecosystem, Storm is a very popular low-latency
platform, while Hadoop is the primary platform for batch processing. We
believe that it will help the further growth of big-data community by
having Hadoop and Storm aligned within Apache foundation. The alignment is
also beneficial to other Apache communities (such as Zookeeper, Thrift,
Mesos). We could include additional sub-projects, Storm-on-YARN and
Storm-on-Mesos, in the near future.

== Known Risks ==

=== Orphaned Products ===

The risk of the Storm project being abandoned is minimal. There are at
least 50 organizations (Twitter, Yahoo!, Microsoft, Groupon, Baidu,
Alibaba, Alipay, Taobao, PARC, RocketFuel etc) are highly incentivized to
continue development. Many of these organizations have built critical
business applications upon Storm, and have devoted significant internal
infrastructure investment in Storm.

=== Inexperience with Open Source ===

Storm has existed as a healthy

Re: [PROPOSAL] Storm for Apache Incubator

2013-09-04 Thread Nathan Marz
That's how many people have contributed code (even just one small patch). The 
people on the committer list have all made significant and high quality 
contributions.

On Sep 4, 2013, at 3:28 AM, Reto Bachmann-Gmür r...@wymiwyg.com wrote:

 +1 (unbinding)
 Looking very good. Just wondering why there are only 7 initial committers
 when you say that the storm developer community has 46 members.
 
 Cheers.
 Reto
 
 
 On Wed, Sep 4, 2013 at 11:44 AM, Srinath Perera srin...@wso2.com wrote:
 
 +1, look good.
 
 --Srinath
 
 
 On Wed, Sep 4, 2013 at 1:37 PM, Nathan Marz nat...@nathanmarz.com wrote:
 
 Hi everyone,
 
 I'd like to propose Storm to be an Apache Incubator project. After much
 thought I believe this is the right next step for the project, and I look
 forward to hearing everyone's thoughts and feedback!
 
 Here's a link to the proposal:
 https://wiki.apache.org/incubator/StormProposal
 
 The proposal is also pasted below.
 
 -Nathan
 
 
 = Storm Proposal =
 
 == Abstract ==
 
 Storm is a distributed, fault-tolerant, and high-performance realtime
 computation system that provides strong guarantees on the processing of
 data.
 
 == Proposal ==
 
 Storm is a distributed real-time computation system. Similar to how
 Hadoop
 provides a set of general primitives for doing batch processing, Storm
 provides a set of general primitives for doing real-time computation. Its
 use cases span stream processing, distributed RPC, continuous
 computation,
 and more. Storm has become a preferred technology for near-realtime
 big-data processing by many organizations worldwide (see a partial list
 at
 https://github.com/nathanmarz/storm/wiki/Powered-By). As an open source
 project, Storm’s developer community has grown rapidly to 46 members.
 
 == Background ==
 
 The past decade has seen a revolution in data processing. MapReduce,
 Hadoop, and related technologies have made it possible to store and
 process
 data at scales previously unthinkable. Unfortunately, these data
 processing
 technologies are not realtime systems, nor are they meant to be. The lack
 of a Hadoop of realtime has become the biggest hole in the data
 processing ecosystem. Storm fills that hole.
 
 Storm was initially developed and deployed at BackType in 2011. After 7
 months of development BackType was acquired by Twitter in July 2011.
 Storm
 was open sourced in September 2011.
 
 Storm has been under continuous development on its Github repository
 since
 being open-sourced. It has undergone four major releases (0.5, 0.6, 0.7,
 0.8) and many minor ones.
 
 == Rationale ==
 
 Storm is a general platform for low-latency big-data processing. It is
 complementary to the existing Apache projects, such as Hadoop. Many
 applications are actually exploring using both Hadoop and Storm for
 big-data processing. Bringing Storm into Apache is very beneficial to
 both
 Apache community and Storm community.
 
 The rapid growth of Storm community is empowered by open source. We
 believe
 the Apache foundation is a great fit as the long-term home for Storm, as
 it
 provides an established process for community-driven development and
 decision making by consensus. This is exactly the model we want for
 future
 Storm development.
 
 == Initial Goals ==
 
  * Move the existing codebase to Apache
  * Integrate with the Apache development process
  * Ensure all dependencies are compliant with Apache License version 2.0
  * Incremental development and releases per Apache guidelines
 
 == Current Status ==
 
 Storm has undergone four major releases (0.5, 0.6, 0.7, 0.8) and many
 minor
 ones. Storm 0.9 is about to be released. Storm is being used in
 production
 by over 50 organizations. Storm codebase is currently hosted at
 github.com
 ,
 which will seed the Apache git repository.
 
 === Meritocracy ===
 
 We plan to invest in supporting a meritocracy. We will discuss the
 requirements in an open forum. Several companies have already expressed
 interest in this project, and we intend to invite additional developers
 to
 participate. We will encourage and monitor community participation so
 that
 privileges can be extended to those that contribute.
 
 === Community ===
 
 The need for a low-latency big-data processing platform in the open
 source
 is tremendous. Storm is currently being used by at least 50 organizations
 worldwide (see https://github.com/nathanmarz/storm/wiki/Powered-By), and
 is
 the most starred Java project on Github. By bringing Storm into Apache,
 we
 believe that the community will grow even bigger.
 
 === Core Developers ===
 
 Storm was started by Nathan Marz at BackType, and now has developers from
 Yahoo!, Microsoft, Alibaba, Infochimps, and many other companies.
 
 === Alignment ===
 
 In the big-data processing ecosystem, Storm is a very popular low-latency
 platform, while Hadoop is the primary platform for batch processing. We
 believe that it will help the further growth of big-data community by
 having Hadoop and Storm aligned within

Re: [PROPOSAL] Storm for Apache Incubator

2013-09-04 Thread Nathan Marz
We definitely need a storm-user list as the existing google groups mailing
list for Storm is quite active. So we'll need to transition that over. I
agree on adding a storm-commits list and added it to the proposal.


On Wed, Sep 4, 2013 at 11:50 AM, Henry Saputra henry.sapu...@gmail.comwrote:

 Excited about Storm coming to Apache. Small comment about the mailing list,
 you may want to propose having:
 * storm-dev
 * storm-commits
 * storm-private (with moderated subscriptions)

 instead for starting into incubator.

 However, Storm has been a well known open source project, maybe it does
 valid to have storm-user from the beginning. But I think you may need
 storm-commits
 list to separate commits log from dev discussions.
 Mentors can chime in about this.

 Thanks,

 Henry



 On Wed, Sep 4, 2013 at 1:07 AM, Nathan Marz nat...@nathanmarz.com wrote:

  Hi everyone,
 
  I'd like to propose Storm to be an Apache Incubator project. After much
  thought I believe this is the right next step for the project, and I look
  forward to hearing everyone's thoughts and feedback!
 
  Here's a link to the proposal:
  https://wiki.apache.org/incubator/StormProposal
 
  The proposal is also pasted below.
 
  -Nathan
 
 
  = Storm Proposal =
 
  == Abstract ==
 
  Storm is a distributed, fault-tolerant, and high-performance realtime
  computation system that provides strong guarantees on the processing of
  data.
 
  == Proposal ==
 
  Storm is a distributed real-time computation system. Similar to how
 Hadoop
  provides a set of general primitives for doing batch processing, Storm
  provides a set of general primitives for doing real-time computation. Its
  use cases span stream processing, distributed RPC, continuous
 computation,
  and more. Storm has become a preferred technology for near-realtime
  big-data processing by many organizations worldwide (see a partial list
 at
  https://github.com/nathanmarz/storm/wiki/Powered-By). As an open source
  project, Storm’s developer community has grown rapidly to 46 members.
 
  == Background ==
 
  The past decade has seen a revolution in data processing. MapReduce,
  Hadoop, and related technologies have made it possible to store and
 process
  data at scales previously unthinkable. Unfortunately, these data
 processing
  technologies are not realtime systems, nor are they meant to be. The lack
  of a Hadoop of realtime has become the biggest hole in the data
  processing ecosystem. Storm fills that hole.
 
  Storm was initially developed and deployed at BackType in 2011. After 7
  months of development BackType was acquired by Twitter in July 2011.
 Storm
  was open sourced in September 2011.
 
  Storm has been under continuous development on its Github repository
 since
  being open-sourced. It has undergone four major releases (0.5, 0.6, 0.7,
  0.8) and many minor ones.
 
  == Rationale ==
 
  Storm is a general platform for low-latency big-data processing. It is
  complementary to the existing Apache projects, such as Hadoop. Many
  applications are actually exploring using both Hadoop and Storm for
  big-data processing. Bringing Storm into Apache is very beneficial to
 both
  Apache community and Storm community.
 
  The rapid growth of Storm community is empowered by open source. We
 believe
  the Apache foundation is a great fit as the long-term home for Storm, as
 it
  provides an established process for community-driven development and
  decision making by consensus. This is exactly the model we want for
 future
  Storm development.
 
  == Initial Goals ==
 
* Move the existing codebase to Apache
* Integrate with the Apache development process
* Ensure all dependencies are compliant with Apache License version 2.0
* Incremental development and releases per Apache guidelines
 
  == Current Status ==
 
  Storm has undergone four major releases (0.5, 0.6, 0.7, 0.8) and many
 minor
  ones. Storm 0.9 is about to be released. Storm is being used in
 production
  by over 50 organizations. Storm codebase is currently hosted at
 github.com
  ,
  which will seed the Apache git repository.
 
  === Meritocracy ===
 
  We plan to invest in supporting a meritocracy. We will discuss the
  requirements in an open forum. Several companies have already expressed
  interest in this project, and we intend to invite additional developers
 to
  participate. We will encourage and monitor community participation so
 that
  privileges can be extended to those that contribute.
 
  === Community ===
 
  The need for a low-latency big-data processing platform in the open
 source
  is tremendous. Storm is currently being used by at least 50 organizations
  worldwide (see https://github.com/nathanmarz/storm/wiki/Powered-By), and
  is
  the most starred Java project on Github. By bringing Storm into Apache,
 we
  believe that the community will grow even bigger.
 
  === Core Developers ===
 
  Storm was started by Nathan Marz at BackType, and now has developers from
  Yahoo

Re: [PROPOSAL] Storm for Apache Incubator

2013-09-04 Thread Nathan Marz
I think that storm-kafka would make sense as a contrib module since it's widely 
used. I'm not sure what to do with the other storm-contrib modules. I figure 
the less code that's part of the initial repo the better, because there will be 
less contribution/legal issues to sort out. How about this - we plan to include 
storm-kafka under a contrib folder of the Apache Storm project (just because a 
lot of people depend on it), and we can pull other storm-contrib modules in if 
community members show initiative in working on and maintaining them?

If that all sounds good I'll update the proposal accordingly.


On Sep 4, 2013, at 6:41 PM, Joe Stein crypt...@gmail.com wrote:

 What does this mean for storm contribs (
 https://github.com/nathanmarz/storm-contrib)? (spouts  bolts) e.g The
 Apache Kafka spout already it is hard to know which to use and which is
 best for 0.7.X and 0.8.X-betaX...  Is the Apache Storm project going to
 help corral that or is it only for Storm core as the proposal implies with
 only the storm code base https://github.com/nathanmarz/storm being part of
 the project?
 
 A lot of traffic on the existing user list is about spouts (e.g. the Kafka
 Spout) and I was not sure if that would still be talked about or funneled
 somewhere else or what the thoughts/plans where for the parts built within
 Storm that are existing now?
 
 /***
 Joe Stein
 Founder, Principal Consultant
 Big Data Open Source Security LLC
 http://www.stealth.ly
 Twitter: @allthingshadoop http://www.twitter.com/allthingshadoop
 /
 
 
 On Wed, Sep 4, 2013 at 4:34 PM, Nathan Marz nat...@nathanmarz.com wrote:
 
 We definitely need a storm-user list as the existing google groups mailing
 list for Storm is quite active. So we'll need to transition that over. I
 agree on adding a storm-commits list and added it to the proposal.
 
 
 On Wed, Sep 4, 2013 at 11:50 AM, Henry Saputra henry.sapu...@gmail.com
 wrote:
 
 Excited about Storm coming to Apache. Small comment about the mailing
 list,
 you may want to propose having:
 * storm-dev
 * storm-commits
 * storm-private (with moderated subscriptions)
 
 instead for starting into incubator.
 
 However, Storm has been a well known open source project, maybe it does
 valid to have storm-user from the beginning. But I think you may need
 storm-commits
 list to separate commits log from dev discussions.
 Mentors can chime in about this.
 
 Thanks,
 
 Henry
 
 
 
 On Wed, Sep 4, 2013 at 1:07 AM, Nathan Marz nat...@nathanmarz.com
 wrote:
 
 Hi everyone,
 
 I'd like to propose Storm to be an Apache Incubator project. After much
 thought I believe this is the right next step for the project, and I
 look
 forward to hearing everyone's thoughts and feedback!
 
 Here's a link to the proposal:
 https://wiki.apache.org/incubator/StormProposal
 
 The proposal is also pasted below.
 
 -Nathan
 
 
 = Storm Proposal =
 
 == Abstract ==
 
 Storm is a distributed, fault-tolerant, and high-performance realtime
 computation system that provides strong guarantees on the processing of
 data.
 
 == Proposal ==
 
 Storm is a distributed real-time computation system. Similar to how
 Hadoop
 provides a set of general primitives for doing batch processing, Storm
 provides a set of general primitives for doing real-time computation.
 Its
 use cases span stream processing, distributed RPC, continuous
 computation,
 and more. Storm has become a preferred technology for near-realtime
 big-data processing by many organizations worldwide (see a partial list
 at
 https://github.com/nathanmarz/storm/wiki/Powered-By). As an open
 source
 project, Storm’s developer community has grown rapidly to 46 members.
 
 == Background ==
 
 The past decade has seen a revolution in data processing. MapReduce,
 Hadoop, and related technologies have made it possible to store and
 process
 data at scales previously unthinkable. Unfortunately, these data
 processing
 technologies are not realtime systems, nor are they meant to be. The
 lack
 of a Hadoop of realtime has become the biggest hole in the data
 processing ecosystem. Storm fills that hole.
 
 Storm was initially developed and deployed at BackType in 2011. After 7
 months of development BackType was acquired by Twitter in July 2011.
 Storm
 was open sourced in September 2011.
 
 Storm has been under continuous development on its Github repository
 since
 being open-sourced. It has undergone four major releases (0.5, 0.6,
 0.7,
 0.8) and many minor ones.
 
 == Rationale ==
 
 Storm is a general platform for low-latency big-data processing. It is
 complementary to the existing Apache projects, such as Hadoop. Many
 applications are actually exploring using both Hadoop and Storm for
 big-data processing. Bringing Storm into Apache is very beneficial to
 both
 Apache community and Storm community.
 
 The rapid growth of Storm community is empowered by open source. We
 believe
 the Apache