Re: Heron Grant Status & Monthly Reporting

2017-09-14 Thread Bill Graham
I was under the impression that an SGA was required, hence we've pursued
that route. If we could instead just do individual ICLAs, that seems like a
much easier route. +1 for that.

Bill

On Thu, Sep 14, 2017 at 7:18 PM P. Taylor Goetz  wrote:

> +1 for ICLAs. That would make things easier, and I imagine (hope) many
> have already been filed.
>
> -Taylor
>
> > On Sep 14, 2017, at 9:09 PM, John D. Ament 
> wrote:
> >
> > All,
> >
> > Changing the subject to avoid any confusion and also looping in the Heron
> > team to this question/concern.
> >
> > Personally, I'd like to understand why Heron is inclined to do a SGA
> rather
> > than ICLAs.  Typically, an SGA would be used if the source code isn't
> > already Apache Licensed and if we don't receive ICLAs from all
> > contributors.
> >
> > Based on [1] there have been 69 contributors to Heron (plus some
> ambiguity
> > since source code is being dropped in from other sources).  A SGA would
> > only work if Twitter has full custody to the source code.  We have to be
> > very careful that such an SGA doesn't list too much within it, or carry a
> > broad disclaimer on it (based on lessons learned from the Netbeans SGA).
> >
> > With this in mind, I'd actually formally recommend against a SGA for
> Heron,
> > and instead import the code as is and sort through the relevant licensing
> > issues here at Apache.
> >
> > Further, in the below thread, there's a recommendation that the podling
> be
> > left at monthly until repositories are brought over.  I have no opinion
> on
> > the matter, and defer to those recommending the action.  Does the podling
> > have any concerns staying monthly?
> >
> > John
> >
> > [1]: https://github.com/twitter/heron/graphs/contributors
> >
> >> On Wed, Sep 13, 2017 at 5:09 PM Fu Maosong  wrote:
> >>
> >> For Heron, Twitter Legal is working on the SGA. We have sent emails for
> the
> >> progress last week but they haven't replied yet.
> >>
> >> 2017-09-12 17:24 GMT-07:00 John D. Ament :
> >>
> >>> On Tue, Sep 12, 2017 at 6:02 PM Dave Fisher 
> >> wrote:
> >>>
>  Pulsar is now signed off and also shepherd notes done.
> 
>  Heron is a little slow moving over should we keep them monthly until
> >> the
>  repos is moved?
> 
> >>>
> >>> Any thoughts on why Heron may be slow moving?  I'd like to understand
> the
> >>> problem(s) first before condemning them to monthly reporting.
> >>>
> >>>
> 
>  Regards,
>  Dave
> 
> > On Sep 12, 2017, at 2:40 PM, Dave Fisher 
> >>> wrote:
> >
> > I'll be signing Pulsar shortly.
> >
> > Sent from my iPhone
> >
> >> On Sep 12, 2017, at 2:38 PM, John D. Ament 
>  wrote:
> >>
> >> All,
> >>
> >> Below please find the revised report.  I'm extremely happy to see
> >> two
> >>> of
> >> the missing podlings reported.  Hopefully we can get the remaining
> >>> sign
> >> offs in place (Griffin & Pulsar).  Hopefully Spot can file a report
> >> as
>  well.
> >>
> >> Incubator PMC report for September 2017
> >>
> >> The Apache Incubator is the entry path into the ASF for projects and
> >> codebases wishing to become part of the Foundation's efforts.
> >>
> >> There are currently 54 podlings incubating.  We executed nine
> >> podling
> >> releases and have one podling planning to graduate this month.  No
>  changes
> >> to the PMC structure this month.
> >>
> >> * Community
> >>
> >> New IPMC members:
> >>
> >> - None
> >>
> >> People who left the IPMC:
> >>
> >> - None
> >>
> >> * New Podlings
> >>
> >> - Amaterasu
> >> - Daffodil
> >>
> >> * Podlings that Retired
> >>
> >> - MRQL
> >>
> >> * Podlings that failed to report, expected next month
> >>
> >> - Myriad
> >> - Spot
> >>
> >> * Reports Missing Sign off
> >>
> >> - Griffin
> >> - Pulsar
> >>
> >> * Graduations
> >>
> >> The board has motions for the following:
> >>
> >> - RocketMQ
> >> - Your podling here?
> >>
> >> * Releases
> >>
> >> The following releases entered distribution during the month of
> >> August:
> >>
> >> - 2017-08-01 Apache Juneau 6.3.1
> >> - 2017-08-01 Apache Tamaya 0.3.0
> >> - 2017-08-05 Apache Tamaya Extensions 0.3.0
> >> - 2017-08-08 Apache Pulsar 1.19.0
> >> - 2017-08-16 Apache HTrace 4.3.0
> >> - 2017-08-16 Apache Spot 1.0
> >> - 2017-08-17 Apache Fluo 1.0.0
> >> - 2017-08-23 Apache S2Graph 0.2.0
> >> - 2017-08-29 Apache Livy 0.4.0
> >>
> >> * IP Clearance
> >>
> >>
> >>
> >> * Legal / Trademarks
> >>
> >>
> >>
> >> * Infrastructure
> >>
> >>
> >>
> >> * Miscellaneous
> >>
> >>
> >>
> >> * Credits
> >>
> >> 
> >>> --
> >> Table of Cont

Re: Does Heron really need a users@ mailing list?

2017-07-31 Thread Bill Graham
Heron is an existing open source project with an active users list hosted
on google groups. Once we move the code to apache we planned to move user
discussions to apache lists.

On Mon, Jul 31, 2017 at 4:57 PM, sebb  wrote:

> Podlings normally don't need a users@ list as the focus should be on
> building developer community.
>
> Is it really needed for Heron?
>
> -
> To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
> For additional commands, e-mail: general-h...@incubator.apache.org
>
>


[RESULT][VOTE] Heron to enter Apache Incubator

2017-06-23 Thread Bill Graham
Dear Incubator members,

This vote has passed with the following result:

24 [+1] votes (8 binding, 16 non-binding)

Binding Votes:


Julien Le Dem
Raphael Bircher
Jake Farrell
Jacques Nadeau
Julian Hyde
Chris Douglas
John Ament
P. Taylor Goetz

Non-binding Votes:


Debo Dutta
William Markito Oliveira
Roy Lenferink
Sijie Guo
Sanjeev Kulkarni
Chris Aniszczyk
Supun Kamburugamuva
Jia Zhai
Karthi Ramasamy
Ashvin A
Ashish
Nabarun Nag
Pierre Smits
Byung-Gon Chun
Bill Graham
Van Gosling

Vote thread:
https://lists.apache.org/thread.html/767b502a7cb698cc509c5f79835db6
1effc89cb80baec46a52359554@%3Cgeneral.incubator.apache.org%3E

I will start the process for getting the SGA submitted.

thanks,
Bill


Re: [VOTE] Heron to enter Apache Incubator

2017-06-23 Thread Bill Graham
This vote is now close and it has passed. Thanks to all who all who
participated in the proposal review and vote.

The vote tally is as follows:

24 [+1] votes (8 binding, 16 non-binding)

Binding Votes:


Julien Le Dem
Raphael Bircher
Jake Farrell
Jacques Nadeau
Julian Hyde
Chris Douglas
John Ament
P. Taylor Goetz

Non-binding Votes:


Debo Dutta
William Markito Oliveira
Roy Lenferink
Sijie Guo
Sanjeev Kulkarni
Chris Aniszczyk
Supun Kamburugamuva
Jia Zhai
Karthi Ramasamy
Ashvin A
Ashish
Nabarun Nag
Pierre Smits
Byung-Gon Chun
Bill Graham
Van Gosling


On Fri, Jun 23, 2017 at 10:25 AM, Debo Dutta (dedutta) 
wrote:

> +1 to Ted’s comment.
>
> As a user, I would love to pick one system and reuse the storm topologies.
> Ideally pick one converged solution.
>
> +1 to the incubation since it will eventually lead to a better options
> within Apache.
>
> debo
>
> On 6/23/17, 10:08 AM, "Ted Dunning"  wrote:
>
> Anybody who worries about you serving as mentor needs a dose of
> reality.
> They can't get anybody better.
>
> On Jun 22, 2017 12:21 PM, "P. Taylor Goetz"  wrote:
>
> if there are ongoing concerns from either the Storm PMC or the Heron
> PPMC
> about me acting as a mentor, I would be willing to step down.
>
> +1 (binding)
>
> -Taylor
>
>
>
> -
> To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
> For additional commands, e-mail: general-h...@incubator.apache.org
>


Re: [VOTE] Heron to enter Apache Incubator

2017-06-23 Thread Bill Graham
Thanks John. We'll keep the originally posted vote close time then.

To answer your previous question about protocol, basically yes at the user
spec/API level used to author topologies, but not at the internal APIs and
communications protocol, those are different. It's roughly analogous to two
different implementations of the JMS or Servlet specs, where both implement
the same spec, but have their own architecture, internal protocols and
additional features.


On Fri, Jun 23, 2017 at 9:59 AM, John D. Ament 
wrote:

> Based on the additional comments, I'm OK with this continuing graduation.
> I would like the proposed podling to undertake a specific task to ensure
> its clear what is different between Storm and Heron, to avoid any
> unexpected competition or user confusion.
>
> John
>
> On Fri, Jun 23, 2017 at 12:32 PM Julien Le Dem 
> wrote:
>
> > Hi Edward,
> >
> > A better comparison is SQL. Heron provides an implementation of the Storm
> > topology api just like a query engine would implement SQL.
> > It is a statement to the Storm API that it became a reference for
> > streaming. This is the shared component and I agree that both projects
> > should collaborate around it.
> >
> > The proposal already has a statement of cooperation: *"We believe that
> > having Heron at Apache will help further the growth of the streaming
> > compute community, as well as encourage cooperation and developer cross
> > pollination with other Apache projects."*
> > If Heron started at Twitter it has now contributors from more companies.
> In
> > particular Microsoft which has been presenting this work in conferences.
> > Joining the incubator is also about growing the community. Diversity is a
> > goal but not a requirement to enter the incubator. Many successful
> projects
> > have started with a little diversity and grown.
> > Heron is its own project, different form Storm both in the programming
> > language used and the deployment approach.
> >
> > This is not a new situation, the Apache foundation has Thrift and Avro,
> > Parquet and ORC to name a few competing projects that address similar
> needs
> > with a different approach each with their own strengths and weaknesses.
> >
> > Your concerns are valid and should be addressed during incubation
> (ensuring
> > cross project collaboration, building more diversity, ...)
> >
> > On Fri, Jun 23, 2017 at 6:54 AM, Edward Capriolo 
> > wrote:
> >
> > > "The only overlap is that Heron supports the Storm user API for ease of
> > > migration."
> > >
> > > It sounds possible possible that storm could be one user facing API
> with
> > > two back ends inside one project.
> > >
> > > "Accumulo vs HBase" I do not think Accumulo and HBase is a valid
> > comparison
> > > one did not start out to emulate or be compatible with the other.
> > >
> > > In any case the largest issue I see is community. The proposed Heron
> > > committer list is mostly a single company. Storm has already
> established
> > a
> > > community
> > > with diverse committers. Also in terms of adoption, suppose you are a
> > Storm
> > > user do you run Heron side by side? Suppose your a vendor that packages
> > > Hadoop and friends do you ship
> > > both? Suppose you provide a no-sql database do you manage (test,
> > document)
> > > a connector for Heron and Storm? In my experience it is not trivial to
> > keep
> > > something working for example AbcBolt across Storm versions now that
> > matrix
> > > would double.
> > >
> > > I wish their was a stronger statement of cooperation in the proposal,
> for
> > > example, "We with to establish a middle ground repo with shared
> > components
> > > etc". If nothing is shared other than a mentor or PMC your could run
> > > into "software
> > > X is the fasted way to run your storm bolts and spouts because of our
> > > special sauce software Y does not have" and "software x is 2.2 years
> > behind
> > > the api of software Y they only implement and test 10% of the spouts we
> > > support"
> > >
> > >
> > > On Fri, Jun 23, 2017 at 8:08 AM, John D. Ament 
> > > wrote:
> > >
> > > > Bill,
> > > >
> > > > Would I be correct in understanding that Heron implements the same
> > > protocol
> > > > as Storm, but the actual implementation is different?
> > > >
> > &g

Re: [VOTE] Heron to enter Apache Incubator

2017-06-22 Thread Bill Graham
It's grossly inaccurate to refer to Heron as a Storm fork. There are about
132k lines of code in the Heron codebase (plus 166k of codegen), of which
about 7k are to implement the Apache Storm API bindings to the Heron API.

The Rationale section of the proposal discusses the Heron architecture,
which is a complete rewrite with little in common with Storm. The only
overlap is that Heron supports the Storm user API for ease of migration.

The value of having multiple projects to solve a common need is that each
can foster innovation, collaboration and exchange of ideas in different
ways. This is not a new concept to Apache. You can look at the incubator
discussions around Accumulo vs HBase (two implementations of the BigTable
paper) for example, to see how two different approaches to a shared problem
can be a good thing.

thanks,
Bill

On Thu, Jun 22, 2017 at 6:45 PM, Von Gosling  wrote:

> Hi,
>
> I will give +1(Non-binding), but,
>
> I have the similar question about so many streaming framework in the
> apache, how to develop community for themselves.
>
>
>
>
> Best Regards,
> Von Gosling
>
>
>
> 在 2017年6月23日,08:51,Edward Capriolo  写道:
>
> I believe heron and storm should be merged back together. I do not see the
> value of storm and a storm fork in the asf.
>
> On Thursday, June 22, 2017, Bill Graham  wrote:
>
> Thanks Taylor for relaying these sentiments, especially the part about the
> Heron website which is indeed poorly worded (I suspect this could have been
> the result of internal docs being open-sourced). I've opened this pull
> request to update the language regarding Storm:
>
> https://github.com/twitter/heron/pull/1979
>
> On Thu, Jun 22, 2017 at 12:21 PM, P. Taylor Goetz  > wrote:
>
> The Apache Storm PMC had a discussion regarding the Heron proposal. In
>
> the
>
> spirit of openness I wanted to bring some of the sentiments expressed in
> that discussion back to this list. Please note that I am paraphrasing
>
> from
>
> that discussion and attempting to relay opinions of the collective PMC,
>
> not
>
> necessarily that of any individual.
>
> * There is a general disappointment that the Heron community chose not to
> engage with the Storm community and instead chose a separate path.
> * A majority of the PMC supports Heron’s incubation, though some felt it
> would result in unnecessary duplication of effort.
> * A majority of the PMC supports the two projects working closely
> together. A number of PMC members suggested the two projects merge in
>
> some
>
> way.
> * Many PMC members took issue some of the marketing language on the Heron
> website, particularly Heron being billed as “the direct successor to
>
> Apache
>
> Storm” and the prominent “Upgrade from Storm” links.  The main concern
>
> here
>
> was such phrasing has somewhat of a hostile tone and undermines the
>
> desire
>
> for better collaboration, as well as confusing users.
>
> One of my goals as a proposed mentor for Heron and a Storm PMC member is
> to address some of these concerns and encourage collaboration. As I
> mentioned to the Storm PMC on that thread, if there are ongoing concerns
> from either the Storm PMC or the Heron PPMC about me acting as a mentor,
>
> I
>
> would be willing to step down.
>
> +1 (binding)
>
> -Taylor
>
> On Jun 16, 2017, at 4:41 PM, Bill Graham 
> > wrote:
>
>
> Hi,
>
> Based on the discussion on the incubator mailing list[1] I would like
>
> to
>
> call a vote to add Heron to the Apache Incubator.
>
> The full proposal is available below, and is also available on the
>
> Apache
>
> Incubator wiki at:
>   https://wiki.apache.org/incubator/HeronProposal
>
> Please vote:
> [ ] +1, bring Heron into Incubator
> [ ] -1, do not bring Heron into Incubator, because...
>
> The vote will open for 7 days until Friday June 23 at 14:00 PT.
>
> Thank you
>
> 1 -
> https://lists.apache.org/thread.html/fb91f527ef479bb5df45bf2c9d93b7
>
> 786c3fa6cdbfeba3128599df79@%3Cgeneral.incubator.apache.org%3E
>
>
>
>
> = Heron Proposal =
>
> = Abstract =
> Heron is a real-time, distributed, fault-tolerant stream processing
>
> engine
>
> initially developed by Twitter.
>
> = Proposal =
>
> Heron is a real-time stream processing engine built for high
>
> performance,
>
> ease of manageability, performance predictability and developer
> productivity[1]. We wish to develop a community around Heron to
>
> increase
>
> contributions and see Heron thrive in an open forum.
>
> = Background =
>
> Heron provides the ability for developers to compose directed acyclic
> graphs (DAGs

Re: [VOTE] Heron to enter Apache Incubator

2017-06-22 Thread Bill Graham
Thanks Taylor for relaying these sentiments, especially the part about the
Heron website which is indeed poorly worded (I suspect this could have been
the result of internal docs being open-sourced). I've opened this pull
request to update the language regarding Storm:

https://github.com/twitter/heron/pull/1979

On Thu, Jun 22, 2017 at 12:21 PM, P. Taylor Goetz  wrote:

> The Apache Storm PMC had a discussion regarding the Heron proposal. In the
> spirit of openness I wanted to bring some of the sentiments expressed in
> that discussion back to this list. Please note that I am paraphrasing from
> that discussion and attempting to relay opinions of the collective PMC, not
> necessarily that of any individual.
>
> * There is a general disappointment that the Heron community chose not to
> engage with the Storm community and instead chose a separate path.
> * A majority of the PMC supports Heron’s incubation, though some felt it
> would result in unnecessary duplication of effort.
> * A majority of the PMC supports the two projects working closely
> together. A number of PMC members suggested the two projects merge in some
> way.
> * Many PMC members took issue some of the marketing language on the Heron
> website, particularly Heron being billed as “the direct successor to Apache
> Storm” and the prominent “Upgrade from Storm” links.  The main concern here
> was such phrasing has somewhat of a hostile tone and undermines the desire
> for better collaboration, as well as confusing users.
>
> One of my goals as a proposed mentor for Heron and a Storm PMC member is
> to address some of these concerns and encourage collaboration. As I
> mentioned to the Storm PMC on that thread, if there are ongoing concerns
> from either the Storm PMC or the Heron PPMC about me acting as a mentor, I
> would be willing to step down.
>
> +1 (binding)
>
> -Taylor
>
> > On Jun 16, 2017, at 4:41 PM, Bill Graham  wrote:
> >
> > Hi,
> >
> > Based on the discussion on the incubator mailing list[1] I would like to
> > call a vote to add Heron to the Apache Incubator.
> >
> > The full proposal is available below, and is also available on the Apache
> > Incubator wiki at:
> >https://wiki.apache.org/incubator/HeronProposal
> >
> > Please vote:
> >  [ ] +1, bring Heron into Incubator
> >  [ ] -1, do not bring Heron into Incubator, because...
> >
> > The vote will open for 7 days until Friday June 23 at 14:00 PT.
> >
> > Thank you
> >
> > 1 -
> > https://lists.apache.org/thread.html/fb91f527ef479bb5df45bf2c9d93b7
> 786c3fa6cdbfeba3128599df79@%3Cgeneral.incubator.apache.org%3E
> >
> >
> >
> > = Heron Proposal =
> >
> > = Abstract =
> > Heron is a real-time, distributed, fault-tolerant stream processing
> engine
> > initially developed by Twitter.
> >
> > = Proposal =
> >
> > Heron is a real-time stream processing engine built for high performance,
> > ease of manageability, performance predictability and developer
> > productivity[1]. We wish to develop a community around Heron to increase
> > contributions and see Heron thrive in an open forum.
> >
> > = Background =
> >
> > Heron provides the ability for developers to compose directed acyclic
> > graphs (DAGs) of real-time query execution logic (i.e. a topology) and
> > submit the topology to execute on a pluggable job scheduling system
> (e.g.,
> > Apache Aurora, YARN, Marathon, etc). Users can employ either the native
> > Heron API or the Apache Storm API to develop the topology. Heron supports
> > the Storm API for ease of migration, but beyond that Heron’s architecture
> > differs considerably from Storm’s.
> >
> > Users submit a topology to the scheduler using the Heron client, which
> uses
> > the Heron binary libraries to deploy all daemons required to run and
> manage
> > the topology. The topology therefore has no reliance on centrally managed
> > Heron services, only on a generic job scheduling system, which lends
> itself
> > well to be run on top of Apache Aurora/Mesos or Apache Hadoop/YARN (among
> > others).
> >
> > The scheduler runs each topology as a job consisting of multiple
> > containers. One of the containers runs the topology master, responsible
> for
> > managing the topology. The remaining containers each runs a stream
> manager
> > responsible for data routing, a metrics manager that collects and reports
> > various metrics and a number of processes called Heron instances which
> run
> > the user-defined logic on the stream of tuples. Parallelism is achieved
> via
> > process-base

Re: [VOTE] Heron to enter Apache Incubator

2017-06-22 Thread Bill Graham
+1 (non-binding)

On Mon, Jun 19, 2017 at 6:15 AM, Jake Farrell  wrote:

> Thanks John
> Comments inline, will ensure that your points are addressed before the
> first release candidate.
>
> -Jake
>
>
> On Sun, Jun 18, 2017 at 6:35 AM, John D. Ament 
> wrote:
>
>> +1, however a few things to note about the proposal (and follow up will be
>> required when bringing Heron in):
>>
>> - There is no ASF 2.0 license (missed when putting together the proposal)
>>
>
> Will ensure that all licensing checkboxes are addressed before the first
> release candidate goes up for a vote.
>
>
> - The IP section doesn't mention anything about a SGA being sent, is your
>> intention to not send an SGA?
>>
>
> SGA is not required to be filed prior to an incubator acceptance vote, it
> is 100% required before the codebase can be imported by infra, which the
> mentors will ensure does occur. (i've all ready asked the project to get
> this rolling)
>
>
>
>> - The NOTICE for the repo indicates there is some source code from Yahoo!.
>> - The contents of
>> https://github.com/twitter/heron/tree/master/third_party seems
>> to be mostly binary files, and you'll need to clean that up for your first
>> release.
>> - Your 3rd party section mentions everything is ASF 2.0, however this
>> includes glog and similar tools that include an odd buildchain license
>> that
>> is actually GPL, we'll need to get clearance if this is actually compliant
>> or not.  Some of the contents in third_party are missing license headers.
>>
>>
> This is similar to other projects using a local third_party cache
> directory that have come to the Apache Incubator, Cassandra, Mesos and
> Aurora are a couple that jump into mind. We will ensure that this is
> addressed and that no source release contains any of these files.
>
>
>
>> John
>>
>> On Fri, Jun 16, 2017 at 4:41 PM Bill Graham  wrote:
>>
>> > Hi,
>> >
>> > Based on the discussion on the incubator mailing list[1] I would like to
>> > call a vote to add Heron to the Apache Incubator.
>> >
>> > The full proposal is available below, and is also available on the
>> Apache
>> > Incubator wiki at:
>> > https://wiki.apache.org/incubator/HeronProposal
>> >
>> > Please vote:
>> >   [ ] +1, bring Heron into Incubator
>> >   [ ] -1, do not bring Heron into Incubator, because...
>> >
>> > The vote will open for 7 days until Friday June 23 at 14:00 PT.
>> >
>> > Thank you
>> >
>> > 1 -
>> >
>> > https://lists.apache.org/thread.html/fb91f527ef479bb5df45bf2
>> c9d93b7786c3fa6cdbfeba3128599df79@%3Cgeneral.incubator.apache.org%3E
>> >
>> >
>> >
>> > = Heron Proposal =
>> >
>> > = Abstract =
>> > Heron is a real-time, distributed, fault-tolerant stream processing
>> engine
>> > initially developed by Twitter.
>> >
>> > = Proposal =
>> >
>> > Heron is a real-time stream processing engine built for high
>> performance,
>> > ease of manageability, performance predictability and developer
>> > productivity[1]. We wish to develop a community around Heron to increase
>> > contributions and see Heron thrive in an open forum.
>> >
>> > = Background =
>> >
>> > Heron provides the ability for developers to compose directed acyclic
>> > graphs (DAGs) of real-time query execution logic (i.e. a topology) and
>> > submit the topology to execute on a pluggable job scheduling system
>> (e.g.,
>> > Apache Aurora, YARN, Marathon, etc). Users can employ either the native
>> > Heron API or the Apache Storm API to develop the topology. Heron
>> supports
>> > the Storm API for ease of migration, but beyond that Heron’s
>> architecture
>> > differs considerably from Storm’s.
>> >
>> > Users submit a topology to the scheduler using the Heron client, which
>> uses
>> > the Heron binary libraries to deploy all daemons required to run and
>> manage
>> > the topology. The topology therefore has no reliance on centrally
>> managed
>> > Heron services, only on a generic job scheduling system, which lends
>> itself
>> > well to be run on top of Apache Aurora/Mesos or Apache Hadoop/YARN
>> (among
>> > others).
>> >
>> > The scheduler runs each topology as a job consisting of multiple
>> > containers. One of the containers runs the topology master, re

[VOTE] Heron to enter Apache Incubator

2017-06-16 Thread Bill Graham
stems to build the project's contributor base.

== Core Developers ==

Current core developers are engineers from Twitter, Google, Microsoft and
Streamlio.

== Alignment ==

Heron utilizes a number of Apache technologies. Heron leverages Apache
ZooKeeper for coordination and has scheduler implementations to integrate
with Apache Mesos, Apache Aurora and Apache Hadoop's YARN (via Apache REEF)
as well as spout implementations to integrate with Apache Kafka and metrics
implementations to integrate with Scribe. Heron also implements the Apache
Storm user-level API, which allows topologies written against Storm to run
in Heron. We believe that having Heron at Apache will help further the
growth of the streaming compute community, as well as encourage cooperation
and developer cross pollination with other Apache projects.

= Known Risks =

== Orphaned Products ==

The risk of the Heron project being abandoned is minimal. It is used in
production at Twitter and Google and other companies are evaluating or
adopting it for production use.

== Inexperience with Open Source ==

All of the core contributors to the project have considerable experience
with open source software development. Bill Graham[2], Ashvin Agrawal[3]
and Supun Kamburugamuve[4], committers on the project, are PMCs on other
Apache projects and Bill and Ashvin have gone through the Apache incubator
process. Twitter has already donated numerous projects to the ASF (e.g.,
Apache Mesos, Apache Aurora, Apache Parquet). We also plan to be mentored
by experienced ASF members that can help with any roadblocks.

== Homogenous Developers ==

Initial committers come from 5 separate organizations. Our intention is
increase the diversity of contributing developers and their affiliations.
To date github contributions have come from approximately 50 contributors
from outside the Twitter team.

== Reliance on Salaried Developers ==

It is expected that Heron development will occur on both salaried time and
on volunteer time. The majority of initial committers are paid by their
employers to contribute to this project. We are committed to recruiting
additional committers from other organizations as well as non-salaried
committers to join project.

== Relationships with Other Apache Products ==

As mentioned in the Alignment section, Heron implements the Apache Storm
API and integrates with multiple Apache schedulers (Apache Mesos, Apache
Aurora and Apache Hadoop's YARN) as well as Apache ZooKeeper and Apache
Thrift.

== An Excessive Fascination with the Apache Brand ==

Heron's popularity is growing in the streaming compute space and we are
long time supporters of the Apache brand. This proposal is not for the
purpose of generating publicity through. Rather, the primary benefits to
joining Apache are those of community building and open decision making
outlined in the Rationale section.

== Documentation ==

This proposal exists online as
http://wiki.apache.org/incubator/HeronProposal. Extensive documentation can
be found on github at https://twitter.github.io/heron and the source code
is well documented.

== Source and Intellectual Property Submission Plan ==

The Heron codebase is currently hosted on Github:
https://github.com/twitter/heron. During incubation, the codebase will be
migrated to Apache infrastructure. The source code is already ASF 2.0
licensed.

== External Dependencies ==

All external libraries have ASF 2.0 compatible licenses except for pylint.
The pylint library is GPL licensed, but is only used for pre-build Python
style checks and is neither bundled with, nor relied upon by, the Heron
source or binary release artifacts.

== Cryptography ==

Heron does not use any cryptography libraries.

= Required Resources =

== Mailing lists ==

 * priv...@heron.incubator.apache.org (with moderated subscriptions)
 * d...@heron.incubator.apache.org
 * comm...@heron.incubator.apache.org
 * u...@heron.incubator.apache.org

== Subversion Directory ==

Git is the preferred source control system: git://git.apache.org/heron

== Issue Tracking ==

JIRA: Heron (HERON)

== Initial Committers ==

 * Andrew Jorgensen (andrew at andrewjorgensen dot com)
 * Ashvin Agrawal (ashvin at apache dot org)*
 * Avrilia Floratou (avrilia dot floratou at gmail dot com)
 * Bill Graham (billgraham at apache dot org)*
 * Brian Hatfield (bmhatfield at gmail dot com)
 * Chris Kellogg (cckellogg at gmail dot com)
 * Huijun Wu (huijun dot wu dot 2010 at gmail dot com)
 * Karthik Ramasamy (karthik at gmail dot com)
 * Maosong Fu (maosongfu at gmail dot com)
 * Neng Lu(freeneng at gmail dot com)
 * Runhang Li (obj dot runhang at gmail dot com)
 * Sanjeev Kulkarni (sanjeevrk at gmail dot com)
 * Supun Kamburugamuve (supun at apache dot org)*
 * Thomas Sun (tom dot ssf at gmail dot com)
 * Yaliang Wang (yaliang dot w dot wang at ieee dot org)

== Affiliations ==

 * Andrew Jorgensen (Google)
 * Ashvin Agrawal (Microsoft)
 * Avrilia Floratou (Microsoft)
 * Bill Graham (T

Edit access request for incubator wiki

2017-06-16 Thread Bill Graham
Hi,

Can I please have edit permission to https://wiki.apache.org/incubator
/ProjectProposals. My wiki login is BillGraham.

thanks,
Bill
-- 
Sent from Gmail Mobile


Re: [PROPOSAL] Heron

2017-06-15 Thread Bill Graham
Hi Taylor,

The Heron team engaged with members of the Apache Storm community through
private channels before the project was made available as open source. We
recognize this is not the ideal approach and going forward we will use more
collaborative methods as we progress and grow the Heron community.

One of our goals during incubation will be to use open forums of
communication, like the Apache mailing lists, and work to foster a truly
collaborative environment for both Apache Storm and Heron community members
to work within together.

The Fabric team at Google uses Heron extensively.

thanks,
Bill

On Thu, Jun 15, 2017 at 10:42 AM, Debo Dutta (dedutta) 
wrote:

> Am happy to help too!
>
> Thx
> Debo
>
> Sent from my iPhone
>
> > On Jun 14, 2017, at 8:05 PM, William Markito Oliveira <
> william.mark...@gmail.com> wrote:
> >
> > Howdy!
> >
> > If Heron is looking for some help around incubation process, I'd love to
> > help while Geode experience is still fresh in my mind and given that
> it's a
> > project/space that I do have interest. Since I'm not an ASF member, I
> don't
> > think I can offer to be a mentor, but can probably still help and
> > participate on the process.
> >
> > Thanks!
> >
> >> On Wed, Jun 14, 2017 at 7:54 PM, P. Taylor Goetz 
> wrote:
> >>
> >> Hi Bill/Supun,
> >>
> >> Sorry for not being a little more clear. I was asking more about how the
> >> Heron community would seek to engage with Storm community at the
> >> *community* level as opposed to the technical level (i.e. “Community
> over
> >> Code”).
> >>
> >> I’ve been asked by many why this has never happened, and have always
> >> struggled to answer. Maybe you could help answer that question as well
> as
> >> if and how that might change if Heron were to incubate.
> >>
> >> Another quick question: The proposal mentions Heron being used in
> >> production at Google, but some Google employees I recently spoke to
> seemed
> >> to contradict that. Could you explain? Note that’s nothing that would
> >> preclude the project from incubating, I’m just curious.
> >>
> >> -Taylor
> >>
> >>> On Jun 14, 2017, at 7:35 AM, Supun Kamburugamuve 
> >> wrote:
> >>>
> >>> Hi Taylor,
> >>>
> >>> For me, one of the interesting differences between Heron and Storm is
> the
> >>> execution model. Storm uses a shared memory model while Heron uses a
> >>> process based model. It will be interesting to see how these two
> evolve.
> >>>
> >>> Thanks,
> >>> Supun..
> >>>
> >>> On Mon, Jun 12, 2017 at 4:15 PM, Bill Graham 
> >> wrote:
> >>>
> >>>> Hi Taylor,
> >>>>
> >>>> Thanks for the mentor offer, we'd be glad to have your help.
> >>>>
> >>>> I think the best place for collaboration would be around the evolution
> >> of
> >>>> the API. In addition we plan to look more into DSL solutions which we
> >> could
> >>>> potentially collaborate on. This could be Trident, or Beam or
> something
> >>>> else, but there could be synergies for future development here.
> >>>>
> >>>> thanks,
> >>>> Bill
> >>>>
> >>>> On Fri, Jun 9, 2017 at 8:53 PM, P. Taylor Goetz 
> >> wrote:
> >>>>
> >>>>> Hi Bill,
> >>>>>
> >>>>> Could you comment on how/if the Heron community would be willing to
> >> work
> >>>>> with the Storm community? I've seen a number of new features in Storm
> >>>> being
> >>>>> ported to Heron, but I have yet to see any attempt by the Heron
> >> community
> >>>>> to engage with the Apache Storm community.
> >>>>>
> >>>>> I don't think it would be too far off to say that the relationship
> >>>> between
> >>>>> Heron and Apache Storm has been somewhat adversarial. The pre- and
> >>>>> post-open sourcing marketing around Heron seemed, at least to me,
> >>>> somewhat
> >>>>> aggressively negative toward Storm.
> >>>>>
> >>>>> As a peer to Apache Storm, how would the proposed "Apache Heron"
> >>>> community
> >>>>> work to collaborate with the Storm community? If Heron is adopting
>

Re: [PROPOSAL] Heron

2017-06-12 Thread Bill Graham
Hi Taylor,

Thanks for the mentor offer, we'd be glad to have your help.

I think the best place for collaboration would be around the evolution of
the API. In addition we plan to look more into DSL solutions which we could
potentially collaborate on. This could be Trident, or Beam or something
else, but there could be synergies for future development here.

thanks,
Bill

On Fri, Jun 9, 2017 at 8:53 PM, P. Taylor Goetz  wrote:

> Hi Bill,
>
> Could you comment on how/if the Heron community would be willing to work
> with the Storm community? I've seen a number of new features in Storm being
> ported to Heron, but I have yet to see any attempt by the Heron community
> to engage with the Apache Storm community.
>
> I don't think it would be too far off to say that the relationship between
> Heron and Apache Storm has been somewhat adversarial. The pre- and
> post-open sourcing marketing around Heron seemed, at least to me, somewhat
> aggressively negative toward Storm.
>
> As a peer to Apache Storm, how would the proposed "Apache Heron" community
> work to collaborate with the Storm community? If Heron is adopting API
> changes in Storm, then it seems there is an opportunity for collaboration.
>
> Don't take any of this as an objection to incubating the project. I would
> support it. I would also be willing to be a mentor, if you would consider
> taking on another.
>
> -Taylor
>
> > On Jun 8, 2017, at 1:23 PM, Bill Graham  wrote:
> >
> > Dear Apache Incubator Community,
> >
> > We are excited to share our proposal for discussion and feedback
> > for entering Apache Incubation. Heron is a real-time, distributed,
> > fault-tolerant stream processing engine.
> >
> > Our proposal can be found at https://wiki.apache.org/
> incubator/HeronProposal
> > and is included below.
> >
> >
> > Thank you,
> >
> > Bill Graham on behalf of the Heron developers
> >
> >
> > # Heron Proposal
> >
> > ## Abstract
> > Heron is a real-time, distributed, fault-tolerant stream processing
> engine
> > initially developed by Twitter.
> >
> > ## Proposal
> >
> > Heron is a real-time stream processing engine built for high performance,
> > ease of manageability, performance predictability and developer
> > productivity[1]. We wish to develop a community around Heron to increase
> > contributions and see Heron thrive in an open forum.
> >
> > ## Background
> >
> > Heron provides the ability for developers to compose directed acyclic
> > graphs (DAGs) of real-time query execution logic (i.e. a topology) and
> > submit the topology to execute on a pluggable job scheduling system
> (e.g.,
> > Apache Aurora, YARN, Marathon, etc). Users can employ either the native
> > Heron API or the Apache Storm API to develop the topology. Heron supports
> > the Storm API for ease of migration, but beyond that Heron’s architecture
> > differs considerably from Storm’s.
> >
> > Users submit a topology to the scheduler using the Heron client, which
> uses
> > the Heron binary libraries to deploy all daemons required to run and
> manage
> > the topology. The topology therefore has no reliance on centrally managed
> > Heron services, only on a generic job scheduling system, which lends
> itself
> > well to be run on top of Apache Aurora/Mesos or Apache Hadoop/YARN (among
> > others).
> >
> > The scheduler runs each topology as a job consisting of multiple
> > containers. One of the containers runs the topology master, responsible
> for
> > managing the topology. The remaining containers each runs a stream
> manager
> > responsible for data routing, a metrics manager that collects and reports
> > various metrics and a number of processes called Heron instances which
> run
> > the user-defined logic on the stream of tuples. Parallelism is achieved
> via
> > process-based isolation of Heron instances, which provides predictable
> > performance while simplifying debugging. The containers are allocated and
> > managed by the scheduler framework based on resource availability of
> nodes
> > in the cluster. The metadata for the topology, such as the physical plan
> > and execution details, are stored in the pluggable Heron State Manager
> > (e.g. Apache ZooKeeper).
> >
> > ## Rationale
> >
> > Heron is a general-purpose, modular and extensible platform that can be
> > leveraged to support common, real-time analytics use cases. There is an
> > increasing demand for open-source, scalable real-time analytics systems.
> We
> > believe that Heron c

Re: [PROPOSAL] Heron

2017-06-09 Thread Bill Graham
Thanks Supun. The main commonality with Storm is that Heron implements the
Storm user-facing API for developing topologies. Besides that the
architecture is completely different (no central services, full isolation
at the process, task and topology levels, pluggable scheduler, etc). The
key unique elements of the Heron architecture are described in paragraphs 2
and 3 of Background section.

On Fri, Jun 9, 2017 at 12:17 PM, Supun Kamburugamuva 
wrote:

> Thanks Bill for bringing up the proposal. Should there be a sentence or
> two describing how Heron compares with Storm?
>
> Regards,
> Supun..
>
> On Thu, Jun 8, 2017 at 1:23 PM, Bill Graham  wrote:
>
>> Dear Apache Incubator Community,
>>
>> We are excited to share our proposal for discussion and feedback
>> for entering Apache Incubation. Heron is a real-time, distributed,
>> fault-tolerant stream processing engine.
>>
>> Our proposal can be found at https://wiki.apache.org/incuba
>> tor/HeronProposal
>>  and is included below.
>>
>>
>> Thank you,
>>
>> Bill Graham on behalf of the Heron developers
>>
>>
>> # Heron Proposal
>>
>> ## Abstract
>> Heron is a real-time, distributed, fault-tolerant stream processing engine
>> initially developed by Twitter.
>>
>> ## Proposal
>>
>> Heron is a real-time stream processing engine built for high performance,
>> ease of manageability, performance predictability and developer
>> productivity[1]. We wish to develop a community around Heron to increase
>> contributions and see Heron thrive in an open forum.
>>
>> ## Background
>>
>> Heron provides the ability for developers to compose directed acyclic
>> graphs (DAGs) of real-time query execution logic (i.e. a topology) and
>> submit the topology to execute on a pluggable job scheduling system (e.g.,
>> Apache Aurora, YARN, Marathon, etc). Users can employ either the native
>> Heron API or the Apache Storm API to develop the topology. Heron supports
>> the Storm API for ease of migration, but beyond that Heron’s architecture
>> differs considerably from Storm’s.
>>
>> Users submit a topology to the scheduler using the Heron client, which
>> uses
>> the Heron binary libraries to deploy all daemons required to run and
>> manage
>> the topology. The topology therefore has no reliance on centrally managed
>> Heron services, only on a generic job scheduling system, which lends
>> itself
>> well to be run on top of Apache Aurora/Mesos or Apache Hadoop/YARN (among
>> others).
>>
>> The scheduler runs each topology as a job consisting of multiple
>> containers. One of the containers runs the topology master, responsible
>> for
>> managing the topology. The remaining containers each runs a stream manager
>> responsible for data routing, a metrics manager that collects and reports
>> various metrics and a number of processes called Heron instances which run
>> the user-defined logic on the stream of tuples. Parallelism is achieved
>> via
>> process-based isolation of Heron instances, which provides predictable
>> performance while simplifying debugging. The containers are allocated and
>> managed by the scheduler framework based on resource availability of nodes
>> in the cluster. The metadata for the topology, such as the physical plan
>> and execution details, are stored in the pluggable Heron State Manager
>> (e.g. Apache ZooKeeper).
>>
>> ## Rationale
>>
>> Heron is a general-purpose, modular and extensible platform that can be
>> leveraged to support common, real-time analytics use cases. There is an
>> increasing demand for open-source, scalable real-time analytics systems.
>> We
>> believe that Heron can be leveraged by other organizations to build
>> streaming applications that can benefit from its robustness, high
>> performance, adaptability to cloud environments and ease of use. Moreover,
>> we hope that open-sourcing Heron will help to further evolve the
>> technology
>> as the project attracts contributors with diverse backgrounds and areas of
>> expertise.
>>
>> We believe the Apache foundation is a great fit as the long-term home for
>> Heron, as it provides an established process for community-driven
>> development and decision making by consensus. This is exactly the model we
>> want for future Heron development.
>>
>> ## Initial Goals
>>
>> * Move the existing codebase, website, documentation, and mailing lists to
>> Apache-hosted infrastructure.
>> * Integrate with the Apache development process.
>> * Ensure al

[PROPOSAL] Heron

2017-06-08 Thread Bill Graham
Dear Apache Incubator Community,

We are excited to share our proposal for discussion and feedback
for entering Apache Incubation. Heron is a real-time, distributed,
fault-tolerant stream processing engine.

Our proposal can be found at https://wiki.apache.org/incubator/HeronProposal
 and is included below.


Thank you,

Bill Graham on behalf of the Heron developers


# Heron Proposal

## Abstract
Heron is a real-time, distributed, fault-tolerant stream processing engine
initially developed by Twitter.

## Proposal

Heron is a real-time stream processing engine built for high performance,
ease of manageability, performance predictability and developer
productivity[1]. We wish to develop a community around Heron to increase
contributions and see Heron thrive in an open forum.

## Background

Heron provides the ability for developers to compose directed acyclic
graphs (DAGs) of real-time query execution logic (i.e. a topology) and
submit the topology to execute on a pluggable job scheduling system (e.g.,
Apache Aurora, YARN, Marathon, etc). Users can employ either the native
Heron API or the Apache Storm API to develop the topology. Heron supports
the Storm API for ease of migration, but beyond that Heron’s architecture
differs considerably from Storm’s.

Users submit a topology to the scheduler using the Heron client, which uses
the Heron binary libraries to deploy all daemons required to run and manage
the topology. The topology therefore has no reliance on centrally managed
Heron services, only on a generic job scheduling system, which lends itself
well to be run on top of Apache Aurora/Mesos or Apache Hadoop/YARN (among
others).

The scheduler runs each topology as a job consisting of multiple
containers. One of the containers runs the topology master, responsible for
managing the topology. The remaining containers each runs a stream manager
responsible for data routing, a metrics manager that collects and reports
various metrics and a number of processes called Heron instances which run
the user-defined logic on the stream of tuples. Parallelism is achieved via
process-based isolation of Heron instances, which provides predictable
performance while simplifying debugging. The containers are allocated and
managed by the scheduler framework based on resource availability of nodes
in the cluster. The metadata for the topology, such as the physical plan
and execution details, are stored in the pluggable Heron State Manager
(e.g. Apache ZooKeeper).

## Rationale

Heron is a general-purpose, modular and extensible platform that can be
leveraged to support common, real-time analytics use cases. There is an
increasing demand for open-source, scalable real-time analytics systems. We
believe that Heron can be leveraged by other organizations to build
streaming applications that can benefit from its robustness, high
performance, adaptability to cloud environments and ease of use. Moreover,
we hope that open-sourcing Heron will help to further evolve the technology
as the project attracts contributors with diverse backgrounds and areas of
expertise.

We believe the Apache foundation is a great fit as the long-term home for
Heron, as it provides an established process for community-driven
development and decision making by consensus. This is exactly the model we
want for future Heron development.

## Initial Goals

* Move the existing codebase, website, documentation, and mailing lists to
Apache-hosted infrastructure.
* Integrate with the Apache development process.
* Ensure all dependencies are compliant with Apache License version 2.0.
* Incrementally develop and release per Apache guidelines.

## Current Status

Heron is a stable project used in production at Twitter since 2014 and open
sourced under the ASL v2 license in 2016. The Heron source code is
currently hosted at github.com (https://github.com/twitter/heron), which
will seed the Apache git repository.

### Meritocracy

By submitting this incubator proposal, we’re expressing our intent to build
a diverse developer community around Heron that will conduct itself
according to The Apache Way and use a meritocratic means of building it's
committer base. Several companies and universities have already expressed
interest in and contributed to Heron. Our goal is to grow the Heron
community by encouraging open communication, contribution and participation
of all types, and ensuring that contributors are recognized appropriately.

### Community

Heron is currently being used by Twitter, Google, Machine Zone and
ndustrial.io and has received significant contributions by Microsoft and
Streamlio. By bringing Heron into the Apache ecosystem, we believe we can
attract even more developers who are interested in creating real-time
systems to build the project's contributor base.

### Core Developers

Current core developers are engineers from Twitter, Google, Microsoft and
Streamlio.

### Alignment

Heron utilizes a number of Apache technologies. Heron levera

Re: [VOTE] Accept Blur into the Apache Incubator

2012-07-20 Thread Bill Graham
+1 (non-binding)

On Fri, Jul 20, 2012 at 9:48 AM, Dave Fisher  wrote:

> +1! - Binding.
>
> On Jul 20, 2012, at 9:42 AM, Aaron McCurry wrote:
>
> > I would like to call a vote for accepting Blur for incubation in the
> > Apache Incubator. The full proposal is available below.
> >
> > Please cast your vote:
> >
> > [ ] +1, bring Blur into Incubator
> > [ ] +0, I don't care either way,
> > [ ] -1, do not bring Blur into Incubator, because...
> >
> > This vote will be open for 72 hours and only votes from the Incubator
> > PMC are binding.
> >
> > Thank you for your consideration!
> >
> > Aaron
> >
> > http://wiki.apache.org/incubator/BlurProposal
> >
> > = Blur Proposal =
> >
> > == Abstract ==
> > Blur is a search platform capable of searching massive amounts of data
> > in a cloud computing environment. Blur leverages several existing
> > Apache projects, including Apache Lucene, Apache Hadoop, Apache
> > !ZooKeeper and Apache Thrift.  Both bulk and near real time (NRT)
> > updates are possible with Blur.  Bulk updates are accomplished using
> > Hadoop Map/Reduce and NRT are performed through direct Thrift calls.
> >
> > == Proposal ==
> > Blur is an open source search platform capable of querying massive
> > amounts of data at incredible speeds. Rather than using the flat,
> > document-like data model used by most search solutions, Blur allows
> > you to build rich data models and search them in a semi-relational
> > manner similar to joins while querying a relational database. Using
> > Blur, you can get precise search results against terabytes of data at
> > Google-like speeds.  Blur leverages multiple open source projects
> > including Hadoop, Lucene, Thrift and !ZooKeeper to create an
> > environment where structured data can be transformed into an index
> > that runs on a Hadoop cluster.  Blur uses the power of Map/Reduce for
> > bulk indexing into Blur.  Server failures are handled automatically by
> > using !ZooKeeper for cluster state and HDFS for index storage.
> >
> > == Background ==
> > Blur was created by Aaron !McCurry in 2010. Blur was developed to
> > solve the challenges in dealing with searching huge quantities of data
> > that the traditional RDBMS solutions could not cope with while still
> > providing JOIN-like capabilities to query the data.  Several other
> > open source projects have implemented aspects of this design including
> > elasticsearch, Katta and Apache Solr.
> >
> > == Rationale ==
> > There is a need for a distributed search capability within the Hadoop
> > ecosystem. Currently, there are no other search solutions that
> > natively leverage HDFS and the failover features of Hadoop in the same
> > manner as the Blur project. The communities we expect to be most
> > interested in such a project are government, health care, and other
> > industries where scalability is a concern. We have made much progress
> > in developing this project over the past 2 years and believe both the
> > project and the interested communities would benefit from this work
> > being openly available and having open development.  In future
> > versions of Blur the API will more closely follow the API’s provided
> > in Lucene so that systems that already use Lucene can more easily
> > scale with Blur. Blur can be viewed as a query execution engine that
> > Lucene based solutions can utilize when scale becomes an issue.
> >
> > == Initial Goals ==
> > The initial goals of the project are:
> > * To migrate the Blur codebase, issue tracking and wiki from
> > github.com and integrate the project with the ASF infrastructure.
> > * Add new committers to the project and grow the community in "The
> Apache Way".
> >
> > == Current Status ==
> >
> > === Meritocracy ===
> > Blur was initially developed by Aaron !McCurry in June 2010.  Since
> > then Blur has continued to evolve with the support of a small
> > development team at Near Infinity.  As a part of the Apache Software
> > Foundation, the Apache Blur team intends to strongly encourage the
> > community to help with and contribute to the project.  Apache Blur
> > will actively seek potential committers and help them become familiar
> > with the codebase.
> >
> > === Community ===
> > A small community has developed around Blur and several project teams
> > are currently using Blur for their big data search capability. The
> > source code is currently available on GitHub and there is a dedicated
> > website (blur.io) that provides an overview of the project. Blur has
> > been shared with several members of the Apache community and has been
> > presented at the Bay Area HUG (see
> > http://www.meetup.com/hadoop/events/20109471/).
> >
> > === Core Developers ===
> > The current developers are employed by Near Infinity Corporation, but
> > we anticipate interest developing among other companies.
> >
> > === Alignment ===
> > Blur is built on top of a number of Apache projects; Hadoop, Lucene,
> > !ZooKeeper, and Thrift. It builds with Maven.  During the 

Re: [VOTE] Giraph to join the incubator

2011-07-24 Thread Bill Graham
+1

On Sun, Jul 24, 2011 at 8:27 AM, Mohammad Nour El-Din <
nour.moham...@gmail.com> wrote:

> +1 (Binding)
>
> On Sat, Jul 23, 2011 at 4:57 AM, Ashish  wrote:
> > +1
> >
> > On Sat, Jul 23, 2011 at 2:00 AM, Avery Ching 
> wrote:
> >
> >> Hi and good friday to you all,
> >>
> >> It's been a week since we submitted our proposal for Giraph's inclusion
> >> into the Apache incubator and the discussion around the proposal seems
> to
> >> have settled.  Thank you for all the comments/questions/general interest
> and
> >> for those who volunteered to be committers.  At this time, I'd like to
> ask
> >> for a vote.
> >>
> >> The latest proposal can be found at the end of this email and in the
> >> following wiki:
> >>
> >> http://wiki.apache.org/incubator/GiraphProposal
> >>
> >> The discussion
> regarding
> >> the proposal can be found below:
> >>
> >> http://www.mail-archive.com/general@incubator.apache.org/msg29957.html
> >>
> >>  >
> >> Please cast your votes:
> >>
> >> [  ] +1 Accept Giraph for incubation
> >> [  ] +0 Indifferent to Giraph incubation
> >> [  ] -1 Reject Giraph for incubation
> >>
> >> This vote will close 72 hours from now.
> >>
> >> Thanks!
> >>
> >> Avery
> >>
> >>
> >> = Giraph : Large-scale graph processing on Hadoop =
> >>
> >> == Abstract ==
> >>
> >> Giraph is a large-scale, fault-tolerant, Bulk Synchronous Parallel
> >> (BSP)-based graph processing framework.
> >>
> >> == Proposal ==
> >>
> >> Graph processing platforms to run large-scale algorithms (such as page
> >> rank, shared connections, personalization-based popularity, etc.) have
> >> become quite popular.  Some recent examples include Pregel and HaLoop.
>  For
> >> general-purpose big data computation, the MapReduce computation model is
> >> widely adopted and the most deployed MapReduce infrastructure is Apache
> >> Hadoop.  We have implemented a graph-processing framework that is
> launched
> >> as a typical Hadoop MapReduce job to leverage existing Hadoop
> >> infrastructure, such as Amazon’s EC2.  Giraph builds upon the
> graph-oriented
> >> nature of Pregel but additionally adds fault-tolerance to the
> coordinator
> >> process with the use of ZooKeeper as its centralized coordination
> service.
> >>  Additionally, Giraph will include a library of generic graph
> algorithms.
> >>
> >> == Background ==
> >>
> >> Giraph was initially began development as a side project at Yahoo! at
> the
> >> end of 2010.  It was made functional in a month and then started adding
> >> various features.  Development has been focused on internal customers
> needs
> >> until this point.
> >>
> >> == Rationale ==
> >>
> >> Web and online social graphs have been rapidly growing in size and scale
> >> during the past decade.  In 2008, Google estimated that the number of
> web
> >> pages reached over a trillion.  Online social networking and email
> sites,
> >> including Yahoo!, Google, Microsoft, Facebook, LinkedIn, and Twitter,
> have
> >> hundreds of millions of users and are expected to grow much more in the
> >> future.  Processing these graphs plays a big role in relevant and
> >> personalized information for users, such as results from a search engine
> or
> >> news in an online social networking site.
> >>
> >> == Initial Goals ==
> >>
> >> At this point, most of the functionality has been implemented and we are
> >> looking to get more adoption and contributions from users outside
> Yahoo!.
> >> We want to ensure that performance scales and that the code is robust
> and
> >> fault tolerant.
> >>
> >> == Current Status ==
> >>
> >> === Meritocracy ===
> >>
> >> Giraph was initially developed by Avery Ching and Christian Kunz
> beginning
> >> in December 2010 at Yahoo!.  There are other developers using Giraph at
> >> Yahoo! that are making suggestions and adding code.  We are reaching out
> to
> >> other folks at social networking companies for additional usage and
> >> development.
> >>
> >> === Community ===
> >>
> >> Several groups who are interested in either joining our project or using
> >> our code have contacted us.  We certainly believe that there is a lot of
> >> interest and are actively looking to improve and expand the community.
> >>
> >> === Core Developers ===
> >>
> >>  * Avery Ching: Wrote a majority of the code
> >>  * Christian Kunz: Wrote most of the communication code and security
> >> integration with Hadoop
> >>
> >> === Alignment ===
> >>
> >> Giraph uses several Apache projects as its underlying infrastructure
> >> (Hadoop and ZooKeeper).   It also builds on Apache Maven.
> >>
> >> == Known Risks ==
> >>
> >> === Orphaned products ===
> >>
> >> There are many social networking companies that would be interested in
> >> using this graph-processing framework and we have already received
> interest
> >> from some of them.  Yahoo! is already using this code in production and
> will
> >> certainly

Re: how to edit Incubator documents

2010-09-09 Thread Bill Graham
Ahh right, of course. Thanks for the heads up.

On Thu, Sep 9, 2010 at 5:53 PM, David Crossley  wrote:
>> Author: billgraham
>> Date: Thu Sep  9 21:33:13 2010
>> New Revision: 995581
>>
>> URL: http://svn.apache.org/viewvc?rev=995581&view=rev
>> Log:
>> fixing links to chukwa site
>>
>> Modified:
>>     incubator/public/trunk/site-publish/projects/chukwa.html
>>
>> Modified: incubator/public/trunk/site-publish/projects/chukwa.html
>
> Bill, thanks for trying. However need to
> edit the source documents and re-build, rather
> than editing the generated HTML files.
> Those changes will need to be re-done.
>
> http://incubator.apache.org/guides/website.html#Edit+your+project+status+report
>
> Anyway, thanks again. It is great to see project
> devs updating the docs.
>
> -David
>

-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org



[jira] Commented: (INCUBATOR-111) Mail archives not getting @incubator.apache.org emails

2010-08-30 Thread Bill Graham (JIRA)

[ 
https://issues.apache.org/jira/browse/INCUBATOR-111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12904320#action_12904320
 ] 

Bill Graham commented on INCUBATOR-111:
---

Oops, that's exactly what I did and I don't have permissions now to move it out 
of incubator. Would you please help me out with that. :)

> Mail archives not getting @incubator.apache.org emails
> --
>
> Key: INCUBATOR-111
> URL: https://issues.apache.org/jira/browse/INCUBATOR-111
> Project: Incubator
>  Issue Type: Bug
>Reporter: Bill Graham
>
> The 2 mail archives links on this page go to archives that stopped updating 
> around 8/3/2010, which is probably around when we switched to incubator email 
> addresses. We should ideally have an archive that includes emails before and 
> after the switch.
> http://hadoop.apache.org/chukwa/mailing_lists.html#Users
> http://mail-archives.apache.org/mod_mbox/hadoop-chukwa-user/
> http://mail-archives.apache.org/mod_mbox/hadoop-chukwa-dev/
> If this isn't possible, please provide two links per list, one to the 
> pre-cutoff archives on apache and one to the post-cutoff archives on 
> incubator.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org