from:"Doug Cutting"

Re: [RESULT] [VOTE] Accept Spot into the Apache Incubator

2016-09-24 Thread Doug Cutting

On Sat, Sep 24, 2016 at 12:03 AM, Gangumalla, Uma
 wrote:
> BTW, there were 5 binding votes.

Oops.  Sorry for the miscount!  I mistakenly searched for "Gangumalla"
rather than "umamahesh" in
http://people.apache.org/committers-by-project.html#incubator-pmc, but
I should have known better regardless.  My sincere apologies.

Doug

-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org

[RESULT] [VOTE] Accept Spot into the Apache Incubator

2016-09-23 Thread Doug Cutting

The vote passes, with 7 +1 votes (4 binding) and no -1 votes.

+1 Jarek Jarcec Cecho (binding)
+1 Gangumalla, Uma
+1 Todd Lipcon (binding)
+1 Tom White (binding)
+1 Zheng, Kai
+1 Stack (binding)
+1 Debo Dutta

Thanks all for voting.

Spot has been accepted for Incubation at Apache.  Welcome Spot!

Doug

-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org

[VOTE] Accept Spot into the Apache Incubator

2016-09-20 Thread Doug Cutting

ense version 2
 * Apache Hadoop: Apache License 2.0
 * Apache Spark: Apache License 2.0
 * JQuery: MIT
 * ReactJS: BSD
 * Bootstrap: MIT

Issues related to GPL dependencies will be resolved during incubation.

== Cryptography ==

Spot does not currently include any cryptography-related code.

== Required Resources ==

=== Developer and user mailing lists ===

 * priv...@spot.incubator.apache.org (PMC)
 * comm...@spot.incubator.apache.org (git push emails)
 * iss...@spot.incubator.apache.org (JIRA issue feed)
 * d...@spot.incubator.apache.org (code reviews plus dev discussion)
 * u...@spot.incubator.apache.org (user questions)

=== Repository ===

 * git://git.apache.org/spot

=== Issue Tracker ===

We would like to import our current JIRA project into the ASF JIRA,
such that our historical commit messages and code comments continue to
reference the appropriate bug numbers.

== Initial Committers ==

 * Grant Babb
 * Ricardo Barona
 * Cesar Berho
 * Jarek Jarcec Cecho
 * Michael Czerny
 * Nick Gamb
 * Sai Ganji
 * Gabriela Lima Garza
 * Victor Gonzalez
 * Mark Grover
 * Morris Hicks
 * Ritu Kama
 * Austin Leahy
 * Ashrith Mekala
 * Diego Ortiz
 * Sudharshan Rao PakalaSai
 * Srinivasa Reddy
 * Alan Ross
 * Everardo Lopez Sandoval
 * Nathan Segerlind
 * Vartika Singh
 * Nathanael Smith
 * Carlos Villavicencio

== Affiliations ==

 * Grant Babb: Jask
 * Ricardo Barona : Intel
 * Cesar Berho: Intel
 * Jarek Jarcec Cecho: StreamSets
 * Michael Czerny: Cybraics
 * Nick Gamb: Centrify
 * Sai Ganji: Cloudwick
 * Gabriela Lima Garza: Intel
 * Victor Gonzalez: Intel
 * Mark Grover: Cloudera
 * Morris Hicks: Cloudera
 * Ritu Kama: Intel
 * Austin Leahy: eBay
 * Ashrith Mekala: Cloudwick
 * Diego Ortiz: Intel
 * Sudharshan Rao PakalaSai: Cloudwick
 * Srinivasa Reddy: Cloudera
 * Alan Ross: Intel
 * Everardo Lopez Sandoval: Intel
 * Nathan Segerlind: Intel
 * Vartika Singh: Cloudera
 * Nathanael Smith: Intel
 * Carlos Villavicencio: Intel

== Sponsors ==

=== Champion ===

 * Doug Cutting - Cloudera

=== Nominated Mentors ===

 * Brock Noland - ASF Member, phData
 * Jarek Jarcec Cecho - ASF Member, StreamSets
 * Andrei Savu - Cloudera
 * Uma Maheswara Rao G - Intel

=== Sponsoring Entity ===

The Apache Incubator.

-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org

Re: [DISCUSS] Spot Incubation Proposal

2016-09-14 Thread Doug Cutting

Lars,

You are correct, there is overlap between this and Metron.  They were
independently started by different teams and with different
architectures.  At the present time both projects are well aware of
one another and choose to continue to pursue separate efforts.  Spot
has been an Intel-led project at Github.  Now, as a growing number of
other institutions and individuals wish to collaborate on this code,
it would be better to relocate the project to Apache.

Thanks,

Doug

On Tue, Sep 13, 2016 at 6:48 PM, Lars Francke <lars.fran...@gmail.com> wrote:
> Thank you Doug.
>
> On a cursory look this seems related to the currently incubating Metron
> project. The documentation on both projects is relatively scarce. Do you
> happen to have any insight on overlap between those two or am I completely
> off here by comparing those two?
>
> On Tue, Sep 13, 2016 at 7:50 PM, Doug Cutting <cutt...@apache.org> wrote:
>
>> Please find attached a proposal for a new podling, Apache Spot, a
>> platform for network telemetry (packet, flow, and proxy at the moment)
>> built on an open data model and Apache Hadoop.
>>
>> The draft proposal is on the wiki at:
>>
>> https://wiki.apache.org/incubator/SpotProposal
>>
>> I have also included the current text of that page below.
>>
>> Thanks,
>>
>> Doug
>>
>> = SpotProposal =
>>
>> == Abstract ==
>>
>> Spot is an open source platform for network telemetry (packet, flow,
>> and proxy at the moment) built on an open data model and Apache
>> Hadoop.
>>
>> == Proposal ==
>>
>> Spot (formerly Open Network Insight, or ONI) is an open source
>> solution for network telemetry (packet, flow, and proxy at the moment)
>> built on an open data model and Apache Hadoop. It provides ingestion
>> and transformation of binary data, scalable machine learning, and
>> interactive visualization for identifying threats in network flows and
>> DNS packets.
>>
>> Spot has a pluggable architecture that can accommodate multiple open
>> data models. Although cybersecurity/network-intrusion analysis is the
>> initial use case for Spot, we are actively encouraging the
>> contribution of new models that will enable other adjacent
>> applications, such as fraud detection or IT-operational analytics such
>> as performance and health monitoring. Because these models are open,
>> users maintain control of their own data.
>>
>> More information on Spot can be found at the existing project website
>> at http://open-network-insight.org/.
>>
>> == Background ==
>>
>> It almost goes without saying that cybersecurity is an acute and
>> paramount concern globally, for organizations of all types and
>> sizes. Fortunately, thanks to the availability of massively scalable
>> (in the PBs) data infrastructure, security professionals can now make
>> authentically data-driven decisions about how they protect their
>> assets. For example, records of network traffic, captured as network
>> flows, are often stored and analyzed for use in network management,
>> and this same information can provide valuable insights into network
>> vulnerabilities.
>>
>> Cybersecurity is just one example, however: There are other examples
>> of adjacent use cases, such as user fraud detection or IT-operations
>> analytics, that would benefit from the combination of Spot
>> functionality and PB-scale data sets for analysis.
>>
>> == Rationale ==
>>
>> Although cybersecurity is its initial use case/data model, Spot is
>> intended to more generally tackle the dual challenges of facilitating
>> the development of big data-driven analytic solutions, while helping
>> vendors avoid having to create one/off infrastructure for each use
>> case. Spot will eliminate issues related to vendor data models that
>> create silos between solutions, and that make it difficult for users
>> to consume these innovations from multiple vendors. In summary, Spot
>> will accelerate the development of new massively scalable analytic
>> applications that give users more flexibility, and more choices.
>>
>> As an initial effort, we are now seeking to build an ecosystem of
>> developers, data scientists, and security professionals to make Spot
>> the open, community-driven, cybersecurity platform standard it needs
>> to become. By bringing Spot to Apache, we hope to galvanize these
>> groups to cooperate in this highly matrixed effort, and to build a
>> global, and diverse, Spot community.
>>
>> == Initial Goals ==
>>
>> Move the existing codebas

[DISCUSS] Spot Incubation Proposal

2016-09-13 Thread Doug Cutting

park: Apache License 2.0
 * JQuery: MIT
 * ReactJS: BSD
 * Bootstrap: MIT

Issues related to GPL dependencies will be resolved during incubation.

== Cryptography ==

Spot does not currently include any cryptography-related code.

== Required Resources ==

=== Developer and user mailing lists ===

 * priv...@spot.incubator.apache.org (PMC)
 * comm...@spot.incubator.apache.org (git push emails)
 * iss...@spot.incubator.apache.org (JIRA issue feed)
 * d...@spot.incubator.apache.org (code reviews plus dev discussion)
 * u...@spot.incubator.apache.org (user questions)

=== Repository ===

 * git://git.apache.org/spot

=== Issue Tracker ===

We would like to import our current JIRA project into the ASF JIRA,
such that our historical commit messages and code comments continue to
reference the appropriate bug numbers.

== Initial Committers ==

 * Grant Babb
 * Ricardo Barona
 * Cesar Berho
 * Jarek Jarcec Cecho
 * Michael Czerny
 * Sai Ganji
 * Gabriela Lima Garza
 * Victor Gonzalez
 * Mark Grover
 * Morris Hicks
 * Ritu Kama
 * Austin Leahy
 * Ashrith Mekala
 * Diego Ortiz
 * Sudharshan Rao PakalaSai
 * Srinivasa Reddy
 * Alan Ross
 * Everardo Lopez Sandoval
 * Nathan Segerlind
 * Vartika Singh
 * Nathanael Smith
 * Carlos Villavicencio

== Affiliations ==

 * Grant Babb: Jask
 * Ricardo Barona : Intel
 * Cesar Berho: Intel
 * Jarek Jarcec Cecho: StreamSets
 * Michael Czerny: Cybraics
 * Sai Ganji: Cloudwick
 * Gabriela Lima Garza: Intel
 * Victor Gonzalez: Intel
 * Mark Grover: Cloudera
 * Morris Hicks: Cloudera
 * Ritu Kama: Intel
 * Austin Leahy: eBay
 * Ashrith Mekala: Cloudwick
 * Diego Ortiz: Intel
 * Sudharshan Rao PakalaSai: Cloudwick
 * Srinivasa Reddy: Cloudera
 * Alan Ross: Intel
 * Everardo Lopez Sandoval: Intel
 * Nathan Segerlind: Intel
 * Vartika Singh: Cloudera
 * Nathanael Smith: Intel
 * Carlos Villavicencio: Intel

== Sponsors ==

=== Champion ===

 * Doug Cutting - Cloudera

=== Nominated Mentors ===

 * Brock Noland - ASF Member, phData
 * Jarek Jarcec Cecho - ASF Member, StreamSets
 * Andrei Savu - Cloudera
 * Uma Maheswara Rao G - Intel

=== Sponsoring Entity ===

The Apache Incubator.

-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org

Re: [DISCUSS] Apache Dataflow Incubator Proposal

2016-01-28 Thread Doug Cutting

On Thu, Jan 28, 2016 at 3:11 PM, Greg Stein  wrote:
> As a regular english word, "beam" cannot be trademarked, by others/us.

Like Windows® or Apple®?

Doug

-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org

Re: [VOTE] Accept Impala into the Apache Incubator

2015-11-25 Thread Doug Cutting

+1 (binding)

Doug

On Tue, Nov 24, 2015 at 1:03 PM, Henry Robinson  wrote:

> Hi -
>
> The [DISCUSS] thread has been quiet for a few days, so I think there's been
> sufficient opportunity for discussion around our proposal to bring Impala
> to the ASF Incubator.
>
> I'd like to call a VOTE on that proposal, which is on the wiki at
> https://wiki.apache.org/incubator/ImpalaProposal, and which I've pasted
> below.
>
> During the discussion period, the proposal has been amended to add Brock
> Noland as a new mentor, to add one missed committer from the list and to
> correct some issues with the dependency list.
>
> Please cast your votes as follows:
>
> [] +1, accept Impala into the Incubator
> [] +/-0, non-counted vote to express a disposition
> [] -1, do not accept Impala into the Incubator (please give your reason(s))
>
> As with the concurrent Kudu vote, I propose leaving the vote open for a
> full seven days (to close at Tuesday, December 1st at noon PST), due to the
> upcoming US holiday.
>
> Thanks,
> Henry
>
> 
>
> = Abstract =
> Impala is a high-performance C++ and Java SQL query engine for data stored
> in Apache Hadoop-based clusters.
>
> = Proposal =
>
> We propose to contribute the Impala codebase and associated artifacts (e.g.
> documentation, web-site content etc.) to the Apache Software Foundation
> with the intent of forming a productive, meritocratic and open community
> around Impala’s continued development, according to the ‘Apache Way’.
>
> Cloudera owns several trademarks regarding Impala, and proposes to transfer
> ownership of those trademarks in full to the ASF.
>
> = Background =
> Engineers at Cloudera developed Impala and released it as an
> Apache-licensed open-source project in Fall 2012. Impala was written as a
> brand-new, modern C++ SQL engine targeted from the start for data stored in
> Apache Hadoop clusters.
>
> Impala’s most important benefit to users is high-performance, making it
> extremely appropriate for common enterprise analytic and business
> intelligence workloads. This is achieved by a number of software
> techniques, including: native support for data stored in HDFS and related
> filesystems, just-in-time compilation and optimization of individual query
> plans, high-performance C++ codebase and massively-parallel distributed
> architecture. In benchmarks, Impala is routinely amongst the very highest
> performing SQL query engines.
>
> = Rationale =
>
> Despite the exciting innovation in the so-called ‘big-data’ space, SQL
> remains by far the most common interface for interacting with data in both
> traditional warehouses and modern ‘big-data’ clusters. There is clearly a
> need, as evidenced by the eager adoption of Impala and other SQL engines in
> enterprise contexts, for a query engine that offers the familiar SQL
> interface, but that has been specifically designed to operate in massive,
> distributed clusters rather than in traditional, fixed-hardware,
> warehouse-specific deployments. Impala is one such query engine.
>
> We believe that the ASF is the right venue to foster an open-source
> community around Impala’s development. We expect that Impala will benefit
> from more productive collaboration with related Apache projects, and under
> the auspices of the ASF will attract talented contributors who will push
> Impala’s development forward at pace.
>
> We believe that the timing is right for Impala’s development to move
> wholesale to the ASF: Impala is well-established, has been Apache-licensed
> open-source for more than three years, and the core project is relatively
> stable. We are excited to see where an ASF-based community can take Impala
> from this strong starting point.
>
> = Initial Goals =
> Our initial goals are as follows:
>
>  * Establish ASF-compatible engineering practices and workflows
>  * Refactor and publish existing internal build scripts and test
> infrastructure, in order to make them usable by any community member.
>  * Transfer source code, documentation and associated artifacts to the ASF.
>  * Grow the user and developer communities
>
> = Current Status =
>
> Impala is developed as an Apache-licensed open-source project. The source
> code is available at http://github.com/cloudera/Impala, and developer
> documentation is at https://github.com/cloudera/Impala/wiki. The majority
> of commits to the project have come from Cloudera-employed developers, but
> we have accepted some contributions from individuals from other
> organizations.
>
> All code reviews are done via a public instance of the Gerrit review tool
> at http://gerrit.cloudera.org:8080/, and discussed on a public mailing
> list. All patches must be reviewed before they are accepted into the
> codebase, via a voting mechanism that is similar to that used on Apache
> projects such as Hadoop and HBase.
>
> Before a patch is committed, it must pass a suite of pre-commit tests.
> These tests are currently run on Cloudera’s internal

Re: [VOTE] Accept Kudu into the Apache Incubator

2015-11-25 Thread Doug Cutting

+1 (binding)

Doug

On Wed, Nov 25, 2015 at 8:45 AM, Chris Douglas  wrote:

> +1 (binding) -C
>
> On Tue, Nov 24, 2015 at 11:32 AM, Todd Lipcon  wrote:
> > Hi all,
> >
> > Discussion on the [DISCUSS] thread seems to have wound down, so I'd like
> to
> > call a VOTE on acceptance of Kudu into the ASF Incubator. The proposal is
> > pasted below and also available on the wiki at:
> > https://wiki.apache.org/incubator/KuduProposal
> >
> > The proposal is unchanged since the original version, except for the
> > addition of Carl Steinbach as a Mentor.
> >
> > Please cast your votes:
> >
> > [] +1, accept Kudu into the Incubator
> > [] +/-0, positive/negative non-counted expression of feelings
> > [] -1, do not accept Kudu into the incubator (please state reasoning)
> >
> > Given the US holiday this week, I imagine many folks are traveling or
> > otherwise offline. So, let's run the vote for a full week rather than the
> > traditional 72 hours. Unless the IPMC objects to the extended voting
> > period, the vote will close on Tues, Dec 1st at noon PST.
> >
> > Thanks
> > -Todd
> > -
> >
> > = Kudu Proposal =
> >
> > == Abstract ==
> >
> > Kudu is a distributed columnar storage engine built for the Apache Hadoop
> > ecosystem.
> >
> > == Proposal ==
> >
> > Kudu is an open source storage engine for structured data which supports
> > low-latency random access together with efficient analytical access
> > patterns. Kudu distributes data using horizontal partitioning and
> > replicates each partition using Raft consensus, providing low
> > mean-time-to-recovery and low tail latencies. Kudu is designed within the
> > context of the Apache Hadoop ecosystem and supports many integrations
> with
> > other data analytics projects both inside and outside of the Apache
> > Software Foundation.
> >
> >
> >
> > We propose to incubate Kudu as a project of the Apache Software
> Foundation.
> >
> > == Background ==
> >
> > In recent years, explosive growth in the amount of data being generated
> and
> > captured by enterprises has resulted in the rapid adoption of open source
> > technology which is able to store massive data sets at scale and at low
> > cost. In particular, the Apache Hadoop ecosystem has become a focal point
> > for such “big data” workloads, because many traditional open source
> > database systems have lagged in offering a scalable alternative.
> >
> >
> >
> > Structured storage in the Hadoop ecosystem has typically been achieved in
> > two ways: for static data sets, data is typically stored on Apache HDFS
> > using binary data formats such as Apache Avro or Apache Parquet. However,
> > neither HDFS nor these formats has any provision for updating individual
> > records, or for efficient random access. Mutable data sets are typically
> > stored in semi-structured stores such as Apache HBase or Apache
> Cassandra.
> > These systems allow for low-latency record-level reads and writes, but
> lag
> > far behind the static file formats in terms of sequential read throughput
> > for applications such as SQL-based analytics or machine learning.
> >
> >
> >
> > Kudu is a new storage system designed and implemented from the ground up
> to
> > fill this gap between high-throughput sequential-access storage systems
> > such as HDFS and low-latency random-access systems such as HBase or
> > Cassandra. While these existing systems continue to hold advantages in
> some
> > situations, Kudu offers a “happy medium” alternative that can
> dramatically
> > simplify the architecture of many common workloads. In particular, Kudu
> > offers a simple API for row-level inserts, updates, and deletes, while
> > providing table scans at throughputs similar to Parquet, a commonly-used
> > columnar format for static data.
> >
> >
> >
> > More information on Kudu can be found at the existing open source project
> > website: http://getkudu.io and in particular in the Kudu white-paper
> PDF:
> > http://getkudu.io/kudu.pdf from which the above was excerpted.
> >
> > == Rationale ==
> >
> > As described above, Kudu fills an important gap in the open source
> storage
> > ecosystem. After our initial open source project release in September
> 2015,
> > we have seen a great amount of interest across a diverse set of users and
> > companies. We believe that, as a storage system, it is critical to build
> an
> > equally diverse set of contributors in the development community. Our
> > experiences as committers and PMC members on other Apache projects have
> > taught us the value of diverse communities in ensuring both longevity and
> > high quality for such foundational systems.
> >
> > == Initial Goals ==
> >
> >  * Move the existing codebase, website, documentation, and mailing lists
> to
> > Apache-hosted infrastructure
> >  * Work with the infrastructure team to implement and approve our code
> > review, build, and testing workflows in the context of the ASF
> >  * Incremental development and

Re: Soliciting feedback for a detailed pTLP policy document

2015-03-04 Thread Doug Cutting

On Mon, Mar 2, 2015 at 5:31 PM, Roman Shaposhnik r...@apache.org wrote:
 At this point, I would like to open this document for soliciting as
 wide a feedback as possible. I would like to especially request
 attention of the ASF board members who asked for this type of
 a document to be available.

As a director, I still don't think the board needs to be involved in a
pTLP's graduation.  As far as I'm concerned, any provisional
status is self-imposed by the PMC and can be removed at its pleasure.
From the board's perspective it's either an ASF project or it's not,
there's not a useful middle ground.  As a project it needs to provide
reports, release according to accepted standards, operate openly, etc.
It may be a young project, with a PMC dominated by old-timers who
aren't responsible for much of the contribution, but I don't see why
that requires a new formal status any more than we need a formal
status for old, slow-moving projects that rarely release.

Put directly, what does a pTLP's graduation change from the board's
perspective?  How should it change the way we review the project's
reports, etc.?  In short, why should we care about this label?  If a
PMC wishes to call itself blue that's fine too, but we don't need a
resolution when it decides to call itself purple.

Doug

-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org

Re: What is The Apache Way?

2015-01-12 Thread Doug Cutting

On Sun, Jan 11, 2015 at 9:49 PM, Roman Shaposhnik ro...@shaposhnik.org wrote:
 I think a better analogy would be US Culture. Yes it is as nebulous
 as it gets, but the fact that US Constitution exists as a written document
 makes a LOT of things WAY easier.

Apache's constitution is the corporate bylaws:
http://www.apache.org/foundation/bylaws.html

US Culture is stuff like Starbucks, Elvis, Manifest Destiny, etc.
Most of that is not coded as law, thankfully.

Doug

-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org

Re: What is The Apache Way?

2015-01-09 Thread Doug Cutting

On Fri, Jan 9, 2015 at 9:41 AM, David Nalley da...@gnsa.us wrote:
 Can a project use an external bug tracker?
 Can a project use a third-parties CI system?
 Can a project host their website outside of the ASF?
 Can a project avoid a users mailing list and move to StackOverflow?
 Can projects use github?

It depends on the details.  Many are not recommended practices.  A
project is likely to get more flak if it takes such paths rather than
more standard paths, e.g., folks declaring that it's absolutely not
allowed.  Some of these may someday be recommended practices if
projects persevere and show how they can be done without violating the
spirit of Apache-style software development.  The board may ask for
more details when a project takes uncommon paths in order to gain
comfort that Apache needs are met.

Doug

-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org

Re: What is The Apache Way?

2015-01-09 Thread Doug Cutting

On Fri, Jan 9, 2015 at 8:12 AM, Benson Margulies bimargul...@gmail.com wrote:
 So, either a lot of us are really stupid, or the Foundation as a whole has
 a gap between the general principles and their application. No, we can't
 have a rule book that details every particle of how to run an Apache
 project, but apparently we could have  more concrete guidance.

The gap definitely exists.  What often leads to confusion is when
folks think there's no gap, that everything is clear-cut and certain,
when it's not.  Different Apache projects are permitted to operate
differently, and the ill-defined line of what's acceptable moves over
time.  This is not entirely bad.  Fixed practices are hard to change,
but the open-source software world changes rapidly.  So maintaining
some flexibility is important.

What we should try to do are document acceptable practices, those ways
of operating that are common in many projects and have worked well.
There may be multiple acceptable practices in a given area (e.g., CTR
 RTC).  Projects that diverge from these might still be acceptable,
but they might also run into problems and should proceed with caution.
Some might tell them that they don't get the Apache Way, which is
distressing, but, at the end of the day, so long as the board doesn't
vote to evict them from the foundation, they're part of the Apache
Way.  The board doesn't generally act without good notice.  Generally
things escalate from folks griping, to the board agreeing to monitor
and advise a project, to the board giving an ultimatum for a specific
practice to stop, to the board finally taking some action.

Doug

-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org

Re: Podlings should be in charge of their mentors (was: Incubator report sign-off)

2015-01-06 Thread Doug Cutting

On Mon, Jan 5, 2015 at 12:57 PM, Upayavira u...@odoko.co.uk wrote:
 I'd much rather we be clear with projects right up front, saying
 something like:

 To join the Incubator, you need one or more mentors. To get to
 graduation, you will need the support of those mentors. If mentors
 become unavailable, you will need to seek replacements. Unless you have
 already learned the ways of the ASF and are ready to graduate, you will
 need to keep engaged with your mentors. If possible, engage in the wider
 ASF, and develop connections with others who might be in a position to
 assist with mentorship should one or all of your current mentors become
 unable to fulfill the role. 

+1

Doug

-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org

Re: Incubator report sign-off

2014-12-19 Thread Doug Cutting

On Fri, Dec 19, 2014 at 12:45 PM, Ross Gardler (MS OPEN TECH)
ross.gard...@microsoft.com wrote:
 I do question the need to dissolve the IPMC

Indeed.  Chris' proposal is not exclusive with keeping the Incubator
as it is.  Folks could currently submit a resolution to the board to
start a TLP and see what happens.

Doug

-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org

Re: [VOTE] Graduation of Apache Spark from the Incubator

2014-02-12 Thread Doug Cutting

+1

Doug

On Mon, Feb 10, 2014 at 8:27 PM, Chris Mattmann mattm...@apache.org wrote:
 Hi Everyone,

 This is a new VOTE to decide if Apache Spark should graduate
 from the Incubator. Please VOTE on the resolution pasted below
 the ballot. I'll leave this VOTE open for at least 72 hours.

 Thanks!

 [ ] +1 Graduate Apache Spark from the Incubator.
 [ ] +0 Don't care.
 [ ] -1 Don't graduate Apache Spark from the Incubator because..

 Here is my +1 binding for graduation.

 Cheers,
 Chris

  snip

 WHEREAS, the Board of Directors deems it to be in the best
 interests of the Foundation and consistent with the
 Foundation's purpose to establish a Project Management
 Committee charged with the creation and maintenance of
 open-source software, for distribution at no charge to the
 public, related to fast and flexible large-scale data analysis
 on clusters.

 NOW, THEREFORE, BE IT RESOLVED, that a Project Management
 Committee (PMC), to be known as the Apache Spark Project, be
 and hereby is established pursuant to Bylaws of the Foundation;
 and be it further

 RESOLVED, that the Apache Spark Project be and hereby is
 responsible for the creation and maintenance of software
 related to fast and flexible large-scale data analysis
 on clusters; and be it further RESOLVED, that the office
 of Vice President, Apache Spark be and hereby is created,
 the person holding such office to serve at the direction of
 the Board of Directors as the chair of the Apache Spark
 Project, and to have primary responsibility for management
 of the projects within the scope of responsibility
 of the Apache Spark Project; and be it further
 RESOLVED, that the persons listed immediately below be and
 hereby are appointed to serve as the initial members of the
 Apache Spark Project:

 * Mosharaf Chowdhury mosha...@apache.org
 * Jason Dai jason...@apache.org
 * Tathagata Das t...@apache.org
 * Ankur Dave ankurd...@apache.org
 * Aaron Davidson a...@apache.org
 * Thomas Dudziak to...@apache.org
 * Robert Evans bo...@apache.org
 * Thomas Graves tgra...@apache.org
 * Andy Konwinski and...@apache.org
 * Stephen Haberman steph...@apache.org
 * Mark Hamstra markhams...@apache.org
 * Shane Huang shane_hu...@apache.org
 * Ryan LeCompte ryanlecom...@apache.org
 * Haoyuan Li haoy...@apache.org
 * Sean McNamara mcnam...@apache.org
 * Mridul Muralidharam mridul...@apache.org
 * Kay Ousterhout kayousterh...@apache.org
 * Nick Pentreath mln...@apache.org
 * Imran Rashid iras...@apache.org
 * Charles Reiss wog...@apache.org
 * Josh Rosen joshro...@apache.org
 * Prashant Sharma prash...@apache.org
 * Ram Sriharsha har...@apache.org
 * Shivaram Venkataraman shiva...@apache.org
 * Patrick Wendell pwend...@apache.org
 * Andrew Xia xiajunl...@apache.org
 * Reynold Xin r...@apache.org
 * Matei Zaharia ma...@apache.org

 NOW, THEREFORE, BE IT FURTHER RESOLVED, that Matei Zaharia be
 appointed to the office of Vice President, Apache Spark, to
 serve in accordance with and subject to the direction of the
 Board of Directors and the Bylaws of the Foundation until
 death, resignation, retirement, removal or disqualification, or
 until a successor is appointed; and be it further

 RESOLVED, that the Apache Spark Project be and hereby is
 tasked with the migration and rationalization of the Apache
 Incubator Spark podling; and be it further

 RESOLVED, that all responsibilities pertaining to the Apache
 Incubator Spark podling encumbered upon the Apache Incubator
 Project are hereafter discharged.

 




 -
 To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
 For additional commands, e-mail: general-h...@incubator.apache.org


-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org

Re: IP Clearance before releasing

2013-12-12 Thread Doug Cutting

On Thu, Dec 12, 2013 at 7:38 AM, Alex Harui aha...@adobe.com wrote:

 There's no whistleblower provision
 for someone who thinks they see something that puts the foundation at risk
 from stopping those to don't see it.


If there's a clear legal problem with a release candidate I'd expect others
to acknowledge it and cancel the vote.  If they don't then that could be
escalated to the Incubator PMC.

Doug

Re: Changing moderation settings

2013-12-12 Thread Doug Cutting

You shouldn't need to subscribe jira to your list.  Rather just 'allow' a
message by using reply-all to a moderation request so that all future posts
from that sender are accepted.

Doug


On Thu, Dec 12, 2013 at 12:38 PM, Chen, Pei
pei.c...@childrens.harvard.eduwrote:

 dev-subscribe-jira=apache@storm.incubator.apache.org

  -Original Message-
  From: Chen, Pei [mailto:pei.c...@childrens.harvard.edu]
  Sent: Thursday, December 12, 2013 3:37 PM
  To: general@incubator.apache.org
  Subject: RE: Changing moderation settings
 
  I believe a moderator can add jira to the subscription...
  Something like:
  Send request email and confirm sent to storm-dev-subscribe-
  jira=apache@incubator.apache.org
 
   -Original Message-
   From: nathan.m...@gmail.com [mailto:nathan.m...@gmail.com] On
  Behalf
   Of Nathan Marz
   Sent: Thursday, December 12, 2013 3:29 PM
   To: general@incubator.apache.org
   Subject: Changing moderation settings
  
   How can I change the moderation settings for the Storm user and dev
 lists?
   I'm getting enormous amounts of moderation emails (including lots
   triggered by JIRA). Is there a way to whitelist accounts, turn off
   moderation, and/or approve in bulk (like via a web interface)?
 
  -
  To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
  For additional commands, e-mail: general-h...@incubator.apache.org


 -
 To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
 For additional commands, e-mail: general-h...@incubator.apache.org

Re: [VOTE] Accept Twill for Incubation

2013-11-07 Thread Doug Cutting

+1

Doug

On Thu, Nov 7, 2013 at 1:04 PM, Andreas Neumann a...@apache.org wrote:
 The discussion about the Weave proposal has calmed. As the outcome of the
 discussion, we have chosen a new name for the project, Twill. I would like
 to call a vote for Twill to become an incubated project.

 The proposal is pasted below, and also available at:
 https://wiki.apache.org/incubator/TwillProposal

 Let's keep this vote open for three business days, closing the voting on
 Tuesday 11/12.

 [ ] +1 Accept Twill into the Incubator
 [ ] +0 Don't care.
 [ ] -1 Don't accept Twill because...

 -Andreas.

 = Abstract =

 Twill is an abstraction over Apache Hadoop® YARN that reduces the
 complexity of developing distributed applications, allowing developers to
 focus more on their business logic.

 = Proposal =

 Twill is a set of libraries that reduces the complexity of developing
 distributed applications. It exposes the distributed capabilities of Apache
 Hadoop® YARN via a simple and intuitive programming model similar to Java
 threads. Twill also has built-in capabilities required by many distributed
 applications, such as real-time application logs and metrics collection,
 application lifecycle management, and network service discovery.

 = Background =

 Hadoop YARN is a generic cluster resource manager that supports any type of
 distributed application. However, YARN’s interfaces are too low level for
 rapid application development. It requires a great deal of boilerplate code
 even for a simple application, creating a high ramp up cost that can turn
 developers away.

 Twill is designed to improve this situation with a programming model that
 makes running distributed applications as easy as running Java threads.
 With the abstraction provided by Twill, applications can be executed in
 process threads during development and unit testing and then be deployed to
 a YARN cluster without any modifications.

 Twill also has built-in support for real-time application logs and metrics
 collection, delegation token renewal, application lifecycle management, and
 network service discovery. This greatly reduces the pain that developers
 face when developing, debugging, deploying and monitoring distributed
 applications.

 Twill is not a replacement for YARN, it’s a framework that operates on top
 of YARN.

 = Rationale =

 Developers who write YARN applications typically find themselves
 implementing the same (or similar) boilerplate code over and over again
 for every application. It makes sense to distill this common code into a
 reusable set of libraries that is perpetually maintained and improved by a
 diverse community of developers.

 Twill’s simple thread-like programming model will enable many Java
 programmers to develop distributed applications. We believe that this
 simplicity will attract developers who would otherwise be discouraged by
 complexity, and many new use cases will emerge for the usage of YARN.

 Incubating Twill as an Apache project makes sense because Twill is a
 framework built on top of YARN, and Twill uses Apache Zookeeper, HDFS,
 Kafka, and other Apache software (see the External Dependencies section).

 = Current Status =

 Twill was initially developed at Continuuity under the name of Weave. The
 Weave codebase is currently hosted in a public repository at github.com,
 which will seed the Apache git repository after renaming to Twill.

 == Meritocracy ==

 Our intent with this incubator proposal is to start building a diverse
 developer community around Twill following the Apache meritocracy model.
 Since Twill was initially developed in early 2013, we have had fast
 adoption and contributions within Continuuity. We are looking forward to
 new contributors. We wish to build a community based on Apache's
 meritocracy principles, working with those who contribute significantly to
 the project and welcoming them to be committers both during the incubation
 process and beyond.

 == Community ==

 Twill is currently being used internally at Continuuity and is at the core
 of our products. We hope to extend our contributor base significantly and
 we will invite all who are interested in simplifying the development of
 distributed applications to participate.

 == Core Developers ==

 Twill is currently being developed by five engineers at Continuuity:
 Terence Yim, Andreas Neumann, Gary Helmling, Poorna Chandra and Albert
 Shau.
 Terence Yim is an Apache committer for Helix, Andreas is an Apache
 committer and PMC member for Oozie, and Gary Helmling is an Apache
 committer and PMC member for HBase. Poorna Chandra and Albert Shau have
 made many contributions to Twill.

 == Alignment ==

 The ASF is the natural choice to host the Twill project as its goal of
 encouraging community-driven open source projects fits with our vision for
 Twill.

 Additionally, many other projects with which we are familiar and expect
 Twill to integrate with, such as ZooKeeper, YARN, HDFS, log4j, and others
 mentioned

Re: Apache project bylaws

2013-10-02 Thread Doug Cutting

On Tue, Oct 1, 2013 at 11:13 PM, Alex Harui aha...@adobe.com wrote:
 The thread on members@ was titled Committer Qualifications.  I asked a
 question about the -1 vote on 9/7/13 and the reply I got was that
 committer voting does not have vetoes, and the document at [1] also seems
 to say that.

I followed up on that thread on members@, to get some clarity.

This issue has come up before.  I don't have time to search the
archives now, but I recall that folks agreed then that the norm at
Apache is consensus for committer additions.  The mention of
procedural votes on the voting page has been a source of confusion.
I suspect it is meant to allude to release plans and the like.  We
should clarify that it isn't meant to refer to committer or PMC member
votes, that those are generally subject to consensus votes.

Doug

-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org

Re: Apache project bylaws

2013-10-02 Thread Doug Cutting

On Wed, Oct 2, 2013 at 9:49 AM, Alex Harui aha...@adobe.com wrote:
 To me, agreeing on the norm is not the same as policy, especially policy
 that does not allow for exceptions.

I agree.  Establishing whether there is a norm is a useful first step.
 That's what I'm trying to take.  Thus far I've seen noone disagree
that consensus is most common for committer additions at Apache.  I've
also seen folks suggest that they prefer having norms than having
explicit bylaws for their projects.  I don't anticipate any policy
being established as a result of this discussion, except perhaps
better documenting what the assumed default is for projects that don't
choose to have explicit bylaws.

 And again, to me, consensus != unanimity.

This might be another case where better documentation would help.  In
my experience at Apache, consensus is equated with unanimity.

Doug

-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org

Re: Apache project bylaws

2013-10-02 Thread Doug Cutting

On Wed, Oct 2, 2013 at 10:20 AM, Alex Harui aha...@adobe.com wrote:
 I'm not sure I understand the
 difference between consensus and unanimous consensus.  Your thoughts?

The difference seems to be the quorum requirement of 3 +1 votes in the
case of consensus but not in unanimous consensus.

They use unanimous consensus in that document only for removals.
Removals at Apache are instead typically consensus-but-one (the person
being removed) although some projects specify a 2/3 or 3/4
super-majority instead.

Another discrepancy with standard Apache policy and that document is
that they don't require 3 +1 votes for a release.

Doug

-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org

Re: Apache project bylaws

2013-10-01 Thread Doug Cutting

On Tue, Oct 1, 2013 at 3:34 PM, Justin Mclean justinmcl...@gmail.com wrote:
 The whole reason this come about is because it's unclear what voting rules 
 are the default when voting someone in as  committer. See [1] (consensus) and 
 [2] (majority). If -1 is a veto or not is sort of important thing to know, 
 and which voting system is used actually changes how people vote.

The default at Apache is that committers and PMC members are added by
consensus.  In nearly every project code changes are also by consensus
while releases require 3 +1 votes from PMC members and more +1 votes
than -1 votes.  Projects that diverge from these should perhaps
document that somewhere, but projects that conform to these probably
don't need to.

I see no discrepancy between the documents you cite.  The first says
that committer votes are by consensus, the second says that
procedural votes are by majority, but doesn't define procedural and
there's no implication that it includes committer votes.

Doug

-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org

Re: Apache project bylaws

2013-10-01 Thread Doug Cutting

I don't find the discussion on members@ that comes to this conclusion. If
you cannot see members@ how do you know this?

Doug
On Oct 1, 2013 6:06 PM, Justin Mclean justinmcl...@gmail.com wrote:

 Hi,

  I see no discrepancy between the documents you cite.  The first says
  that committer votes are by consensus, the second says that
  procedural votes are by majority, but doesn't define procedural and
  there's no implication that it includes committer votes.

 There was conversation on members@ in the last couple of days (which I'm
 unable to view) that came to the opposite conclusion, so there's some
 confusion/differing opinion on the matter.

 Context is that I was under the assumption that consensus was required to
 vote a committer in, other PMC members thought otherwise or were unsure. On
 looking into it, I found it does vary from project to project and that it
 didn't seem to be defined clearly anywhere if your project doesn't have
 bylaws/guidelines.

 Thanks,
 Justin
 -
 To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
 For additional commands, e-mail: general-h...@incubator.apache.org

Re: Apache project bylaws

2013-10-01 Thread Doug Cutting

Lots of people on this list are also on members@, and, so far, none have
objected to my statements. If this continues, it would indicate a lack of
controversy.

Doug
On Oct 1, 2013 7:36 PM, Justin Mclean justinmcl...@gmail.com wrote:

 Hi,

  I don't find the discussion on members@ that comes to this conclusion.
 If
  you cannot see members@ how do you know this?

 I was informed by a member on Flex private and here [1] which you already
 responded to.

 Thanks,
 Justin

 1. http://markmail.org/thread/chfagblj72cv7zrt



 -
 To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
 For additional commands, e-mail: general-h...@incubator.apache.org

Re: Voting in Committers

2013-09-30 Thread Doug Cutting

On Sun, Sep 29, 2013 at 7:39 AM, Alex Harui aha...@adobe.com wrote:
 The answer I got on members@ is that [1] is not a policy document and
 therefore a vote as to whether to make someone a committer defaults to
 majority rules unless the TLP has voted otherwise, and a -1 vote is not a
 veto unless the TLP has voted otherwise.

On the contrary, I believe the default at Apache is that committer and
PMC votes are by consensus, that a -1 is a veto.  The rationale is
that committers and PMCs must regularly reach consensus on code
changes, so adding folks without consensus creates projects that
cannot function.

Doug

-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org

[RESULT] [VOTE] Accept Storm into the Incubator

2013-09-18 Thread Doug Cutting

On Thu, Sep 12, 2013 at 12:19 PM, Doug Cutting cutt...@apache.org wrote:
 I'd like to call a vote to accept Storm as a new Incubator podling.

This passes, with lots of +1 votes (plenty by PMC members) and no -1 votes.

Thanks for voting.

Doug

-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org

[VOTE] Accept Storm into the Incubator

2013-09-12 Thread Doug Cutting

 xumingmingv at gmail dot com
   * Jason Jackson jason at cvk dot ca
   * Andy Feng afeng at yahoo-inc dot com
   * Flip Kromer  flip at infochimps dot com
   * David Lao davidlao at microsoft dot com
   * P. Taylor Goetz ptgoetz at gmail dot com

== Affiliations ==

   * Nathan Marz - Nathan’s Startup
   * James Xu - Alibaba
   * Jason Jackson - Twitter
   * Andy Feng - Yahoo!
   * Flip Kromer - Infochimps
   * David Lao - Microsoft
   * P. Taylor Goetz - Health Market Science

== Sponsors ==


=== Champion ===

   * Doug Cutting  cutting at apache dot org

=== Nominated Mentors ===

  * Ted Dunning tdunning at maprtech dot com
  * Arvind Prabhakar arvind at apache dot org
  * Devaraj Das ddas at hortonworks dot com
  * Matt Franklin m.ben.franklin at gmail dot com
  * Benjamin Hindman benjamin.hindman at gmail dot com

=== Sponsoring Entity ===

 The Apache Incubator

-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org

Re: Moderation of report reminders

2013-07-16 Thread Doug Cutting

On Tue, Jul 16, 2013 at 10:13 AM, sebb seb...@gmail.com wrote:
 So why not pre-allow all automated senders when creating the podling list?

Why not pre-allow *@apache.org?

Doug

-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org

Re: Vote on personal matters: majority vote vs consensus

2013-03-28 Thread Doug Cutting

On Thu, Mar 28, 2013 at 9:29 AM, ant elder ant.el...@gmail.com wrote:
 Alternatively, you could say enough is enough and to end the debate
 you're going to call a vote to demonstrate i've the PMCs support - a
 vote on letting ant stay on. That sounds like you're being nice, but
 in fact you're being clever, because now you only need 25% of voters
 to vote -1 and i'm gone.

This sounds like a vote to support the status quo, which isn't
something we normally do.  Votes are typically phrased as changes to
the status quo, where a +1 indicates a vote for the change and a -1
indicates a vote to keep things as they are.  So there's a natural
valance to voting and the phrasing of the ballot should not alter
that.

Doug

-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org

Re: Vote on personal matters: majority vote vs consensus

2013-03-27 Thread Doug Cutting

On Wed, Mar 27, 2013 at 3:11 PM, Niall Pemberton
niall.pember...@gmail.com wrote:
 I think it should be 3/4 majority.

I agree that supermajority would be better than simple majority here.
Moving to simple majority seems too radical.  Over time it's more
prone to building a PMC that cannot easily agree on things.  If
consensus has proven too difficult to reach for a group this large,
then softening it a bit to supermajority seems like a better first
step then moving all the way to simple majority.

Doug

-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org

Re: No more existing-TLP graduations (was: [PROPOSAL] Curator for the Apache Incubator)

2013-02-27 Thread Doug Cutting

On Tue, Feb 26, 2013 at 2:17 PM, Benson Margulies bimargul...@gmail.com wrote:
 Guys, this was my point a few weeks ago, and the question I posed to
 the board. Did the board discuss it at the meeting, or is that part of
 the board meeting happening here?

Here is the comment I made in response to your question.  Several
other board members noted their assent with this statement and none
dissented.

   My understanding is that when a podling wishes to merge
   into an existing PMC, the decision is up to the
   recieving PMC, much like any other large contribution.
   The board provides oversight for this, as with all
   activity at the ASF.  That said, the IPMC provides
   valuable, relevant and desired input to the board's
   oversight.  Like a podling graduation, the IPMC can
   recommend action or caution to the board.

To my thinking, whether the IPMC wishes to get involved in projects
that might join existing PMCs is largely up to the IPMC, not the
board.  That said, it's sometimes hard to know at the outset whether
something will develop into a sufficiently large, independent
community or whether its developers might instead end up join an
existing, closely related community.  I don't see why the IPMC would
would want to forbid prospective podlings from mentioning this
possibility.

Doug

-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org

Re: [VOTE] Release Apache Crunch 0.5.0 (incubating) RC0

2013-02-19 Thread Doug Cutting

+1 Checksums  signatures match, tests pass, licencing looks to be in order.

Doug

On Fri, Feb 15, 2013 at 5:08 PM, Josh Wills jwi...@apache.org wrote:
 Hello,

 This is a call for a vote on releasing the following candidate as Apache
 Crunch 0.5.0 (incubating). This is our third release at Apache, and it
 fixes the following issues:

 https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12313526version=12323476

 The vote will be open for at least 72 hours. We received 1 IPMC member vote
 from Patrick Hunt on the vote thread on crunch-dev, and will need two more
 IPMC votes in order to make the release.

 Release artifacts:
 http://people.apache.org/~jwills/crunch-0.5.0-incubating-RC0/

 Maven staging repo:
 https://repository.apache.org/content/repositories/orgapachecrunch-228/

 The tag to be voted upon:
 https://git-wip-us.apache.org/repos/asf?p=incubator-crunch
 .git;a=tag;h=e60ace8424109dc941b13262d43dab659ffaca8a

 Crunch's KEYS file:
 http://www.apache.org/dist/incubator/crunch/KEYS

 Thanks,
 Josh


On Fri, Feb 15, 2013 at 5:08 PM, Josh Wills jwi...@apache.org wrote:
 Hello,

 This is a call for a vote on releasing the following candidate as Apache
 Crunch 0.5.0 (incubating). This is our third release at Apache, and it
 fixes the following issues:

 https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12313526version=12323476

 The vote will be open for at least 72 hours. We received 1 IPMC member vote
 from Patrick Hunt on the vote thread on crunch-dev, and will need two more
 IPMC votes in order to make the release.

 Release artifacts:
 http://people.apache.org/~jwills/crunch-0.5.0-incubating-RC0/

 Maven staging repo:
 https://repository.apache.org/content/repositories/orgapachecrunch-228/

 The tag to be voted upon:
 https://git-wip-us.apache.org/repos/asf?p=incubator-crunch
 .git;a=tag;h=e60ace8424109dc941b13262d43dab659ffaca8a

 Crunch's KEYS file:
 http://www.apache.org/dist/incubator/crunch/KEYS

 Thanks,
 Josh

-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org

Re: [VOTE] Graduate Apache Crunch Podling from the Incubator

2013-02-05 Thread Doug Cutting

+1

Doug

On Mon, Feb 4, 2013 at 11:41 AM, Josh Wills jwi...@apache.org wrote:
 This is a call to graduate the Apache Crunch podling from Apache Incubator.

 Apache Crunch entered the Incubator in May of 2012. We have made
 significant progress with the project since moving over to Apache. We have
 ten committers listed on our status page at [1] including three accepted
 after the podling was formed, and we have verified that Apache Crunch is a
 suitable name. [2]

 We completed two releases (Apache Crunch 0.3.0-incubating and Apache Crunch
 0.4.0-incubating) and are currently preparing for a third.

 The community of Apache Crunch is active, healthy, and growing and has
 demonstrated the ability to self-govern using accepted Apache practices.

 The Apache Crunch community voted overwhelmingly to graduate [3],
 collecting three binding votes from our mentors and IPMC members Arun
 Murthy, Tom White, and Patrick Hunt. You can view the discussion at [4] and
 [5].

 Our charter is below and on our wiki. [6]

 Please cast your votes:

 [ ] +1 Graduate Apache Crunch from Apache Incubator
 [ ] +0 Indifferent to graduation status of Apache Crunch
 [ ] -1 Reject graduation of Apache Crunch from Apache Incubator because...

 We'll run the vote for at least 72 hours (closing at the earliest at 8PM
 GMT on February 7th.)

 [1] http://incubator.apache.org/projects/crunch.html
 [2] https://issues.apache.org/jira/browse/PODLINGNAMESEARCH-18
 [3] http://markmail.org/message/7mplf2wyzqhs2gts
 [4] http://markmail.org/message/3zu5wszwpaqegxic
 [5] http://markmail.org/message/wbz43fpnta7r2w4e
 [6] https://cwiki.apache.org/confluence/display/CRUNCH/Graduation+Resolution

 X. Establish the Apache Crunch Project

 WHEREAS, the Board of Directors deems it to be in the best interests of
 the Foundation and consistent with the Foundation's purpose to establish
 a Project Management Committee charged with the creation and maintenance
 of open-source software, for distribution at no charge to the public,
 related to the development of Java libraries for writing, testing, and
 running MapReduce pipelines.

 NOW, THEREFORE, BE IT RESOLVED, that a Project Management Committee
 (PMC), to be known as the Apache Crunch Project, be and hereby is
 established pursuant to Bylaws of the Foundation; and be it further

 RESOLVED, that the Apache Crunch Project be and hereby is responsible
 for the creation and maintenance of software related to development of
 Java libraries for writing, testing, and running MapReduce pipelines; and
 be it further

 RESOLVED, that the office of Vice President, Apache Crunch be and
 hereby is created, the person holding such office to serve at the direction
 of the Board of Directors as the chair of the Apache Crunch Project, and to
 have primary responsibility for management of the projects within the scope
 of responsibility of the Apache Crunch Project; and be it further

 RESOLVED, that the persons listed immediately below be and hereby
 are appointed to serve as the initial members of the Apache Crunch Project:

 * Brock Noland br...@apache.org
 * Christian Tzolov tzo...@apache.org
 * Gabriel Reid gr...@apache.org
 * Josh Wills jwi...@apache.org
 * Kiyan Ahmadizadeh ki...@apache.org
 * Matthias Friedrich m...@apache.org
 * Rahul Sharma rsha...@apache.org
 * Robert Chu robert...@apache.org
 * Tom White tomwh...@apache.org
 * Vinod Kumar Vavilapalli vino...@apache.org

 NOW, THEREFORE, BE IT FURTHER RESOLVED, that Josh Wills be appointed to the
 office of Vice President, Apache Crunch, to serve in accordance with and
 subject to the direction of the Board of Directors and the Bylaws of the
 Foundation until death, resignation, retirement, removal
 or disqualification, or until a successor is appointed; and be it further

 RESOLVED, that the initial Apache Crunch PMC be and hereby is tasked
 with the creation of a set of bylaws intended to encourage open development
 and increased participation in the Apache Crunch Project; and be it further

 RESOLVED, that the Apache Crunch Project be and hereby is tasked with
 the migration and rationalization of the Apache Incubator Crunch podling;
 and be it further

 RESOLVED, that all responsibilities pertaining to the Apache
 Incubator Crunch podling encumbered upon the Apache Incubator Project are
 hereafter discharged.

-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org

Re: [VOTE] Graduate Apache Etch podling from Apache Incubator

2012-12-18 Thread Doug Cutting

+1

Doug

On Tue, Dec 18, 2012 at 1:10 AM, Martin Veith martin.ve...@bmw-carit.de wrote:
 This is a call for vote to graduate the Apache Etch podling from Apache 
 Incubator.

 The Apache Etch project entered the Incubator in September 2008.
 Since then it has had ups and downs but we feel ready for graduation now.
 Our community, though small, is active and healthy.

 In the past four years we have grown the community in committers and users 
 (e.g. in the automotive domain).
 We made significant improvements to the project codebase and completed 
 several releases according to the ASF guidelines.

 The community (including our mentors and shepherds) has voted to proceed with 
 graduation [1], the result can be found at [2].
 Discussion and voting for the proposed resolution is also available at [3] 
 and [4].
 Please find the proposed board resolution below.

 Please cast your votes:
 [ ] +1 Graduate Apache Etch podling from Apache Incubator
 [ ] +0 Indifferent to the graduation status of Apache Etch podling
 [ ] -1 Reject graduation of Apache Etch podling from Apache Incubator because 
 ...

 Thanks,
 Martin

 [1] http://s.apache.org/etch-graduation-vote
 [2] http://s.apache.org/etch-graduation-vote-result
 [3] http://s.apache.org/etch-resolution-proposal
 [4] http://s.apache.org/etch-proposal-vote-result

 -
 Resolution:
 X. Establish the Apache Etch Project

WHEREAS, the Board of Directors deems it to be in the best
interests of the Foundation and consistent with the
Foundation's purpose to establish a Project Management
Committee charged with the creation and maintenance of
open-source software, for distribution at no charge to
the public, related to a cross-platform, language- and
transport-independent RPC-like messaging framework for
building and consuming network services.

NOW, THEREFORE, BE IT RESOLVED, that a Project Management
Committee (PMC), to be known as the Apache Etch Project,
be and hereby is established pursuant to Bylaws of the
Foundation; and be it further

RESOLVED, that the Apache Etch Project be and hereby is
responsible for the creation and maintenance of software
related to a cross-platform, language- and
transport-independent RPC-like messaging framework for
building and consuming network services;
and be it further

RESOLVED, that the office of Vice President, Apache Etch be
and hereby is created, the person holding such office to
serve at the direction of the Board of Directors as the chair
of the Apache Etch Project, and to have primary responsibility
for management of the projects within the scope of
responsibility of the Apache Etch Project; and be it further

RESOLVED, that the persons listed immediately below be and
hereby are appointed to serve as the initial members of the
Apache Etch Project:

 * Scott Comerscco...@apache.org
 * Martijn Dashorst   dasho...@apache.org
 * Michael Fitzner  fitz...@apache.org
 * Youngjin Park yp...@apache.org
 * Martin Veith   vei...@apache.org

NOW, THEREFORE, BE IT FURTHER RESOLVED, that Martin Veith
be appointed to the office of Vice President, Apache Etch, to
serve in accordance with and subject to the direction of the
Board of Directors and the Bylaws of the Foundation until
death, resignation, retirement, removal or disqualification,
or until a successor is appointed; and be it further

RESOLVED, that the initial Apache Etch PMC be and hereby is
tasked with the creation of a set of bylaws intended to
encourage open development and increased participation in the
Apache Etch Project; and be it further

RESOLVED, that the Apache Etch Project be and hereby
is tasked with the migration and rationalization of the Apache
Incubator Etch podling; and be it further

RESOLVED, that all responsibilities pertaining to the Apache
Incubator Etch podling encumbered upon the Apache Incubator
Project are hereafter discharged.


 -
 To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
 For additional commands, e-mail: general-h...@incubator.apache.org


-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org

Re: [VOTE] Release Apache Crunch 0.4.0 (incubating) RC1

2012-11-16 Thread Doug Cutting

+1 RAT tests pass (as do others), checksums  sigs match.

On Tue, Nov 13, 2012 at 9:54 AM, Matthias Friedrich m...@mafr.de wrote:
 Hi,

 this is a call for a vote on releasing the following candidate as
 Apache Crunch 0.4.0 (incubating). This is the second release candidate
 of our second release at Apache, and it fixes the following issues:

 https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12313526version=12323244

 Our vote thread on crunch-dev:
 http://mail-archives.apache.org/mod_mbox/incubator-crunch-dev/201211.mbox/%3C20121113173454.GA3000%40mafr.de%3E

 We already collected one IPMC vote from Roman Shaposhnik (thanks!),
 so we still need two more votes.

 Please download, test, and vote by November 16th at 18:00 UTC.

 Release artifacts:
 http://people.apache.org/~mafr/apache-crunch-0.4.0-incubating-rc1/

 Maven staging repo:
 https://repository.apache.org/content/repositories/orgapachecrunch-034/

 The tag to be voted upon:
 https://git-wip-us.apache.org/repos/asf?p=incubator-crunch.git;a=commit;h=91e6c96899f85245255476f0a5e7d5feb48ddac0

 PGP keys for the Crunch team:
 https://people.apache.org/keys/group/crunch.asc

 Some basic release validation checks:
 https://cwiki.apache.org/confluence/display/CRUNCH/Validating+a+Release

 The vote will be open for 72 hours.

 Regards,
   Matthias

 -
 To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
 For additional commands, e-mail: general-h...@incubator.apache.org


-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org

Re: [VOTE] Apache Crunch (incubating) 0.3.0 Release Candidate 1

2012-09-11 Thread Doug Cutting

+1  Downloaded sources, ran RAT, validated checksums.

Doug

On Tue, Sep 11, 2012 at 5:58 AM, Josh Wills jwi...@apache.org wrote:
 Hello everyone,

 This is a call for a vote on releasing the following candidate as Apache
 Crunch 0.3.0 (incubating). This will be our first release. A vote was held
 on the developer mailing list and passed with 4 +1s:

 http://markmail.org/thread/yvtvog5lrj3a7gep

 +1s:
 phunt (IPMC)
 jwills (binding)
 greid (binding)
 mafr (binding)

 We need two additional IPMC votes.

 The release fixes the issues listed here:

 https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12313526version=12322446

 Please download, test, and vote by September 14th at 6AM Pacific Time.

 Source files:
 http://people.apache.org/~jwills/crunch-0.3.0-incubating-RC1/

 Maven staging repo:
 https://repository.apache.org/content/repositories/orgapachecrunch-040/

 The tag to be voted upon:
 https://git-wip-us.apache.org/repos/asf?p=incubator-crunch.git;a=tag;h=4666bd889f9b641d7c0157bc4401a1b985fedc89

 Crunch's KEYS file:
 http://www.apache.org/dist/incubator/crunch/KEYS

 The vote will be open for 72 hours.

 [ ] +1  approve
 [ ] +0  no opinion
 [ ] -1  disapprove (and reason why)

 Thank you,
 Josh

-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org

Re: [VOTE] Accept Drill into the Apache Incubator

2012-08-11 Thread Doug Cutting

Otis said his vote was 'blinding', not 'binding'.

Doug
On Aug 11, 2012 12:28 AM, Ted Dunning ted.dunn...@gmail.com wrote:

 This vote is now closed.

 In the responses to this thread, I count 15 binding positive votes and
 4 non-binding votes.  The number of positive votes increases to 17 if
 you count myself (the champion) and Isabel (a mentor) but neither of
 us actually sent the key email to record a vote (oops).

 One of the non-binding votes was by Otis Gospadnetic who said that his
 vote was binding, but I didn't find his name on the list of incubator
 PMC members, so I counted it as non-binding.  The list I used is at
 http://people.apache.org/committers-by-project.html#incubator-pmc

 By any count, this vote to admit Drill to incubator therefore passes.

 This proposal includes mentors so this vote also constitutes
 acceptance of the mentors by the Incubator PMC.  All three of the
 mentors (Grant, myself, and Isabel) are Apache members.

 This proposal as approved also includes an initial list of committers,
 all of whom have ICLA's on file.

 I will coordinate with the other mentors and the committers to commit
 the status file and perform other establishment activities necessary
 to establish Drill as a project under incubation.  I expect that this
 will take several days.  I will announce progress on this mailing list
 to allow people to subscribe to the mailing lists.


 On Thu, Aug 9, 2012 at 11:27 AM, Andrew Purtell apurt...@apache.org
 wrote:
  +1 (non-binding)
 
  On Wed, Aug 8, 2012 at 8:11 AM, Ted Dunning ted.dunn...@gmail.com
 wrote:
  I would like to call a vote for accepting Drill for incubation in the
  Apache Incubator. The full proposal is available below.  Discussion
  over the last few days has been quite positive.
 
  Please cast your vote:
 
  [ ] +1, bring Drill into Incubator
  [ ] +0, I don't care either way,
  [ ] -1, do not bring Drill into Incubator, because...
 
  This vote will be open for 72 hours and only votes from the Incubator
  PMC are binding.  The start of the vote is just before 3AM UTC on 8
  August so the closing time will be 3AM UTC on 11 August.
 
  Thank you for your consideration!
 
  Ted
 
  http://wiki.apache.org/incubator/DrillProposal
 
  = Drill =
 
  == Abstract ==
  Drill is a distributed system for interactive analysis of large-scale
  datasets, inspired by
  [[http://research.google.com/pubs/pub36632.html|Google's Dremel]].
 
  == Proposal ==
  Drill is a distributed system for interactive analysis of large-scale
  datasets. Drill is similar to Google's Dremel, with the additional
  flexibility needed to support a broader range of query languages, data
  formats and data sources. It is designed to efficiently process nested
  data. It is a design goal to scale to 10,000 servers or more and to be
  able to process petabyes of data and trillions of records in seconds.
 
  == Background ==
  Many organizations have the need to run data-intensive applications,
  including batch processing, stream processing and interactive
  analysis. In recent years open source systems have emerged to address
  the need for scalable batch processing (Apache Hadoop) and stream
  processing (Storm, Apache S4). In 2010 Google published a paper called
  Dremel: Interactive Analysis of Web-Scale Datasets, describing a
  scalable system used internally for interactive analysis of nested
  data. No open source project has successfully replicated the
  capabilities of Dremel.
 
  == Rationale ==
  There is a strong need in the market for low-latency interactive
  analysis of large-scale datasets, including nested data (eg, JSON,
  Avro, Protocol Buffers). This need was identified by Google and
  addressed internally with a system called Dremel.
 
  In recent years open source systems have emerged to address the need
  for scalable batch processing (Apache Hadoop) and stream processing
  (Storm, Apache S4). Apache Hadoop, originally inspired by Google's
  internal MapReduce system, is used by thousands of organizations
  processing large-scale datasets. Apache Hadoop is designed to achieve
  very high throughput, but is not designed to achieve the sub-second
  latency needed for interactive data analysis and exploration. Drill,
  inspired by Google's internal Dremel system, is intended to address
  this need.
 
  It is worth noting that, as explained by Google in the original paper,
  Dremel complements MapReduce-based computing. Dremel is not intended
  as a replacement for MapReduce and is often used in conjunction with
  it to analyze outputs of MapReduce pipelines or rapidly prototype
  larger computations. Indeed, Dremel and MapReduce are both used by
  thousands of Google employees.
 
  Like Dremel, Drill supports a nested data model with data encoded in a
  number of formats such as JSON, Avro or Protocol Buffers. In many
  organizations nested data is the standard, so supporting a nested data
  model eliminates the need to normalize the data. With that said, flat

Re: [VOTE] Accept Blur into the Apache Incubator

2012-07-22 Thread Doug Cutting

 that
 currently use Blur are committed to improving the codebase of the
 project due to its fulfilling needs not addressed by any other
 software. In addition, one customer is providing financial support to
 further develop Blur given its importance on mission-critical
 projects.

 === Inexperience with Open Source ===
 The codebase has been treated internally as an open source project
 since its beginning, and Near Infinity has extensive experience
 developing and releasing open source projects
 (http://www.nearinfinity.com/products/open_source). We do not
 anticipate difficulty in operating under the Apache Way.

 === Homogeneous Developers ===
 Current developers are all employed by Near Infinity but we are
 actively seeking contributors from different companies and would
 welcome their participation.

 === Reliance on Salaried Developers ===
 Blur was originally created by Aaron !McCurry as a personal project
 and he remains the primary contributor.  Currently, Aaron’s employer
 (Near Infinity) fully supports his continued participation with paid,
 dedicated time to work on Blur. All other current developers are paid
 by Near Infinity to work on Blur as well.

 === Relationships with Other Apache Products ===
 Blur dependencies:

  * Apache Hadoop
  * Apache Lucene
  * Apache !ZooKeeper
  * Apache Thrift
  * Apache log4j

 === Apache Brand ===
 Our interest in releasing this code as an Apache project is due to its
 strong relationship with other Apache projects, i.e. Blur has
 dependencies on Hadoop, Lucene, !ZooKeeper, and Thrift and its
 uniqueness within the Hadoop ecosystem.

 == Documentation ==
 Current documentation can be found at http://blur.io and
 https://github.com/nearinfinity/blur.

 == Initial Source ==
 Blur has been in development since summer 2010. The core codebase
 consists of about ~29,000 (~10,000 if the generated RPC code is not
 included) lines of code mainly Java.

 == Source and Intellectual Property Submission Plan ==
 Blur core code, examples, documentation, and training materials will
 be submitted by Near Infinity Corporation.

 == External Dependencies ==
  * concurrentlinkedhashmap - Apache 2.0 License -
 http://code.google.com/p/concurrentlinkedhashmap/

 == Cryptography ==
 none

 == Required Resources ==
  * Mailing Lists
* blur-private
* blur-dev
* blur-commits
* blur-user
  * Subversion Directory
* https://git-wip-us.apache.org/repos/asf/blur.git
  * Issue Tracking
* JIRA
  * Continuous Integration
* Jenkins
  * Web
* http://incubator.apache.org/blur/wiki at http://wiki.apache.org
 or http://cwiki.apache.org

 == Initial Committers ==
  * Aaron !McCurry (aaron.mccurry at nearinfinity dot com)
  * Scott Leberknight (scott.leberknight at nearinfinity dot com)
  * Ryan Gimmy (ryan.gimmy at nearinfinity dot com)
  * Tim Williams (twilliams at apache dot org)
  * Patrick Hunt (phunt at apache dot org)
  * Doug Cutting (cutting at apache dot org)

 == Affiliations ==
  * Aaron !McCurry, Near Infinity
  * Scott Leberknight, Near Infinity
  * Ryan Gimmy, Near Infinity
  * Patrick Hunt, Cloudera
  * Doug Cutting, Cloudera

 == Sponsors ==
  * Champion: Patrick Hunt

 == Nominated Mentors ==
  * Tim Williams  (twilliams at apache dot org)
  * Doug Cutting (cutting at apache dot org)
  * Patrick Hunt (phunt at apache dot org)

 == Sponsoring Entity ==
  * Apache Incubator

 -
 To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
 For additional commands, e-mail: general-h...@incubator.apache.org

Re: [VOTE] Accept Crunch into the Apache Incubator

2012-05-24 Thread Doug Cutting


+1

Doug

On 05/23/2012 11:45 AM, Josh Wills wrote:

I would like to call a vote for accepting Apache Crunch for
incubation in the Apache Incubator. The full proposal is available
below.  We ask the Incubator PMC to sponsor it, with phunt as
Champion, and phunt, tomwhite, and acmurthy volunteering to be
Mentors.

Please cast your vote:

[ ] +1, bring Crunch into Incubator
[ ] +0, I don't care either way,
[ ] -1, do not bring Crunch into Incubator, because...

This vote will be open for 72 hours and only votes from the Incubator
PMC are binding.

http://wiki.apache.org/incubator/CrunchProposal

Proposal text from the wiki:
--
= Crunch - Easy, Efficient MapReduce Pipelines in Java and Scala =

== Abstract ==

Crunch is a Java library for writing, testing, and running pipelines
of !MapReduce jobs on Apache Hadoop.

== Proposal ==

Crunch is a Java library for writing, testing, and running pipelines
of !MapReduce jobs on Apache Hadoop. Its main goal is to provide a
high-level API for writing and testing complex !MapReduce jobs that
require multiple processing stages.  It has a simple, flexible, and
extensible data model that makes it ideal for processing data that
does not naturally fit into a relational structure, such as time
series and serialized object formats like JSON and Avro. It supports
running pipelines either as a series of !MapReduce jobs on an Apache
Hadoop cluster or in memory on a single machine for fast testing and
debugging.

== Background ==

Crunch was initially developed by Cloudera to simplify the process of
creating sequences of dependent !MapReduce jobs, especially jobs that
processed non-relational data like time series. Its design was based
on a paper Google published about a Java library they developed called
!FlumeJava that was created in order to solve a similar class of
problems. Crunch was open-sourced by Cloudera on !GitHub as an Apache
2.0 licensed project in October 2011. During this time Crunch has been
formally released twice, as versions 0.1.0 (October 2010) and 0.2.0
(February 2012), with an incremental update to version 0.2.1 (March
2012) .  These releases are also distributed by Cloudera as source and
binaries from Cloudera's Maven repository.

== Rationale ==

Most of the interesting analytical and data processing tasks that are
run on an Apache Hadoop cluster require a series of !MapReduce jobs to
be executed in sequence. Developers who are creating these pipelines
today need to manually assign the sequence of tasks to perform in a
dependent chain of !MapReduce jobs, even though there are a number of
well-known patterns for fusing dependent computations together into a
single !MapReduce stage and for performing common types of joins and
aggregations. This results in !MapReduce pipelines that are more
difficult to test, maintain, and extend to support new functionality.

Furthermore, the type of data that is being stored and processed using
Apache Hadoop is evolving. Although Hadoop was originally used for
storing large volumes of structured text in the form of webpages and
log files, it is now common for Hadoop to store complex, structured
data formats such as JSON, Apache Avro, and Apache Thrift. These
formats allow developers to work with serialized objects in
programming languages like Java, C++, and Python, and allow for new
types of analysis to be performed on complex data types. Hadoop has
also been adopted by the scientific research community, who are using
Hadoop to process time series data, structured binary files in the
HDF5 format, and large medical and satellite images.

Crunch addresses these challenges by providing a lightweight and
extensible Java API for defining the stages of a data processing
pipeline, which can then be run on an Apache Hadoop cluster as a
sequence of dependent !MapReduce jobs, or in-memory on a single
machine to facilitate fast testing and debugging. Crunch relies on a
small set of primitive abstractions that represent immutable,
distributed collections of objects. Developers define functions that
are applied to those objects in order to generate new immutable,
distributed collections of objects. Crunch also provides a library of
common !MapReduce patterns for performing efficient joins and
aggregation operations over these distributed collections that
developers may integrate into their own pipelines. Crunch also
provides native support for processing structured binary data formats
like JSON, Apache Avro, and Apache Thrift, and is designed to be
extensible to support working with any kind of data format that Java
supports in its native form.

== Initial Goals ==

Crunch is currently in its first major release with a considerable
number of enhancement requests, tasks, and issues recorded towards its
future development. The initial goal of this project will be to
continue to build community in the spirit of the Apache

Re: [VOTE] CloudStack for Apache Incubator

2012-04-10 Thread Doug Cutting

+1

Doug
 On Apr 9, 2012 6:32 PM, Kevin Kluge kevin.kl...@citrix.com wrote:

 Hi All.  I'd like to call for a VOTE for CloudStack to enter the
 Incubator.  The proposal is available at [1] and I have also included it
 below.   Please vote with:
 +1: accept CloudStack into Incubator
 +0: don't care
 -1: do not accept CloudStack into Incubator (please explain the objection)

 The vote is open for at least 72 hours from now (until at least 19:00
 US-PST on April 12, 2012).

 Thanks for the consideration.

 -kevin

 [1] http://wiki.apache.org/incubator/CloudStackProposal




 Abstract

 CloudStack is an IaaS (Infrastracture as a Service) cloud orchestration
 platform.

 Proposal

 CloudStack provides control plane software that can be used to create an
 IaaS cloud. It includes an HTTP-based API for user and administrator
 functions and a web UI for user and administrator access. Administrators
 can provision physical infrastructure (e.g., servers, network elements,
 storage) into an instance of CloudStack, while end users can use the
 CloudStack self-service API and UI for the provisioning and management of
 virtual machines, virtual disks, and virtual networks.

 Citrix Systems, Inc. submits this proposal to donate the CloudStack source
 code, documentation, websites, and trademarks to the Apache Software
 Foundation (ASF).

 Background

 Amazon and other cloud pioneers invented IaaS clouds. Typically these
 clouds provide virtual machines to end users. CloudStack additionally
 provides baremetal OS installation to end users via a self-service
 interface. The management of physical resources to provide the larger goal
 of cloud service delivery is known as orchestration. IaaS clouds are
 usually described as elastic -- an elastic service is one that allows its
 user to rapidly scale up or down their need for resources.

 A number of open source projects and companies have been created to
 implement IaaS clouds. Cloud.com started CloudStack in 2008 and released
 the source under GNU General Public License version 3 (GPL v3) in 2010.
 Citrix acquired Cloud.com, including CloudStack, in 2011. Citrix
 re-licensed the CloudStack source under Apache License v2 in April, 2012.

 Rationale

 IaaS clouds provide the ability to implement datacenter operations in a
 programmable fashion. This functionality is tremendously powerful and
 benefits the community by providing:

 - More efficient use of datacenter personnel
 - More efficient use of datacenter hardware
 - Better responsiveness to user requests
 - Better uptime/availability through automation

 While there are several open source IaaS efforts today, none are governed
 by an independent foundation such as ASF. Vendor influence and/or
 proprietary implementations may limit the community's ability to choose the
 hardware and software for use in the datacenter. The community at large
 will benefit from the ability to enhance the orchestration layer as needed
 for particular hardware or software support, and to implement algorithms
 and features that may reduce cost or increase user satisfaction for
 specific use cases. In this respect the independent nature of the ASF is
 key to the long term health and success of the project.

 Initial Goals

 The CloudStack project has two initial goals after the proposal is
 accepted and the incubation has begun.

 The Cloudstack Project's first goal is to ensure that the CloudStack
 source includes only third party code that is licensed under the Apache
 License or open source licenses that are approved by the ASF for use in ASF
 projects. The CloudStack Project has begun the process of removing third
 party code that is not licensed under an ASF approved license. This is an
 ongoing process that will continue into the incubation period. Third party
 code contributed to CloudStack under the CloudStack contribution agreement
 was assigned to Cloud.com in exchange for distributing CloudStack under
 GPLv3. The CloudStack project has begun the process of amending the
 previous CloudStack contribution agreements to obtain consent from existing
 contributors to change the CloudStack project's license. In the event that
 an existing contributor does not consent to this change, the project is
 prepared to remove that contributor's code. Additionally, there are binary
 dependencies on redistributed libraries that are not provided with an
 ASF-approved license. Finally, the CloudStack has source files incorporated
 from third parties that were not provided with an ASF-approved license. We
 have begun the process of re-writing this software. This is an ongoing
 process that will extend into the incubation period. These issues are
 discussed in more detail later in the proposal.

 Although CloudStack is open source, many design documents and discussions
 that should have been publicly available and accessible were not
 publicized. The Project's second goal will be to fix this lack of
 transparency by encouraging the initial committers to

Re: [VOTE] Release Apache Flume version 1.1.0-incubating (rc1)

2012-03-25 Thread Doug Cutting

+1 Checksums and signatures match, tests pass, RAT finds no issues.

Doug

On 03/19/2012 05:46 PM, Arvind Prabhakar wrote:
 This is the second incubator release for Apache Flume, version
 1.1.0-incubating. We are now voting on release candidate rc1.
 
 *** Please cast your vote within the next 72 hours ***
 
 The list of fixed issues:
 https://svn.apache.org/repos/asf/incubator/flume/tags/flume-1.1.0-incubating-rc1/CHANGELOG
 
 The tarball (*.tar.gz), signature (*.asc), checksum (*.md5sum,
 *.sha1sum) for the source and binary can be found at:
 http://people.apache.org/~arvind/flume/110rc1/
 
 The tag to be voted upon:
 https://svn.apache.org/repos/asf/incubator/flume/tags/flume-1.1.0-incubating-rc1/
 
 The KEYS file:
 http://www.apache.org/dist/incubator/flume/KEYS
 
 Changes since last build:
 * FLUME-1032. Fix Flume NG build for binary distribution
 * Updated change log and release notes.
 
 -
 To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
 For additional commands, e-mail: general-h...@incubator.apache.org
 

-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org

Re: [DISCUSS] - Packages renaming and backward compatibility (was: Re: [VOTE] Graduate Sqoop podling from Apache Incubator)

2012-03-08 Thread Doug Cutting

On 03/07/2012 11:31 PM, Alex Karasulu wrote:
 Not trying to beat a dead horse to death here but I'm starting to think
 that we might have had some basis to these package namespace issues. The
 recent private Lucene-Commons threads show what can happen if this policy
 is that hmmm liberal. Don't know if that's the right choice of words.

The differences between the cases should inform any policy.

In one case you have the inclusion of an older package name for
back-compatibility by the same community that created the older API.  In
the other case you have the inclusion of an API that conflicts with one
managed by a different, still-active community.

Doug

-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org

Re: Thoughts on Incubator board reports

2012-03-06 Thread Doug Cutting

Jukka,

This sounds like a great plan to me.  Providing the board with a summary
demonstrates that the IPMC has reviewed all of the podling reports and
assessed the progress of each podling.  Also including the full podling
reports to the board both gives supporting evidence to the podling
summaries and preserves the podling report for posterity in the minutes,
so that the board and others can easily review progress from past quarters.

+1

Doug

On 03/05/2012 03:29 PM, Jukka Zitting wrote:
 Hi,
 
 During the February board meeting there was a discussion about what
 the directors would like to see in Incubator reports. The feedback we
 got on this ranged from providing just an executive summary of all
 Incubator activity to doing that *and* including all the podling
 reports. While there was no clear single message, the overall
 impression I got was that the board expects the Incubator PMC to
 provide better and more active oversight on podlings. At the same time
 many directors also wanted to hear directly from the podlings
 themselves.
 
 After thinking about this for a while, here's what I think we should do:
 
 First of all I think we should keep including the individual podling
 reports in the Incubator board report. The main reason for doing this
 is that I think we should get the podlings up to the habit of
 reporting to the board instead of just to the IPMC right from the
 beginning. The IPMC will provide extra review and feedback to help the
 podlings, but ultimately all the reports are addressed to the ASF
 board. This approach should also be in line with the ideas of scaling
 back the the Incubator and making podlings more autonomous.
 
 Second, to address concerns about oversight within the Incubator as a
 whole and to provide enough information to directors who may not be
 interested in all the details of individual podling reports, the IPMC
 should also provide a report summary along the lines of what we did
 last month. In addition to basic classification of podlings based on
 their progress, we should also highlight any notable issues or other
 topics the board may want to focus on.
 
 Finally, and crucially since the above isn't too different from what
 we've been doing all along, we'll take some time to discuss the
 podling reports that need some clarification or for which some other
 kind of feedback should be given. As you've seen, I've already started
 doing some of that and I'm hoping to set an example for others to
 follow. Iterate for a few months, and I believe the result should be a
 notable increase in report quality, graduation focus and more
 generally the awareness within the IPMC of how the podlings are doing.
 
 BR,
 
 Jukka Zitting
 
 -
 To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
 For additional commands, e-mail: general-h...@incubator.apache.org
 

-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org

Re: [VOTE] Graduate Sqoop podling from Apache Incubator

2012-02-29 Thread Doug Cutting

On 02/29/2012 01:33 AM, Alex Karasulu wrote:
 No project should be allowed to graduate without solving all issues
 pertaining to marks. It's a failure of the incubator in the past for
 allowing other projects to do so. I'm shocked it was allowed.

This is not a trademark issue.  Package names are subject to fair use.

Doug

-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org

Re: [VOTE] Graduate Sqoop podling from Apache Incubator

2012-02-29 Thread Doug Cutting

On 02/29/2012 06:19 AM, Alex Karasulu wrote:
 The class/package names are merely not being deleted. Presuming that the
  original code was part of the inceptional code grant, one can conclude that
  the company in question doesn't mind their namespace being used by ASF
  projects *for that purpose*.
 
 
 OK I'm completely content if the Co. in question does so in writing freeing
 us of any responsibility.

I don't think this is required.

 ... the names of the Java language API files, packages, classes, and
methods are not protectable as a matter of law ...

http://www.groklaw.net/articlebasic.php?story=20110915194531435

Doug

-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org

Re: [VOTE] Graduate Sqoop podling from Apache Incubator

2012-02-28 Thread Doug Cutting

On 02/28/2012 12:59 AM, Alex Karasulu wrote:
 That namespace is a mark of Cloudera. 

Package names are not generally considered to be trademarks.

Doug

-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org

Re: [VOTE] Graduate Sqoop podling from Apache Incubator

2012-02-28 Thread Doug Cutting

On 02/28/2012 06:01 AM, Ate Douma wrote:
 And specifically as this seems to concern compatibility support for
 Cloudera own API, only needed for Cloudera customers.

Sqoop was an Apache-licensed open source project at Github before it
came to Apache.  It's thus safe to assume that it had users who were not
Cloudera customers before it came to Apache.

Doug




-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org

Re: [DISCUSS] Moving forward

2012-02-09 Thread Doug Cutting

On 02/09/2012 07:42 AM, Mattmann, Chris A (388J) wrote:
 2. I wrote an Incubator deconstruction proposal here:
 http://wiki.apache.org/incubator/IncubatorDeconstructionProposal
 
 I still wholly believe in the proposal and that it should be implemented.
 It contains a series of (potentially revertible) steps that in my mind will
 remove unnecessary overheads and get us to the philosophy of 
 Incubation yes; Incubator, no. The Incubator was a success, it's 
 served its purpose. We should celebrate and move on.

An alternative to pro-actively deconstructing the Incubator might be to
try the direct-to-TLP approach on some new projects.  If this new
approach demonstrates a smoother path to TLP than traditional incubation
then new projects would prefer it and the Incubator would wither and die
a natural death.  We don't need a vote by the IPMC to try this.  Just
find a candidate project and bring a proposal to the board.

Doug

-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org

Re: [VOTE] Jukka Zitting for IPMC Chair (was Re: NOMINATIONS for Incubator PMC Chair)

2012-02-09 Thread Doug Cutting

+1

Doug

On 02/09/2012 07:16 AM, Mattmann, Chris A (388J) wrote:
 Hi Folks,
 
 OK there has been enough discussion here. It's time to VOTE for a new IPMC
 chair and it looks like the remaining folks (including me) that were in the 
 running 
 have aligned beyond the following nominee: Jukka Zitting. Suffice to say, he 
 was 
 *my first choice* :) 
 
 In the interest of moving the current discussion matters forward, please VOTE
 on this recommendation to the board by the IPMC. I'll leave the VOTE open
 for at least the next 72 hours:
 
 [ ] +1 Recommend Jukka Zitting for the IPMC chair position.
 [ ] +0 Don't care.
 [ ]  -1 Don't recommend Jukka Zitting for the IPMC chair position because...
 
 Note that only VOTEs from the Incubator PMC members are binding, but
 all are welcome to voice their opinion and it will be recorded in the final
 tallies. 
 
 Finally, just to note, these VOTEs on personnel are normally the only
 thing in Apache that is discussed in private (human/social issues), but
 in the interest of openness and transparency that has been demonstrated
 here during these discussions, I will hold this VOTE on the public list.
 
 Thanks!
 
 Cheers,
 Chris
 
 P.S. Here's my +1. Thanks buddy.
 
 On Feb 8, 2012, at 3:11 PM, Benson Margulies wrote:
 
 I am happy to step out of the way for Jukka. He was clever enough to
 stay out of the email s*** storm, and that alone, in my mind, renders
 him most qualified.

 On Wed, Feb 8, 2012 at 6:02 PM, Christian Grobmeier grobme...@gmail.com 
 wrote:
 I already mentioned that I would have nominated you, and so I am
 delighted to read your message. It will be very difficult to choose
 between all these strong candidates.

 Cheers

 On Wed, Feb 8, 2012 at 11:49 PM, Jukka Zitting jukka.zitt...@gmail.com 
 wrote:
 Hi,

 After consideration and some convincing (thanks!), I've decided to
 throw also my hat into the ring as a candidate to be the next chairman
 of the IPMC.

 I believe in that role I could be more effective in focusing more of
 our collective attention at where I think it would do most good - at
 the actual podlings we're here to help.

 That said, the current incubation process clearly has problems and I
 very much support efforts to improve the way we work (even if the
 result is to replace the Incubator with something better). However,
 I'd like to leave the leadership on these efforts to others and, as
 mentioned elsewhere, rather try to act as a balancing force that helps
 achieve consensus where possible.

 Should I be elected, I'd resign as the chairman of the Jackrabbit PMC.
 In fact I think it's in any case high time for Jackrabbit to be
 rotating that role.

 Finally, if elected (and assuming the IPMC still exists), I'd serve
 for at most two years before calling for a re-election, or possibly
 much less if I don't find enough free cycles to perform the duty as
 well as it should.

 BR,

 Jukka Zitting

 
 
 ++
 Chris Mattmann, Ph.D.
 Senior Computer Scientist
 NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
 Office: 171-266B, Mailstop: 171-246
 Email: chris.a.mattm...@nasa.gov
 WWW:   http://sunset.usc.edu/~mattmann/
 ++
 Adjunct Assistant Professor, Computer Science Department
 University of Southern California, Los Angeles, CA 90089 USA
 ++
 
 
 -
 To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
 For additional commands, e-mail: general-h...@incubator.apache.org
 

-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org

Re: [DISCUSS] Re: [VOTE] Jukka Zitting for IPMC Chair (was Re: NOMINATIONS for Incubator PMC Chair)

2012-02-09 Thread Doug Cutting

On 02/09/2012 08:39 AM, sebb wrote:
 In case it's not obvious, I agree with Ross, Andrus and Marcel - I
 think the current VOTE thread is invalid and should be cancelled.

I don't see how it is invalid.  Chris might have added more choices or
invited more discussion first, but he can call a vote.  If you prefer
that this choice not be made at this time then vote -1 and explain your
rationale, no?

Doug

-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org

Re: PMC chair vs. reorg proposals

2012-02-06 Thread Doug Cutting

On 02/05/2012 11:40 AM, Benson Margulies wrote:
 If the board decides to go that way, I am happy to see Chris in charge
 of the transition.

It's not the board's decision to make.  The folks in the Incubator need
to decide what they as volunteers want to do.  As a board member, either
approach is acceptable to me.

Also, there doesn't need to be just a single approach.  If there's not a
clear consensus one way or the other then some folks might try an
experiment with direct-to-TLP incubation while some others might try
revitalizing the IPMC.

Doug

-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org

too much traffic here

2012-02-04 Thread Doug Cutting

On 02/04/2012 09:15 AM, Mattmann, Chris A (388J) wrote:
 We both care about this stuff, which is why we keep replying. I'm happy
 to continue to reply, so long as you are when I feel it's warranted. I've
 ignored a few of them that I didn't have the energy to, but that's the point
 of a mailing list. At the end of the day, I hope with the diligence and effort
 I've provided to reply to folks concerns (whether they think I've replied
 or addressed them or not), I am basically brain dumping and trying to 
 not leave any question as to what my opinions are. 

Chris  others,

Please do not feel obliged to answer every response to your messages
here.  That does not scale well.  There are a lot of people who read
this list.  Each message sent is equivalent to taking the floor in a
meeting with all of these people in the room.  Please pause to consider
that before responding or folks will start to leave this room.

A best practice on a list with this many members is to respond
thoughtfully on each thread just once per day.  Try to keep your
responses short, sticking to the points which are most important to you.
 Then wait, giving folks who might only read the list once a day a
chance to respond to each of your messages.  Use this time to gauge the
general response of the community.  Then try to respond to the list as a
whole rather than individually to each member of the list.

Thanks,

Doug

-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org

Re: Nomination of Chris Mattman for the IPMC Chair (was: Re: NOMINATIONS for Incubator PMC Chair)

2012-02-03 Thread Doug Cutting

On 02/02/2012 09:58 PM, Mattmann, Chris A (388J) wrote:
 What do Board members think? IPMC hats on? Great. Board 
 hats on? Great too. Would be great to get opinions now 
 rather than have to wait. 

I like the simplicity of erasing the layer of management that is the
Incubator.

The board is a stricter parent but with less attention to detail and
patience than the IPMC has shown.  Board members are not likely to
examine every proposed release tarball to check that everything is
licensed correctly.  On the other hand, if a project doesn't report or
fails to act on advice from the board for long, then the board will
replace the chair or propose to closing the project.

Would it work to the board as a single parent?  Yes, I think it would.
It would be a tough love approach.  However if there were also people
advising and monitoring young projects then things might go more
smoothly.  So if folks are willing to organize and manage this kinder,
gentler parent/teacher then I'd be happy to have a VP Incubation.

Doug

-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org

Re: [DISCUSS] eliminate vetoes on personnel votes

2012-01-31 Thread Doug Cutting

On 01/30/2012 05:12 PM, Greg Stein wrote:
 I've never liked vetoes for this. One person can hold an entire PMC hostage
 simply for disliking someone (or worse: subtle corporate concerns masked
 otherwise). People have said in the past, you should have veto so you're
 not forced to work with somebody you dislike. I respond, grow up. we work
 with annoying people all the time, and the majority says they *can* work

When this question came up in another context, Roy's concern, as I
recall it, was something to the effect that if you don't allow vetoes of
proposed PMC members then you might create a dysfunctional PMC.  (Roy,
please correct me if I miss-recall.)  A PMC needs to regularly reach
consensus.  If person X has technical ideas that are incompatible with
person Y then perhaps they should not be on the same PMC.  At least
that's the way I recall Roy's argument...

Also note that if you get to the point where one person is vetoing a PMC
addition then the rest of the PMC could vote to remove that one person.
 A veto is effectively asking the PMC to choose between you and the new
person, a strident move.

A less confrontational approach is to have a discussion before any vote,
where folks can air their concerns.  If folks voice significant concerns
then it might not be wise to hold a vote.

Finally I'll observe that if supermajority would result in a different
result than consensus then the PMC probably has serious problems
collaborating that need to be fixed.

Doug

-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org

comments for Incubator PMC from board

2012-01-24 Thread Doug Cutting

Incubator PMC,

Recent reports from your PMC to the board do not appear to have been
thoroughly reviewed.  Prior to submitting your report to the board, the
Incubator PMC should review all podling reports, note problems, and,
when needed, take action.  Direct podling oversight is the
responsibility of the Incubator PMC.

If a podling's reports are inadequate, uninformative or missing then the
Incubator PMC should work to ensure that the podling improve its
reports.  A summary of such actions by your PMC should be included in
the Incubator report.

Please include in next month's report a summary of any new processes you
put in place to better ensure adequate review and oversight of podlings.

Thanks,

Doug

-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org

Re: Q. Forks without concensus?; A. anytime / depends / never without agreement

2012-01-03 Thread Doug Cutting

On 01/03/2012 07:35 AM, William A. Rowe Jr. wrote:
 [1] I don't see it as our place to *judge* communities. If it is a fork,
 or a corporate spin-out, or a move, or brand new... All Good. 
 
 [2] At Apache, all contributions are voluntary.  We do not accept code
 from copyright owners who don't want us to have it, even if we have
 the legal right to adopt it for other reasons.

These aren't necessarily contradictory.  At least part of what Roy's
saying is that if someone doesn't intend to distribute their software
under the Apache license then we should not take it.  But I think if
someone's clearly established their intent to publish a body of software
under the Apache license and a new community forms around that software
that's distinct from its original authors, then we can consider housing
that community.

Doug


-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org

Re: [VOTE] Release for Bigtop version 0.2.0-incubating

2011-11-11 Thread Doug Cutting

+1  Signatures and checksums look good.  Rat reports no license problems.

Doug

On 11/02/2011 06:01 PM, Roman Shaposhnik wrote:
 This is the second incubator release for Apache Bigtop, version
 0.2.0-incubating.
 
 It fixes the following issues:

 https://issues.apache.org/jira/secure/ReleaseNote.jspa?version=12317591projectId=12311420
 
 *** Please download, test, and vote by Saturday, November 7
 
 Note that we are voting on the source (tag): release-0.2.0-incubating-RC1
 
 Source tarball, checksums, signature:
  http://people.apache.org/~rvs/bigtop-0.2.0-incubating-RC1/
 
 The tag to be voted on:
 
 https://svn.apache.org/repos/asf/incubator/bigtop/tags/release-0.2.0-incubating-RC1/
 
 Bigtop's KEYS file, containing the PGP keys used to sign the release:
 http://svn.apache.org/repos/asf/incubator/bigtop/dist/KEYS
 
 Note that the Incubator PMC needs to vote on the release after a successful
 PPMC vote before any release can be made official.
 
 Thanks!
 
 -
 To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
 For additional commands, e-mail: general-h...@incubator.apache.org
 

-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org

Re: [VOTE] S4 to join the Incubator

2011-09-24 Thread Doug Cutting

+1

Doug
On Sep 20, 2011 1:57 PM, Patrick Hunt ph...@apache.org wrote:
 It's been a nearly a week since the S4 proposal was submitted for
 discussion. A few questions were asked, and the proposal was clarified
 in response. Sufficient mentors have volunteered. I thus feel we are
 now ready for a vote.

 The latest proposal can be found at the end of this email and at:

 http://wiki.apache.org/incubator/S4Proposal

 The discussion regarding the proposal can be found at:

 http://s.apache.org/RMU

 Please cast your votes:

 [ ] +1 Accept S4 for incubation
 [ ] +0 Indifferent to S4 incubation
 [ ] -1 Reject S4 for incubation

 This vote will close 72 hours from now.

 Thanks,

 Patrick

 --
 = S4 Proposal =

 == Abstract ==

 S4 (Simple Scalable Streaming System) is a general-purpose,
 distributed, scalable, partially fault-tolerant, pluggable platform
 that allows programmers to easily develop applications for processing
 continuous, unbounded streams of data.

 == Proposal ==

 S4 is a software platform written in Java. Clients that send and
 receive events can be written in any programming language. S4 also
 includes a collection of modules called Processing Elements (or PEs
 for short) that implement basic functionality and can be used by
 application developers. In S4, keyed data events are routed with
 affinity to Processing Elements (PEs), which consume the events and do
 one or both of the following: (1) ''emit'' one or more events which
 may be consumed by other PEs, (2) ''publish'' results. The
 architecture resembles the Actors model, providing semantics of
 encapsulation and location transparency, thus allowing applications to
 be massively concurrent while exposing a simple programming interface
 to application developers.

 To drive adoption and increase the number of contributors to the
 project, we may need to prioritize the focus based on feedback from
 the community. We believe that one of the top priorities and driving
 design principle for the S4 project is to provide a simple API that
 hides most of the complexity associated with distributed systems and
 concurrency. The project grew out of the need to provide a flexible
 platform for application developers and scientists that can be used
 for quick experimentation and production.

 S4 differs from existing Apache projects in a number of fundamental
 ways. Flume is an Incubator project that focuses on log processing,
 performing lightweight processing in a distributed fashion and
 accumulating log data in a centralized repository for batch
 processing. S4 instead performs all stream processing in a distributed
 fashion and enables applications to form arbitrary graphs to process
 streams of events. We see Flume as a complementary project. We also
 expect S4 to complement Hadoop processing and in some cases to
 supersede it. Kafka is another Incubator project that focuses on
 processing large amounts of stream data. The design of Kafka, however,
 follows the pub-sub paradigm, which focuses on delivering messages
 containing arbitrary data from source processes (publishers) to
 consumer processes (subscribers). Compared to S4, Kafka is an
 intermediate step between data generation and processing, while S4 is
 itself a platform for processing streams of events.

 S4 overall addresses a need of existing applications to process
 streams of events beyond moving data to a centralized repository for
 batch processing. It complements the features of existing Apache
 projects, such as Hadoop, Flume, and Kafka, by providing a flexible
 platform for distributed event processing.

 == Background ==

 S4 was initially developed at Yahoo! Labs starting in 2008 to process
 user feedback in the context of search advertising. The project was
 licensed under the Apache License version 2.0 in October 2010. The
 project documentation is currently available at http://s4.io .

 == Rationale ==

 Stream computing has been growing steadily over the last 20 years.
 However, recently there has been an explosion in real-time data
 sources including the Web, sensor networks, financial securities
 analysis and trading, traffic monitoring, natural language processing
 of news and social data, and much more.

 As Hadoop evolved as a standard open source solution for batch
 processing of massive data sets, there is no equivalent community
 supported open source platform for processing data streams in
 real-time. While various research projects have evolved into
 proprietary commercial products, S4 has the potential to fill the gap.
 Many projects that require a scalable stream processing architecture
 currently use Hadoop by segmenting the input stream into data batches.
 This solution is not efficient, results in high latency, and
 introduces unnecessary complexity.

 The S4 design is primarily driven by large scale applications for data
 mining and machine learning in a production environment. We think that
 the S4 design is surprisingly flexible and

[RESULT] [VOTE] Accumulo to join the Incubator

2011-09-12 Thread Doug Cutting

This passes, with 20 +1 votes, plenty of them binding, and no -1 votes.

Thanks to all who voted!

We can now get started creating the Apache Accumulo podling.

Doug

On 09/09/2011 09:22 AM, Doug Cutting wrote:
 It's been a week since the Accumulo proposal was submitted for
 discussion.  A few questions were asked, and the proposal was clarified
 in response.  Sufficient mentors have volunteered.  I thus feel we are
 now ready for a vote.
 
 The latest proposal can be found at the end of this email and at:
 
   http://wiki.apache.org/incubator/AccumuloProposal
 
 The discussion regarding the proposal can be found at:
 
   http://s.apache.org/oi
 
 Please cast your votes:
 
 [  ] +1 Accept Accumulo for incubation
 [  ] +0 Indifferent to Accumulo incubation
 [  ] -1 Reject Accumulo for incubation
 
 This vote will close 72 hours from now.
 
 Thanks,
 
 Doug
 
 ---
 
 = Accumulo Proposal =
 
 == Abstract ==
 Accumulo is a distributed key/value store that provides expressive,
 cell-level access labels.
 
 == Proposal ==
 Accumulo is a sorted, distributed key/value store based on Google's
 BigTable design.  It is built on top of Apache Hadoop, Zookeeper, and
 Thrift.  It features a few novel improvements on the BigTable design in
 the form of cell-level access labels and a server-side programming
 mechanism that can modify key/value pairs at various points in the data
 management process.
 
 == Background ==
 Google published the design of BigTable in 2006.  Several other open
 source projects have implemented aspects of this design including HBase,
 CloudStore, and Cassandra.  Accumulo began its development in 2008.
 
 == Rationale ==
 There is a need for a flexible, high performance distributed key/value
 store that provides expressive, fine-grained access labels.  The
 communities we expect to be most interested in such a project are
 government, health care, and other industries where privacy is a
 concern.  We have made much progress in developing this project over the
 past 3 years and believe both the project and the interested communities
 would benefit from this work being openly available and having open
 development.
 
 == Current Status ==
 
 === Meritocracy ===
 We intend to strongly encourage the community to help with and
 contribute to the code.  We will actively seek potential committers and
 help them become familiar with the codebase.
 
 === Community ===
 A strong government community has developed around Accumulo and training
 classes have been ongoing for about a year.  Hundreds of developers use
 Accumulo.
 
 === Core Developers ===
 The developers are mainly employed by the National Security Agency, but
 we anticipate interest developing among other companies.
 
 === Alignment ===
 Accumulo is built on top of Hadoop, Zookeeper, and Thrift.  It builds
 with Maven.  Due to the strong relationship with these Apache projects,
 the incubator is a good match for Accumulo.
 
 == Known Risks ==
 === Orphaned Products ===
 There is only a small risk of being orphaned.  The community is
 committed to improving the codebase of the project due to its fulfilling
 needs not addressed by any other software.
 
 === Inexperience with Open Source ===
 The codebase has been treated internally as an open source project since
 its beginning, and the initial Apache committers have been involved with
 the code for multiple years.  While our experience with public open
 source is limited, we do not anticipate difficulty in operating under
 Apache's development process.
 
 === Homogeneous Developers ===
 The committers have multiple employers and it is expected that
 committers from different companies will be recruited.
 
 === Reliance on Salaried Developers ===
 The initial committers are all paid by their employers to work on
 Accumulo and we expect such employment to continue.  Some of the initial
 committers would continue as volunteers even if no longer employed to do so.
 
 === Relationships with Other Apache Products ===
 Accumulo uses Hadoop, Zookeeper, Thrift, Maven, log4j, commons-lang,
 -net, -io, -jci, -collections, -configuration, -logging, and -codec.
 
 === Relationship to HBase ===
 Accumulo and HBase are both based on the design of Google's BigTable, so
 there is a danger that potential users will have difficulty
 distinguishing the two.  Some of the key areas in which Accumulo differs
 from HBase are discussed below.  It may be possible to incorporate the
 desired features of Accumulo into HBase.  However, the amount of work
 required would slow development of HBase and Accumulo considerably.  We
 believe this warrants a podling for Accumulo at the current time.  We
 expect active cross-pollination will occur between HBase and podling
 Accumulo and it is possible that the codebases and projects will
 ultimately converge.
 
  Access Labels 
 Accumulo has an additional portion of its key that sorts after the
 column qualifier and before the timestamp.  It is called

[VOTE] Accumulo to join the Incubator

2011-09-09 Thread Doug Cutting

), jcommon (LGPL),
slf4j (MIT), junit (CPL)

== Cryptography ==
none

== Required Resources ==
 * Mailing Lists
   * accumulo-private
   * accumulo-dev
   * accumulo-commits
   * accumulo-user

 * Subversion Directory
   * https://svn.apache.org/repos/asf/incubator/accumulo

 * Issue Tracking
   * JIRA Accumulo (ACCUMULO)

 * Continuous Integration
   * Jenkins builds on https://builds.apache.org/

 * Web
   * http://incubator.apache.org/accumulo/
   * wiki at http://wiki.apache.org or http://cwiki.apache.org

== Initial Committers ==
 * Aaron Cordova (aaron at cordovas dot org)
 * Adam Fuchs (adam.p.fuchs at ugov dot gov)
 * Eric Newton (ecn at swcomplete dot com)
 * Billie Rinaldi (billie.j.rinaldi at ugov dot gov)
 * Keith Turner (keith.turner at ptech-llc dot com)
 * John Vines (john.w.vines at ugov dot gov)
 * Chris Waring (christopher.a.waring at ugov dot gov)

== Affiliations ==
 * Aaron Cordova, The Interllective
 * Adam Fuchs, National Security Agency
 * Eric Newton, SW Complete Incorporated
 * Billie Rinaldi, National Security Agency
 * Keith Turner, Peterson Technology LLC
 * John Vines, National Security Agency
 * Chris Waring, National Security Agency

== Sponsors ==
 * Champion: Doug Cutting

== Nominated Mentors ==
 * Benson Margulies
 * Alan Cabrera
 * Bernd Fondermann
 * Owen O'Malley

== Sponsoring Entity ==
 * Apache Incubator


-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org

Re: [VOTE] Release Whirr version 0.6.0-incubating

2011-08-25 Thread Doug Cutting

On 08/24/2011 07:09 PM, Andrei Savu wrote:
 Now that the resolution was accepted by the board is no longer
 possible to have a last release as an incubator project?

Whirr is no longer an incubator project.  Whirr can now make releases
without permission from the Incubator PMC.  Even if the project's
resources are still located within the Incubator, Whirr is a TLP and can
release like one.

Doug

-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org

Re: [VOTE] Release for Bigtop version 0.1.0-incubating RC2

2011-08-23 Thread Doug Cutting

+1  Checksums  signatures are correct and RAT reports no serious problems.

Doug

On 08/22/2011 11:07 AM, Andrew Bayer wrote:
 This is the first incubator release for Apache Bigtop, version
 0.1.0-incubating.
 
 It fixes the following issues:
 https://issues.apache.org/jira/secure/ReleaseNote.jspa?version=12317549styleName=HtmlprojectId=12311420
 
 *** Please download, test, and vote by Thursday, August 24 (3 working days
 from now)
 
 Note that we are voting on the source (tag).
 
 Source tarball, checksums, signature:
 http://people.apache.org/~abayer/bigtop-0.1.0-incubating-candidate-2/http://people.apache.org/~abayer/bigtop-0.1.0-incubating-candidate-0/
 
 The tag to be voted on:
 http://svn.apache.org/repos/asf/incubator/bigtop/tags/release-0.1.0-incubating-RC2
 (svn rev. 1160352)
 
 Bigtop's KEYS file, containing the PGP keys used to sign the release:
 http://svn.apache.org/repos/asf/incubator/bigtop/dist/KEYS
 
 Note that the Incubator PMC needs to vote on the release after a successful
 PPMC vote before any release can be made official.
 
 Thanks!
 
 A.
 

-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org

Re: [VOTE] Graduation of the Whirr Podling

2011-08-08 Thread Doug Cutting

+1

Doug

On 08/03/2011 10:56 PM, Tom White wrote:
 Hi everyone,
 
 The Whirr podling joined the incubator in May 2010. Since then it has
 made 5 releases following Apache guidelines, added 4 new committers,
 and added 3 new PPMC members. The community is healthy and growing,
 and we've shown an ability to self-govern using accepted Apache
 practices.
 
 The Whirr podling has now voted to graduate:
 
 Vote: 
 http://mail-archives.apache.org/mod_mbox/incubator-whirr-dev/201107.mbox/%3ccabqr8u_+mk8w_z-4vu-+9mwg+c+r4c1kmuhxa1drmulujnu...@mail.gmail.com%3E
 Result: 
 http://mail-archives.apache.org/mod_mbox/incubator-whirr-dev/201107.mbox/%3ccabqr8u8nttkqxjerp-txnn4jljrvqyrsrqxuesjiyq2td53...@mail.gmail.com%3E
 
 The vote received 7 PPMC approvals, of which 3 were also IPMC members
 (Patrick Hunt, Doug Cutting, and myself).
 
 I would like to ask the IPMC to approve the graduation.
 
 [  ] +1 - I approve of the Whirr graduation
 [  ] +0 - I have no opinion
 [  ] -1 - There's an issue with graduation at this time, which is
 
 Voting will be open for 72 hours. Please find the proposed board
 resolution below.
 
 Thanks
 Tom
 
 ## Resolution to create a TLP from graduating Incubator podling
 
 X. Establish the Apache Whirr Project
 
WHEREAS, the Board of Directors deems it to be in the best
interests of the Foundation and consistent with the
Foundation's purpose to establish a Project Management
Committee charged with the creation and maintenance of
open-source software related to running services on cloud
infrastructure for distribution at no charge to the public.
 
NOW, THEREFORE, BE IT RESOLVED, that a Project Management
Committee (PMC), to be known as the Apache Whirr Project,
be and hereby is established pursuant to Bylaws of the
Foundation; and be it further
 
RESOLVED, that the Apache Whirr Project be and hereby is
responsible for the creation and maintenance of software
related to running services on cloud infrastructure;
and be it further
 
RESOLVED, that the office of Vice President, Apache Whirr be
and hereby is created, the person holding such office to
serve at the direction of the Board of Directors as the chair
of the Apache Whirr Project, and to have primary responsibility
for management of the projects within the scope of
responsibility of the Apache Whirr Project; and be it further
 
RESOLVED, that the persons listed immediately below be and
hereby are appointed to serve as the initial members of the
Apache Whirr Project:
 
  * Adrian Cole  adrianc...@apache.org
  * Lars George  larsgeo...@apache.org
  * Patrick Hunt ph...@apache.org
  * Tibor Kiss   ti...@apache.org
  * Johan Oskarsson  jo...@apache.org
  * Andrew Purtell   apurt...@apache.org
  * Andrei Savu  as...@apache.org
  * Tom Whitetomwh...@apache.org
 
NOW, THEREFORE, BE IT FURTHER RESOLVED, that Tom White
be appointed to the office of Vice President, Apache Whirr, to
serve in accordance with and subject to the direction of the
Board of Directors and the Bylaws of the Foundation until
death, resignation, retirement, removal or disqualification,
or until a successor is appointed; and be it further
 
RESOLVED, that the initial Apache Whirr PMC be and hereby is
tasked with the creation of a set of bylaws intended to
encourage open development and increased participation in the
Apache Whirr Project; and be it further
 
RESOLVED, that the Apache Whirr Project be and hereby
is tasked with the migration and rationalization of the Apache
Incubator Whirr podling; and be it further
 
RESOLVED, that all responsibilities pertaining to the Apache
Incubator Whirr podling encumbered upon the Apache Incubator
Project are hereafter discharged.
 
 -
 To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
 For additional commands, e-mail: general-h...@incubator.apache.org
 

-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org

Re: [VOTE] Oozie to join the Incubator

2011-07-01 Thread Doug Cutting

+1

Doug

On 06/29/2011 12:10 PM, Mohammad Islam wrote:
 Hi All,
 
 The discussion about Oozie proposal is settling down. Therefore I would like 
 to 
 initiate a vote to accept Oozie as an Apache Incubator project.
 
 The latest proposal is pasted at the end and it could be found in the wiki as 
 well:
  
 http://wiki.apache.org/incubator/OozieProposal
 
 
 The related discussion thread is at:
 http://www.mail-archive.com/general@incubator.apache.org/msg29633.html
 
 
 Please cast your votes:
 
 [  ] +1 Accept Oozie for incubation
 [  ] +0 Indifferent to Oozie incubation
 [  ] -1 Reject Oozie for incubation
 
 This vote will close 72 hours  from now.
 
 Regards,
 Mohammad
 
 
 Abstract
 Oozie is a server-based workflow scheduling and coordination system to manage 
 data processing jobs for Apache HadoopTM. 
 
 Proposal
 Oozie is an  extensible, scalable and reliable system to define, manage, 
 schedule,  and execute complex Hadoop workloads via web services. More  
 specifically, this includes: 
 
   * XML-based declarative framework to specify a job or a complex 
 workflow of 
 dependent jobs. 
 
   * Support different types of job such as Hadoop Map-Reduce, Pipe, 
 Streaming, 
 Pig, Hive and custom java applications. 
 
   * Workflow scheduling based on frequency and/or data availability. 
   * Monitoring capability, automatic retry and failure handing of jobs. 
   * Extensible and pluggable architecture to allow arbitrary grid 
 programming 
 paradigms. 
 
   * Authentication, authorization, and capacity-aware load throttling to 
 allow 
 multi-tenant software as a service. 
 
 Background
 Most data  processing applications require multiple jobs to achieve their 
 goals,  
 with inherent dependencies among the jobs. A dependency could be  sequential, 
 where one job can only start after another job has finished.  Or it could be 
 conditional, where the execution of a job depends on the  return value or 
 status 
 of another job. In other cases, parallel  execution of multiple jobs may be 
 permitted – or desired – to exploit  the massive pool of compute nodes 
 provided 
 by Hadoop. 
 
 These  job dependencies are often expressed as a Directed Acyclic Graph, also 
  
 called a workflow. A node in the workflow is typically a job (a  computation 
 on 
 the grid) or another type of action such as an eMail  notification. 
 Computations 
 can be expressed in map/reduce, Pig, Hive or  any other programming paradigm 
 available on the grid. Edges of the graph  represent transitions from one 
 node 
 to the next, as the execution of a  workflow proceeds. 
 
 Describing  a workflow in a declarative way has the advantage of decoupling 
 job  
 dependencies and execution control from application logic. Furthermore,  the 
 workflow is modularized into jobs that can be reused within the same  
 workflow 
 or across different workflows. Execution of the workflow is  then driven by a 
 runtime system without understanding the application  logic of the jobs. This 
 runtime system specializes in reliable and  predictable execution: It can 
 retry 
 actions that have failed or invoke a  cleanup action after termination of the 
 workflow; it can monitor  progress, success, or failure of a workflow, and 
 send 
 appropriate alerts  to an administrator. The application developer is 
 relieved 
 from  implementing these generic procedures. 
 
 Furthermore,  some applications or workflows need to run in periodic 
 intervals 
 or  when dependent data is available. For example, a workflow could be  
 executed 
 every day as soon as output data from the previous 24 instances  of another, 
 hourly workflow is available. The workflow coordinator  provides such 
 scheduling 
 features, along with prioritization, load  balancing and throttling to 
 optimize 
 utilization of resources in the  cluster. This makes it easier to maintain, 
 control, and coordinate  complex data applications. 
 
 Nearly  three years ago, a team of Yahoo! developers addressed these critical 
  
 requirements for Hadoop-based data processing systems by developing a  new 
 workflow management and scheduling system called Oozie. While it was  
 initially 
 developed as a Yahoo!-internal project, it was designed and  implemented with 
 the intention of open-sourcing. Oozie was released as a GitHub project in 
 early 
 2010. Oozie is used in production within Yahoo and  since it has been 
 open-sourced it has been gaining adoption with  external developers 
 
 Rationale
 Commonly,  applications that run on Hadoop require multiple Hadoop jobs in 
 order 
 to  obtain the desired results. Furthermore, these Hadoop jobs are commonly  
 a 
 combination of Java map-reduce jobs, Streaming map-reduce jobs, Pipes  
 map-reduce jobs, Pig jobs, Hive jobs, HDFS operations, Java programs  and 
 shell 
 scripts. 
 
 Because  of this, developers find themselves writing ad-hoc glue programs to  
 combine these Hadoop jobs. These ad-hoc programs are

Re: [VOTE] Accept Bigtop for incubation

2011-06-20 Thread Doug Cutting

Doug

On 06/17/2011 07:15 PM, Tom White wrote:
As there are no active discussions on the proposal thread, I would
like to initiate a vote to accept Bigtop as an Apache Incubator
project.

The proposal is available at

http://wiki.apache.org/incubator/BigtopProposal?action=recallrev=13

I've also put a copy of the proposal at the end of this email.

The discussion thread is available at

http://mail-archives.apache.org/mod_mbox/incubator-general/201106.mbox/%3cbanlktimriyvs5g5maklqvinauz9h6s5...@mail.gmail.com%3E

Please cast your votes:

[ ] +1 Accept Bigtop for incubation
[ ] +0 Indifferent to Bigtop incubation
[ ] -1 Reject Bigtop for incubation

This vote will close 72 hours from now.

Thanks,
Tom

= Bigtop - Apache Hadoop Ecosystem Packaging and Test =

== Abstract ==

Bigtop - a project for the development of packaging and tests of the
Hadoop ecosystem.

== Proposal ==

The primary goal of Bigtop is to build a community around the
packaging and interoperability testing of Hadoop-related projects.
This includes testing at various levels (packaging, platform, runtime,
upgrade, etc...) developed by a community with a focus on the system
as a whole, rather than individual projects.

Build, packaging and integration test code that depends upon official
releases of the Apache Hadoop-related projects (HDFS, MapReduce,
HBase, Hive, Pig, ZooKeeper, etc...) will be developed and released by
this project. As bugs and other issues are found we expect these to be
fixed upstream.

== Background ==

The initial packaging and test code for Bigtop was developed by
Cloudera to package projects from the Apache Hadoop ecosystem and
provide a consistent, inter-operable framework.

== Rationale ==

Hadoop defines itself as:

{{{
The Apache Hadoop project develops open-source software for reliable,
scalable, distributed computing. Hadoop includes these subprojects:

* Hadoop Common: The common utilities that support the other Hadoop
subprojects.
* HDFS: A distributed file system that provides high throughput access
to application data.
* MapReduce: A software framework for distributed processing of large
data sets on compute clusters.
}}}

There are also several other Hadoop-related projects at Apache. Some
TLP examples include HBase, Hive, Mahout, ZooKeeper, and Pig. There
are also several new projects in the Incubator such as HCatalog, Hama
and Sqoop.

From a packaging and deployment perspective, the current
loosely-coupled nature of the project has limitations:
1. Insufficient building against trunk versions of dependent projects
(in the style of Apache Gump).
1. Insufficient testing against the trunk versions of dependent projects.
1. No consistent packaging for the Linux servers which provide the
main Hadoop datacenter platform.
1. No functional testing against multi-machine clusters as part of
the regular automated build process. This is due to a lack of a
physical or virtual Hadoop cluster for testing, and not enough test
suites designed to run against a live cluster with known datasets.

The intent of this project is to build a community where the projects
are brought together, packaged, and tested for interoperability.

Projects such as Apache Whirr (incubating), which deploy and use a
collection of Hadoop-related projects, would benefit from the
interoperability testing done by Bigtop, rather than picking and
testing project combinations themselves.

== Initial Goals ==

Much of the code for Bigtop has been released by Cloudera under the
Apache 2.0 license for over two years.

Some current goals include:
* create a set of packages for the Hadoop ecosystem, over a wide
range of platforms
* interoperability test these projects
* document project sets that are known to work well together

Bigtop’s release artifact would consist of a single tarball of
packaging and test code that, when built, would produce source and
binary Linux packages for the upstream projects.

= Current Status =

== Meritocracy ==

Bigtop was originally developed and released as an open source
packaging infrastructure, CDH, by Cloudera.

== Community ==

The community is primarily the original developers at Cloudera,
however a number of contributions to the packaging specifications have
been accepted from outside contributors. Growing a diverse community
is the main reason to bring Bigtop to the Apache Incubator.

== Core Developers ==

The core developers for Bigtop project are:
* Andrew Bayer has extensive expertise with build tools, specifically
Jenkins continuous integration and Maven.
* Peter Linnell has contributed to the RPM packaging.
* Bruno Mahé has overseen much of the development of the RPM and
Debian packaging system.
* Roman Shaposhnik and Konstantin Boudnik designed and implemented
the system testing framework.

Many of the committers to the Bigtop project have contributed

Re: [VOTE] MRUnit entry into the incubator

2011-03-02 Thread Doug Cutting

+1

Doug

On 03/01/2011 05:16 PM, Eric Sammer wrote:
 All:
 
 Discussions from the [PROPOSAL] thread seem to have tapered off so I'd like
 to call a vote on accepting MRUnit into the incubator. I'm re-pasting the
 proposal for simplicity. We'll leave the vote open for 72 hours.
 
 Thanks!
 
 = MRUnit, a library to support unit testing of Hadoop MapReduce jobs =
 
 == Abstract ==
 MRUnit is a java library that provides mocks and infrastructure for writing
 unit tests for Hadoop MapReduce jobs and related components.
 
 == Proposal ==
 MRUnit is a java library to facilitate unit testing of Hadoop MapReduce jobs
 by providing drivers and mock objects to simulate the Hadoop runtime
 environment of a map reduce job. This code base already exists as a
 subproject of the Apache Hadoop MapReduce project and lives in the contrib
 directory of the source tree.
 
 == Background ==
 Writing unit tests of MapReduce jobs can be a tedious process. User code can
 quickly become entangled with Hadoop APIs making testing difficult and error
 prone. In many cases, users will simply forgo testing given the complexity
 of the environment. MRUnit was created as a simple library users can use in
 conjunction with test suites like JUnit to provide a harness for injecting
 appropriate mock objects.
 
 == Rationale ==
 MRUnit has existed as a contrib component of Apache Hadoop. This has served
 to introduce users to the library and to provide necessary functionality to
 developers in the form of development support. That said, MRUnit is not
 necessarily an intrinsic component of Hadoop proper and could benefit from
 being a standalone project in that:
  * A separate project would support an independent development and release
 schedule allowing for faster iteration and response to user requests.
  * Separating adjunct projects from the core Hadoop codebase simplifies
 Hadoop's build and release.
  * MRUnit users can get a simpler artifact in a way most appropriate to
 development time (i.e. Maven or Ivy repositories).
  * MRUnit can build out independent support for different versions of Hadoop
 without requiring circular dependencies or testing issues.
 
 Having greater development and tooling support for Hadoop makes the project
 accessible to a wider audience by reducing the chance of bugs.
 
 == Initial Goals ==
  * Provide a new home for the existing codebase.
  * Make artifacts available via Maven and / or Ivy.
  * Expand test support for other Hadoop components (e.g. Partitioners)
  * Establish a lightweight, independent release cycle.
 
 == Current Status ==
 === Meritocracy ===
 MRUnit was originally created by Aaron Kimball, and has had some
 contributions from members of the Hadoop community. By becoming its own
 project, significant contributors to MRUnit would become committers, and
 allow the project to grow.
 
 === Community ===
 The MRUnit community is predominantly composed of engineers who author
 MapReduce jobs running against Apache Hadoop. Given that this library
 appeals to a specific subset of the overall Apache Hadoop community, it
 makes sense to decouple its release cycle from that of Hadoop as a whole, to
 allow more rapid iteration in this space.
 
 === Core developers ===
 
 Aaron Kimball wrote most of the original code and is familiar with open
 source and Apache-style development, being a Hadoop committer. A number of
 other contributors have provided patches to this codebase over time. Eric
 Sammer has worked as a committer on Flume, a github-based open source
 project.
 
 === Alignment ===
 MRUnit aligns with Hadoop as it aims to be a testing harness and framework
 for the Hadoop MapReduce framework.
 
 == Known Risks ==
 === Orphaned products ===
 All members of the team are committed to making MRUnit a success.
 
 === Inexperience with Open Source ===
 The initial code comes from Hadoop where it was developed in an open-source,
 collaborative way. All the initial committers are committers on other Apache
 projects (with the exception of Eric who is experienced with open source
 development at Github and other communities), and are experienced in working
 with new contributors.
 
 === Homogenous Developers ===
 The initial set of committers is from a diverse set of organizations, and
 geographic locations. They are all experienced with developing in a
 distributed environment.
 
 === Reliance on Salaried Developers ===
 It is expected that MRUnit will be developed on a combination of volunteer
 and salaried time.
 
 === Relationships with Other Apache Products ===
 MRUnit will depend on many other Apache Projects as already mentioned above
 (e.g. Hadoop).
 
 === A Excessive Fascination with the Apache Brand ===
 We think that MRUnit will benefit from The Apache Incubator. There was
 discussion about moving this project entirely out of Apache Hadoop and into
 e.g., Github (as a fork), but after Chris Mattmann prompted some discussions
 on the Hadoop general list to stick around in the Incubator,

Re: [VOTE] Accept Lucene.Net for incubation

2011-02-04 Thread Doug Cutting


+1

Doug

On 01/26/2011 10:05 PM, Troy Howard wrote:

All,

Since posting the Lucene.Net Incubator proposal announcement on Jan
12th, we now have three mentors signed up and would like to call a
vote to accept Lucene.Net into the Apache Incubator.

The proposal is included below and can also be found at:

http://wiki.apache.org/incubator/Lucene.Net%20Proposal

Please cast your votes:

[ ] +1 Accept Lucene.Net for incubation
[ ] +0 Don't care
[ ] -1 Reject for the following reason:

Thanks,
Troy


= Lucene.Net - A .NET port of Lucene =
== Preface ==
Lucene.Net is a sub-project which is being spun off from the Lucene
TLP but is not yet ready for graduation. We propose to address certain
needs of the project by transitioning to an Incubator Podling.

== Abstract ==
Lucene.Net will be a port of the Lucene search engine library, written
in C# and targeted at .NET runtime users.

== Proposal ==
Lucene.Net has three aims. First, it will maintain the existing
line-by-line port from Java to C#, fully automating and commoditizing
the process such that the project can easily synchronize with the Java
Lucene release schedule. Second, it will be a high-performance C#
search engine library. Third, it will maximize its usability and power
when used within the .NET runtime. To that end, it will present a
highly idiomatic, carefully tailored API that takes advantage of many
of the special features of the .NET runtime.

== Background ==
Lucene.Net, began as a independent project focused on creating a
line-by-line, API for API port of Java Lucene to C#. It continued
successfully in this way and eventually became a ASF Incubator project
in April of 2006 and graduated as a sub-project of Lucene in October
of 2009.

The last year has been challenging for the project. The committers who
originally lead the project have stopped maintaining it and
development has stagnated since June of 2010. The user community has
spoken out requesting a change in philosophy and direction for the
project, but those requests have been unheeded. This has led to a
number of forks outside of the ASF. We would like to bring those forks
back in as branches and be responsive to the needs of community
without the need for multiple non-ASF forks.

The Lucene PMC wants to see the project continue to thrive and has
indicated that a return to the Incubator is an appropriate step, with
the end goal of building a new team of committers and maintaining a
steady release cycle meeting the previously stated goals. Because
Lucene is working to move away from being an umbrella project, a
long term goal of the Lucene.Net project is to graduate to an ASF TLP.

== Rationale ==
There is great need for a search engine library in the mode of Lucene
within the .NET runtime. Individuals naturally wish to code in their
language of choice. Organizations which do not have significant Java
expertise may not want to support Java strictly for the sake of
running a Lucene installation. Developers may want to take advantage
of C#'s unique language features and the .NET runtime's unique
execution and interoperability model. Lucene.Net will meet all these
demands.

Apache is a natural home for our project given the way it has always
operated: user-driven innovation, lively and amiable mailing list
discussions, strength through diversity, and so on. We feel
comfortable here, and we believe that we will become exemplary Apache
citizens.

== Initial Goals (to be completed before Feb 2011) ==
  * Build a new list of committers
  * Make a 2.9.2 compatible release as quickly as possible (this
already exists, it just needs to be packaged correctly)
  * Update website, documentation, etc.
  * Create a well documented repeatable and fully automated language
porting process
  * Start a .NET style API branch, either by incorporating some or
all existing fork projects or by starting a new branch to this end

== Current Status ==
=== Meritocracy ===
We understand meritocracy and will fully embrace this concept in our
project management methodology. One of the proposed committers, DIGY,
has been a committer on the current Lucene.Net project since November
2008. Prescott Nasser has been a contributor on the project,
submitting patches, documentation, and website enhancements. Three of
the other proposed initial committers, Troy Howard, Chris Currens and
Sergey Mirvoda are both already actively involved in other open source
projects, either as committers of code or in coordination roles. Troy,
Chris, Sergey and Prescott are currently committers on a Lucene.Net
fork known as Lucere, and as such are intimately familiar with the
code base and share a vision for the future direction of the project.
Scott Lombard and Michael Herndon are passionate about Lucene.Net as
well and have already contributed significantly in terms of project
organization and direction and discussions on the mailing list.

All of the proposed committers are familiar with the challenges faced
with starting and maintaining a

Re: [VOTE][PROPOSAL] EasyAnt incubator

2011-01-25 Thread Doug Cutting


+1

Doug

On 01/24/2011 09:14 AM, Antoine Levy-Lambert wrote:

I would like to present for a vote the following proposal to be
sponsored by
the Ant PMC for a new EasyAnt podling.

The proposal is available on the wiki at and included below:

http://wiki.apache.org/incubator/EasyAntProposal

[] +1 to accept EasyAnt into the Incubator
[] 0 don't care
[] -1 object and reason why.

Thanks,
Antoine Levy-Lambert

--- Proposal text from the wiki ---


EasyAnt Proposal

The following presents the proposal for creating a new EasyAnt project
within the Apache Software Foundation.

= Abstract =

Easyant is a build system based on Apache Ant and Apache Ivy.

= Proposal =

EasyAnt goals are :

* to leverage popularity and flexibility of Ant.
* to integrate Apache Ivy, such that the build system combines a
ready-to-use dependency manager.
* to simplify standard build types, such as building web applications,
JARs etc, by providing ready to use builds.
* to provide conventions and guidelines.
* to make plugging-in of fresh functionalities easy as writing simple
Ant scripts as Easyant plugins.

To still remain adaptable,

* Though Easyant comes with a lot of conventions, we never lock you in.
* Easyant allows you to easily extend existing modules or create and use
your own modules.
* Easyant makes migration from Ant very simple. Your legacy Ant scripts
could still be leveraged with Easyant.

= Rationale =

On the Ivy and Ant mailing list, an often asked question is Why Ivy is
not shipped with Ant ?. Ant users (and some opponents) complains also
about the bootstrapping of an Ant based build system: it is mainly about
copying an existing one. EasyAnt is intended to response to both of
these requirements: a prepackaged Ant + Ivy solution with standard build
script ready to be used.

Also taking inspiration from the success of Apache Maven, EasyAnt is
adopting the convention over configuration principle. Then it could be
easy to build standard project at least for all commons steps (no more
need to reinvent the wheel between each projects). The common part
should be easy enough to tune parameters without having deep ant
knowledge (example changing the default directory of sources, force
compilation to be java 1.4 compatible, etc...).

Last but not least, EasyAnt is intended to provide a plugin based
architecture to make it easy to contribute on a specific step of the
build. Build plugins are pieces of functionality that can be plugged
into or removed from a project. Plugins could actually perform a piece
of your regular build, e.g. compile java classes during build of a
complete war. Or, do a utility action, e.g. deploy your built web
application onto a packaged Jetty server!

= Current Status =

== Meritocracy ==

Some of the core developers are already committers and members of the
Apache Ant PMC, so they understand what it means to have a process based
on meritocracy.

== Community ==

EasyAnt have a really small community (around 100 downloads per
release). It is not a problem as the team is currently making
restructuring changes. The team plans to make more promotion after those
changes and strongly believe that community is the priority as the tool
is designed to be easy to use.

== Core Developers ==

Xavier Hanin and Nicolas Lalev�¡�ée are members of the PMC of Apache Ant.
Jerome Benois is an Acceleo committer, he was a committer in Eclipse MDT
Papyrus for two years and he's an active contributor in Eclipse Modeling
and Model Driven community. He's a committer on Bushel project now
contribute to the Ivy code base. He leads the EasyAnt for Eclipse plugin
development.
Jason Trump is leading Beet project on sourceforge
(http://beet.sourceforge.net/).
Jean-Louis Boudart is Hudson committer.

== Alignment ==

EasyAnt is based on Apache Ant and Ivy. Being part of Apache could help
for a closer collaboration between projects.
The team plans to reinject as much as possible stuff into Ant or Ivy
like they've done in the past on :
* extensionPoint : kind of IoC for targets (Ant)
* import/include mechanism (Ant)
* module inheritance (Ivy)

= Known risks =

== Orphaned products ==

Jean-Louis Boudart is the main developer of EasyAnt. Other developers
got interested in this project and are now touching to every aspect of
EasyAnt. Thus the risk of being orphaned is quite limited.

== Inexperience with Open Source ==

Many of the committers have experience working on open source projects.
Two of them have experience as committers on other Apache projects.

== Homogenous Developers ==

The existing committers are spread over a number of countries and
employers.

== Reliance on Salaried Developers ==

None of the developers rely on EasyAnt for consulting work.

== Relationships with Other Apache Products ==

As already stated above, EasyAnt is intended to have a quite good
integration with both Apache Ant and Apache Ivy.

== A Excessive Fascination with the Apache Brand ==

As we're already based on many Apache project (Ant + Ivy),

Re: [PROPOSAL] Mesos Project

2010-12-15 Thread Doug Cutting


+1

Doug

On 12/13/2010 02:08 PM, Matei Zaharia wrote:

We would like to propose Mesos as an incubator proposal.

Mesos is a resource manager for clusters that provides resource sharing and 
isolation across distributed applications like Apache Hadoop, MPI, or web 
applications. It started as a research project at UC Berkeley, but is now being 
used both by other Berkeley groups and at Twitter. We open sourced Mesos in 
August and would like to grow a broader community around it.

Our proposal is included below and available on the wiki at 
http://wiki.apache.org/incubator/MesosProposal. We look forward to hearing 
feedback and questions on the proposal. Also, let us know if you're interested 
in being a mentor.

Thanks,

Matei Zaharia, Benjamin Hindman, Andy Konwinski, and Ali Ghodsi



= Abstract =

Mesos is a cluster manager that provides resource sharing and
isolation across cluster applications.



= Proposal =

Mesos is system for sharing resources between cluster applications such
as Hadoop MapReduce, HBase, MPI, and web applications.
It is motivated by three use cases. First, organizations that use
several of these applications can use Mesos to share nodes between them,
increasing utilization and simplifying management. Second, inspired by
MapReduce, a wide array of new cluster programming frameworks are being
proposed, such as Apache Hama, Microsoft Dryad, and Google's Pregel and
Caffeine. Mesos provides a common interface for such frameworks to share
resources, allowing organizations to use multiple frameworks in the same
cluster. Third, Mesos allows users of a framework such as Hadoop to have
multiple instances of the framework on the same cluster, facilitating
workload isolation and incremental deployment of upgrades.



= Background =

Mesos was inspired by operational issues experienced in large Apache Hadoop
deployments as well as a desire to provide a management system for a
wider range of cluster applications. The Apache Hadoop community has long
realized that the current model of having one instance of MapReduce
control a whole cluster leads to problems with isolation (one job may
cause the master to crash, killing all the other jobs), scalability,
and software upgrades (an upgrade must be deployed on the whole cluster).
Statically partitioning resources into multiple fixed-size MapReduce clusters
is unattractive because it lowers both utilization and data locality.
The community has discussed a two-level scheduling model where a simple,
robust low-level layer enables multiple applications to launch tasks
(https://issues.apache.org/jira/browse/MAPREDUCE-279). Mesos is such a layer,
with the additional goal of supporting non-Hadoop applications as well.

Mesos started as a research project at UC Berkeley, but is now being
tested at several companies (including Twitter and Facebook), and has attracted
interest from other industry users and researchers as well. We are
therefore proposing to place Mesos in the Apache incubator and build an
open source community around it.



= Rationale =

Although a variety of cluster schedulers (e.g. Torque, Sun Grid Engine)
already exist in the scientific computing community, they are not well
suited for today's data center environment.
These schedulers generally give jobs coarse-grained static allocations of
the cluster (e.g. X nodes for the full duration of the job).
This is problematic because many cluster applications are elastic
(can scale up and down), so utilization is not optimal under static
partitioning, and because data-intensive applications such as MapReduce
need to run a few tasks on every node of the cluster to read data locally.
To address these challenges, Mesos is designed around two principles:

  * Fine-grained sharing: Mesos allocates resources at the level of tasks
within a job, allowing applications to scale up and down over time and
to take turns accessing data on cluster nodes.
  * Application-controlled scheduling: Applications control which nodes
their tasks run on, allowing them to achieve placement goals such as
data locality.

In addition to these principles, Mesos is designed to be simple, scalable
and robust, becuase a cluster manager must be highly available to support
applications and should not become a bottleneck. Application-controlled
scheduling already simplifies our design by pushing much of the complex
logic of tracking job state to applications. In addition, Mesos employs an
optimized C++ message-passing library to achieve scalability and supports
master failover using Apache ZooKeeper.

Mesos already supports running Hadoop and MPI. We plan to add support
for other systems as requested (and contributed) by the community.



= Current Status =

== Meritocracy ==

Our intent with this incubator proposal is to start building a diverse
developer community around Mesos following the Apache meritocracy model.
We have wanted to make the project open source and encourage contributors
from multiple

Re: [VOTE] Accept Wave into the incubator

2010-12-02 Thread Doug Cutting

Doug

On 11/29/2010 10:52 PM, Dan Peterson wrote:

Hi everyone,

Please vote on the acceptance of Wave into the Apache incubator.

The proposal is available at: http://wiki.apache.org/incubator/WaveProposal
(for your convenience, a snapshot is also copied below)

The earlier discussion thread can be found at:
http://apache.markmail.org/message/3ebtccdxvipp2732?q=general%40incubator.apache.org+list:org.apache.incubator.general+order:date-backwardpage=2

The vote options:

[ ] +1 Accept Wave for incubation
[ ] +0 Don't care
[ ] -1 Reject for the following reason:

The vote is open for 72 hours.

Thanks,
-Dan

Apache Wave Proposal (Apache Incubator)

= Abstract =

Apache Wave is the project where wave technology is developed at Apache.
Wave in a Box (WIAB) is the name of the main product at the moment, which is
a server that hosts and federates waves, supports extensive APIs, and
provides a rich web client. This project also includes an implementation of
the Wave Federation protocol, to enable federated collaboration systems
(such as multiple interoperable Wave In a Box instances).

= Proposal =

A wave is a hosted, live, concurrent data structure for rich communication.
It can be used like email, chat, or a document.

WIAB is a server that hosts waves. The best analogy for this is a mail
server with a web client. WIAB is comprised of a few high-level components:
the client and the server. They have the following major functionality
(though this is not an exhaustive list):

* Client
*A dynamic web client for users to create, edit, and search waves. Users
can access this client by directly visiting the server in a browser.
* Gadgets provide the ability to insert, view, and modify the UI --
exposing the Wave Gadgets API (
http://code.google.com/apis/wave/extensions/gadgets/guide.html)
* A console client that can create and edit waves via a command-line-like
interface.
* Server
* Hosts and stores waves. WIAB comes with a default storage mechanism. The
administrators of the server may configure it to use alternative storage
mechanisms.
* Indexing, allowing for searching the waves a user has access to.
* Basic authentication, configurable to delegate to other systems.
* Federation, allowing separate Wave in a Box servers to communicate with
each other using the Wave Federation Protocol (
http://www.waveprotocol.org/federation).
* Robots, using the Wave Robots API, (
http://code.google.com/apis/wave/extensions/robots/) may interact with waves
on a WIAB instance.

= Background =

Wave expresses a new metaphor for communication: hosted conversations. This
was created by Lars and Jens Rasmussen after observation of people's use of
many separate forms of communication to get something done, e.g, email,
chat, docs, blogs, twitter, etc.

The vision has always been to better the way people communicate and
collaborate. Building open protocols and sharing code available in an open
and free way is a critical part of that vision. Anyone should be able to
bring up their own wave server and communicate with others (much like SMTP).

We hope this project will allow everyone to easily gain the benefits of Wave
with a standard implementation of Wave – in a box.

= Rationale =

Wave has shown it excels at small group collaboration when hosted by Google.
Although Wave will not continue as a standalone Google product, there is a
lot of interest from many organizations in both running Wave and building
upon the technology for new products.

We are confident that with the community-centric development environment
fostered by the Apache Software Foundation, WIAB will thrive.

= Initial Goals =

The initial goals of the project are:

1. To migrate the codebase from code.google.com and integrate the project
with the ASF infrastructure (issue management, build, project site, etc).
1. To quickly reach a state where it is possible to continue the
development of the Wave In a Box implementation under the ASF project.
1. To add new committers to the project and grow the community in The
Apache Way.

= Current Status =

The open source Wave in a Box project has existed in various forms for
approximately 16 months (starting out life as the FedOne open source
project).

FedOne began in July 2009 in order to accelerate adoption of the wave
federation protocol, and serve as a proof of concept that a non-Google
implementation of the wave federation protocol could interoperate with the
Google production instance. It worked. FedOne's existence lead to a
prototype by Novell that demonstrated federation between Google Wave and
Novell Pulse (now known as Vibe). In addition, in May of 2010, SAP unveiled
a prototype version of SAP StreamWork that federated with both Novell Pulse
and Google Wave. All three systems interoperated, sharing real-time state,
and gadget updates. In May 2010 Google released significantly more code
(including the cross-browser rich text editor) to connect with other
components that were

Re: [VOTE] Release Whirr version 0.2.0-incubating

2010-11-12 Thread Doug Cutting


+1 Checksums  sigs are correct.  Licensing looks good.

Doug

On 11/10/2010 08:59 AM, Patrick Hunt wrote:

This is the second incubator release for Apache Whirr, version
0.2.0-incubating.

PPMC release vote thread:
http://markmail.org/message/kdfnohhod6wdrqaz

The issues fixed for 0.2.0-incubating
https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=1230styleName=Htmlversion=12315339

Source and binary files:
http://people.apache.org/~phunt/whirr-0.2.0-candidate-0/

Maven staging repo:
https://repository.apache.org/content/repositories/orgapachewhirr-032

The tag to be voted upon:
http://svn.apache.org/repos/asf/incubator/whirr/tags/release-0.2.0-incubating

The vote is open for 72 hours.

[ ] +1
[ ] +0
[ ] -1

-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org



-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org

Re: [VOTE] Thrift 0.5.0 RC1

2010-10-05 Thread Doug Cutting


+1 Looks good to me!

Doug

On 10/04/2010 09:47 AM, Bryan Duxbury wrote:

I propose that we accept
http://people.apache.org/~bryanduxbury/thrift-0.5.0-rc1.tar.gzhttp://people.apache.org/%7Ebryanduxbury/thrift-0.5.0-rc1.tar.gzas
the official Thrift 0.5.0 release.

I produced this tarball by checking out a clean copy of the 0.5.x branch and
running ./bootstrap.sh  ./configure  make dist.

The GPG signature can be found at
http://people.apache.org/~bryanduxbury/thrift-0.5.0-rc1.tar.gz.aschttp://people.apache.org/%7Ebryanduxbury/thrift-0.5.0-rc1.tar.gz.asc.
It has an MD5 sum of 14c97adefb4efc209285f63b4c7f51f2.

Please download, verify sig/summ, and install/test the libraries of your
choice.

We held the podling vote on thrift-dev and had 5 +1 votes, include 3 PPMC
votes, and no -1 votes.
(Message-ID:aanlktimghmkrsxg-iw7jujr9pw=+so6uvq77yr-gj...@mail.gmail.com)

This vote will be open for 72 hours.



-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org

Re: [VOTE] Release Whirr version 0.1.0-incubating

2010-09-17 Thread Doug Cutting


+1

Checked that src tarball has a correct signature  md5sum.  Also ran RAT 
over the extracted sources and the licensing looks good.


Doug

On 09/14/2010 11:19 AM, Tom White wrote:

This is the first incubator release for Apache Whirr, version
0.1.0-incubating. We already received one binding IPMC +1 vote for the
PPMC release vote on whirr-dev, so are looking for two more.

PPMC release vote thread:
http://mail-archives.apache.org/mod_mbox/incubator-whirr-dev/201009.mbox/%3caanlktinio1np6d+gbnm4w6jjcg-6koe7x8begkuxr...@mail.gmail.com%3e

The issues fixed for 0.1.0-incubating
https://issues.apache.org/jira/secure/ReleaseNote.jspa?version=12315111styleName=HtmlprojectId=1230

Source and binary files:
http://people.apache.org/~tomwhite/whirr-releases/

Maven staging repo:
https://repository.apache.org/content/repositories/orgapachewhirr-009

The tag to be voted upon:
http://svn.apache.org/repos/asf/incubator/whirr/tags/release-0.1.0-incubating

The vote is open for 72 hours.

[ ] +1
[ ] +0
[ ] -1

Thanks,
Tom

-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org



-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org

Re: [PROPOSAL] Gora to enter Incubator

2010-09-14 Thread Doug Cutting


+1 Sounds like a great project.

Doug

On 09/13/2010 06:10 AM, Enis Soztutar wrote:

Hi all,

We would like to announce the Proposal for Gora, an ORM for Colum Stores,
for the Apache Incubation. We believe that Gora can find a nice home at
Apache.

Wiki of the proposal can be found at
http://wiki.apache.org/incubator/GoraProposal

The proposal is as below.


= Gora Proposal for Apache Incubation =

== Abstract ==
Gora is an ORM framework for column stores such as Apache HBase and Apache
Cassandra with a specific focus on Hadoop.

== Proposal ==
Although there are various excellent ORM frameworks for relational
databases, data modeling in NoSQL data stores differ profoundly from their
relational cousins. Moreover, data-model agnostic frameworks such as JDO are
not sufficient for use cases, where one needs to use the full power of the
data models in column stores. Gora fills this gap by giving the user an
easy-to-use ORM framework with data store specific mappings and built in
Apache Hadoop support.

The overall goal for Gora is to become the standard data representation and
persistence framework for big data. The roadmap of Gora can be grouped as
follows.

  * Data Persistence : Persisting objects to Column stores such as HBase,
Cassandra, Hypertable; key-value stores such as Voldermort, Redis, etc; SQL
databases, such as MySQL, HSQLDB, flat files in local file system of Hadoop
HDFS.
  * Data Access : An easy to use Java-friendly common API for accessing the
data regardless of its location.
  * Indexing : Persisting objects to Lucene and Solr indexes,
accessing/querying the data with Gora API.
  * Analysis : Accesing the data and making analysis through adapters for
Apache Pig, Apache Hive and Cascading
  * MapReduce support : Out-of-the-box and extensive MapReduce (Apache
Hadoop) support for data in the data store.

== Background ==
ORM stands for Object Relation Mapping. It is a technology which abstacts
the persistency layer
(mostly Relational Databases) so that plain domain level objects can be
used, without the cumbersome effort to save/load the data to and from the
database. Gora differs from current solutions in that:
  * Gora is specially focussed at NoSQL data stores, but also has limited
support for SQL databases
  * The main use case for Gora is to access/analyze big data using Hadoop.
  * Gora uses Avro for bean definition, not byte code enhancement or
annotations
  * Object-to-data store mappings are backend specific, so that full data
model can be utilized.
  * Gora is simple since it ignores complex SQL mappings
  * Gora will support persistence, indexing and anaysis of data, using Pig,
Lucene, Hive, etc

== Rationale ==
ORM frameworks are nothing new. But with the explosion of data generated in
Terabytes and even Petabytes, NoSQL data stores are gaining ever-increasing
popularity. Coupled with limited support to already-proven Apache Hadoop
support in current ORM frameworks, there was a need for a new project.

Gora is currently hosted at Github. However, Gora has ties to ASF in many
ways. As detailed in the proposal section, Gora will be a high level client
for many Apache projects and subprojects including Hadoop(common, hdfs, and
mapreduce), HBase, Cassandra, Avro, Lucene, Solr, Pig, and Hive. Gora
already uses Hadoop, HBase, Cassandra and Avro. Moreover, Gora started its
life inside Apache Nutch project, and now Nutch trunk uses Gora as a
library. Even more, the initial set of committers are all ASF members.
Therefore, we think that Apache will be an excellent home for Gora.

== Initial Goals ==
Initial goals for Gora can be summarized as:
  * Iron out the remaining issues with HBase, Cassandra and SQL support.
  * Make the first release before the end of the year.
  * Improve documentation
  * Support for Cascading

== Current Status ==
=== Meritocracy ===
Current commit rights belong to the initial list of committers four of who
are also ASF members. All the developers have extensive experience with
Apache projects. We honor the meritocracy policy of ASF foundation.

=== Community ===
Gora’s community mostly overlap with that of Nutch, Hadoop, HBase, Avro and
Cassandra. We
have a small community for now (5 initial committers, 18 people tracking the
project at Github), but have been piggybacking the Nutch community for a
while. If Gora is accepted to Apache Incubator, we expect more traction.
Moreover, with the increasing popularity of NoSQL databases, we expect more
users.

=== Core Developers ===
Gora was started by the initial code base inside Apache Nutch by Doğacan
Güney. Then Enis Söztutar has refactored and re-architected the project out
of Nutch. Later Julien Nioche, Andrzej Bialecki and Doğacan has ported Nutch
to use the newly formed project. Later, Sertan Alkan has joined. Doğacan and
Julien are Nutch PMC members, Andrzej is the Nutch PMC chair. Enis is an
Apache Hadoop PMC member.

=== Alignment ===
As discusssed in the second paragraph of Rationale Section, all of the

Re: Thrift 0.3.0 RC6

2010-08-02 Thread Doug Cutting

+1 Checked signature, checksum and ran RAT. All looked good.

Doug

On 07/28/2010 12:28 PM, Bryan Duxbury wrote:

All,

RC5 went out to gene...@incubator and met some resistance. I've make fixes
to the branch and I believe we're ready to go again.

I propose we accept RC6 as the official version of Thrift 0.3.0.

You can find the tarball at
http://people.apache.org/~bryanduxbury/thrift-0.3.0-rc6.tar.gzhttp://people.apache.org/%7Ebryanduxbury/thrift-0.3.0-rc6.tar.gz,
which was created by checking out
https://svn.apache.org/repos/asf/incubator/thrift/branches/0.3.x and running
make dist.

The GPG signature can be found at
http://people.apache.org/~bryanduxbury/thrift-0.3.0-rc6.tar.gz.aschttp://people.apache.org/%7Ebryanduxbury/thrift-0.3.0-rc6.tar.gz.asc
.

The md5 summ is a6c80ab3d8c7827365a9b40f5c9d66a3, which can also be found at
http://people.apache.org/~bryanduxbury/thrift-0.3.0-rc6.tar.gz.md5http://people.apache.org/%7Ebryanduxbury/thrift-0.3.0-rc6.tar.gz.md5
.

The sha1 summ is 671e6913c86dbac5a7c84a82d94045b3649d38f1, which can also be
found at
http://people.apache.org/~bryanduxbury/thrift-0.3.0-rc6.tar.gz.sha1http://people.apache.org/%7Ebryanduxbury/thrift-0.3.0-rc6.tar.gz.sha1
.

Please download, verify signatures and summs, untar, and install the
libraries of your choice.

-Bryan

Note: I spent some time looking at this release with RAT, and while it still
produces some warnings, that is because of my inability to figure out how to
make it exclude the things it complains about. Please refer to the LICENSE
file for information on files with ambiguous licenses.

-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org

Re: [VOTE] Move Lucy to the Incubator

2010-07-19 Thread Doug Cutting


+1

Doug

On 07/17/2010 03:23 PM, Chris Hostetter wrote:


I would like to call a vote for accepting Apache Lucy for incubation
in the Apache Incubator. The full proposal is available below. We ask
the Incubator PMC to sponsor it, with myself (hossman) as Champion, and
mattmann, upayavira, mikemccand, and hossman volunteering to be Mentors.

Please cast your vote:

[ ] +1, bring Lucy into Incubator
[ ] +0, I don't care either way,
[ ] -1, do not bring Lucy into Incubator, because...

This vote will be open for 72 hours and only votes from the Incubator
PMC are binding.

http://wiki.apache.org/incubator/LucyProposal

PREFACE
Lucy is a sub-project which is being spun off from the Lucene TLP but is
not yet ready for graduation. We propose to address certain needs of the
project by transitioning to an Incubator Podling, and assimilating the
KinoSearch codebase.

ABSTRACT
Lucy will be a loose port of the Lucene search engine library, written in
C and targeted at dynamic language users.

PROPOSAL
Lucy has two aims. First, it will be a high-performance C search engine
library. Second, it will maximize its usability and power when accessed
via dynamic language bindings. To that end, it will present highly
idiomatic, carefully tailored APIs for each of its host binding
languages, including support for subclasses written entirely in the
host language.

BACKGROUND
Lucy, a loose C port of Java Lucene, began as an ambitious,
from-scratch Lucene sub-project, with David Balmain (author of Ferret, a
Ruby/C port of Lucene), Doug Cutting, and Marvin Humphrey (founder of
KinoSearch, a Perl/C port) as committers. During an initial burst of
activity, the overall architecture for Lucy was sketched out by Dave and
Marvin. Unfortunately, Dave became unavailable soon after, and without a
working codebase to release or any users, it proved difficult to replace
him. Still, Marvin carried on their work throughout a period of
seemingly low activity.

In the last year, that work has come to fruition: major technical
milestones have been achieved and Lucy's underpinnings have been
completed. Additionally, other developers from the KinoSearch community
have taken an interest in Lucy and have begun to ramp up their
contributions. The next steps for Lucy were articulated by the Lucene
PMC in a recent review: make releases, acquire users, grow community.

To implement the Lucene PMC's recommendations and get to a release as
quickly as possible, the Lucy community proposes to assimilate the
KinoSearch codebase, which has been retrofitted to use Lucy's core. Lucy
still lacks a number of important indexing and search classes; we wish to
flesh these out via IP clearance work rather than software development.

Because Lucene is working to move away from being an umbrella project,
a long term goal of the Lucy project is to graduate to an ASF TLP. With
that in mind, it seems more appropriate for the KinoSearch software grant
to take place within the context of the Incubator, and that a Lucy
podling and PPMC be established which will ultimately take responsibility
for the codebase.

RATIONALE
There is great hunger for a search engine library in the mode of Lucene
which is accessible from various dynamic languages, and for one
accessible from pure C. Individuals naturally wish to code in their
language of choice. Organizations which do not have significant Java
expertise may not want to support Java strictly for the sake of running a
Lucene installation. Developers may want to take advantage of C's
interoperability and fine-grained control. Lucy will meet all these
demands.

Apache is a natural home for our project given the way it has always
operated: user-driven innovation, security as a requirement, lively and
amiable mailing list discussions, strength through diversity, and so on.
We feel comfortable here, and we believe that we will become exemplary
Apache citizens.

INITIAL GOALS
* Make a 1.0 stable release as quickly as possible.
* Concentrate on community expansion.
* Expose a public C API.

CURRENT STATUS
Meritocracy
Our initial committer list includes two individuals (Peter Karman and
Nathan Kurz) who started off as KinoSearch users, demonstrated merit
through constructive forum participation, adept negotiation, consensus
building, and submission of high-quality contributions, and were invited
to become committers. Peter now rolls most releases.

We look forward to continuing to operate as a meritocracy under the
established traditions and rules of the ASF.

Community
Lucy's most active participants of late have been drawn from the
KinoSearch and Lucene communities. Having been focused on features and
technical goals for a long time, we are considerably overdue for a stable
release, and anticipate rapid growth in its wake.

Core Developers
* Marvin Humphrey is the project founder of KinoSearch, and co-founded
the existing Lucy sub-project. He is presently employed by Eventful,
Inc.
* Peter Karman has

Re: [VOTE] Accept Whirr for Incubation

2010-05-07 Thread Doug Cutting

,
Cassandra, and hopefully more.

== Known Risks ==
=== Orphaned products ===
There is a risk that Whirr will not gain adoption. However, the
current Hadoop scripts seem to be fairly widely used. The small number
of initial committers is also a risk, although by starting the project
it is expected that new contributors will quickly be attracted to the
project and help it grow.

=== Inexperience with Open Source ===
The initial code comes from Hadoop where it was developed in an
open-source, collaborative way. All the initial committers are
committers on other Apache projects, and are experienced in working
with new contributors.

=== Homogenous Developers ===
The initial set of committers is from a diverse set of organizations,
and geographic locations. They are all experienced with developing in
a distributed environment.

=== Reliance on Salaried Developers ===
It is expected that Whirr will be developed on salaried and volunteer
time, although all of the initial developers will work on it mainly on
salaried time.

=== Relationships with Other Apache Products ===
Whirr will depend on many other Apache Projects as already mentioned
above (e.g. Hadoop, !ZooKeeper). If the project develops some common
infrastructure then it is possible that it becomes a dependency on a
project that wishes to use that infrastructure for running in the
cloud.

=== A Excessive Fascination with the Apache Brand ===
We think that Whirr will benefit from the community sharing ideas and
best practices for running cloud services. The ASF does a great job at
building communities, which is why we want to build Whirr at Apache.

== Documentation ==
Information on the current scripts and general background can be found at
  * http://wiki.apache.org/hadoop/AmazonEC2
  * http://archive.cloudera.com/docs/ec2.html
  * http://hbase.s3.amazonaws.com/hbase/HBase-EC2-HUG9.pdf
  * http://www.slideshare.net/steve_l/new-roles-for-the-cloud

== Initial Source ==
  * http://svn.apache.org/viewvc/hadoop/common/trunk/src/contrib/cloud/
  * http://github.com/tomwhite/whirr

== Source and Intellectual Property Submission Plan ==
The initial source is already in an Apache project's SVN repository
(Hadoop), so there should be no action required here.

== External Dependencies ==
The existing external dependencies all have Apache compatible
licenses: boto (MIT), libcloud (Apache 2.0), simplejson (MIT). Jclouds
is not a dependency of the current source, but it is Apache 2.0
licensed, so it will be possible to use it in the future if required.

== Cryptography ==
Whirr uses standard APIs and tools for SSH and SSL.

== Required Resources ==
=== Mailing lists ===
  * whirr-private (with moderated subscriptions)
  * whirr-dev
  * whirr-commits
  * whirr-user

=== Subversion Directory ===
  * https://svn.apache.org/repos/asf/incubator/whirr

=== Issue Tracking ===
  * JIRA Whirr (WHIRR)

=== Other Resources ===
The existing code already has unit and integration tests so we would
like a Hudson instance to run them whenever a new patch is submitted.
This can be added after project creation.

== Initial Committers ==
  * Tom White (tomwhite at apache dot org)
  * Andrew Purtell (apurtell at apache dot org)
  * Johan Oskarsson (johan at apache dot org)
  * Steve Loughran (stevel at apache dot org)
  * Patrick Hunt (phunt at apache dot org)

== Affiliations ==
  * Tom White, Cloudera
  * Andrew Purtell, Trend Micro
  * Johan Oskarsson, Twitter
  * Steve Loughran, HP Labs
  * Patrick Hunt, Yahoo!

== Sponsors ==
=== Champion ===
  * Tom White

=== Nominated Mentors ===
  * Doug Cutting
  * Tom White
  * Steve Loughran

=== Sponsoring Entity ===
  * Incubator PMC

-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org



-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org

Re: [PROPOSAL] Whirr Project

2010-04-22 Thread Doug Cutting


Tom White wrote:

The proposal is on the incubator wiki at
http://wiki.apache.org/incubator/WhirrProposal.


This sounds useful to me.  I'd be willing to help mentor.

Doug

-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org

Re: [VOTE] Apache Traffic Server as a TLP

2010-04-09 Thread Doug Cutting


+1 Traffic Server appears to have met graduation requirements.

Doug

Bryan Call wrote:

Greetings,

As no issues have been raised in our previous post to discuss graduation,
the Apache Traffic Server community requests that the IPMC vote on 
recommending

this resolution to the ASF Board.

Traffic Server community vote to graduate:
http://www.mail-archive.com/trafficserver-...@incubator.apache.org/msg01750.html 



Incubation status:
http://incubator.apache.org/projects/trafficserver.html

Please cast your vote:
[ ] +1 to recommend Traffic Server's graduation
[ ]  0 don't care
[ ] -1 no, don't recommend yet, (because...)

-Bryan Call





-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org


-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org

Re: Question on tlps using incubator releases

2010-03-23 Thread Doug Cutting


Patrick Hunt wrote:
Are there any issues with Apache tlps using incubator releases? I've 
heard, but cannot find any official documentation, that tlps should not. 
Is this really the case? Are there any rules/guidelines for this?


I don't think this is a problem.  One project can even release otherwise 
unreleased code from another project, but, in doing so, it takes on the 
onus of ensuring that the code was obtained legally, is licensed 
correctly, etc.  Incubator releases are approved by the Incubator PMC, 
and have had as much (or more) legal review as releases by other TLPs.


Doug

-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org

Re: [VOTE] Approve the release of apache-trafficserver-incubating-2.0.0-alpha

2010-03-10 Thread Doug Cutting


+1

Doug

Leif Hedstrom wrote:

Hi,

Trying this again, the artifact is still the same as the previous 
attempt, but I'm trying to improve this request email,  to hopefully get 
a few more votes.


The Traffic Server PPMC has voted on and approved the release of TS 
v2.0.0-alpha. We would now like to request the approval from the 
Incubator PMC for this release. The original vote thread is


Message-ID: 4b8d5deb.2050...@apache.org


The release candidate artifact and checksums / signature can be found at:

http://people.apache.org/~zwoop/
http://people.apache.org/~zwoop/apache-trafficserver-incubating-2.0.0-alpha.tar.bz2 

http://people.apache.org/~zwoop/apache-trafficserver-incubating-2.0.0-alpha.tar.bz2.md5 

http://people.apache.org/~zwoop/apache-trafficserver-incubating-2.0.0-alpha.tar.bz2.sha1 

http://people.apache.org/~zwoop/apache-trafficserver-incubating-2.0.0-alpha.tar.bz2.asc 




The .asc signature is signed by me (Leif Hedstrom (CODE SIGNING KEY) 
zw...@apache.org, fingerprint is 5D7BBC5A). I have verified that this 
key is available on at least two key servers, and the KEYS file for the 
project includes it as well:


http://www.apache.org/dist/incubator/trafficserver/KEYS


The checksums of the release artifact are

md5sum: 18f914f3873bc4d22c5f3115b9db011f
sha1sum: f735a19a70c2aa69d97d4ea239a5192258dadf17


Note that this release only supports the Linux platform, but has been 
tested on a number of distros, both 32- and 64-bit. Apache TS v2.2.0 
will support many more Unix flavours (trunk already does), but no 
support for Windows is planned thus far. If someone wants to start 
working on that, most of the code base is Windows compatible, just needs 
a little tender love and care.


The artifact is only released as a bzip2'd tar-ball, and probably needs 
to be analyzed using tools available on a Unix platform (I'm not a 
Windows person, so I don't know what is available there).


Please vote on releasing this package as Apache 
trafficserver-incubating-2.0.0-alpha:


[  ] +1 Publish
[  ]   0 Abstain
[  ] -1 Don't publish, because...


Below is a summary of the vote on the mailing list.

Thanks,

-- Leif


Subject:  [RESULT] [VOTE] Release candidate for Traffic Server 2.0.0-alpha

The vote for release passes with six +1 votes, and no -1 or 0 votes. I  
will request for the IPMC to release this.


Thanks!

-- leif

[+1] Leif Hedstrom
[+1] George Paul
[+1] John Plevyak
[+1] Bryan Call
[+1] Jean-Frederic Clere
[+1] Doug Cutting

-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org



-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org

Re: Droid IP clearance?

2010-02-22 Thread Doug Cutting


Thorsten Scherler wrote:

The initial code was based on Apache Nutch so all the IP were cleared
there. The modification that have been done by myself are all done as
ASF committer. There have been code adopted from Henri Yandell patch to
HttpComponents which as well had been cleared by himself as ASF
committer. The second version was a rewrite from various ASF committer
or based on patch submission where we have a software grant. 


IMOH there are (and never have been) no issues about IP clearance but
ATM AFAIK there is no-one actively pursuing the matter. Any help highly
appreciated.


It sounds like there are in fact no Droid IP clearance issues.  Perhaps 
IP clearance was only mentioned as an issue in the board report because 
it had not yet been explicitly addressed by the podling?


Thanks,

Doug

-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org

Droid IP clearance?

2010-02-17 Thread Doug Cutting


Incubator PMC,

On reading this month's report to the Board from the Incubator, the 
Board was curious what, if anything, is blocking Droid's IP clearance 
process and requested that I look into this.  Is someone actively 
pursuing this?  If so, are there any difficulties in obtaining the 
clearance?


Thanks!

Doug


-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org

WSRP4J stuck?

2010-02-17 Thread Doug Cutting


Incubator PMC,

On reading this month's report to the Board from the Incubator, the 
Board was concerned about the inactivity in WSRP4J.  Has this project 
been abandoned, or is there some other explanation for the lack of any 
activity in the last three months?


Thanks!

Doug


-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org

avro in mapreduce

2010-01-27 Thread Doug Cutting


I would like to call folks attention to MAPREDUCE-1126.

https://issues.apache.org/jira/browse/MAPREDUCE-1126

This is a key link in a series of issues involved in integrating Avro in 
Mapreduce.  Aaron proposed a design in early December, building on the 
design Tom developed last summer and committed in September in 
HADOOP-6165.  Aaron's design was approved, and, after several rounds of 
reviews, I committed Aaron's patch on 11 January.


On 15 January Owen reverted this commit without warning.  It seems that 
Owen objects to the path initiated last July in HADOOP-6165.


Aaron has also contributed MAPREDUCE-815, which permits one to use Avro 
for all phases of Mapreduce.  When that issue is committed, the primary 
chain of Avro integration into Mapreduce will be complete.


Can others please take the time to read this issue and express their 
opinions?


Thank you,

Doug

-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org

avro in mapreduce

2010-01-26 Thread Doug Cutting


I would like to call folks attention to MAPREDUCE-1126.

https://issues.apache.org/jira/browse/MAPREDUCE-1126

This is a key link in a series of issues involved in integrating Avro in
Mapreduce.  Aaron proposed a design in early December, building on the
design Tom developed last summer and committed in September in
HADOOP-6165.  Aaron's design was approved, and, after several rounds of
reviews, I committed Aaron's patch on 11 January.

On 15 January Owen reverted this commit without warning.  It seems that
Owen objects to the path initiated last July in HADOOP-6165.

Aaron has also contributed MAPREDUCE-815, which permits one to use Avro
for all phases of Mapreduce.  When that issue is committed, the primary
chain of Avro integration into Mapreduce will be complete.

Can others please take the time to read this issue and express their
opinions?

Thank you,

Doug


-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org

Re: avro in mapreduce

2010-01-26 Thread Doug Cutting


Oops.  Wrong list.  Nothing to see here.

Doug

Doug Cutting wrote:

I would like to call folks attention to MAPREDUCE-1126.

https://issues.apache.org/jira/browse/MAPREDUCE-1126

This is a key link in a series of issues involved in integrating Avro in
Mapreduce.  Aaron proposed a design in early December, building on the
design Tom developed last summer and committed in September in
HADOOP-6165.  Aaron's design was approved, and, after several rounds of
reviews, I committed Aaron's patch on 11 January.

On 15 January Owen reverted this commit without warning.  It seems that
Owen objects to the path initiated last July in HADOOP-6165.

Aaron has also contributed MAPREDUCE-815, which permits one to use Avro
for all phases of Mapreduce.  When that issue is committed, the primary
chain of Avro integration into Mapreduce will be complete.

Can others please take the time to read this issue and express their
opinions?

Thank you,

Doug


-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org



-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org

Re: Publishing api docs for Subversion

2009-12-07 Thread Doug Cutting


Branko Čibej wrote:

Actually, we're talking about API documentation which in Subversion's
case is generated from the sources, so yes, it is subject to release
votes. But only for actual releases.

Restricting the publishing of generated API documentation would imply
that we should restrict access to ViewVC, too, because anyone can browse
that exact same documentation through that, albeit formatted a bit
differently.


No, that's not required.  The best practice for our official project 
websites is to distinguish content that's intended for the general 
public from content that's intended for developers.  We should only link 
to subversion and nightly builds from the developer portions of our 
sites.  We must, for example, not link to subversion or to nightly 
builds from the release download page.


We have a two-step process for licensing.  First, we try to make sure 
that things are suitable for release before we commit them to 
subversion.  But some things fall through the cracks in this step. 
Sometimes files lack license headers.  Sometimes, especially in the 
incubator, we'll commit something before we have all of the ICLA's on 
file.  Sometimes we commit software that we cannot release and must 
remove.  So as the second step, we review everything prior to release to 
double-check that the licensing is correct.


Between these two steps the content is technically accessible by anyone, 
but our legal claim is that we're not yet distributing it to the general 
public under the Apache license, but only to our developers as a working 
draft.  To support this claim, we only link to pre-release content from 
developer-specific portions of our sites.  ViewVC is a developer tool, 
not a tool we promote for use by the general public.


Posting content intended for release that has not cleared the second 
hurdle bypasses the release process and should be avoided.


We do not have this two-step process for non-released website content. 
Our project home pages are not formally released and we trust that folks 
will be careful about what's posted there.  But we should not use this 
as a grounds to publish content that is otherwise subject to release to 
the general public without a release vote.


Doug



-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org

Re: Publishing api docs for Subversion

2009-12-07 Thread Doug Cutting


Branko Čibej wrote:

So I'm not too clear on what your objections are.
* Do you object to publishing non-released documentation on the
  project Web pages? 


I object to posting these outside of a clearly-marked developer portion 
of the project's web site.



  Then you should start by cleaning up the
  existing ASF TLPs; begin with HTTPD, for example.


I am here responding to a question about an incubating project, not 
about HTTPD.  I agree that HTTPD does not appear to be following 
best-practice here.  All other developer-specific links for HTTPD seem 
to be in http://httpd.apache.org/dev/.  This one should be too, although 
the sidebar link does at least have dev in it.



* Do you object to publishing the link and not marking it as
  development or unstable or whatever? AFAICS nobody suggested that.


I think the best practice is not merely to mark such links, but to put 
all developer-specific links in a separate section of the website for 
developers.


Doug

-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org

Re: How documentation != code, and how to do policy (was: Re: Publishing api docs for Subversion)

2009-12-07 Thread Doug Cutting


Leo Simons wrote:

So, subversion publishes their trunk API docs nightly, for the
convenience of their own developers and the surrounding tool developer
community. All those people mostly want trunk API docs, and they want
them mostly so they don't have to run doxygen themselves. There's
really no need to protect the normal users of the subversion website
from bad API docs, they won't be using those docs at all.


It's fine to make nightly builds available, including of documentation. 
 All I'm suggesting is that, just as nightly builds should not be 
linked to from the general download page, nightly documentation should 
not be linked to from the general documentation page.  Both, like links 
to ViewVC, should only be linked to from developer-specific pages.



The best response in this case is probably to look for a similar
project around the ASF that has already figured out a similar process
and see if things are compatible. Like, httpd or apr. Ah, they do the
same. Cool, done.


Just because HTTPD or any other project does something does not always 
mean it's best practice.  It often does, but, in this case, I think 
adding dev to a link in the sidebar is a poor substitute for moving 
this link to http://httpd.apache.org/dev/.



If you have an idea about what the policy is, check your idea against
the extensive docs on www.apache.org/dev/ and incubator.apache.org. If
your idea is in there, point people at the documented policy.


I believe I cited this earlier in the thread:

http://www.apache.org/dev/release.html#what

Do not include any links on the project website that might encourage 
non-developers to download and use nightly builds, snapshots, release 
candidates, or any other similar package.


This is motivated by legal reasons.  Copyright and license issues are 
possible for documentation as well as code, so I see no reason to make 
an exception for nightly documentation builds.



Always remember the incubator is not here to invent policy and apply
it to incubating projects. The incubator is here to help incubating
projects navigate the ASF so they can create and distribute software
ASF style.


I'm not inventing policy.  I'm describing the way every project I'm 
involved with operates and interpreting the rules posted at 
http://www.apache.org/dev/.


Doug

-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org

Re: How documentation != code, and how to do policy (was: Re: Publishing api docs for Subversion)

2009-12-07 Thread Doug Cutting


Niall Pemberton wrote:

You're taking a
policy that applies to release artifacts and stretching it to
something it wasn't intended to cover.


Applying the rules for releases to significant subsets of releases 
doesn't seem like much of a stretch to me.  Subsets are subject to the 
same copyright and license concerns, the motivations for the rules.



In the absence of specific
policy then *objections* are out of order since its up to the PMC of a
project to decide these things.


What?  I can't state what I believe to be a best practice?

I have not objected to anything.  Someone asked about posting 
pre-release documentation, and I remarked that, like pre-release code, 
they should keep it distant from released documentation, ideally only 
linked from the developer portion of their site.  Is that really a bad idea?


Doug

-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org

Re: How documentation != code, and how to do policy (was: Re: Publishing api docs for Subversion)

2009-12-07 Thread Doug Cutting


Doug Cutting wrote:

 In the absence of specific policy then *objections* are out of order
I have not objected to anything.


Forgive me.  I did in fact use the verb object in a prior message:


   * Do you object to publishing non-released documentation on the
 project Web pages?

I object to posting these outside of a clearly-marked developer portion of the 
project's web site.


I didn't mean this as a veto or formal objection. I was rather using a 
parallel construction to better make clear my view.  I meant this in the 
informal sense that I would argue this is not the best approach.


Sorry for any misunderstanding.

Doug

-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org

Re: Publishing api docs for Subversion

2009-12-07 Thread Doug Cutting


William A. Rowe Jr. wrote:

I suspect that renaming /docs/trunk/ to /docs/dev/ would be sufficient and
follow this best practice?


I don't know how much folks look at the URL, but I think I've heard Roy 
indicate that all developer-specific stuff should be under a dev/ URL.


I think it would be better yet not to link to it from the side bar, 
which appears on every page, but rather just from the 
http://httpd.apache.org/dev/ page.  If the primary point of posting it 
is so that developers can refer to it without having to build it 
themselves, it doesn't need to be posted so prominently, does it?


Doug

-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org

Re: Publishing api docs for Subversion

2009-12-07 Thread Doug Cutting


Joe Schaefer wrote:

Exactly.  That's the key difference between a release and a website, we
can't take the release back.


Good point.  We don't mirror the website on 3rd party sites like we do 
releases, nor does HTTPD currently package pre-release docs as an 
archive that folks might download and install locally.  So this is less 
risky than promoting complete nightly builds.  But what if a project 
starts posting the nightly documentation as a tarball, so that folks can 
access it while offline?


So I still worry that it sets a bad precedent to permit publishing a 
significant subset of a nightly build on a public website.  I as yet see 
no reason why it's a problem to link to it from the developer portion of 
the site, like links to subversion, except that developers might already 
be used to finding it on the primary site.  Which is precisely why, when 
a new project asks how to post its nightly documentation, we should tell 
them the best practice is to confine pre-release stuff to the developer 
portion of the site.  There they can post it as individual pages, 
archives, a big PDF or whatever.  We can keep this line clear: if it's 
content destined for release but that hasn't been released, it should 
only be available from the developer portion of the site.


Doug



-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org

Re: Publishing api docs for Subversion

2009-12-04 Thread Doug Cutting


Paul Querna wrote:

http://httpd.apache.org/docs/trunk/

Which is linked from the sidebar everywhere, and on the docs page:
http://httpd.apache.org/docs/


That trunk documentation is at least labelled dev.  I'd argue it 
should only be linked to from http://httpd.apache.org/dev/ and that it 
should reside there.  That would be consistent with:


http://www.apache.org/dev/release.html#what

Do not include any links on the project website that might encourage 
non-developers to download and use nightly builds, snapshots, release 
candidates, or any other similar package.


Documentation that's released is released under the Apache license and 
is not in general strongly distinguished from code that's released: we 
require a CLA on file for documentation contributions, we require each 
documentation source file to contain the license, etc.  Publishing trunk 
documentation to the non-developer portion of a web site is 
encouragement for non-developers to use trunk, which is something we 
should avoid.


Doug

-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org

Re: Publishing api docs for Subversion

2009-12-04 Thread Doug Cutting


Niall Pemberton wrote:

It might be good a good idea to not confuse users trying to find docs
that relate to a release from that of of the current trunk, but its
doing incubating projects a disservice to try and make out that
release policy cover the docs they publish on their web site.


Don't our release policies cover all the stuff that's in releases and 
derivations thereof?  What's the rationale to separate documentation 
from everything else that's in a release?



This is
something for the project to decide for themselves and we should keep
our noses out and avoid trying to find new ways to beat them up with
policy.


I'm actually trying to keep the policy straightforward and consistent: 
ASF content that's subject to release votes should only be published to 
non-developers after it's been released.  Posting trunk documentation 
seems very similar to posting a nightly build.  Both are permitted, but 
not only via developer-specific pages, not in a way that can be 
construed as an official distribution.


Doug

-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org

Re: Publishing api docs for Subversion

2009-12-04 Thread Doug Cutting


Niall Pemberton wrote:

What we publish on the ASF websites
doesn't have to conform to the licensing policy that releases do.


I'm not talking about the website in general.  I'm talking specifically 
about publishing content primarily intended for inclusion in releases. 
Would we permit someone to mirror other files from trunk on the website? 
 What's special about documentation?  It comes from contributors, we 
require an ICLA or some other indication of intent to submit under the 
Apache license, its changes are subject to vetos, etc.  It's otherwise 
treated just like code.


Doug

-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org

Re: Publishing api docs for Subversion

2009-12-04 Thread Doug Cutting


Niall Pemberton wrote:

I would prefer what I say isn't distorted by selective editing.


Sorry, that was not my intent.


I'm not talking about the website in general.  I'm talking specifically
about publishing content primarily intended for inclusion in releases. Would


Publication  release are two different things - thats the point.


I don't see that yet.  Can you tell me more about the difference?  I use 
publish, distribute and release more or less synonymously when 
referring to project content.  Subversion contains only working drafts.



we permit someone to mirror other files from trunk on the website?  What's


Yes and I bet every project provides a link to browse subversion which
itself is just another web site.


Yes, but such links are meant to be confined to developer-oriented 
pages.  We specifically do not encourage anyone but developers to use 
code in subversion.  We provide extra diligence for releases, and that 
only makes sense if we don't otherwise distribute their content to the 
general public.  Subversion is a service for our developers.


Doug

-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org

Re: Publishing api docs for Subversion

2009-12-03 Thread Doug Cutting


Paul Querna wrote:

httpd and apr have published doxygen of their trunks periodically,
they aren't based on any release.


Were these published these on the official public website or in the dev/ 
section?


I was under the impression that released documentation should be treated 
similarly to released code.  The convention I've used is that stuff 
that's in trunk, stuff that's intended to be included in releases, is 
only published after release.  Other pages on the website that are not 
included in releases, e.g., the project's home page, are clearly 
published without a release vote.


In particular, I think it's a bad practice to publish automatic nightly 
builds on the official website of content that's otherwise the subject 
of release votes.  Is it forbidden?  Perhaps not, but it's not a 
practice we should encourage in the incubator.  Often documentation 
includes code.  Do we want to publish that code without a vote?


Doug

-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org

Re: Publishing api docs for Subversion

2009-12-02 Thread Doug Cutting


Bhuvaneswaran A wrote:

We tend to update the api docs generated using doxygen and java doc on
a nightly basis.


Unreleased artifacts should be linked only from the developer portion of 
the site and should not be hosted on the official project site.  You 
might, e.g., just link to them on the Hudson server rather than copy 
them to people.apache.org.  Documentation should only be published to 
the official website after it's been included in an Apache release. 
This is for legal reasons: we work hard to ensure that releases have 
licensing in order, but do not in general guarantee that licensing is 
correct at all times in source code repositories.


Doug

-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org

1 2 >

1 - 100 of 176 matches

Mail list logo