Thanks Alan, I believe Sebastian had added me to the list.

- Henry

On Mon, Apr 7, 2014 at 7:59 AM, Alan Gates <ga...@hortonworks.com> wrote:
> Henry, definitely glad to have you on board.  Ashutosh Chauhan (new to the 
> IPMC) has also expressed his interest to me offline in being a mentor.  I’ll 
> add both of you the proposal.
>
> Alan.
>
> On Apr 6, 2014, at 10:34 AM, Henry Saputra <henry.sapu...@gmail.com> wrote:
>
>> Hi Guys,
>>
>> The proposal looks great and I would love to help to sign up as a
>> Mentor if you guys still have space for one.
>>
>>
>> - Henry
>>
>>
>> On Sun, Mar 30, 2014 at 12:14 AM, Alan Gates <ga...@hortonworks.com> wrote:
>>> I would like to propose Stratosphere as an Apache Incubator project.  I 
>>> have posted the proposal to 
>>> https://wiki.apache.org/incubator/StratosphereProposal and posted the text 
>>> of the proposal below.
>>>
>>> Alan.
>>>
>>> = Stratosphere =
>>>
>>> == Abstract ==
>>> Stratosphere is an open source system for parallel data analysis. 
>>> Stratosphere deeply integrates MapReduce and database technologies to 
>>> provide expressive and optimizable programming interfaces and at the same 
>>> time efficient and scalable execution.
>>>
>>> == Proposal ==
>>> Stratosphere is an open source system for expressive, declarative, fast, 
>>> and efficient data analysis. Stratosphere combines the scalability and 
>>> programming flexibility of distributed MapReduce-like platforms with the 
>>> efficiency, out-of-core execution, and query optimization capabilities 
>>> found in parallel databases.
>>>
>>> == Background ==
>>> There is currently a need for general-purpose cluster computing platforms 
>>> that are compatible with the Hadoop ecosystem, are more efficient, easier 
>>> to use, and can support more applications than Hadoop MapReduce, but are 
>>> not restricted to a specific data model and language (such as the 
>>> relational model and a variant of SQL). Stratosphere fulfils these needs.
>>>
>>> Stratosphere exposes expressive APIs in Java and Scala (conceptually 
>>> similar to Spark, Cascading, Scalding) that allow arbitrary user-defined 
>>> functions in the same language and data model that the program is written 
>>> in. Stratosphere programs pass through a cost-based optimizer that finds 
>>> the best execution path for these programs depending on the data and 
>>> cluster characteristics. The design and implementation of Stratosphere is 
>>> based on research that generalizes query optimizers in relational 
>>> databases. Stratosphere has a distributed runtime that is architected upon 
>>> the principles of parallel databases, providing true pipelining (a basis 
>>> for stream processing) and efficient out-of-core algorithms for grouping, 
>>> sorting, joining, and aggregating data. Stratosphere provides first-class 
>>> support for iterative algorithms via a built-in iterate operator, covering 
>>> Machine Learning and graph analysis use cases. It achieves performance 
>>> similar to Apache Giraph without being a specialized graph processing 
>>> system.
>>>
>>> Stratosphere has undergone three major releases (v0.1, v0.2, v0.4) and some 
>>> minor ones.
>>>
>>> == Rationale ==
>>> Stratosphere started out in 2008 as a research project by the Technical 
>>> University of Berlin, the Humboldt University of Berlin, and the Hasso 
>>> Plattner Institute, and has received subsequent funding from the German 
>>> Research Council, the European Institute of Innovation and Technology, the 
>>> European Commision, and industry.
>>>
>>> The traction of Stratosphere has by far exceeded our initial expectations, 
>>> and we are therefore seeking an organizational long-term home for 
>>> Stratosphere beyond the University walls that will house and further 
>>> encourage contributors from companies and other organizations that are 
>>> interested in Stratosphere. We believe that the Apache Software Foundation 
>>> is the ideal home for Stratosphere. Stratosphere integrates with several 
>>> existing Apache projects, such as HDFS, YARN, HBase, and Avro. The team is 
>>> familiar with the Apache processes and fully subscribes to the Apache 
>>> mission. One of the proposing members is a long-time Apache contributor and 
>>> PMC member.
>>>
>>> == Initial Goals ==
>>> * Move the existing codebase to Apache
>>> * Integrate with the Apache development process
>>> * Ensure all dependencies are compliant with Apache License version 2.0
>>> * Incremental development and releases per Apache guidelines
>>>
>>>
>>> == Current Status ==
>>> === Meritocracy ===
>>> Stratosphere operated on meritocratic principles from the get go. The 
>>> initial project proposal submitted to the German Research Council
>>> in 2008 stated that all code developed in the project will be released as 
>>> open source under the Apache 2 license. Currently, all the
>>> discussions pertaining to Stratosphere development are public on 
>>> [[https://github.com/stratosphere/stratosphere|GitHub]]  and our 
>>> [[https://groups.google.com/forum/#!forum/stratosphere-dev|mailing list]]. 
>>> The current incubation proposal includes the major code contributors to 
>>> Stratosphere. Several additional people have worked on the Stratosphere 
>>> codebase for research prototypes and industry use cases and would be 
>>> interested in becoming committers. We are starting with a small committer 
>>> group and we plan to add additional committers following an open 
>>> merit-based decision process during the incubation phase.
>>>
>>> === Community ===
>>> Currently, the core of Stratosphere is developed at TU Berlin, mainly by 
>>> the committers listed in this proposal. Additional people from several 
>>> Universities and companies in Europe are working with Stratosphere and are 
>>> interested in becoming committers to the project.
>>>
>>> During the years, Stratosphere has been adopted as a platform for research 
>>> and teaching in several Universities (TU Berlin, HU Berlin, HPI, RWTH, 
>>> Inria, KTH, U. Trento, UCSD, and others), and it is currently witnessing 
>>> its first industrial installations. We are seeing a rapidly growing 
>>> interest in Stratosphere by both startups and large companies, as well as a 
>>> growing community (our first 
>>> [[http://stratosphere.eu/events/2013/summit.html|Stratosphere Summit]] in 
>>> November 2013 attracted over 80 participants). Stratosphere was recently 
>>> accepted as a mentoring organization in Google Summer of Code 2014.
>>>
>>> We believe that acceptance in the Apache Software Foundation will 
>>> consolidate the current community under one organizational umbrella, and 
>>> most importantly accelerate the growth of the community.
>>>
>>> === Core developers ===
>>> The core developers of the system are Stephan Ewen, Fabian Hueske, Daniel 
>>> Warneke, Robert Metzger, Ufuk Celebi, and Aljoscha Krettek, who are all 
>>> committers in the current proposal.
>>>
>>> === Alignment ===
>>> Stratosphere is compatible with, and related to several Apache projects. 
>>> Stratosphere re-uses parts of Apache Hadoop, in particular HDFS and YARN, 
>>> as well as Apache HBase and Apache Avro. Stratosphere is a very good 
>>> compilation target for query languages such as Apache Hive and Apache Pig.
>>>
>>> == Known Risks ==
>>> === Orphaned Products ===
>>> There is strong interest in Stratosphere by several companies and 
>>> organizations, and there is currently a long-term commitment to fund 
>>> salaried developers for Stratosphere by public and private organizations in 
>>> Europe.
>>>
>>> === Inexperience with Open Source ===
>>> Sebastian Schelter is a committer and PMC member of Apache Mahout and 
>>> Apache Giraph, member of the Apache Software Foundation, member of the 
>>> Incubator PMC and project mentor for Apache Drill. Sebastian, along with 
>>> our mentors, will guide the rest of the committers that have experience 
>>> with releasing software as open source but little experience in 
>>> participating in an open source project besides Stratosphere itself.
>>>
>>> In mid-2013 Stratosphere transitioned from an “open source project with 
>>> publicly accessible source code” to an open source project that puts the 
>>> community first. We moved from a University-hosted git repository to 
>>> GitHub, where we discuss all issues publicly. This also includes release 
>>> planning (via GitHub’s milestone feature) and code reviews. We also moved 
>>> our build system to the publicly available Travis-CI. The mailing lists are 
>>> hosted with Google Groups, we use the public Maven repository 
>>> infrastructure of Sonatype. The source code of the www.stratosphere.eu 
>>> website is publicly available and is meant to be changed by external 
>>> contributors (for example for documentation purposes).
>>>
>>> === Homogeneous Developers ===
>>> Most committers in this proposal belong to the same institution (TU 
>>> Berlin). The engagement of these committers goes well beyond the necessary 
>>> development to support research, and all committers work on Stratosphere in 
>>> their free time. Several people from other institutions are working on and 
>>> are familiar with the Stratosphere codebase. We will work to attract them 
>>> as future committers during the incubation phase, following a merit-based 
>>> approach.
>>>
>>> === Reliance on Salaried Developers ===
>>> Currently, Stratosphere receives support from salaried developers, in 
>>> particular from graduate students at TU Berlin that are funded by the 
>>> German Research Council, the European Institute of Technology, and the 
>>> European Commission. These students work in their free time on Stratosphere 
>>> in addition to their employment.
>>>
>>> We expect that Stratosphere development will occur on both salaried and 
>>> volunteer time. We will recruit additional committers, including 
>>> non-salaried developers, and we will work to ensure that the project will 
>>> move forward independently of salaried developers.
>>>
>>> === Relationship with Other Apache Products ===
>>> Stratosphere interfaces with several existing Apache projects: Apache HBase 
>>> for storage, Apache Hadoop (HDFS for storage, YARN for resource management, 
>>> and Stratosphere contains a generic wrapper for Hadoop MapReduce input 
>>> formats), and Apache Avro (for serialization). Stratosphere uses Apache 
>>> Maven and Apache Commons libraries internally. Stratosphere can be a great 
>>> compilation target for Apache Pig and Apache Hive, although such 
>>> functionality is not yet implemented.
>>>
>>> Stratosphere is also related with several projects undergoing incubation in 
>>> the Apache Incubation project, such as Tez, Drill, and Spark (graduated). 
>>> While all these projects target sufficiently different spaces and have 
>>> different architectures, it would be interesting to explore code reuse 
>>> possibilities. For example, we are currently basing our design for 
>>> compiling SQL to Stratosphere on the Optiq library, also used by Apache 
>>> Drill.
>>>
>>> === An Excessive Fascination with the Apache Brand ===
>>> We believe that the Apache brand will help us attract contributors to 
>>> Stratosphere, by giving us a well-defined, transparent development process 
>>> under a known brand. At the same time, Stratosphere already has a healthy 
>>> community and current funding guarantees the further codebase development 
>>> and growth of the project for the next 3-5 years. The reason for this 
>>> proposal is not to gain publicity, but to further strengthen the longevity 
>>> of the project as explained in the Rationale section.
>>>
>>> == Documentation ==
>>> * [[https://stratosphere.eu|Project website]]
>>> * [[http://stratosphere.eu/docs/0.4/|Documentation]]
>>> * [[https://github.com/stratosphere/stratosphere|Codebase]]
>>> * [[https://groups.google.com/forum/#!forum/stratosphere-dev|Mailing list]]
>>>
>>> == Initial Source ==
>>> Stratosphere is hosted on 
>>> [[https://github.com/stratosphere/stratosphere|GitHub]] . This is the 
>>> codebase that we will migrate to the Apache Foundation. The code was 
>>> previously hosted on a TU Berlin’s own git infrastructure. It has always 
>>> been Apache 2.0 licensed.
>>>
>>> === Source and Intellectual Property Submission Plan ===
>>> All initial and past committers will sign a CLA with the ASF while the 
>>> incubator proposal for Stratosphere is being discussed. All organizations 
>>> that have employed Stratosphere contributors in the past will sign a SGA. 
>>> Current contributors will sign a CCLA. All major contributors are still 
>>> active in the project.
>>>
>>> === External Dependencies ===
>>> All critical dependencies are, to the extend of our knowledge, from other 
>>> Apache projects. These include Apache Hadoop (for YARN and HDFS) and some 
>>> libraries (log4j, commons codec, junit and more). Our web frontend uses 
>>> some MIT-licensed JavaScript libraries.
>>>
>>> == Required Resources ==
>>>
>>> === Mailing list ===
>>> We will migrate our mailing lists to the following:
>>> * us...@stratosphere.incubator.apache.org
>>> * d...@stratosphere.incubator.apache.org
>>> * priv...@stratosphere.incubator.apache.org
>>> * comm...@stratosphere.incubator.apache.org
>>>
>>> === Source control ===
>>> We would like to use Git for source control and enable GitHib mirroring 
>>> functionality, where code reviews on GitHub are automatically
>>> forwarded to the developer mailing list. (See also: 
>>> [[https://blogs.apache.org/infra/entry/improved_integration_between_apache_and]])
>>>
>>>
>>> === Issue tracking ===
>>> We are currently using GitHub for issue tracking. We request an 
>>> Apache-hosted JIRA, and we will import existing issues there.
>>>
>>>
>>> == Initial committers ==
>>> * Stephan Ewen - stephan.e...@tu-berlin.de
>>> * Fabian Hueske - fabian.hue...@tu-berlin.de
>>> * Daniel Warneke - warn...@posteo.de
>>> * Robert Metzger - metrob...@gmail.com
>>> * Ufuk Celebi - u.cel...@fu-berlin.de
>>> * Aljoscha Krettek - aljoscha.kret...@gmail.com
>>> * Kostas Tzoumas - kostas.tzou...@tu-berlin.de
>>> * Sebastian Schelter  - s...@apache.org
>>>
>>> === Affiliations ===
>>> * Stephan Ewen (TU Berlin)
>>> * Fabian Hueske (TU Berlin)
>>> * Daniel Warneke (Amadeus IT Group)
>>> * Robert Metzger (TU Berlin)
>>> * Ufuk Celebi (FU Berlin)
>>> * Aljoscha Krettek (TU Berlin)
>>> * Kostas Tzoumas (TU Berlin)
>>> * Sebastian Schelter (TU Berlin)
>>>
>>> == Sponsors ==
>>> === Champion ===
>>> Alan Gates (ga...@apache.org)
>>>
>>> === Nominated Mentors ===
>>> * Sean Owen (sro...@apache.org) (Note: Sean is an Apache member but not 
>>> currently on the IPC, he will need to request IPMC membership)
>>> * Ted Dunning (tdunn...@apache.org)
>>> * Owen O'Malley (omal...@apache.org)
>>>
>>> === Sponsoring Entity ===
>>> The Apache Incubator
>>>
>>>
>>> --
>>> CONFIDENTIALITY NOTICE
>>> NOTICE: This message is intended for the use of the individual or entity to
>>> which it is addressed and may contain information that is confidential,
>>> privileged and exempt from disclosure under applicable law. If the reader
>>> of this message is not the intended recipient, you are hereby notified that
>>> any printing, copying, dissemination, distribution, disclosure or
>>> forwarding of this communication is strictly prohibited. If you have
>>> received this communication in error, please contact the sender immediately
>>> and delete it from your system. Thank You.
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
>>> For additional commands, e-mail: general-h...@incubator.apache.org
>>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
>> For additional commands, e-mail: general-h...@incubator.apache.org
>>
>
>
> --
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity to
> which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
> For additional commands, e-mail: general-h...@incubator.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org

Reply via email to