I would suggest mesosphere, but... Apache Mesos! Haha. What about Troposphere, or Tropos? ;)
On 31 March 2014 11:28, Afkham Azeez <[email protected]> wrote: > > > > On Mon, Mar 31, 2014 at 2:52 AM, Sanjiva Weerawarana <[email protected]> > wrote: >> >> Hmmm will Stratos and Stratosphere be confusing? If so we should give >> feedback on the name ... > > > This will definitely be confusing, IMO > >> >> >> ---------- Forwarded message ---------- >> From: Alan Gates <[email protected]> >> Date: Sun, Mar 30, 2014 at 11:14 AM >> Subject: [PROPOSAL] Stratosphere >> To: [email protected] >> >> >> I would like to propose Stratosphere as an Apache Incubator project. I >> have posted the proposal to >> https://wiki.apache.org/incubator/StratosphereProposal and posted the text >> of the proposal below. >> >> Alan. >> >> = Stratosphere = >> >> == Abstract == >> Stratosphere is an open source system for parallel data analysis. >> Stratosphere deeply integrates MapReduce and database technologies to >> provide expressive and optimizable programming interfaces and at the same >> time efficient and scalable execution. >> >> == Proposal == >> Stratosphere is an open source system for expressive, declarative, fast, >> and efficient data analysis. Stratosphere combines the scalability and >> programming flexibility of distributed MapReduce-like platforms with the >> efficiency, out-of-core execution, and query optimization capabilities found >> in parallel databases. >> >> == Background == >> There is currently a need for general-purpose cluster computing platforms >> that are compatible with the Hadoop ecosystem, are more efficient, easier to >> use, and can support more applications than Hadoop MapReduce, but are not >> restricted to a specific data model and language (such as the relational >> model and a variant of SQL). Stratosphere fulfils these needs. >> >> Stratosphere exposes expressive APIs in Java and Scala (conceptually >> similar to Spark, Cascading, Scalding) that allow arbitrary user-defined >> functions in the same language and data model that the program is written >> in. Stratosphere programs pass through a cost-based optimizer that finds the >> best execution path for these programs depending on the data and cluster >> characteristics. The design and implementation of Stratosphere is based on >> research that generalizes query optimizers in relational databases. >> Stratosphere has a distributed runtime that is architected upon the >> principles of parallel databases, providing true pipelining (a basis for >> stream processing) and efficient out-of-core algorithms for grouping, >> sorting, joining, and aggregating data. Stratosphere provides first-class >> support for iterative algorithms via a built-in iterate operator, covering >> Machine Learning and graph analysis use cases. It achieves performance >> similar to Apache Giraph without being a specialized graph processing >> system. >> >> Stratosphere has undergone three major releases (v0.1, v0.2, v0.4) and >> some minor ones. >> >> == Rationale == >> Stratosphere started out in 2008 as a research project by the Technical >> University of Berlin, the Humboldt University of Berlin, and the Hasso >> Plattner Institute, and has received subsequent funding from the German >> Research Council, the European Institute of Innovation and Technology, the >> European Commision, and industry. >> >> The traction of Stratosphere has by far exceeded our initial expectations, >> and we are therefore seeking an organizational long-term home for >> Stratosphere beyond the University walls that will house and further >> encourage contributors from companies and other organizations that are >> interested in Stratosphere. We believe that the Apache Software Foundation >> is the ideal home for Stratosphere. Stratosphere integrates with several >> existing Apache projects, such as HDFS, YARN, HBase, and Avro. The team is >> familiar with the Apache processes and fully subscribes to the Apache >> mission. One of the proposing members is a long-time Apache contributor and >> PMC member. >> >> == Initial Goals == >> * Move the existing codebase to Apache >> * Integrate with the Apache development process >> * Ensure all dependencies are compliant with Apache License version 2.0 >> * Incremental development and releases per Apache guidelines >> >> >> == Current Status == >> === Meritocracy === >> Stratosphere operated on meritocratic principles from the get go. The >> initial project proposal submitted to the German Research Council >> in 2008 stated that all code developed in the project will be released as >> open source under the Apache 2 license. Currently, all the >> discussions pertaining to Stratosphere development are public on >> [[https://github.com/stratosphere/stratosphere|GitHub]] and our >> [[https://groups.google.com/forum/#!forum/stratosphere-dev|mailing list]]. >> The current incubation proposal includes the major code contributors to >> Stratosphere. Several additional people have worked on the Stratosphere >> codebase for research prototypes and industry use cases and would be >> interested in becoming committers. We are starting with a small committer >> group and we plan to add additional committers following an open merit-based >> decision process during the incubation phase. >> >> === Community === >> Currently, the core of Stratosphere is developed at TU Berlin, mainly by >> the committers listed in this proposal. Additional people from several >> Universities and companies in Europe are working with Stratosphere and are >> interested in becoming committers to the project. >> >> During the years, Stratosphere has been adopted as a platform for research >> and teaching in several Universities (TU Berlin, HU Berlin, HPI, RWTH, >> Inria, KTH, U. Trento, UCSD, and others), and it is currently witnessing its >> first industrial installations. We are seeing a rapidly growing interest in >> Stratosphere by both startups and large companies, as well as a growing >> community (our first >> [[http://stratosphere.eu/events/2013/summit.html|Stratosphere Summit]] in >> November 2013 attracted over 80 participants). Stratosphere was recently >> accepted as a mentoring organization in Google Summer of Code 2014. >> >> We believe that acceptance in the Apache Software Foundation will >> consolidate the current community under one organizational umbrella, and >> most importantly accelerate the growth of the community. >> >> === Core developers === >> The core developers of the system are Stephan Ewen, Fabian Hueske, Daniel >> Warneke, Robert Metzger, Ufuk Celebi, and Aljoscha Krettek, who are all >> committers in the current proposal. >> >> === Alignment === >> Stratosphere is compatible with, and related to several Apache projects. >> Stratosphere re-uses parts of Apache Hadoop, in particular HDFS and YARN, as >> well as Apache HBase and Apache Avro. Stratosphere is a very good >> compilation target for query languages such as Apache Hive and Apache Pig. >> >> == Known Risks == >> === Orphaned Products === >> There is strong interest in Stratosphere by several companies and >> organizations, and there is currently a long-term commitment to fund >> salaried developers for Stratosphere by public and private organizations in >> Europe. >> >> === Inexperience with Open Source === >> Sebastian Schelter is a committer and PMC member of Apache Mahout and >> Apache Giraph, member of the Apache Software Foundation, member of the >> Incubator PMC and project mentor for Apache Drill. Sebastian, along with our >> mentors, will guide the rest of the committers that have experience with >> releasing software as open source but little experience in participating in >> an open source project besides Stratosphere itself. >> >> In mid-2013 Stratosphere transitioned from an "open source project with >> publicly accessible source code" to an open source project that puts the >> community first. We moved from a University-hosted git repository to GitHub, >> where we discuss all issues publicly. This also includes release planning >> (via GitHub's milestone feature) and code reviews. We also moved our build >> system to the publicly available Travis-CI. The mailing lists are hosted >> with Google Groups, we use the public Maven repository infrastructure of >> Sonatype. The source code of the www.stratosphere.eu website is publicly >> available and is meant to be changed by external contributors (for example >> for documentation purposes). >> >> === Homogeneous Developers === >> Most committers in this proposal belong to the same institution (TU >> Berlin). The engagement of these committers goes well beyond the necessary >> development to support research, and all committers work on Stratosphere in >> their free time. Several people from other institutions are working on and >> are familiar with the Stratosphere codebase. We will work to attract them as >> future committers during the incubation phase, following a merit-based >> approach. >> >> === Reliance on Salaried Developers === >> Currently, Stratosphere receives support from salaried developers, in >> particular from graduate students at TU Berlin that are funded by the German >> Research Council, the European Institute of Technology, and the European >> Commission. These students work in their free time on Stratosphere in >> addition to their employment. >> >> We expect that Stratosphere development will occur on both salaried and >> volunteer time. We will recruit additional committers, including >> non-salaried developers, and we will work to ensure that the project will >> move forward independently of salaried developers. >> >> === Relationship with Other Apache Products === >> Stratosphere interfaces with several existing Apache projects: Apache >> HBase for storage, Apache Hadoop (HDFS for storage, YARN for resource >> management, and Stratosphere contains a generic wrapper for Hadoop MapReduce >> input formats), and Apache Avro (for serialization). Stratosphere uses >> Apache Maven and Apache Commons libraries internally. Stratosphere can be a >> great compilation target for Apache Pig and Apache Hive, although such >> functionality is not yet implemented. >> >> Stratosphere is also related with several projects undergoing incubation >> in the Apache Incubation project, such as Tez, Drill, and Spark (graduated). >> While all these projects target sufficiently different spaces and have >> different architectures, it would be interesting to explore code reuse >> possibilities. For example, we are currently basing our design for compiling >> SQL to Stratosphere on the Optiq library, also used by Apache Drill. >> >> === An Excessive Fascination with the Apache Brand === >> We believe that the Apache brand will help us attract contributors to >> Stratosphere, by giving us a well-defined, transparent development process >> under a known brand. At the same time, Stratosphere already has a healthy >> community and current funding guarantees the further codebase development >> and growth of the project for the next 3-5 years. The reason for this >> proposal is not to gain publicity, but to further strengthen the longevity >> of the project as explained in the Rationale section. >> >> == Documentation == >> * [[https://stratosphere.eu|Project website]] >> * [[http://stratosphere.eu/docs/0.4/|Documentation]] >> * [[https://github.com/stratosphere/stratosphere|Codebase]] >> * [[https://groups.google.com/forum/#!forum/stratosphere-dev|Mailing >> list]] >> >> == Initial Source == >> Stratosphere is hosted on >> [[https://github.com/stratosphere/stratosphere|GitHub]] . This is the >> codebase that we will migrate to the Apache Foundation. The code was >> previously hosted on a TU Berlin's own git infrastructure. It has always >> been Apache 2.0 licensed. >> >> === Source and Intellectual Property Submission Plan === >> All initial and past committers will sign a CLA with the ASF while the >> incubator proposal for Stratosphere is being discussed. All organizations >> that have employed Stratosphere contributors in the past will sign a SGA. >> Current contributors will sign a CCLA. All major contributors are still >> active in the project. >> >> === External Dependencies === >> All critical dependencies are, to the extend of our knowledge, from other >> Apache projects. These include Apache Hadoop (for YARN and HDFS) and some >> libraries (log4j, commons codec, junit and more). Our web frontend uses some >> MIT-licensed JavaScript libraries. >> >> == Required Resources == >> >> === Mailing list === >> We will migrate our mailing lists to the following: >> * [email protected] >> * [email protected] >> * [email protected] >> * [email protected] >> >> === Source control === >> We would like to use Git for source control and enable GitHib mirroring >> functionality, where code reviews on GitHub are automatically >> forwarded to the developer mailing list. (See also: >> [[https://blogs.apache.org/infra/entry/improved_integration_between_apache_and]]) >> >> >> === Issue tracking === >> We are currently using GitHub for issue tracking. We request an >> Apache-hosted JIRA, and we will import existing issues there. >> >> >> == Initial committers == >> * Stephan Ewen - [email protected] >> * Fabian Hueske - [email protected] >> * Daniel Warneke - [email protected] >> * Robert Metzger - [email protected] >> * Ufuk Celebi - [email protected] >> * Aljoscha Krettek - [email protected] >> * Kostas Tzoumas - [email protected] >> * Sebastian Schelter - [email protected] >> >> === Affiliations === >> * Stephan Ewen (TU Berlin) >> * Fabian Hueske (TU Berlin) >> * Daniel Warneke (Amadeus IT Group) >> * Robert Metzger (TU Berlin) >> * Ufuk Celebi (FU Berlin) >> * Aljoscha Krettek (TU Berlin) >> * Kostas Tzoumas (TU Berlin) >> * Sebastian Schelter (TU Berlin) >> >> == Sponsors == >> === Champion === >> Alan Gates ([email protected]) >> >> === Nominated Mentors === >> * Sean Owen ([email protected]) (Note: Sean is an Apache member but not >> currently on the IPC, he will need to request IPMC membership) >> * Ted Dunning ([email protected]) >> * Owen O'Malley ([email protected]) >> >> === Sponsoring Entity === >> The Apache Incubator >> >> >> -- >> CONFIDENTIALITY NOTICE >> NOTICE: This message is intended for the use of the individual or entity >> to >> which it is addressed and may contain information that is confidential, >> privileged and exempt from disclosure under applicable law. If the reader >> of this message is not the intended recipient, you are hereby notified >> that >> any printing, copying, dissemination, distribution, disclosure or >> forwarding of this communication is strictly prohibited. If you have >> received this communication in error, please contact the sender >> immediately >> and delete it from your system. Thank You. >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: [email protected] >> For additional commands, e-mail: [email protected] >> >> >> >> >> -- >> Sanjiva Weerawarana, Ph.D. >> Founder, Chairman & CEO; WSO2, Inc.; http://wso2.com/ >> email: [email protected]; office: (+1 650 745 4499 | +94 11 214 5345) >> x5700; cell: +94 77 787 6880 | +1 408 466 5099; voip: +1 650 265 8311 >> blog: http://sanjiva.weerawarana.org/; twitter: @sanjiva >> Lean . Enterprise . Middleware > > > > > -- > Afkham Azeez > Director of Architecture; WSO2, Inc.; http://wso2.com, > Member; Apache Software Foundation; http://www.apache.org/ > > email: [email protected] cell: +94 77 3320919 > blog: http://blog.afkham.org > twitter: http://twitter.com/afkham_azeez > linked-in: http://lk.linkedin.com/in/afkhamazeez > > > Lean . Enterprise . Middleware > -- Noah Slater https://twitter.com/nslater
