Re: [VOTE] Accept Optiq into the incubator

2014-05-19 Thread Ashutosh Chauhan
With 6 +1s vote passes. Thanks everyone for taking time to vote. Vote
thread is now closed. I will proceed with next steps now.

Thanks,
Ashutosh


On Mon, May 12, 2014 at 12:53 PM, Suresh Srinivas sur...@hortonworks.comwrote:

 +1 (binding)


 On Fri, May 9, 2014 at 11:03 AM, Ashutosh Chauhan hashut...@apache.org
 wrote:

  Based on the results of the discussion thread (
 
 
 http://mail-archives.apache.org/mod_mbox/incubator-general/201404.mbox/%3CCA%2BFBdFQA4TghLRdh9GgDKaMtKLQHxE_QZV%3DoZ7HfiDSA_jyqwg%40mail.gmail.com%3E
   ),  I would like to call a vote on accepting Optiq into the incubator.
 
  [ ] +1 Accept Optiq into the Incubator
  [ ] +0 Indifferent to the acceptance of Stratosphere
  [ ] -1 Do not accept Optiq because ...
 
  The vote will be open until Tuesday May 13 18:00 UTC.
 
  https://wiki.apache.org/incubator/OptiqProposal
 
  = Optiq =
  == Abstract ==
 
  Optiq is a framework that allows efficient translation of queries
 involving
  heterogeneous and federated data.
 
  == Proposal ==
 
  Optiq is a highly customizable engine for parsing and planning queries on
  data in a wide variety of formats. It allows database-like access, and in
  particular a SQL interface and advanced query optimization, for data not
  residing in a traditional database.
 
  == Background ==
 
  Databases were traditionally engineered in a monolithic stack, providing
 a
  data storage format, data processing algorithms, query parser, query
  planner, built-in functions, metadata repository and connectivity layer.
  They innovate in some areas but rarely in all.
 
  Modern data management systems are decomposing that stack into separate
  components, separating data, processing engine, metadata, and query
  language support. They are highly heterogeneous, with data in multiple
  locations and formats, caching and redundant data, different workloads,
 and
  processing occurring in different engines.
 
  Query planning (sometimes called query optimization) has always been a
 key
  function of a DBMS, because it allows the implementors to introduce new
  query-processing algorithms, and allows data administrators to
 re-organize
  the data without affecting applications built on that data. In a
  componentized system, the query planner integrates the components (data
  formats, engines, algorithms) without introducing unncessary coupling or
  performance tradeoffs.
 
  But building a query planner is hard; many systems muddle along without a
  planner, and indeed a SQL interface, until the demand from their
 customers
  is overwhelming.
 
  There is an opportunity to make this process more efficient by creating a
  re-usable framework.
 
  == Rationale ==
 
  Optiq allows database-like access, and in particular a SQL interface and
  advanced query optimization, for data not residing in a traditional
  database. It is complementary to many current Hadoop and NoSQL systems,
  which have innovative and performant storage and runtime systems but
 lack a
  SQL interface and intelligent query translation.
 
  Optiq is already in use by several projects, including Apache Drill,
 Apache
  Hive and Cascading Lingual, and commercial products.
 
  Optiq's architecture consists of:
 
  An extensible relational algebra.
   * SPIs (service-provider interfaces) for metadata (schemas and tables),
  planner rules, statistics, cost-estimates, user-defined functions.
   * Built-in sets of rules for logical transformations and common
  data-sources.
   * Two query planning engines driven by rules, statistics, etc. One
 engine
  is cost-based, the other rule-based.
   * Optional SQL parser, validator and translator to relational algebra.
   * Optional JDBC driver.
 
  == Initial Goals ==
 
  The initial goals are be to move the existing codebase to Apache and
  integrate with the Apache development process. Once this is accomplished,
  we plan for incremental development and releases that follow the Apache
  guidelines.
 
  As we move the code into the org.apache namespace, we will restructure
  components as necessary to allow clients to use just the components of
  Optiq that they need.
 
  A version 1.0 release, including pre-built binaries, will foster wider
  adoption.
 
  == Current Status ==
 
  Optiq has had over a dozen minor releases over the last 18 months. Its
 core
  SQL parser and validator, and its planning engine and core rules, are
  mature and robust and are the basis for several production systems; but
  other components and SPIs are still undergoing rapid evolution.
 
  === Meritocracy ===
 
  We plan to invest in supporting a meritocracy. We will discuss the
  requirements in an open forum. We encourage the companies and projects
  using Optiq to discuss their requirements in an open forum and to
  participate in development. We will encourage and monitor community
  participation so that privileges can be extended to those that
 contribute.
 
  Optiq's pluggable architecture encourages developers to contribute
  

Re: [VOTE] Accept Optiq into the incubator

2014-05-12 Thread Sharad Agarwal
+1 (non-binding)


On Fri, May 9, 2014 at 11:33 PM, Ashutosh Chauhan hashut...@apache.orgwrote:

 Based on the results of the discussion thread (

 http://mail-archives.apache.org/mod_mbox/incubator-general/201404.mbox/%3CCA%2BFBdFQA4TghLRdh9GgDKaMtKLQHxE_QZV%3DoZ7HfiDSA_jyqwg%40mail.gmail.com%3E
  ),  I would like to call a vote on accepting Optiq into the incubator.

 [ ] +1 Accept Optiq into the Incubator
 [ ] +0 Indifferent to the acceptance of Stratosphere
 [ ] -1 Do not accept Optiq because ...

 The vote will be open until Tuesday May 13 18:00 UTC.

 https://wiki.apache.org/incubator/OptiqProposal

 = Optiq =
 == Abstract ==

 Optiq is a framework that allows efficient translation of queries involving
 heterogeneous and federated data.

 == Proposal ==

 Optiq is a highly customizable engine for parsing and planning queries on
 data in a wide variety of formats. It allows database-like access, and in
 particular a SQL interface and advanced query optimization, for data not
 residing in a traditional database.

 == Background ==

 Databases were traditionally engineered in a monolithic stack, providing a
 data storage format, data processing algorithms, query parser, query
 planner, built-in functions, metadata repository and connectivity layer.
 They innovate in some areas but rarely in all.

 Modern data management systems are decomposing that stack into separate
 components, separating data, processing engine, metadata, and query
 language support. They are highly heterogeneous, with data in multiple
 locations and formats, caching and redundant data, different workloads, and
 processing occurring in different engines.

 Query planning (sometimes called query optimization) has always been a key
 function of a DBMS, because it allows the implementors to introduce new
 query-processing algorithms, and allows data administrators to re-organize
 the data without affecting applications built on that data. In a
 componentized system, the query planner integrates the components (data
 formats, engines, algorithms) without introducing unncessary coupling or
 performance tradeoffs.

 But building a query planner is hard; many systems muddle along without a
 planner, and indeed a SQL interface, until the demand from their customers
 is overwhelming.

 There is an opportunity to make this process more efficient by creating a
 re-usable framework.

 == Rationale ==

 Optiq allows database-like access, and in particular a SQL interface and
 advanced query optimization, for data not residing in a traditional
 database. It is complementary to many current Hadoop and NoSQL systems,
 which have innovative and performant storage and runtime systems but lack a
 SQL interface and intelligent query translation.

 Optiq is already in use by several projects, including Apache Drill, Apache
 Hive and Cascading Lingual, and commercial products.

 Optiq's architecture consists of:

 An extensible relational algebra.
  * SPIs (service-provider interfaces) for metadata (schemas and tables),
 planner rules, statistics, cost-estimates, user-defined functions.
  * Built-in sets of rules for logical transformations and common
 data-sources.
  * Two query planning engines driven by rules, statistics, etc. One engine
 is cost-based, the other rule-based.
  * Optional SQL parser, validator and translator to relational algebra.
  * Optional JDBC driver.

 == Initial Goals ==

 The initial goals are be to move the existing codebase to Apache and
 integrate with the Apache development process. Once this is accomplished,
 we plan for incremental development and releases that follow the Apache
 guidelines.

 As we move the code into the org.apache namespace, we will restructure
 components as necessary to allow clients to use just the components of
 Optiq that they need.

 A version 1.0 release, including pre-built binaries, will foster wider
 adoption.

 == Current Status ==

 Optiq has had over a dozen minor releases over the last 18 months. Its core
 SQL parser and validator, and its planning engine and core rules, are
 mature and robust and are the basis for several production systems; but
 other components and SPIs are still undergoing rapid evolution.

 === Meritocracy ===

 We plan to invest in supporting a meritocracy. We will discuss the
 requirements in an open forum. We encourage the companies and projects
 using Optiq to discuss their requirements in an open forum and to
 participate in development. We will encourage and monitor community
 participation so that privileges can be extended to those that contribute.

 Optiq's pluggable architecture encourages developers to contribute
 extensions such as adapters for data sources, new planning rules, and
 better statistics and cost-estimation functions. We look forward to
 fostering a rich ecosystem of extensions.

 === Community ===

 Building a data management system requires a high degree of technical
 skill, and correspondingly, the community of developers directly using
 

Re: [VOTE] Accept Optiq into the incubator

2014-05-12 Thread Alan Cabrera
+1 - binding


Regards,
Alan

On May 9, 2014, at 11:03 AM, Ashutosh Chauhan hashut...@apache.org wrote:

 [ ] +1 Accept Optiq into the Incubator
 [ ] +0 Indifferent to the acceptance of Stratosphere
 [ ] -1 Do not accept Optiq because ...



Re: [VOTE] Accept Optiq into the incubator

2014-05-12 Thread Alan Gates
+1.

Alan.

On May 9, 2014, at 11:03 AM, Ashutosh Chauhan hashut...@apache.org wrote:

 Based on the results of the discussion thread (
 http://mail-archives.apache.org/mod_mbox/incubator-general/201404.mbox/%3CCA%2BFBdFQA4TghLRdh9GgDKaMtKLQHxE_QZV%3DoZ7HfiDSA_jyqwg%40mail.gmail.com%3E
 ),  I would like to call a vote on accepting Optiq into the incubator.
 
 [ ] +1 Accept Optiq into the Incubator
 [ ] +0 Indifferent to the acceptance of Stratosphere
 [ ] -1 Do not accept Optiq because ...
 
 The vote will be open until Tuesday May 13 18:00 UTC.
 
 https://wiki.apache.org/incubator/OptiqProposal
 
 = Optiq =
 == Abstract ==
 
 Optiq is a framework that allows efficient translation of queries involving
 heterogeneous and federated data.
 
 == Proposal ==
 
 Optiq is a highly customizable engine for parsing and planning queries on
 data in a wide variety of formats. It allows database-like access, and in
 particular a SQL interface and advanced query optimization, for data not
 residing in a traditional database.
 
 == Background ==
 
 Databases were traditionally engineered in a monolithic stack, providing a
 data storage format, data processing algorithms, query parser, query
 planner, built-in functions, metadata repository and connectivity layer.
 They innovate in some areas but rarely in all.
 
 Modern data management systems are decomposing that stack into separate
 components, separating data, processing engine, metadata, and query
 language support. They are highly heterogeneous, with data in multiple
 locations and formats, caching and redundant data, different workloads, and
 processing occurring in different engines.
 
 Query planning (sometimes called query optimization) has always been a key
 function of a DBMS, because it allows the implementors to introduce new
 query-processing algorithms, and allows data administrators to re-organize
 the data without affecting applications built on that data. In a
 componentized system, the query planner integrates the components (data
 formats, engines, algorithms) without introducing unncessary coupling or
 performance tradeoffs.
 
 But building a query planner is hard; many systems muddle along without a
 planner, and indeed a SQL interface, until the demand from their customers
 is overwhelming.
 
 There is an opportunity to make this process more efficient by creating a
 re-usable framework.
 
 == Rationale ==
 
 Optiq allows database-like access, and in particular a SQL interface and
 advanced query optimization, for data not residing in a traditional
 database. It is complementary to many current Hadoop and NoSQL systems,
 which have innovative and performant storage and runtime systems but lack a
 SQL interface and intelligent query translation.
 
 Optiq is already in use by several projects, including Apache Drill, Apache
 Hive and Cascading Lingual, and commercial products.
 
 Optiq's architecture consists of:
 
 An extensible relational algebra.
 * SPIs (service-provider interfaces) for metadata (schemas and tables),
 planner rules, statistics, cost-estimates, user-defined functions.
 * Built-in sets of rules for logical transformations and common
 data-sources.
 * Two query planning engines driven by rules, statistics, etc. One engine
 is cost-based, the other rule-based.
 * Optional SQL parser, validator and translator to relational algebra.
 * Optional JDBC driver.
 
 == Initial Goals ==
 
 The initial goals are be to move the existing codebase to Apache and
 integrate with the Apache development process. Once this is accomplished,
 we plan for incremental development and releases that follow the Apache
 guidelines.
 
 As we move the code into the org.apache namespace, we will restructure
 components as necessary to allow clients to use just the components of
 Optiq that they need.
 
 A version 1.0 release, including pre-built binaries, will foster wider
 adoption.
 
 == Current Status ==
 
 Optiq has had over a dozen minor releases over the last 18 months. Its core
 SQL parser and validator, and its planning engine and core rules, are
 mature and robust and are the basis for several production systems; but
 other components and SPIs are still undergoing rapid evolution.
 
 === Meritocracy ===
 
 We plan to invest in supporting a meritocracy. We will discuss the
 requirements in an open forum. We encourage the companies and projects
 using Optiq to discuss their requirements in an open forum and to
 participate in development. We will encourage and monitor community
 participation so that privileges can be extended to those that contribute.
 
 Optiq's pluggable architecture encourages developers to contribute
 extensions such as adapters for data sources, new planning rules, and
 better statistics and cost-estimation functions. We look forward to
 fostering a rich ecosystem of extensions.
 
 === Community ===
 
 Building a data management system requires a high degree of technical
 skill, and correspondingly, the community of developers 

Re: [VOTE] Accept Optiq into the incubator

2014-05-12 Thread Fabian Hueske
+1 (non-binding)


2014-05-12 17:11 GMT+02:00 Alan Gates ga...@hortonworks.com:

 +1.

 Alan.

 On May 9, 2014, at 11:03 AM, Ashutosh Chauhan hashut...@apache.org
 wrote:

  Based on the results of the discussion thread (
 
 http://mail-archives.apache.org/mod_mbox/incubator-general/201404.mbox/%3CCA%2BFBdFQA4TghLRdh9GgDKaMtKLQHxE_QZV%3DoZ7HfiDSA_jyqwg%40mail.gmail.com%3E
  ),  I would like to call a vote on accepting Optiq into the incubator.
 
  [ ] +1 Accept Optiq into the Incubator
  [ ] +0 Indifferent to the acceptance of Stratosphere
  [ ] -1 Do not accept Optiq because ...
 
  The vote will be open until Tuesday May 13 18:00 UTC.
 
  https://wiki.apache.org/incubator/OptiqProposal
 
  = Optiq =
  == Abstract ==
 
  Optiq is a framework that allows efficient translation of queries
 involving
  heterogeneous and federated data.
 
  == Proposal ==
 
  Optiq is a highly customizable engine for parsing and planning queries on
  data in a wide variety of formats. It allows database-like access, and in
  particular a SQL interface and advanced query optimization, for data not
  residing in a traditional database.
 
  == Background ==
 
  Databases were traditionally engineered in a monolithic stack, providing
 a
  data storage format, data processing algorithms, query parser, query
  planner, built-in functions, metadata repository and connectivity layer.
  They innovate in some areas but rarely in all.
 
  Modern data management systems are decomposing that stack into separate
  components, separating data, processing engine, metadata, and query
  language support. They are highly heterogeneous, with data in multiple
  locations and formats, caching and redundant data, different workloads,
 and
  processing occurring in different engines.
 
  Query planning (sometimes called query optimization) has always been a
 key
  function of a DBMS, because it allows the implementors to introduce new
  query-processing algorithms, and allows data administrators to
 re-organize
  the data without affecting applications built on that data. In a
  componentized system, the query planner integrates the components (data
  formats, engines, algorithms) without introducing unncessary coupling or
  performance tradeoffs.
 
  But building a query planner is hard; many systems muddle along without a
  planner, and indeed a SQL interface, until the demand from their
 customers
  is overwhelming.
 
  There is an opportunity to make this process more efficient by creating a
  re-usable framework.
 
  == Rationale ==
 
  Optiq allows database-like access, and in particular a SQL interface and
  advanced query optimization, for data not residing in a traditional
  database. It is complementary to many current Hadoop and NoSQL systems,
  which have innovative and performant storage and runtime systems but
 lack a
  SQL interface and intelligent query translation.
 
  Optiq is already in use by several projects, including Apache Drill,
 Apache
  Hive and Cascading Lingual, and commercial products.
 
  Optiq's architecture consists of:
 
  An extensible relational algebra.
  * SPIs (service-provider interfaces) for metadata (schemas and tables),
  planner rules, statistics, cost-estimates, user-defined functions.
  * Built-in sets of rules for logical transformations and common
  data-sources.
  * Two query planning engines driven by rules, statistics, etc. One engine
  is cost-based, the other rule-based.
  * Optional SQL parser, validator and translator to relational algebra.
  * Optional JDBC driver.
 
  == Initial Goals ==
 
  The initial goals are be to move the existing codebase to Apache and
  integrate with the Apache development process. Once this is accomplished,
  we plan for incremental development and releases that follow the Apache
  guidelines.
 
  As we move the code into the org.apache namespace, we will restructure
  components as necessary to allow clients to use just the components of
  Optiq that they need.
 
  A version 1.0 release, including pre-built binaries, will foster wider
  adoption.
 
  == Current Status ==
 
  Optiq has had over a dozen minor releases over the last 18 months. Its
 core
  SQL parser and validator, and its planning engine and core rules, are
  mature and robust and are the basis for several production systems; but
  other components and SPIs are still undergoing rapid evolution.
 
  === Meritocracy ===
 
  We plan to invest in supporting a meritocracy. We will discuss the
  requirements in an open forum. We encourage the companies and projects
  using Optiq to discuss their requirements in an open forum and to
  participate in development. We will encourage and monitor community
  participation so that privileges can be extended to those that
 contribute.
 
  Optiq's pluggable architecture encourages developers to contribute
  extensions such as adapters for data sources, new planning rules, and
  better statistics and cost-estimation functions. We look forward to
  fostering a rich 

Re: [VOTE] Accept Optiq into the incubator

2014-05-12 Thread Suresh Srinivas
+1 (binding)


On Fri, May 9, 2014 at 11:03 AM, Ashutosh Chauhan hashut...@apache.orgwrote:

 Based on the results of the discussion thread (

 http://mail-archives.apache.org/mod_mbox/incubator-general/201404.mbox/%3CCA%2BFBdFQA4TghLRdh9GgDKaMtKLQHxE_QZV%3DoZ7HfiDSA_jyqwg%40mail.gmail.com%3E
  ),  I would like to call a vote on accepting Optiq into the incubator.

 [ ] +1 Accept Optiq into the Incubator
 [ ] +0 Indifferent to the acceptance of Stratosphere
 [ ] -1 Do not accept Optiq because ...

 The vote will be open until Tuesday May 13 18:00 UTC.

 https://wiki.apache.org/incubator/OptiqProposal

 = Optiq =
 == Abstract ==

 Optiq is a framework that allows efficient translation of queries involving
 heterogeneous and federated data.

 == Proposal ==

 Optiq is a highly customizable engine for parsing and planning queries on
 data in a wide variety of formats. It allows database-like access, and in
 particular a SQL interface and advanced query optimization, for data not
 residing in a traditional database.

 == Background ==

 Databases were traditionally engineered in a monolithic stack, providing a
 data storage format, data processing algorithms, query parser, query
 planner, built-in functions, metadata repository and connectivity layer.
 They innovate in some areas but rarely in all.

 Modern data management systems are decomposing that stack into separate
 components, separating data, processing engine, metadata, and query
 language support. They are highly heterogeneous, with data in multiple
 locations and formats, caching and redundant data, different workloads, and
 processing occurring in different engines.

 Query planning (sometimes called query optimization) has always been a key
 function of a DBMS, because it allows the implementors to introduce new
 query-processing algorithms, and allows data administrators to re-organize
 the data without affecting applications built on that data. In a
 componentized system, the query planner integrates the components (data
 formats, engines, algorithms) without introducing unncessary coupling or
 performance tradeoffs.

 But building a query planner is hard; many systems muddle along without a
 planner, and indeed a SQL interface, until the demand from their customers
 is overwhelming.

 There is an opportunity to make this process more efficient by creating a
 re-usable framework.

 == Rationale ==

 Optiq allows database-like access, and in particular a SQL interface and
 advanced query optimization, for data not residing in a traditional
 database. It is complementary to many current Hadoop and NoSQL systems,
 which have innovative and performant storage and runtime systems but lack a
 SQL interface and intelligent query translation.

 Optiq is already in use by several projects, including Apache Drill, Apache
 Hive and Cascading Lingual, and commercial products.

 Optiq's architecture consists of:

 An extensible relational algebra.
  * SPIs (service-provider interfaces) for metadata (schemas and tables),
 planner rules, statistics, cost-estimates, user-defined functions.
  * Built-in sets of rules for logical transformations and common
 data-sources.
  * Two query planning engines driven by rules, statistics, etc. One engine
 is cost-based, the other rule-based.
  * Optional SQL parser, validator and translator to relational algebra.
  * Optional JDBC driver.

 == Initial Goals ==

 The initial goals are be to move the existing codebase to Apache and
 integrate with the Apache development process. Once this is accomplished,
 we plan for incremental development and releases that follow the Apache
 guidelines.

 As we move the code into the org.apache namespace, we will restructure
 components as necessary to allow clients to use just the components of
 Optiq that they need.

 A version 1.0 release, including pre-built binaries, will foster wider
 adoption.

 == Current Status ==

 Optiq has had over a dozen minor releases over the last 18 months. Its core
 SQL parser and validator, and its planning engine and core rules, are
 mature and robust and are the basis for several production systems; but
 other components and SPIs are still undergoing rapid evolution.

 === Meritocracy ===

 We plan to invest in supporting a meritocracy. We will discuss the
 requirements in an open forum. We encourage the companies and projects
 using Optiq to discuss their requirements in an open forum and to
 participate in development. We will encourage and monitor community
 participation so that privileges can be extended to those that contribute.

 Optiq's pluggable architecture encourages developers to contribute
 extensions such as adapters for data sources, new planning rules, and
 better statistics and cost-estimation functions. We look forward to
 fostering a rich ecosystem of extensions.

 === Community ===

 Building a data management system requires a high degree of technical
 skill, and correspondingly, the community of developers directly using
 Optiq 

Re: [VOTE] Accept Optiq into the incubator

2014-05-11 Thread Andrew Purtell
+1


On Sat, May 10, 2014 at 2:03 AM, Ashutosh Chauhan hashut...@apache.orgwrote:

 Based on the results of the discussion thread (

 http://mail-archives.apache.org/mod_mbox/incubator-general/201404.mbox/%3CCA%2BFBdFQA4TghLRdh9GgDKaMtKLQHxE_QZV%3DoZ7HfiDSA_jyqwg%40mail.gmail.com%3E
  ),  I would like to call a vote on accepting Optiq into the incubator.

 [ ] +1 Accept Optiq into the Incubator
 [ ] +0 Indifferent to the acceptance of Stratosphere
 [ ] -1 Do not accept Optiq because ...

 The vote will be open until Tuesday May 13 18:00 UTC.

 https://wiki.apache.org/incubator/OptiqProposal

 = Optiq =
 == Abstract ==

 Optiq is a framework that allows efficient translation of queries involving
 heterogeneous and federated data.

 == Proposal ==

 Optiq is a highly customizable engine for parsing and planning queries on
 data in a wide variety of formats. It allows database-like access, and in
 particular a SQL interface and advanced query optimization, for data not
 residing in a traditional database.

 == Background ==

 Databases were traditionally engineered in a monolithic stack, providing a
 data storage format, data processing algorithms, query parser, query
 planner, built-in functions, metadata repository and connectivity layer.
 They innovate in some areas but rarely in all.

 Modern data management systems are decomposing that stack into separate
 components, separating data, processing engine, metadata, and query
 language support. They are highly heterogeneous, with data in multiple
 locations and formats, caching and redundant data, different workloads, and
 processing occurring in different engines.

 Query planning (sometimes called query optimization) has always been a key
 function of a DBMS, because it allows the implementors to introduce new
 query-processing algorithms, and allows data administrators to re-organize
 the data without affecting applications built on that data. In a
 componentized system, the query planner integrates the components (data
 formats, engines, algorithms) without introducing unncessary coupling or
 performance tradeoffs.

 But building a query planner is hard; many systems muddle along without a
 planner, and indeed a SQL interface, until the demand from their customers
 is overwhelming.

 There is an opportunity to make this process more efficient by creating a
 re-usable framework.

 == Rationale ==

 Optiq allows database-like access, and in particular a SQL interface and
 advanced query optimization, for data not residing in a traditional
 database. It is complementary to many current Hadoop and NoSQL systems,
 which have innovative and performant storage and runtime systems but lack a
 SQL interface and intelligent query translation.

 Optiq is already in use by several projects, including Apache Drill, Apache
 Hive and Cascading Lingual, and commercial products.

 Optiq's architecture consists of:

 An extensible relational algebra.
  * SPIs (service-provider interfaces) for metadata (schemas and tables),
 planner rules, statistics, cost-estimates, user-defined functions.
  * Built-in sets of rules for logical transformations and common
 data-sources.
  * Two query planning engines driven by rules, statistics, etc. One engine
 is cost-based, the other rule-based.
  * Optional SQL parser, validator and translator to relational algebra.
  * Optional JDBC driver.

 == Initial Goals ==

 The initial goals are be to move the existing codebase to Apache and
 integrate with the Apache development process. Once this is accomplished,
 we plan for incremental development and releases that follow the Apache
 guidelines.

 As we move the code into the org.apache namespace, we will restructure
 components as necessary to allow clients to use just the components of
 Optiq that they need.

 A version 1.0 release, including pre-built binaries, will foster wider
 adoption.

 == Current Status ==

 Optiq has had over a dozen minor releases over the last 18 months. Its core
 SQL parser and validator, and its planning engine and core rules, are
 mature and robust and are the basis for several production systems; but
 other components and SPIs are still undergoing rapid evolution.

 === Meritocracy ===

 We plan to invest in supporting a meritocracy. We will discuss the
 requirements in an open forum. We encourage the companies and projects
 using Optiq to discuss their requirements in an open forum and to
 participate in development. We will encourage and monitor community
 participation so that privileges can be extended to those that contribute.

 Optiq's pluggable architecture encourages developers to contribute
 extensions such as adapters for data sources, new planning rules, and
 better statistics and cost-estimation functions. We look forward to
 fostering a rich ecosystem of extensions.

 === Community ===

 Building a data management system requires a high degree of technical
 skill, and correspondingly, the community of developers directly using
 Optiq is 

[VOTE] Accept Optiq into the incubator

2014-05-10 Thread Ashutosh Chauhan
Based on the results of the discussion thread (
http://mail-archives.apache.org/mod_mbox/incubator-general/201404.mbox/%3CCA%2BFBdFQA4TghLRdh9GgDKaMtKLQHxE_QZV%3DoZ7HfiDSA_jyqwg%40mail.gmail.com%3E
 ),  I would like to call a vote on accepting Optiq into the incubator.

[ ] +1 Accept Optiq into the Incubator
[ ] +0 Indifferent to the acceptance of Stratosphere
[ ] -1 Do not accept Optiq because ...

The vote will be open until Tuesday May 13 18:00 UTC.

https://wiki.apache.org/incubator/OptiqProposal

= Optiq =
== Abstract ==

Optiq is a framework that allows efficient translation of queries involving
heterogeneous and federated data.

== Proposal ==

Optiq is a highly customizable engine for parsing and planning queries on
data in a wide variety of formats. It allows database-like access, and in
particular a SQL interface and advanced query optimization, for data not
residing in a traditional database.

== Background ==

Databases were traditionally engineered in a monolithic stack, providing a
data storage format, data processing algorithms, query parser, query
planner, built-in functions, metadata repository and connectivity layer.
They innovate in some areas but rarely in all.

Modern data management systems are decomposing that stack into separate
components, separating data, processing engine, metadata, and query
language support. They are highly heterogeneous, with data in multiple
locations and formats, caching and redundant data, different workloads, and
processing occurring in different engines.

Query planning (sometimes called query optimization) has always been a key
function of a DBMS, because it allows the implementors to introduce new
query-processing algorithms, and allows data administrators to re-organize
the data without affecting applications built on that data. In a
componentized system, the query planner integrates the components (data
formats, engines, algorithms) without introducing unncessary coupling or
performance tradeoffs.

But building a query planner is hard; many systems muddle along without a
planner, and indeed a SQL interface, until the demand from their customers
is overwhelming.

There is an opportunity to make this process more efficient by creating a
re-usable framework.

== Rationale ==

Optiq allows database-like access, and in particular a SQL interface and
advanced query optimization, for data not residing in a traditional
database. It is complementary to many current Hadoop and NoSQL systems,
which have innovative and performant storage and runtime systems but lack a
SQL interface and intelligent query translation.

Optiq is already in use by several projects, including Apache Drill, Apache
Hive and Cascading Lingual, and commercial products.

Optiq's architecture consists of:

An extensible relational algebra.
 * SPIs (service-provider interfaces) for metadata (schemas and tables),
planner rules, statistics, cost-estimates, user-defined functions.
 * Built-in sets of rules for logical transformations and common
data-sources.
 * Two query planning engines driven by rules, statistics, etc. One engine
is cost-based, the other rule-based.
 * Optional SQL parser, validator and translator to relational algebra.
 * Optional JDBC driver.

== Initial Goals ==

The initial goals are be to move the existing codebase to Apache and
integrate with the Apache development process. Once this is accomplished,
we plan for incremental development and releases that follow the Apache
guidelines.

As we move the code into the org.apache namespace, we will restructure
components as necessary to allow clients to use just the components of
Optiq that they need.

A version 1.0 release, including pre-built binaries, will foster wider
adoption.

== Current Status ==

Optiq has had over a dozen minor releases over the last 18 months. Its core
SQL parser and validator, and its planning engine and core rules, are
mature and robust and are the basis for several production systems; but
other components and SPIs are still undergoing rapid evolution.

=== Meritocracy ===

We plan to invest in supporting a meritocracy. We will discuss the
requirements in an open forum. We encourage the companies and projects
using Optiq to discuss their requirements in an open forum and to
participate in development. We will encourage and monitor community
participation so that privileges can be extended to those that contribute.

Optiq's pluggable architecture encourages developers to contribute
extensions such as adapters for data sources, new planning rules, and
better statistics and cost-estimation functions. We look forward to
fostering a rich ecosystem of extensions.

=== Community ===

Building a data management system requires a high degree of technical
skill, and correspondingly, the community of developers directly using
Optiq is potentially fairly small, albeit highly technical and engaged. But
we also expect engagement from members of the communities of projects that
use Optiq, such as Drill and Hive. 

Re: [VOTE] Accept Optiq into the incubator

2014-05-10 Thread Ted Dunning
On Fri, May 9, 2014 at 8:03 PM, Ashutosh Chauhan hashut...@apache.orgwrote:

 [X] +1 Accept Optiq into the Incubator
 [ ] +0 Indifferent to the acceptance of Stratosphere
 [ ] -1 Do not accept Optiq because ...


The appearance of Stratosphere is a clear typo.

I still vote to accept.


Re: [VOTE] Accept Optiq into the incubator

2014-05-10 Thread Chris Douglas
+1 -C

On Fri, May 9, 2014 at 11:03 AM, Ashutosh Chauhan hashut...@apache.org wrote:
 Based on the results of the discussion thread (
 http://mail-archives.apache.org/mod_mbox/incubator-general/201404.mbox/%3CCA%2BFBdFQA4TghLRdh9GgDKaMtKLQHxE_QZV%3DoZ7HfiDSA_jyqwg%40mail.gmail.com%3E
  ),  I would like to call a vote on accepting Optiq into the incubator.

 [ ] +1 Accept Optiq into the Incubator
 [ ] +0 Indifferent to the acceptance of Stratosphere
 [ ] -1 Do not accept Optiq because ...

 The vote will be open until Tuesday May 13 18:00 UTC.

 https://wiki.apache.org/incubator/OptiqProposal

 = Optiq =
 == Abstract ==

 Optiq is a framework that allows efficient translation of queries involving
 heterogeneous and federated data.

 == Proposal ==

 Optiq is a highly customizable engine for parsing and planning queries on
 data in a wide variety of formats. It allows database-like access, and in
 particular a SQL interface and advanced query optimization, for data not
 residing in a traditional database.

 == Background ==

 Databases were traditionally engineered in a monolithic stack, providing a
 data storage format, data processing algorithms, query parser, query
 planner, built-in functions, metadata repository and connectivity layer.
 They innovate in some areas but rarely in all.

 Modern data management systems are decomposing that stack into separate
 components, separating data, processing engine, metadata, and query
 language support. They are highly heterogeneous, with data in multiple
 locations and formats, caching and redundant data, different workloads, and
 processing occurring in different engines.

 Query planning (sometimes called query optimization) has always been a key
 function of a DBMS, because it allows the implementors to introduce new
 query-processing algorithms, and allows data administrators to re-organize
 the data without affecting applications built on that data. In a
 componentized system, the query planner integrates the components (data
 formats, engines, algorithms) without introducing unncessary coupling or
 performance tradeoffs.

 But building a query planner is hard; many systems muddle along without a
 planner, and indeed a SQL interface, until the demand from their customers
 is overwhelming.

 There is an opportunity to make this process more efficient by creating a
 re-usable framework.

 == Rationale ==

 Optiq allows database-like access, and in particular a SQL interface and
 advanced query optimization, for data not residing in a traditional
 database. It is complementary to many current Hadoop and NoSQL systems,
 which have innovative and performant storage and runtime systems but lack a
 SQL interface and intelligent query translation.

 Optiq is already in use by several projects, including Apache Drill, Apache
 Hive and Cascading Lingual, and commercial products.

 Optiq's architecture consists of:

 An extensible relational algebra.
  * SPIs (service-provider interfaces) for metadata (schemas and tables),
 planner rules, statistics, cost-estimates, user-defined functions.
  * Built-in sets of rules for logical transformations and common
 data-sources.
  * Two query planning engines driven by rules, statistics, etc. One engine
 is cost-based, the other rule-based.
  * Optional SQL parser, validator and translator to relational algebra.
  * Optional JDBC driver.

 == Initial Goals ==

 The initial goals are be to move the existing codebase to Apache and
 integrate with the Apache development process. Once this is accomplished,
 we plan for incremental development and releases that follow the Apache
 guidelines.

 As we move the code into the org.apache namespace, we will restructure
 components as necessary to allow clients to use just the components of
 Optiq that they need.

 A version 1.0 release, including pre-built binaries, will foster wider
 adoption.

 == Current Status ==

 Optiq has had over a dozen minor releases over the last 18 months. Its core
 SQL parser and validator, and its planning engine and core rules, are
 mature and robust and are the basis for several production systems; but
 other components and SPIs are still undergoing rapid evolution.

 === Meritocracy ===

 We plan to invest in supporting a meritocracy. We will discuss the
 requirements in an open forum. We encourage the companies and projects
 using Optiq to discuss their requirements in an open forum and to
 participate in development. We will encourage and monitor community
 participation so that privileges can be extended to those that contribute.

 Optiq's pluggable architecture encourages developers to contribute
 extensions such as adapters for data sources, new planning rules, and
 better statistics and cost-estimation functions. We look forward to
 fostering a rich ecosystem of extensions.

 === Community ===

 Building a data management system requires a high degree of technical
 skill, and correspondingly, the community of developers directly using
 Optiq is