Re: [VOTE] Accept Apache Atlas into Apache Incubator

2015-05-08 Thread Seetharam Venkatesh
Hi Jake,

Sorry that I missed your comment and delay in my response. Thanks for the
heads up and will take this up with podling name search jira.

Thanks!

On Sun, May 3, 2015 at 6:52 PM, Jake Farrell  wrote:

> Sorry I missed the discussion thread for this proposed podling, the name
> for this project may have an issue with Netflix Atlas [1] when it comes
> time to graduate, may be worth the discussion of switching names if voted
> in before any infra resources are setup
>
> -Jake
>
> [1]:
> http://techblog.netflix.com/2014/12/introducing-atlas-netflixs-primary.html
> [2]: https://github.com/netflix/atlas
>
>
> On Fri, May 1, 2015 at 3:26 AM, Seetharam Venkatesh <
> venkat...@innerzeal.com
> > wrote:
>
> > Hello folks,
> >
> > Following the discussion earlier in the thread: http://s.apache.org/r2
> >
> > I would like to call a VOTE for accepting Apache Atlas as a new incubator
> > project.
> >
> > The proposal is available at:
> > https://wiki.apache.org/incubator/AtlasProposal
> > Also, the text of the latest wiki proposal is included at the bottom of
> > this email.
> >
> > The VOTE is open for at least the next 72 hours:
> >
> >  [ ] +1 accept Apache Atlas into the Apache Incubator
> >  [ ] ±0 Abstain
> >  [ ] -1 because...
> >
> > Of course I am +1! (non-binding)
> >
> > Thanks!
> >
> >
> > = Apache Atlas Proposal =
> >
> > == Abstract ==
> >
> > Apache Atlas is a scalable and extensible set of core foundational
> > governance services that enables enterprises to effectively and
> efficiently
> > meet their compliance requirements within Hadoop and allows integration
> > with the complete enterprise data ecosystem.
> >
> > == Proposal ==
> >
> > Apache Atlas allows agnostic governance visibility into Hadoop, these
> > abilities are enabled through a set of core foundational services powered
> > by a flexible metadata repository.
> >
> > These services include:
> >
> >  * Search and Lineage for datasets
> >  * Metadata driven data access control
> >  * Indexed and Searchable Centralized Auditing operational Events
> >  * Data lifecycle management – ingestion to disposition
> >  * Metadata interchange with other metadata tools
> >
> > == Background ==
> >
> > Hadoop is one of many platforms in the modern enterprise data ecosystem
> and
> > requires governance controls commensurate with this reality.
> >
> > Currently, there is no easy or complete way to provide comprehensive
> > visibility and control into Hadoop audit, lineage, and security for
> > workflows that require Hadoop and non-Hadoop processing.
> >
> > Many solutions are usually point based, and require a monolithic
> > application workflow.  Multi-tenancy and concurrency are problematic as
> > these offerings are not aware of activity outside of their narrow focus.
> >
> > As Hadoop gains greater popularity, governance concerns will become
> > increasingly vital to increasing maturity and furthering adoption. It is
> a
> > particular barrier to expanding enterprise data under management.
> >
> > == Rationale ==
> >
> > Atlas will address issues previously discussed by providing governance
> > capabilities in Hadoop -- using both a prescriptive and forensic model
> > enriched by business taxonomical metadata.Atlas, at its core, is
> > designed to exchange metadata with other tools and processes within and
> > outside of the Hadoop stack -- enable governance controls that are truly
> > platform agnostic and effectively (and defensibly) address compliance
> > concerns.
> >
> > Initially working with a group of leading partners in several industries,
> > Atlas is built to solve specific real world governance problems that
> > accelerate product maturity and time to value.
> >
> > Atlas aims to grow a community to help build a widely adopted pattern for
> > governance, metadata modeling and exchange in Hadoop – which will advance
> > the interests for the whole community.
> >
> > == Current Status ==
> >
> > An initial version with a valuable set of features is developed by the
> list
> > of initial committers and is hosted on github.
> >
> > === Meritocracy ===
> >
> > Our intent with this proposal is to start building a diverse  developer
> > community around Atlas following the Apache meritocracy model. We have
> > wanted to make the project open source and encourage contributors from
> > multiple organizations from the start.
> >
> > We plan to provide plenty of support to new developers and to quickly
> > recruit those who make solid contributions to committer status.
> >
> > === Community ===
> >
> > We are happy to report that the initial team already represents multiple
> > organizations. We hope to extend the user and developer base further in
> the
> > future and build a solid open source community around Atlas.
> >
> > === Core Developers ===
> >
> > Atlas development is currently being led by engineers from Hortonworks –
> > Harish Butani, Venkatesh Seetharam, Shwetha G S, and Jon Maron. All the
> > engineers have deep expertise i

Re: [DISCUSS] Mysos Incubation proposal

2015-05-08 Thread jan i
On Friday, May 8, 2015, Jiang Yan Xu  wrote:

> Maybe in as many places as possible we could spell out the portmanteau as
> Mysos: MySQL on Mesos.

+1 count me in as being confused, but this suggestion helps.

rgds
jan i

>
> Yan
> ---
> Jiang Yan Xu > | @xujyan <
> https://twitter.com/xujyan>
>
> On Thu, May 7, 2015 at 5:47 PM, Ted Dunning  > wrote:
>
> > No.  I mean Mysos / Mesos phonetic and typographic similarity.
> >
> > It took 5-10 lines of reading for me to understand the Mysos was not a
> > typo.  Until then, I was just confused.  And I had heard of this effort
> > before.
> >
> > It is just a data point.  Not any kind of strong comment.
> >
> >
> >
> > On Fri, May 8, 2015 at 1:39 AM, Henry Saputra  >
> > wrote:
> >
> > > Do you mean about it being "Apache Mesos framework" ?
> > >
> > > In Mesos world, a framework is like an application build to run on top
> of
> > > Mesos.
> > >
> > > We should probably change the definition to be more clear from the
> > > outside of Mesos community.
> > >
> > > - Henry
> > >
> > > On Thu, May 7, 2015 at 4:01 PM, Ted Dunning  >
> > wrote:
> > > > On Thu, May 7, 2015 at 7:47 PM, Dave Lester  >
> > wrote:
> > > >
> > > >> Mysos is an Apache Mesos framework
> > > >
> > > >
> > > > That is a very confusing name (to me) at least.
> > >
> > > -
> > > To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
> 
> > > For additional commands, e-mail: general-h...@incubator.apache.org
> 
> > >
> > >
> >
>


-- 
Sent from My iPad, sorry for any misspellings.


[DISCUSS] Trafodion Incubation Proposal

2015-05-08 Thread Stack
I would like to start up a discussion on Trafodion joining the ASF as an
incubating project.

Trafodion is a webscale SQL-on-Hadoop solution that enables transactional
or operational workloads on Hadoop, .

The proposal is available on the wiki here:
https://wiki.apache.org/incubator/TrafodionProposal#preview

The proposal text is also attached to the end of this email.

Trafodion is a rich, storied SQL engine that has recently been ported to
run on HBase and Hadoop. I think it would make for a fine addition to the
Apache family of projects  It would be good to hear what others think.

Thank you in advance for giving the proposal a read.

Yours,
St.Ack


Trafodion Apache Incubator Proposal

Abstract

Trafodion is a webscale SQL-on-Hadoop solution enabling transactional or
operational workloads on Hadoop.

Proposal

Apache Trafodion builds on the scalability, elasticity, and flexibility of
Hadoop. Trafodion extends Hadoop to provide guaranteed transactional
integrity, enabling new kinds of big data applications to run on Hadoop. Key
features of Apache Trafodion include:

* Full-functioned ANSI SQL language support
* JDBC/ODBC connectivity for Linux/Windows clients
* Distributed ACID transaction protection across multiple statements,
tables and rows
* Performance improvements for OLTP workloads with compile-time and
run-time optimizations
* Support for large data sets using a parallel-aware query optimizer
* ANSI SQL security and data integrity constraints including referential
integrity

Hewlett-Packard Company submits this proposal to donate its Apache License,
Version 2.0 open source project known as Trafodion, its source code,
documentation, and web site content to the Apache Software Foundation in
order to build an open source community

Background

Trafodion is an open source project sponsored by HP, incubated at HP Labs
and HP-IT, to develop an enterprise-class SQL-on-Hadoop solution targeting
big data transactional or operational workloads. HP publically announced
the open source project and uploaded the source code to GitHub in June 2014.

The SQL compiler, optimizer and executor components of Trafodion have a
rich heritage. Under development since 1993, they were released as
commercial closed source software in various flavors such as HP NonStop
SQL/MX and HP Neoview. NonStop SQL/MX was designed for online transaction
processing on HP’s NonStop (formerly Tandem) fault-tolerant servers and is
known for its high availability, scalability, and performance. Hundreds of
companies and thousands of servers are running mission-critical
applications today on NonStop SQL/MX. In addition, much of these components
today are running internal to HP as the core of its Enterprise Data
Warehouse (EDW), managing over a PB of data.

Starting in 2013, the software was modified to run on HBase and a new
distributed transaction manager was written to run as an HBase co-processor.

Unlike most NOSQL and other SQL-on-Hadoop open source projects, Trafodion
provides comprehensive ANSI SQL language support including full-functioned
data definition (DDL), data manipulation (DML), transaction control (TCL)
and database utility support.

Trafodion provides comprehensive and standard SQL data manipulation support
including SELECT, INSERT, UPDATE, DELETE, and UPSERT/MERGE syntax with
language options including join variants, unions, where predicates,
aggregations (group by and having), sort ordering, sampling, correlated and
nested sub-queries, cursors, and many SQL functions.

Utilities are provided for updating table statistics used by the optimizer
for costing (i.e. selectivity/cardinality estimates) plan alternatives, for
displaying the chosen SQL execution plan, plan shaping, backup and
restoring the database, data loading and unloading, and a command line
utility for interfacing with the database engine.

Explicit control statements are provided to allow applications to define
transaction boundaries and to abort transactions when warranted, including
BEGIN WORK, COMMIT WORK, ROLLBACK WORK and SET TRANSACTION.

Trafodion supports ANSI’s grant/revoke semantics to define user and role
privileges in terms of managing and accessing the database objects.

Rationale

The name “Trafodion” (the Welsh word for transactions, pronounced
“Tra-vod-eee-on”) was chosen specifically to emphasize the differentiation
that Trafodion provides in closing a critical gap in the Hadoop ecosystem.
Trafodion builds on the scalability, elasticity, and flexibility of Hadoop.
Trafodion extends Hadoop to provide guaranteed transactional integrity,
enabling new kinds of big data applications to run on Hadoop.

Current Status

HP released the Trafodion code under the Apache License, Version 2, in June
of 2014. Since that time, we have had one major release in January 2015 and
one minor release in April 2015. The focus of these releases has been in
getting our base functionality, including security, working on top of
Apache HBase, as well as improving perfo

Re: [DISCUSS] Mysos Incubation proposal

2015-05-08 Thread Jiang Yan Xu
Maybe in as many places as possible we could spell out the portmanteau as
Mysos: MySQL on Mesos.

Yan
---
Jiang Yan Xu  | @xujyan 

On Thu, May 7, 2015 at 5:47 PM, Ted Dunning  wrote:

> No.  I mean Mysos / Mesos phonetic and typographic similarity.
>
> It took 5-10 lines of reading for me to understand the Mysos was not a
> typo.  Until then, I was just confused.  And I had heard of this effort
> before.
>
> It is just a data point.  Not any kind of strong comment.
>
>
>
> On Fri, May 8, 2015 at 1:39 AM, Henry Saputra 
> wrote:
>
> > Do you mean about it being "Apache Mesos framework" ?
> >
> > In Mesos world, a framework is like an application build to run on top of
> > Mesos.
> >
> > We should probably change the definition to be more clear from the
> > outside of Mesos community.
> >
> > - Henry
> >
> > On Thu, May 7, 2015 at 4:01 PM, Ted Dunning 
> wrote:
> > > On Thu, May 7, 2015 at 7:47 PM, Dave Lester 
> wrote:
> > >
> > >> Mysos is an Apache Mesos framework
> > >
> > >
> > > That is a very confusing name (to me) at least.
> >
> > -
> > To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
> > For additional commands, e-mail: general-h...@incubator.apache.org
> >
> >
>


[RESULT] [VOTE] Climate Model Diagnostic Analyzer

2015-05-08 Thread Mattmann, Chris A (3980)
Hi Everyone,

This VOTE has passed with the following tallies:

+1

Chris Mattmann*
Lei Pan
Louis Suárez-Potts
Jan Iversen*
Kim Whitehall*
Ted Dunning*
John D. Ament*
Jake Farrell*
Alan Cabrera*
Suresh Marru*
Michael Joyce*
Henry Saputra*
Roman Shoposhnik*
Greg Reddin*
Jia Zhang

* -indicates IPMC

I will now get going on bootstrapping the podling. Congrats
everyone! Welcome to the Incubator :)

Cheers,
Chris

++
Chris Mattmann, Ph.D.
Chief Architect
Instrument Software and Science Data Systems Section (398)
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 168-519, Mailstop: 168-527
Email: chris.a.mattm...@nasa.gov
WWW:  http://sunset.usc.edu/~mattmann/
++
Adjunct Associate Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++






-Original Message-
From: , Chris Mattmann 
Reply-To: "general@incubator.apache.org" 
Date: Saturday, April 18, 2015 at 7:00 PM
To: "general@incubator.apache.org" 
Cc: "Pan, Lei (398K)" , "Lee, Seungwon (398K)"
, "Zhai, Chengxing (398K)"
, "Tang, Benyang (398J)"
, "jia.zh...@west.cmu.edu"

Subject: [VOTE} Climate Model Diagnostic Analyzer

>OK all, discussion has died down, we have 3 mentors, I think it’s
>time to proceed to a VOTE.
>
>I am calling a VOTE now to accept the Climate Model Diagnostic
>Analyzer (CMDA) into the Apache Incubator. The VOTE is open for
>at least the next 72 hours:
>
>[ ] +1 Accept Apache Climate Model Diagnostic Analyzer into the Apache
>Incubator.
>[ ] +0 Abstain.
>[ ] -1 Don’t accept Apache Climate Model Diagnostic Analyzer into the
>Apache Incubator
>because…
>
>I’ll try and close the VOTE out on Friday.
>
>Of course I am +1!
>
>P.S. the text of the latest wiki proposal is pasted below:
>
>Cheers,
>Chris
>
>
>= Apache ClimateModelDiagnosticAnalyzer Proposal =
>
>== Abstract ==
>
>The Climate Model Diagnostic Analyzer (CMDA) provides web services for
>multi-aspect physics-based and phenomenon-oriented climate model
>performance evaluation and diagnosis through the comprehensive and
>synergistic use of multiple observational data, reanalysis data, and model
>outputs.
>
>== Proposal ==
>
>The proposed web-based tools let users display, analyze, and download
>earth science data interactively. These tools help scientists quickly
>examine data to identify specific features, e.g., trends, geographical
>distributions, etc., and determine whether a further study is needed. All
>of the tools are designed and implemented to be general so that data from
>models, observation, and reanalysis are processed and displayed in a
>unified way to facilitate fair comparisons. The services prepare and
>display data as a colored map or an X-Y plot and allow users to download
>the analyzed data. Basic visual capabilities include 1) displaying
>two-dimensional variable as a map, zonal mean, and time series 2)
>displaying three-dimensional variable’s zonal mean, a two-dimensional
>slice at a specific altitude, and a vertical profile. General analysis can
>be done using the difference, scatter plot, and conditional sampling
>services. All the tools support display options for using linear or
>logarithmic scales and allow users to specify a temporal range and months
>in a year. The source/input datasets for these tools are CMIP5 model
>outputs, Obs4MIP observational datasets, and ECMWF reanalysis datasets.
>They are stored on the server and are selectable by a user through the web
>services.
>
>=== Service descriptions ===
>
>1. '''Two dimensional variable services'''
>
>* Map of two-dimensional variable:  This services displays a two
>dimensional variable as a colored longitude and latitude map with values
>represented by a color scheme. Longitude and latitude ranges can be
>specified to magnify a specific region.
>
>* Two dimensional variable zonal mean:  This service plots the zonal mean
>value of a two-dimensional variable as a function of the latitude in terms
>of an X-Y plot.
>
>* Two dimensional variable time series:  This service displays the average
>of a two-dimensional variable over the specific region as function of time
>as an X-Y plot.
>
>2. '''Three dimensional variable services'''
>
>* Map of a two dimensional slice of a three-dimensional variable:  This
>service displays a two-dimensional slice of a three-dimensional variable
>at a specific altitude as a colored longitude and latitude map with values
>represented by a color scheme.
>
>* Three dimensional zonal mean:  Zonal mean of the specified
>three-dimensional variable is computed and displayed as a colored
>altitude-latitude map.
>
>* Vertical profile of a three-dimensional variable:  Compute the area
>weighted average of a three-dimensional variable over the specified region
>and display the average as function of pressure level (altitude) as an X-Y