Re: [VOTE] Release Apache Parquet Format 2.3.0 RC2

2015-02-19 Thread Chris Aniszczyk
+1 binding

verified sigs + tests pass

On Thu, Feb 19, 2015 at 12:36 PM, Ted Dunning  wrote:

> OK.  I will change my vote to +1.  The sha that Brock demonstrates is the
> same one that I saw when I was checking sums.
>
> I would *strongly* encourage the project to improve things before the next
> release by:
>
> - fix the sha has to not be compressed.  This is unconventional and thus
> not a good thing.  Anything that discourages the checking of a secure
> signature is a bad thing
>
> - add either a COMPILING or INSTALL file or more info in the README that
> describes the obscure method for compilation
>
> - check in the generated code or at least include it in the source artifact
> so that thrift and protoc are not necessary for a simple compilation.  It
> is better than downstream users have an easier experience than a hard
> experience is documented
>
>
>
>
>
> On Wed, Feb 18, 2015 at 10:58 AM, Brock Noland  wrote:
>
> > [x] +1 Release this as Apache Parquet Format 2.3.0
> >
> > verified sigs, hashes, no jars, and inspected tarball.
> >
> > On Mon, Feb 16, 2015 at 8:03 AM, Ryan Blue  wrote:
> > > Hi everyone,
> > >
> > > I propose RC2 to be released as official Apache Parquet 2.3.0 release.
> > >
> > > A similar vote has passed in the podling, with 4 +1 votes and 0 -1 or
> +0
> > > votes. (With one binding +1 from an IPMC member.)
> > >
> > > The commit id is 7a6079ed5ddfa98a59cf8ac8728bcf5b0a1233b9
> > > * This corresponds to the tag: apache-parquet-format-2.3.0-incubating
> > > * https://github.com/apache/incubator-parquet-format/tree/7a6079ed
> > > *
> > >
> >
> https://git-wip-us.apache.org/repos/asf/incubator/repo?p=incubator-parquet-format.git&a=commit&h=7a6079ed5ddfa98a59cf8ac8728bcf5b0a1233b9
> > >
> > > The release tarball, signature, and checksums are here:
> > > *
> > >
> >
> https://dist.apache.org/repos/dist/dev/incubator/parquet/apache-parquet-format-2.3.0-incubating-rc2
> > >
> > > You can find the KEYS file here:
> > > * https://dist.apache.org/repos/dist/dev/incubator/parquet/KEYS
> > >
> > > Binary artifacts are staged in Nexus here:
> > > *
> > >
> >
> https://repository.apache.org/content/groups/staging/org/apache/parquet/parquet-format/
> > >
> > > Parquet Format 2.3.0 is functionally identical to the 2.2.0 release,
> but
> > the
> > > classes have been moved to the org.apache.parquet package and the maven
> > > groupId is now org.apache.parquet.
> > >
> > > Please download, verify, and test. This vote will close on 19 Feb.
> > >
> > > [ ] +1 Release this as Apache Parquet Format 2.3.0
> > > [ ] +0
> > > [ ] -1 Do not release this because...
> > >
> > > --
> > > Ryan Blue
> > >
> > > -
> > > To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
> > > For additional commands, e-mail: general-h...@incubator.apache.org
> > >
> >
> > -
> > To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
> > For additional commands, e-mail: general-h...@incubator.apache.org
> >
> >
>



-- 
Cheers,

Chris Aniszczyk
http://aniszczyk.org
+1 512 961 6719


Re: [DISCUSS] Commons RDF to join the Apache Incubator

2015-02-19 Thread Gary Gregory
On Sun, Feb 15, 2015 at 2:29 AM, Benedikt Ritter  wrote:

> Hi all,
>
> at first sorry for the delay. I've been on vacation the last 10 days with
> no access to my emails.
>

Same for me.

Gary


>
> 2015-02-10 21:31 GMT+01:00 Marvin Humphrey :
>
> > On Tue, Feb 10, 2015 at 7:21 AM, Stian Soiland-Reyes 
> > wrote:
> >
> > > The natural path to Apache Commons Sandbox has been studied, but we
> > > think that in this phase of the project, which focuses on the API
> > > design and actively involves the developers of existing toolkits, it
> > > is better to have a more focused community and infrastructure. Rather
> > > than a new Top-Level Project, the goal is still to graduate as part of
> > > Apache Commons, that is when API has achieve the required maturity and
> > > the project goes into maintenance mode.
> >
> > If Commons is OK with this, I imagine this is a fine plan -- good enough
> > for
> > entering incubation.
> >
>
> Short answer: The Apache Commons community is fine with this.
>
> Long answer: There has been some confusion (and misunderstanding?) about
> the way the Apache Commons project works. The Commons RDF community wanted
> to use either github or a separate mailing list for shaping out the initial
> API. The first in our opinion doesn't work since Apache projects have to
> use Apache infrastructure. The latter wasn't possible since we don't what
> to create sub communities inside commons [1]. This is a lesson learned from
> Jakarta (note that I've not been around by the time Jakarta shout down, so
> I'm just writing down, what I've learned from others). This eventually led
> to the suggestion to go though the incubator. [2]
>
> We like to underline, that we have no experience with the RDF
> specification. From a technical point of view we can help to develop the
> proposed API (according to our design guide lines [3]). But we need the
> people the the RDF space to review contributions from a semantic PoV. So
> this should not end up like developing the RDF library at the incubator and
> then hand it of for maintenance to the Commons community. I think all
> people involved here have pointed out, that they are willing to work on the
> project even after it's initial release. Note, that we have recently
> granted write access to all ASF committers [4]. So if Commons RDF
> eventually moves to Apache Commons, anybody from the Jena/Sesame/Clerezza
> projects may join the development.
>
>
> >
> > I also think it would be OK for the project to decide it wants to become
> a
> > TLP.  Whether the project joins Commons or becomes its own TLP won't
> impact
> > the number of people qualified to work on it.  Some Apache TLPs are
> > effectively in maintenance mode and have very low activity, but still
> have
> > PMC
> > members willing to answer user questions, make security releases and file
> > "still here" quarterly reports.  That seems like a legitimate aspiration
> > for
> > this project.
> >
>
> In the case of Commons RDF going TLP we would like to ask the project to
> choose a different name to avoid confusion. But I think this has already
> been discussed in this thread.
>
> Regards,
> Benedikt
>
> [1] http://markmail.org/message/mnlh64qod7cuuj56
> [2] http://markmail.org/message/wl6hpkb4nhsroro5
> [3] http://commons.apache.org/releases/versioning.html
> [4] http://markmail.org/message/ylmw7qzx23br4ver
>
>
> >
> > A potential Jena destination also seems as though it would have certain
> > advantages, though my naive speculation is that it might be sub-optimal
> in
> > terms of providing neutral territory for negotiating a common API for
> Jena
> > and
> > Sesame.
> >
> > In any case it seems likely that if the project achieves its design goal,
> > there will be people willing to work on it as long as both Jena and
> Sesame
> > remain viable.  That makes it different from other potential "maintenance
> > mode" TLPs which are in danger of stagnation because they cannot renew
> > their
> > communities.
> >
> > Is that take roughly accurate, Sergio et al?
>
>
> > > === Mailing lists ===
> > >
> > >  * commons-rdf-dev
> > >  * commons-rdf-commits
> >
> > Those sound like final mailing lists rather than Incubator ones.  I might
> > have
> > expected these instead:
> >
> > d...@commons-rdf.incubator.apache.org
> > comm...@commons-rdf.incubator.apache.org
> >
> > Do you expect to keep separate mailing lists after graduation, or will
> > traffic
> > be shunted onto existing Commons mailing list like
> d...@commons.apache.org
> > and
> > comm...@commons.apache.org?
> >
> > >  * Sergio Fernández (wikier dot apache dot org)
> > >  * Andy Seaborne (andy dot apache dot org)
> > >  * Peter Ansell (ansell dot apache dot org)
> > >  * Stian Soiland-Reyes (stain at apache dot org)
> > >  * Reto Gmür (reto at apache dot org)
> >
> > Lots of Apache experience in this group.  Four are PMC members of at
> least
> > one
> > Apache project.  Andy and Reto are ASF Members.  Andy and Sergio are both
> > IPMC
> >

[VOTE] Accept Apache AsterixDB in to the Incubator

2015-02-19 Thread Mattmann, Chris A (3980)
Hi Everyone,

OK, discussion has died down on this thread. I was originally
suggesting that the pTLP option may be best for this community,
but after some discussions with the existing community of
AsterixDB’ers proposing to bring the project here to the ASF,
AsterixDB would like to move forward independent of whatever
comes of the pTLP discussions.

That said, I would like to propose Apache AsterixDB as an
Incubator project. I am now calling a VOTE to accept AsterixDB
into the Apache Incubator. This VOTE will run for at least 72 hours.

[ ] +1 Accept Apache AsterixDB into the Incubator
[ ] +0 Don’t care.
[ ] -1 Don’t accept Apache AsterixDB into the Incubator because..

Thanks for the feedback so far and looking forward to the VOTE!

You can count my binding +1.

Cheers,
Chris

++
Chris Mattmann, Ph.D.
Chief Architect
Instrument Software and Science Data Systems Section (398)
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 168-519, Mailstop: 168-527
Email: chris.a.mattm...@nasa.gov
WWW:  http://sunset.usc.edu/~mattmann/
++
Adjunct Associate Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++






-Original Message-
From: , Chris Mattmann 
Date: Wednesday, January 14, 2015 at 6:20 PM
To: "general@incubator.apache.org" 
Cc: Michael Carey , Ian Maxon , Till
Westmann 
Subject: [PROPOSAL] Apache AsterixDB Incubator

>Hi Folks,
>
>I am pleased to bring forth the Apache AsterixDB proposal to the
>Apache Incubator as Champion, working in collaboration with the
>team. Please find the wiki proposal here:
>
>https://wiki.apache.org/incubator/AsterixDBProposal
>
>
>Full text of the proposal is below. Please discuss and enjoy. I’ll
>leave the discussion open for a week, and then look to call a VOTE
>hopefully end of next week if all is well.
>
>Cheers!
>Chris Mattmann
>
>=
>Apache AsterixDB Proposal
>
>Abstract
>
>Apache AsterixDB is a scalable big data management system (BDMS) that
>provides storage, management, and query capabilities for large
>collections of semi-structured data.
>
>Proposal
>
>AsterixDB is a big data management system (BDMS) that makes it
>well-suited to needs such as web data warehousing and social data
>storage and analysis. Feature-wise, AsterixDB has:
>
>* A NoSQL style data model (ADM) based on extending JSON with object
>  database concepts.
>* An expressive and declarative query language (AQL) for querying
>  semi-structured data.
>* A runtime query execution engine, Hyracks, for partitioned-parallel
>  execution of query plans.
>* Partitioned LSM-based data storage and indexing for efficient
>  ingestion of newly arriving data.
>* Support for querying and indexing external data (e.g., in HDFS) as
>  well as data stored within AsterixDB.
>* A rich set of primitive data types, including support for spatial,
>  temporal, and textual data.
>* Indexing options that include B+ trees, R trees, and inverted
>  keyword index support.
>* Basic transactional (concurrency and recovery) capabilities akin to
>  those of a NoSQL store.
>
>
>Background and Rationale
>
>In the world of relational databases, the need to tackle data volumes
>that exceed the capabilities of a single server led to the
>development of “shared-nothing” parallel database systems several
>decades ago. These systems spread data over a cluster based on a
>partitioning strategy, such as hash partitioning, and queries are
>processed by employing partitioned-parallel divide-and-conquer
>techniques. Since these systems are fronted by a high-level,
>declarative language (SQL), their users are shielded from the
>complexities of parallel programming. Parallel database systems have
>been an extremely successful application of parallel computing, and
>quite a number of commercial products exist today.
>
>In the distributed systems world, the Web brought a need to index and
>query its huge content. SQL and relational databases were not the
>answer, though shared-nothing clusters again emerged as the hardware
>platform of choice. Google developed the Google File System (GFS) and
>MapReduce programming model to allow programmers to store and process
>Big Data by writing a few user-defined functions. The MapReduce
>framework applies these functions in parallel to data instances in
>distributed files (map) and to sorted groups of instances sharing a
>common key (reduce) -- not unlike the partitioned parallelism in
>parallel database systems. Apache's Hadoop MapReduce platform is the
>most prominent implementation of this paradigm for the rest of the
>Big Data community. On top of Hadoop and HDFS sit declarative
>languages like Pig and Hive that each compile down to Hadoop
>MapReduce jobs.
>
>The big Web companies were also cha