Re: [RESULT] Vote on Impala 2.10.0 release candidate 2

2017-09-13 Thread Tim Armstrong
Thanks for your hard work on this Bharath.

On Tue, Sep 12, 2017 at 11:14 PM, Bharath Vissapragada <
bhara...@cloudera.com> wrote:

> The vote has passed with the following tally.
>
> +1 (binding)
>
> - Brock Noland
> - Carl Steinbach
> - John D. Ament
>
> -1 (binding) - None
> 0 - None
>
> Thanks everyone for testing and voting on the release.
>


Re: Looking for Champion

2018-06-08 Thread Tim Armstrong
> Meanwhile we found Impala is a very good MPP SQL query engine, so we 
> integrated
them together.

Palo didn't integrate with Impala, it forked Impala's codebase and embedded
it in its own repository. I don't remember any attempts from the Palo team
to engage with the Impala community or attempt to work with us to
contribute any improvements.

It looks like Palo is still pulling in new code from Impala.  E.g. this
commit includes a bunch of code I wrote as part of IMPALA-3200:
https://github.com/baidu/palo/commit/2419384e8a211f10e7636afc6d3423700ba22b5a#diff-1c501d9a8b5c3d1d1cce48d5e1fb0edf

The code isn't owned by any individual, I contributed it to Apache and it's
free for anyone to do what they want to do with it, but pulling in
improvements from other projects without any attempt to attribute it or
contribute improvements back seems contrary to the Apache way.

Anyway, maybe incubation is an opportunity for us to work together, but I'd
hope that if Palo does go into incubation that it will rethink some of the
practices it's been following.

On Fri, Jun 8, 2018 at 9:12 AM, Todd Lipcon  wrote:

> On Thu, Jun 7, 2018 at 11:55 PM, Li,De(BDG)  wrote:
>
> > Hi, Jim
> >
> > Thank you for your response.
> > Actually, we start Palo in several years ago, and that time we developed
> > the storage engine based on Mesa technology.
> > Meanwhile we found Impala is a very good MPP SQL query engine, so we
> > integrated them together.
> >
>
> From what I can tell of the Palo source, it's not so much an integration as
> a copied-and-modified codebase, right? i.e Palo does not use Impala as a
> dependency, but rather shares a lot of code from the Impala project that
> has since diverged.
>
>
> >
> > With this integration, the goal of Palo is to implement a single,
> > full-featured, mysql protocol compatible data warehousing.
> >
>
> That sounds pretty similar to the goals of the Impala project. Impala isn't
> MySQL-compatible at the moment but that seems more like a particular
> feature that could be added rather than a distinct identity of the project.
> Otherwise, Impala's goal is to be a full featured data warehouse engine as
> well.
>
> Generally Apache has no rules against multiple projects fulfilling similar
> goals or use cases, even when those projects might compete. However I think
> it would be relatively unusual to incubate a project that appears to be
> derived from a fork of an existing project, at least without first
> considering whether the additional feature set could be contributed back to
> the existing community.
>
> -Todd
>
>
> > 在 2018/6/8 下午1:55, "Jim Apple"  写入:
> >
> > >Hello! As a contributor to Impala, I’d be interested in hearing thoughts
> > >from the Palo community about integration between Impala and Palo.
> > >
> > >For instance, are there any apparent design goals of Impala that the
> Palo
> > >community thinks are fundamentally incompatible with Palo?
> > >
> > >Thanks,
> > >Jim
> > >
> > >On 2018/06/08 04:45:32, "Li,De(BDG)"  wrote:
> > >> Hi all,
> > >>
> > >> I am Reed, as a developer worked with the team for Palo (a MPP-based
> > >>interactive SQL data warehousing).
> > >> https://github.com/baidu/palo/wiki/Palo-Overview
> > >>
> > >> We propose to contribute Palo as an Apache Incubator project, and
> > >> we are still looking for possible Champion if anyone would like to
> > >>volunteer. Thanks a lot.
> > >>
> > >> Best Regards,
> > >> Reed
> > >>
> > >> ===
> > >> The draft of the proposal as below:
> > >>
> > >> #Apache Palo
> > >>
> > >> ##Abstract
> > >>
> > >> Palo is a MPP-based interactive SQL data warehousing for reporting and
> > >>analysis.
> > >>
> > >> ##Proposal
> > >>
> > >> We propose to contribute the Palo codebase and associated artifacts
> > >>(e.g. documentation, web-site content etc.) to the Apache Software
> > >>Foundation with the intent of forming a productive, meritocratic and
> > >>open community around Palo’s continued development, according to the
> > >>‘Apache Way’.
> > >>
> > >> Baidu owns several trademarks regarding Palo, and proposes to transfer
> > >>ownership of those trademarks in full to the ASF.
> > >>
> > >> ###Overview of Palo
> > >>
> > >> Palo’s implementation consists of two daemons: Frontend (FE) and
> > >>Backend (BE).
> > >>
> > >> **Frontend daemon** consists of query coordinator and catalog manager.
> > >>Query coordinator is responsible for receiving users’ sql queries,
> > >>compiling queries and managing queries execution. Catalog manager is
> > >>responsible for managing metadata such as databases, tables,
> partitions,
> > >>replicas and etc. Several frontend daemons could be deployed to
> > >>guarantee fault-tolerance, and load balancing.
> > >>
> > >> **Backend daemon** stores the data and executes the query fragments.
> > >>Many backend daemons could also be deployed to provide scalability and
> > >>fault-tolerance.
> > >>
> > >> A typical Palo cluster generally composes of several frontend daemons
> > >>and