Yes, I know what your concerns, I will do the design at first, and try to
modify less impala core code but adding some code to expand it.
all in all, a brief design will be output at first, and the community can
review and give me some advice. Thanks again.

2017-11-04 1:56 GMT+08:00 Dimitris Tsirogiannis <dtsirogian...@cloudera.com>
:

> Hi Yu,
>
> First of all thank you for your interest in extending Impala. That said,
> what you're proposing is not a trivial task as it will affect multiple
> impala components (metadata, planner, query exec, etc). Also, it adds
> another dependency to Impala that will have to be continuously tested and
> maintained. So, before jumping into the code, I think you need to submit a
> proposal that outlines:
> 1. The design; see how Impala interacts with other storage engines such as
> Kudu or HBase to understand which components are affected by such a change.
> 2. How are you going to test this and what kind of testing infrastructure
> will be in place in order to ensure that future commits don't break the
> integration with ElasticSearch?
> 3. Timeline and milestones (sub-tasks) for this project.
>
> I suggest submitting your proposal as a google doc so that it's easier to
> comment on. At the same time, I think it's very important for you to get
> more experience in modifying the Impala codebase. So, before endeavoring in
> such a big task, it may worth spending some time working on a few smaller
> (ramp-up) tasks.
>
> Thanks,
> Dimitris
>
>
>
> On Thu, Nov 2, 2017 at 11:32 PM, yu feng <olaptes...@gmail.com> wrote:
>
> > Hi All :
> >
> >    We are try to query data from Elasticsearch using impala, we want
> > to take advantage of fast speed of impala engine and fast filter and
> > aggregation speed of Elasticsearch.
> >
> > I want to do it in the following way :
> >
> > 1、add a new Table type(metadata) called ES Table.
> > 2、add two new ExecNode(ESScanNode and ESAggregation) to implements query
> to
> > ES.
> > 3、when a query to ES Table, try to rewrite execution plan while contains
> > Aggregation(parent) and ESScanNode(child) to a ESAggregation.
> >
> > In this way, I think it can scan and do aggregation by ES.
> >
> > I want to know what attitude about the combination, and Is it some better
> > way to implement it ?
> >
> > Thanks a lot.
> >
>

Reply via email to