Hi All : We are try to query data from Elasticsearch using impala, we want to take advantage of fast speed of impala engine and fast filter and aggregation speed of Elasticsearch.
I want to do it in the following way : 1、add a new Table type(metadata) called ES Table. 2、add two new ExecNode(ESScanNode and ESAggregation) to implements query to ES. 3、when a query to ES Table, try to rewrite execution plan while contains Aggregation(parent) and ESScanNode(child) to a ESAggregation. In this way, I think it can scan and do aggregation by ES. I want to know what attitude about the combination, and Is it some better way to implement it ? Thanks a lot.