Hello impala devs!
Let me say that I have used impala a lot and am very impressed with it.
I know impala is moving into the Apache incubator (I have an incubator
prodling gossip so I know this is challenging). There are few things I want
to bring to your attention/discuss, so that they do not become an issue or
blocker in the future.
1) code
Your proposal https://wiki.apache.org/incubator/ImpalaProposal lists hive
as a dependency.
External Dependencies
Apache Hive (Apache Software License v2.0)
I notice that the cloudera impala has CDH "hive" (which are rather old)
jars in its source tree:
https://github.com/cloudera/Impala/tree/8b621a301329d91fbe10a8aac5e39a2b14d6d25f/thirdparty/hive-1.1.0-cdh5.12.0-SNAPSHOT
A quick search did not find any evidence of that in incubator-impala (which
is good):
https://github.com/apache/incubator-impala/
We (Hive) want people using only official Apache Hive releases for
dependencies. We want to avoid:
1) Full or partial code forks of Apache Hive which still carry the Hive name
2) Artifacts published to central repositories named "*Hive*" which could
be confusing
I am not asserting that impala if affected by case #1 or #2 currently, but
something to be aware of. If you need guidance feel free to discuss
further with the Hive PMC.
2) Next topic, the Hive name and statements that imply compatibility:
http://impala.apache.org/
For Apache Hive users, Impala utilizes the same metadata, ODBC driver, SQL
syntax, and user interface as Hiveāso you don't have to worry about
re-inventing the implementation wheel.
Apache Hive proposes and adds syntax all the time. For example, this
feature is in the works now (
https://issues.apache.org/jira/browse/HIVE-15986). Even if every effort was
made to keep the languages and features in sync no one would be able to
make this claim. This because Apache Hive does not have compatibility tests
for any of these things (We do not have anything like ANSI SQL 92).
This text needs be replaced. It is probably fine to make statements such as
"Impala can run many of queries as Apache Hive", or "users of Apache Hive
will find many familiar features in Impala".
Again welcome to the incubator, I am sure getting impala through is fun
with the c++ ness of it all!
Thanks,
Edward