Hi, community . 

I want to start the discussion about Hive dialect shouldn't fall back to 
Flink's default dialect. 

Currently, when the HiveParser fail to parse the sql in Hive dialect, it'll 
fall back to Flink's default parser[1] to handle flink-specific statements like 
"CREATE CATALOG xx with (xx);". 

As I‘m involving with Hive dialect and have some communication with community 
users who use Hive dialectrecently, I'm thinking throw exception directly 
instead of falling back to Flink's default dialect when fail to parse the sql 
in Hive dialect 

Here're some reasons: 

First of all, it'll hide some error with Hive dialect. For example, we found we 
can't use Hive dialect any more with Flink sql client in release validation 
phase[2], finally we find a modification in Flink sql client cause it, but our 
test case can't find it earlier for although HiveParser faill to parse it but 
then it'll fall back to default parser and pass test case successfully. 

Second, conceptually, Hive dialect should be do nothing with Flink's default 
dialect. They are two totally different dialect. If we do need a dialect mixing 
Hive dialect and default dialect , may be we need to propose a new hybrid 
dialect and announce the hybrid behavior to users. 
Also, It made some users confused for the fallback behavior. The fact comes 
from I had been ask by community users. Throw an excpetioin directly when fail 
to parse the sql statement in Hive dialect will be more intuitive. 

Last but not least, it's import to decouple Hive with Flink planner[3] before 
we can externalize Hive connector[4]. If we still fall back to Flink default 
dialct, then we will need depend on `ParserImpl` in Flink planner, which will 
block us removing the provided dependency of Hive dialect as well as 
externalizing Hive connector. 

Although we hadn't announced the fall back behavior ever, but some users may 
implicitly depend on this behavior in theirs sql jobs. So, I hereby open the 
dicussion about abandoning the fall back behavior to make Hive dialect clear 
and isoloted. 
Please remember it won't break the Hive synatax but the syntax specified to 
Flink may fail after then. But for the failed sql, you can use `SET 
table.sql-dialect=default;` to switch to Flink dialect. 
If there's some flink-specific statements we found should be included in Hive 
dialect to be easy to use, I think we can still add them as specific cases to 
Hive dialect. 

Look forwards to your feedback. I'd love to listen the feedback from community 
to take the next steps. 

[1]:https://github.com/apache/flink/blob/678370b18e1b6c4a23e5ce08f8efd05675a0cc17/flink-connectors/flink-connector-hive/src/main/java/org/apache/flink/table/planner/delegation/hive/HiveParser.java#L348
 
[2]:https://issues.apache.org/jira/browse/FLINK-26681 
[3]:https://issues.apache.org/jira/browse/FLINK-31413 
[4]:https://issues.apache.org/jira/browse/FLINK-30064 



Best regards, 
Yuxia 

Reply via email to