Hi Hive developers, I noticed HIVE-22046 introduced incompatibility to Metastore APIs while I'm testing integration between Hive 4 and other software. If I understand correctly, clients are currently required to additionally specify the engine name when they get or update column statistics.
- https://issues.apache.org/jira/browse/HIVE-22046 - https://github.com/apache/hive/pull/741 For example, Trino has a feature to use column stats and it fails. Note that I am not 100% sure about Trino's implementation or behavior. ``` trino> create table hive.default.test_trino (id int); Query 20230810_152236_00004_t9n6h failed: Required field 'engine' is unset! Struct:TableStatsRequest(dbName:default, tblName:test_trino, colNames:[id], engine:null) ``` I have two questions about this feature. (1) Should any engine use a unique engine name? I guess some software can store or use stats compatible with Hive. I wonder if it can reuse engine=hive in that case, or should use a different name like engine=trino. I see Impala gives a unique engine name to metastore. Taking a glance, Spark is unlikely to be using col stats of Hive directly. - https://issues.apache.org/jira/browse/IMPALA-8842 (2) Should Hive Metastore use engine=hive as a default value? If other compatible software can reuse engine=hive, it could be an option to accept requests with the old format assuming its engine is "hive" for compatibility. Or should they explicitly specify engine=hive when using Hive 4? Regards, Okumin