mrhaoqi opened a new issue, #55253:
URL: https://github.com/apache/spark/issues/55253
## Problem
Spark 4.0.0 bundles `hadoop-client-api-3.4.1.jar` and
`hadoop-client-runtime-3.4.1.jar` which are incompatible with Hive Metastore
3.1.x (and earlier 3.x versions).
## Symptoms
When connecting to HMS 3.1.3 with Spark 4.0.0, the following error occurs:
org.apache.spark.sql.AnalysisException:
org.apache.hadoop.hive.ql.metadata.HiveException: Unable to fetch table
...
Caused by: org.apache.thrift.TApplicationException: Invalid method name:
'get_table_objects_by_name'
Or:
Invalid method name: 'get_table'
## Root Cause
1. Hadoop 3.4.1's `HiveMetaStoreClient` uses newer Thrift protocol methods
2. Hive Metastore 3.1.x uses older Thrift protocol that doesn't implement
methods like `get_table_objects_by_name`
3. Spark 4.0.0 upgraded Hive Service RPC from 3.1.3 to 4.0.0 (per
SPARK-45265), targeting HMS 4.0 compatibility
4. This implicitly breaks compatibility with HMS 3.1.x
## Workaround
Removing Spark's bundled Hadoop 3.4.1 JARs and replacing with 3.3.4
version resolves the issue:
```bash
rm -f /opt/spark/jars/hadoop-client-api-3.4.1.jar
rm -f /opt/spark/jars/hadoop-client-runtime-3.4.1.jar
# Replace with Hadoop 3.3.4 versions
Environment
- Spark: 4.0.0
- Hive Metastore: 3.1.3
- Hadoop (Spark bundled): 3.4.1
- Iceberg: 1.7.x (using HiveCatalog)
Documentation Gap
The Spark 4.0.0 release notes mention:
- "Support Hive 4.0 metastore" (SPARK-45265)
- "Remove Hive support prior to 2.0.0" (SPARK-45328)
But there's no explicit documentation stating that HMS 3.1.x support is
deprecated or broken.
Suggested Resolution
Either:
1. Add explicit documentation warning about HMS 3.1.x incompatibility
2. Provide a hadoop-3.3 profile option for users who need HMS 3.x
compatibility
3. Consider bundling both Hadoop client versions with profile switch
capability
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]