Hello, I came across an issue[1] in PyHive which involves the SHOW TABLES output from Thrift Server.
When you run a SHOW TABLES statement in beeline, it will return a table with the following fields: (i) schema name, (ii) table name, (iii) temporary table flag. This output is different from what Hive does, which returns a single column containing all table names. >From the spark[2] docs: "The Thrift JDBC/ODBC server implemented here corresponds to the HiveServer2 in built-in Hive.". With that being said, there is a compatibility issue in that particular statement because it breaks libraries like PyHive. Now my questions: 1) Is it expected for Thrift Server to be 100% Hive compatible? 2) If the answer to the previous question is yes, is this a bug in spark? 3) What possible problems could bring to spark if we make SHOW TABLES return just like what Hive returns and make Thrift Server resolve a SHOW TABLES EXTENDED statement to return what SparkSQL returns? [1] https://github.com/dropbox/PyHive/issues/146 [2] https://spark.apache.org/docs/latest/sql-distributed-sql-engine.html -- Ricardo Martinelli De Oliveira Data Engineer, AI CoE Red Hat Brazil <https://www.redhat.com/> @redhatjobs <https://twitter.com/redhatjobs> redhatjobs <https://www.facebook.com/redhatjobs> @redhatjobs <https://instagram.com/redhatjobs> <https://www.redhat.com/>