Github user rezasafi commented on a diff in the pull request: https://github.com/apache/spark/pull/22614#discussion_r222349989 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveShim.scala --- @@ -746,34 +746,20 @@ private[client] class Shim_v0_13 extends Shim_v0_12 { getAllPartitionsMethod.invoke(hive, table).asInstanceOf[JSet[Partition]] } else { logDebug(s"Hive metastore filter is '$filter'.") - val tryDirectSqlConfVar = HiveConf.ConfVars.METASTORE_TRY_DIRECT_SQL - // We should get this config value from the metaStore. otherwise hit SPARK-18681. - // To be compatible with hive-0.12 and hive-0.13, In the future we can achieve this by: - // val tryDirectSql = hive.getMetaConf(tryDirectSqlConfVar.varname).toBoolean - val tryDirectSql = hive.getMSC.getConfigValue(tryDirectSqlConfVar.varname, - tryDirectSqlConfVar.defaultBoolVal.toString).toBoolean try { // Hive may throw an exception when calling this method in some circumstances, such as - // when filtering on a non-string partition column when the hive config key - // hive.metastore.try.direct.sql is false + // when filtering on a non-string partition column. getPartitionsByFilterMethod.invoke(hive, table, filter) .asInstanceOf[JArrayList[Partition]] } catch { - case ex: InvocationTargetException if ex.getCause.isInstanceOf[MetaException] && - !tryDirectSql => + case ex: InvocationTargetException if ex.getCause.isInstanceOf[MetaException] => logWarning("Caught Hive MetaException attempting to get partition metadata by " + "filter from Hive. Falling back to fetching all partition metadata, which will " + - "degrade performance. Modifying your Hive metastore configuration to set " + - s"${tryDirectSqlConfVar.varname} to true may resolve this problem.", ex) + "degrade performance. Enable direct SQL mode in hive metastore to attempt " + + "to improve performance. However, Hive's direct SQL mode is an optimistic " + + "optimization and does not guarantee improved performance.") --- End diff -- Hive has a config "hive.metastore.limit.partition.request" that can limit number of partitions that can be requested from HMS. So I think there is no need for a new config on the Spark side. Also since direct sql is a best effort approach just failing when direct sql is enabled is not good.
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org