Spark does not use Hive for execution, so Hive params will not have an effect. I don't think you can enforce that in Spark. Typically you enforce things like that at a layer above your SQL engine, or can do so, because there is probably other access you need to lock down.
On Tue, Feb 22, 2022 at 6:35 AM Saurabh Gulati <saurabh.gul...@fedex.com.invalid> wrote: > Hello, > We are trying to setup Spark as the execution engine for exposing our data > stored in lake. We have hive metastore running along with Spark thrift > server and are using Superset as the UI. > > We save all tables as External tables in hive metastore with storge being > on Cloud. > > We see that right now when users run a query in Superset SQL Lab it scans > the whole table. What we want is to limit the data scan by setting > something like hive.mapred.mode=strict in spark, so that user gets an > exception if they don't specify a partition column. > > We tried setting spark.hadoop.hive.mapred.mode=strict in > spark-defaults.conf in thrift server but it still scans the whole table. > Also tried setting hive.mapred.mode=strict in hive-defaults.conf for > metastore container. > > We use Spark 3.2 with hive-metastore version 3.1.2 > > Is there a way in spark settings to make it happen. > > > TIA > Saurabh >