Re: Need to make WHERE clause compulsory in Spark SQL

Sean Owen Tue, 22 Feb 2022 06:37:22 -0800

Spark does not use Hive for execution, so Hive params will not have an
effect. I don't think you can enforce that in Spark. Typically you enforce
things like that at a layer above your SQL engine, or can do so, because
there is probably other access you need to lock down.


On Tue, Feb 22, 2022 at 6:35 AM Saurabh Gulati
<saurabh.gul...@fedex.com.invalid> wrote:

> Hello,
> We are trying to setup Spark as the execution engine for exposing our data
> stored in lake. We have hive metastore running along with Spark thrift
> server and are using Superset as the UI.
>
> We save all tables as External tables in hive metastore with storge being
> on Cloud.
>
> We see that right now when users run a query in Superset SQL Lab it scans
> the whole table. What we want is to limit the data scan by setting
> something like hive.mapred.mode=strict in spark, so that user gets an
> exception if they don't specify a partition column.
>
> We tried setting spark.hadoop.hive.mapred.mode=strict in
> spark-defaults.conf in thrift server  but it still scans the whole table.
> Also tried setting hive.mapred.mode=strict in hive-defaults.conf for
> metastore container.
>
> We use Spark 3.2 with hive-metastore version 3.1.2
>
> Is there a way in spark settings to make it happen.
>
>
> TIA
> Saurabh
>

Re: Need to make WHERE clause compulsory in Spark SQL

Reply via email to