Re: How to set a config for a single query?

Khalid Mammadov Thu, 05 Jan 2023 01:04:16 -0800

Hi

I believe there is a feature in Spark specifically for this purpose. You
can create a new spark session and set those configs.
Note that it's not the same as creating a separate driver processes with
separate sessions, here you will still have the same SparkContext that
works as a backend for both or more spark sessions and does all the heavy
work.


*spark.newSession()*

https://spark.apache.org/docs/3.1.1/api/python/reference/api/pyspark.sql.SparkSession.newSession.html#pyspark.sql.SparkSession.newSession

Hope this helps
Khalid


On Wed, 4 Jan 2023, 00:25 Felipe Pessoto, <felipepess...@hotmail.com> wrote:

> Hi,
>
>
>
> In Scala is it possible to set a config value to a single query?
>
>
>
> I could set/unset the value, but it won’t work for multithreading
> scenarios.
>
>
>
> Example:
>
>
>
> spark.sql.adaptive.coalescePartitions.enabled = false
>
>                 queryA_df.collect
>
> spark.sql.adaptive.coalescePartitions.enabled=original value
>
>                 queryB_df.collect
>
>                 queryC_df.collect
>
>                 queryD_df.collect
>
>
>
>
>
> If I execute that block of code multiple times using multiple thread, I
> can end up executing Query A with coalescePartitions.enabled=true, and
> Queries B, C and D with the config set to false, because another thread
> could set it between the executions.
>
>
>
> Is there any good alternative to this?
>
>
>
> Thanks.
>

Re: How to set a config for a single query?

Reply via email to