[Spark conf setting] spark.sql.parquet.cacheMetadata = true still invalidates cache in memory.

Parag Mohanty Thu, 01 Jul 2021 07:29:24 -0700

Hi Team
I am trying to read a parquet file, cache it and then do transformation and
overwrite the parquet file in a session.
But first count action doesn't cache the dataframe.
It gets cached while caching the transformed dataframe.
Even if the spark.sql.parquet.cacheMetadata = true still the write
operation destroys the cache.
Is it expected? What is the relevance of this conf setting ?


We are using pyspark on spark cluster mode.
Regards
Parag Mohanty

[Spark conf setting] spark.sql.parquet.cacheMetadata = true still invalidates cache in memory.

Reply via email to