Hi Team I am trying to read a parquet file, cache it and then do transformation and overwrite the parquet file in a session. But first count action doesn't cache the dataframe. It gets cached while caching the transformed dataframe. Even if the spark.sql.parquet.cacheMetadata = true still the write operation destroys the cache. Is it expected? What is the relevance of this conf setting ?
We are using pyspark on spark cluster mode. Regards Parag Mohanty