Hello,
I have a cached dataframe:
spark.read.format("delta").load("/data").groupBy(col("event_hour")).count.cache
I would like to access the "live" data for this data frame without deleting
the cache (using unpersist()). Whatever I do I always get the cached data
on subsequent queries. Even adding new column to the query doesn't help:
spark.read.format("delta").load("/data").groupBy(col("event_hour")).count.withColumn("dummy",
lit("dummy"))
I'm able to workaround this using cached sql view, but I couldn't find a
pure dataFrame solution.
Thank you,
Tomas