Peter Andrew created SPARK-50569: ------------------------------------ Summary: Clearing dataframes/views persisted in isolated Connect session Key: SPARK-50569 URL: https://issues.apache.org/jira/browse/SPARK-50569 Project: Spark Issue Type: Improvement Components: Connect Affects Versions: 3.5.3 Reporter: Peter Andrew
With Spark Connect, `sparkSession.catalog.clearCache` clears all dataframes that have been persisted in the Spark Connect server, including those persisted by different isolated sessions. Similarly, views createdby `DataFrame.createOrReplaceTempView` are not removed after an isolated session is terminated, even though the documentation might be interpreted as though it should: > The lifetime of this temporary table is tied to the > [{{SparkSession}}|https://spark.apache.org/docs/latest/api/python/reference/pyspark.sql/api/pyspark.sql.SparkSession.html#pyspark.sql.SparkSession] > that was used to create this > [{{DataFrame}}|https://spark.apache.org/docs/latest/api/python/reference/pyspark.sql/api/pyspark.sql.DataFrame.html#pyspark.sql.DataFrame]. It'd be very useful to be able to do the following: - When calling `clearCache`, controlling whether to clear dataframes persisted in the current isolated session only, or all dataframes. - Configuring the Spark Connect server to clean up all persisted dataframes/views from an isolated session when it is terminated. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org