Maxim Gekk created SPARK-34213: ---------------------------------- Summary: LOAD DATA doesn't refresh v1 table cache Key: SPARK-34213 URL: https://issues.apache.org/jira/browse/SPARK-34213 Project: Spark Issue Type: Sub-task Components: SQL Affects Versions: 3.0.2, 3.2.0, 3.1.1 Reporter: Maxim Gekk
The example below portraits the issue: 1. Create a source table: {code:sql} spark-sql> CREATE TABLE src_tbl (c0 int, part int) USING hive PARTITIONED BY (part); spark-sql> INSERT INTO src_tbl PARTITION (part=0) SELECT 0; spark-sql> SHOW TABLE EXTENDED LIKE 'src_tbl' PARTITION (part=0); default src_tbl false Partition Values: [part=0] Location: file:/Users/maximgekk/proj/load-data-refresh-cache/spark-warehouse/src_tbl/part=0 ... {code} 2. Load data from the source table to a cached destination table: {code:sql} spark-sql> CREATE TABLE dst_tbl (c0 int, part int) USING hive PARTITIONED BY (part); spark-sql> INSERT INTO dst_tbl PARTITION (part=1) SELECT 1; spark-sql> CACHE TABLE dst_tbl; spark-sql> SELECT * FROM dst_tbl; 1 1 spark-sql> LOAD DATA LOCAL INPATH '/Users/maximgekk/proj/load-data-refresh-cache/spark-warehouse/src_tbl/part=0' INTO TABLE dst_tbl PARTITION (part=0); spark-sql> SELECT * FROM dst_tbl; 1 1 {code} The last query does not show recently loaded data from the source table. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org