Maxim Gekk created SPARK-34213:
----------------------------------

             Summary: LOAD DATA doesn't refresh v1 table cache
                 Key: SPARK-34213
                 URL: https://issues.apache.org/jira/browse/SPARK-34213
             Project: Spark
          Issue Type: Sub-task
          Components: SQL
    Affects Versions: 3.0.2, 3.2.0, 3.1.1
            Reporter: Maxim Gekk


The example below portraits the issue:
1. Create a source table:
{code:sql}
spark-sql> CREATE TABLE src_tbl (c0 int, part int) USING hive PARTITIONED BY 
(part);
spark-sql> INSERT INTO src_tbl PARTITION (part=0) SELECT 0;
spark-sql> SHOW TABLE EXTENDED LIKE 'src_tbl' PARTITION (part=0);
default src_tbl false   Partition Values: [part=0]
Location: 
file:/Users/maximgekk/proj/load-data-refresh-cache/spark-warehouse/src_tbl/part=0
...
{code}
2. Load data from the source table to a cached destination table:
{code:sql}
spark-sql> CREATE TABLE dst_tbl (c0 int, part int) USING hive PARTITIONED BY 
(part);
spark-sql> INSERT INTO dst_tbl PARTITION (part=1) SELECT 1;
spark-sql> CACHE TABLE dst_tbl;
spark-sql> SELECT * FROM dst_tbl;
1       1
spark-sql> LOAD DATA LOCAL INPATH 
'/Users/maximgekk/proj/load-data-refresh-cache/spark-warehouse/src_tbl/part=0' 
INTO TABLE dst_tbl PARTITION (part=0);
spark-sql> SELECT * FROM dst_tbl;
1       1
{code}
The last query does not show recently loaded data from the source table.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to