[ https://issues.apache.org/jira/browse/SPARK-24669?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Dong Jiang updated SPARK-24669: ------------------------------- Description: I can do the following in sequence # Create a managed table using path options # Drop the table via dropping the parent database cascade # Re-create the database and table with a different path # The new table shows data from the old path, not the new path {code} echo "first" > /tmp/first.csv echo "second" > /tmp/second.csv spark-shell spark.version res0: String = 2.3.0 spark.sql("create database foo") spark.sql("create table foo.first (id string) using csv options (path='/tmp/first.csv')") spark.table("foo.first").show() +-----+ | id| +-----+ |first| +-----+ spark.sql("drop database foo cascade") spark.sql("create database foo") spark.sql("create table foo.first (id string) using csv options (path='/tmp/second.csv')") "note, the path is different now, pointing to second.csv, but still showing data from first file" spark.table("foo.first").show() +-----+ | id| +-----+ |first| +-----+ "now, if I drop the table explicitly, instead of via dropping database cascade, then it will be the correct result" spark.sql("drop table foo.first") spark.sql("create table foo.first (id string) using csv options (path='/tmp/second.csv')") spark.table("foo.first").show() +------+ | id| +------+ |second| +------+ {code} was: I can do the following in sequence # Create a managed table using path options # Drop the table via dropping the parent database cascade # Re-create the database and table with a different path # The new table shows data from the old path, not the new path {code} echo "first" > /tmp/first.csv echo "second" > /tmp/second.csv spark-shell spark.version res0: String = 2.3.0 spark.sql("create database foo") spark.sql("create table foo.first (id string) using csv options (path='/tmp/first.csv')") spark.table("foo.first").show() +-----+ | id| +-----+ |first| +-----+ spark.sql("drop database foo cascade") spark.sql("create database foo") spark.sql("create table foo.first (id string) using csv options (path='/tmp/second.csv')") "note, the path is different now, pointing to second.csv, but still showing data from first file" spark.table("foo.first").show() +-----+ | id| +-----+ |first| +-----+ "now, if I drop the table explicitly, then it will be correct" spark.sql("drop table foo.first") spark.sql("create table foo.first (id string) using csv options (path='/tmp/second.csv')") spark.table("foo.first").show() +------+ | id| +------+ |second| +------+ {code} > Managed table was not cleared of path after drop database cascade > ----------------------------------------------------------------- > > Key: SPARK-24669 > URL: https://issues.apache.org/jira/browse/SPARK-24669 > Project: Spark > Issue Type: Bug > Components: SQL > Affects Versions: 2.3.0 > Reporter: Dong Jiang > Priority: Major > > I can do the following in sequence > # Create a managed table using path options > # Drop the table via dropping the parent database cascade > # Re-create the database and table with a different path > # The new table shows data from the old path, not the new path > {code} > echo "first" > /tmp/first.csv > echo "second" > /tmp/second.csv > spark-shell > spark.version > res0: String = 2.3.0 > spark.sql("create database foo") > spark.sql("create table foo.first (id string) using csv options > (path='/tmp/first.csv')") > spark.table("foo.first").show() > +-----+ > | id| > +-----+ > |first| > +-----+ > spark.sql("drop database foo cascade") > spark.sql("create database foo") > spark.sql("create table foo.first (id string) using csv options > (path='/tmp/second.csv')") > "note, the path is different now, pointing to second.csv, but still showing > data from first file" > spark.table("foo.first").show() > +-----+ > | id| > +-----+ > |first| > +-----+ > "now, if I drop the table explicitly, instead of via dropping database > cascade, then it will be the correct result" > spark.sql("drop table foo.first") > spark.sql("create table foo.first (id string) using csv options > (path='/tmp/second.csv')") > spark.table("foo.first").show() > +------+ > | id| > +------+ > |second| > +------+ > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org