chinnaraolalam commented on PR #9187:
URL: https://github.com/apache/iceberg/pull/9187#issuecomment-2010411781
@ajantha-bhat @RussellSpitzer Testcase was failing for non-iceberg table,
but it should not effect non-iceberg tables. This might be an issue
I can see 2 cases:
CASE 1: Launching spark-sql session with session catalog(which is spark
default, here iceberg tables will not work) and can manage non-iceberg tables.
when drop the non-iceberg table like parquet table, it will purge data and it
will not leave any data on disk.
CASE 2: Launching spark-sql session with Spark session catalog(Iceberg
provided, here iceberg and non-iceberg tables can be managed) will work fine
for iceberg tables, when operate on non-iceberg tables like parquet table, drop
non-icerberg table will not purge the data and data will be leaked until manual
cleanup.
So here CASE-1 and CASE-2 behaviour is different, more over launching
spark-sql with spark session catalog is bringing behavioural change for
non-iceberg tables(Its an issue).
In CASE 2: CREATE` TABLE parquettable (id bigint, data string) USING parquet;
INSERT INTO parquettable VALUES (1,'A),(2,'B'),(3,'C');
SELECT id,data FROM parquettable WHERE lenght(data) = 1;
DROP TABLE parquettable;
CREATE TABLE parquettable (id bigint, data string) USING
parquet; --> This query will fail and throw exception like
[LOCATION_ALREADY_EXIST] (as drop table purge not happended)
Where as in case 1, It is passing and it will not throw any error
Please add you are thoughts.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]