stuxuhai commented on PR #14930: URL: https://github.com/apache/iceberg/pull/14930#issuecomment-3749765484
@nastra Thanks a lot for the thoughtful feedback and for taking the time to review this. Based on our testing, applying this change should not affect the behavior of `DESCRIBE` or `SHOW VIEWS`. The impact is limited to `CREATE VIEW` statements. While using `SparkSessionCatalog` in practice, we encountered a number of confusing and unintuitive behaviors, especially when working with non-Iceberg views and tables. This PR is intended to address one of those cases. For example, with the following sequence: ```sql -- create hive view create view test_hive_view as select 1 as id, 'hive_view' as name; -- use SparkSessionCatalog to create iceberg view create view test_iceberg_view as select 2 as id, 'iceberg_view' as name; -- ERROR: org.apache.iceberg.exceptions.NoSuchIcebergViewException: Not an iceberg view create view if not exists test_hive_view as select 1 as id, 'iceberg' as name; -- ERROR: [VIEW_NOT_FOUND] The view test_hive_view cannot be found. create or replace view test_hive_view as select 2 as id, 'create or replace by iceberg' as name; -- ERROR: [VIEW_NOT_FOUND] The view test_hive_view cannot be found. drop view test_hive_view; -- Succeeds, but actually queries the Hive view instead of the Iceberg view select * from test_iceberg_view; -- Succeeds, but should fail with WRONG_COMMAND_FOR_OBJECT_TYPE drop table test_iceberg_view; -- ERROR: [VIEW_NOT_FOUND] The view test_iceberg_table cannot be found (should be WRONG_COMMAND_FOR_OBJECT_TYPE) drop view test_iceberg_table; -- Only drops metadata but does not delete data unless PURGE is specified, which is also different in behavior drop table test_hive_managed_table; ``` From a user perspective, these behaviors are quite surprising and make it difficult to reason about how `SparkSessionCatalog` should be used safely. My understanding is that `SparkSessionCatalog` is primarily intended to manage Iceberg tables and views. For non-Iceberg objects, it would be ideal if the behavior could fall back to Spark’s session catalog so that the results remain consistent with running Spark without Iceberg. At the moment, `getSessionCatalog()` returns a `V2SessionCatalog` instance, and due to current Spark design constraints, `V2SessionCatalog` does not implement the `ViewCatalog` interface, which makes fallback handling for Hive views more complicated. For the specific `CREATE VIEW IF NOT EXISTS` case, it appears that the issue can be resolved with a small and localized change, which is what this PR focuses on. I’m very happy to iterate on this further and explore the best long-term approach together. Thanks again for the review and for the discussion — I really appreciate it. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
