stuxuhai commented on PR #14930:
URL: https://github.com/apache/iceberg/pull/14930#issuecomment-3749765484

   @nastra  Thanks a lot for the thoughtful feedback and for taking the time to 
review this.
   
   Based on our testing, applying this change should not affect the behavior of 
`DESCRIBE` or `SHOW VIEWS`. The impact is limited to `CREATE VIEW` statements. 
While using `SparkSessionCatalog` in practice, we encountered a number of 
confusing and unintuitive behaviors, especially when working with non-Iceberg 
views and tables. This PR is intended to address one of those cases.
   
   For example, with the following sequence:
   
   ```sql
   -- create hive view
   create view test_hive_view as select 1 as id, 'hive_view' as name;
   
   -- use SparkSessionCatalog to create iceberg view
   create view test_iceberg_view as select 2 as id, 'iceberg_view' as name;
   
   -- ERROR: org.apache.iceberg.exceptions.NoSuchIcebergViewException: Not an 
iceberg view
   create view if not exists test_hive_view as select 1 as id, 'iceberg' as 
name;
   
   -- ERROR: [VIEW_NOT_FOUND] The view test_hive_view cannot be found.
   create or replace view test_hive_view as select 2 as id, 'create or replace 
by iceberg' as name;
   
   -- ERROR: [VIEW_NOT_FOUND] The view test_hive_view cannot be found.
   drop view test_hive_view;
   
   -- Succeeds, but actually queries the Hive view instead of the Iceberg view
   select * from test_iceberg_view;
   
   -- Succeeds, but should fail with WRONG_COMMAND_FOR_OBJECT_TYPE
   drop table test_iceberg_view;
   
   -- ERROR: [VIEW_NOT_FOUND] The view test_iceberg_table cannot be found 
(should be WRONG_COMMAND_FOR_OBJECT_TYPE)
   drop view test_iceberg_table;
   
   -- Only drops metadata but does not delete data unless PURGE is specified, 
which is also different in behavior
   drop table test_hive_managed_table;
   ```
   
   From a user perspective, these behaviors are quite surprising and make it 
difficult to reason about how `SparkSessionCatalog` should be used safely.
   
   My understanding is that `SparkSessionCatalog` is primarily intended to 
manage Iceberg tables and views. For non-Iceberg objects, it would be ideal if 
the behavior could fall back to Spark’s session catalog so that the results 
remain consistent with running Spark without Iceberg.
   
   At the moment, `getSessionCatalog()` returns a `V2SessionCatalog` instance, 
and due to current Spark design constraints, `V2SessionCatalog` does not 
implement the `ViewCatalog` interface, which makes fallback handling for Hive 
views more complicated.
   
   For the specific `CREATE VIEW IF NOT EXISTS` case, it appears that the issue 
can be resolved with a small and localized change, which is what this PR 
focuses on. I’m very happy to iterate on this further and explore the best 
long-term approach together.
   
   Thanks again for the review and for the discussion — I really appreciate it.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to