rdblue commented on PR #9487:
URL: https://github.com/apache/iceberg/pull/9487#issuecomment-1911227076
> I don't know what the behavior would be when trying to load a table (which
is actually a view) in an older client, but it would likely break in unexpected
ways.
The older client would try to load a view metadata file as table metadata
and would fail on a required field, like `snapshots`. It would be annoying, but
it shouldn't be dangerous. We should definitely double check this, though.
> I'm not sure we can provide the necessary guarantees around namespace
collisions
This is my primary argument against using a different table. We rely on
having a single table with a uniqueness constraint for atomic operations and
having a separate table means there is the potential to check existence and
insert in the other table on create. What's worse is that if we have a
conflict, we don't have well-defined behavior -- if you call `loadView` you'd
get a view and if you call `loadTable` you'd get a table.
It may seem like the risk is small because we're talking about simultaneous
create events and normal commits would not be affected. But, older clients
would not be able to deconflict by checking the view table before creating
tables. Those clients would blindly create conflicting tables.
----
It sounds like the two primary concerns about the single-table option are:
1. Failure in clients that don't know about views
2. Migration
The first seems like the biggest decision point to me because migration
isn't so difficult that it is a blocker. As long as we can create a column with
a default ('TABLE') then older clients will continue to work with tables just
fine (again, we need to test and validate this!). The only issue would be the
failure when they load a view.
I think that failure when loading a view from an older client is unavoidable
if we go this route. However, it isn't terrible. I would just want
administrators to know that this is going to happen if we can't have a good
enough error message. Coming back to migration, I would say that maybe we need
to opt into enabling views by running the `ALTER TABLE` command. But that
depends on how bad the user experience is when trying to read from a table that
is actually a view.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]