edwinchoi commented on pull request #1508:
URL: https://github.com/apache/iceberg/pull/1508#issuecomment-702272183
Thanks for elaborating. My thoughts...
> For the write-audit-publish (WAP) pattern, there is an option to only
stage a commit and not update the table's current-snapshot-id. In this case,
the writer updates the table by creating a new snapshot. Then an auditor reads
the snapshot and validates it (with row counts, for example), and if the
snapshot looks good, commits the snapshot as the current table state. This
allows reports to be validated before going live.
I don't think WAP is working as expected under RTAS.
```scala
spark.sql("""
CREATE TABLE test.ns.tbl
USING iceberg
TBLPROPERTIES ('write.wap.enabled'='true')
AS SELECT * FROM VALUES (1, "Alice"), (2, "Bob") AS (id, fname)
""")
spark.conf.set("spark.wap.id", "12345")
spark.sql("""
CREATE OR REPLACE TABLE test.ns.tbl
USING iceberg
AS SELECT * FROM VALUES (1, 5, "alice"), (2, 3, "bob") AS (id, name_len,
name)
""")
spark.conf.unset("spark.wap.id")
```
After running this, the schema from the staged change is showing up but the
data that _should_ exist isn't accessible, i.e., `DESC test.ns.tbl` shows the
new schema and `SELECT * FROM test.ns.tbl` is coming up empty. Even under a
simple schema change `ALTER TABLE ... ADD COLUMN ...`, the change is taking
effect prematurely.
> Also, the table state can be rolled back to a previous snapshot and new
commits will form a new history afterwards.
I don't believe this changes the notion of what was current at some point in
time. If you view the timeline as being from the database's point of view, then
rolling back doesn't change the fact that at _some point in time_, a snapshot,
that is now inaccessible, was visible in the database.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]