Hi All,
I want to support change data feed for to spark sql, This feature can be
achieved in two ways.
1. Call Procedure Command
sql syntax
CALL system.table_changes('tableName', start_timestamp, end_timestamp)
example:
CALL system.table_changes('tableName', TIMESTAMP '2021-01-23 04:30:45',
TIMESTAMP '2021-02-23 6:00:00')
2. Support querying MOR(CDC) table as of a savepoint
SELECT * FROM A.B TIMESTAMP AS OF 1643119574;
SELECT * FROM A.B TIMESTAMP AS OF '2019-01-29 00:37:58' ;
SELECT * FROM A.B TIMESTAMP AS OF '2019-01-29 00:37:58' AND '2021-02-23
6:00:00' ;
SELECT * FROM A.B VERSION AS OF 'Snapshot123456789';
Any feedback is welcome!
Thank you.
Regards,
Forward Xu
Related Links:
[1] Call Procedure Command <https://issues.apache.org/jira/browse/HUDI-3161>
[2] Support querying a table as of a savepoint
<https://issues.apache.org/jira/browse/HUDI-3221>
[3] Change data feed
<https://docs.databricks.com/delta/delta-change-data-feed.html#language-sql>