alamb commented on PR #9450:
URL: https://github.com/apache/arrow-rs/pull/9450#issuecomment-4085357250

   For this PR I think we need an "End to end" test that shows the usecase that 
the CDC code is intended to solve
   
   For example, perhaps such a test can write two parquet files, with the same 
data except for some chosen rows in the middle , and verify that most of the 
pages are the same. 
   
   It is not entirely clear to me how a "content addressable filesystem" works 
(aka how does it know where the parquet pages start/end) so having that 
documented / mocked out would also be nice


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to