bvaradar commented on issue #1556: URL: https://github.com/apache/incubator-hudi/issues/1556#issuecomment-620303776
@HariprasadAllaka1612 Not sure if I completely understand the context here. Questions inline related to your descriptions ? 1. Reading CDC table from hive (hoodie table) to get the latest marker, what do you mean by marker ? Is it commit time of Hudi or some timestamped directory that you are using as input folder ? 2. Read the files from S3 based on the latest marked read in step1. Are you reading the files directly or from running incremental query here ? In general, this could be eventual consistency issue too. Does the path s3a://gat-datalake-refined-dev/reports/player/dat/2020/04/23 belong to the CDC table ? Does it actually exist when you do aws s3 ls ? Did CDC pipeline ran with consistency guard enabled ? ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org