PhatakN1 commented on issue #1549: URL: https://github.com/apache/incubator-hudi/issues/1549#issuecomment-618292312
If MOR inserts go to a parquet file but updates to go a log file, then a query on the _ro table will show the inserts since the last compaction but not the updates. Isnt that like providing an inconsistent state of data? So, I still see all inserts since the last compaction but none of the updates? These are the contents of the log file using show logfile records in hudi-cli {"_hoodie_commit_time": "20200422083923", "_hoodie_commit_seqno": "20200422083923_1_2", "_hoodie_record_key": "11", "_hoodie_partition_path": "2019-03-14", "_hoodie_file_name": "c9df1d00-5dda-4bf7-8f27-1d4534bbbe4c-0", "dms_received_ts": "2020-04-22T08:38:36.873970Z", "tran_id": 11, "tran_date": "2019-03-14", "store_id": 5, "store_city": "CHICAGO", "store_state": "IL", "item_code": "XXXXXX", "quantity": 15, "total": 106.25, "Op": "D"} This is the log file metadata ║ 20200422083923 │ 1 │ AVRO_DATA_BLOCK │ {"SCHEMA":"{\"type\":\"record\",\"name\":\"retail_transactions\",\"fields\":[{\"name\":\"_hoodie_commit_time\",\"type\":[\"null\",\"string\"],\"doc\":\"\",\"default\":null},{\"name\":\"_hoodie_commit_seqno\",\"type\":[\"null\",\"string\"],\"doc\":\"\",\"default\":null},{\"name\":\"_hoodie_record_key\",\"type\":[\"null\",\"string\"],\"doc\":\"\",\"default\":null},{\"name\":\"_hoodie_partition_path\",\"type\":[\"null\",\"string\"],\"doc\":\"\",\"default\":null},{\"name\":\"_hoodie_file_name\",\"type\":[\"null\",\"string\"],\"doc\":\"\",\"default\":null},{\"name\":\"dms_received_ts\",\"type\":\"string\"},{\"name\":\"tran_id\",\"type\":\"int\"},{\"name\":\"tran_date\",\"type\":\"string\"},{\"name\":\"store_id\",\"type\":\"int\"},{\"name\":\"store_city\",\"type\":\"string\"},{\"name\":\"store_state\",\"type\":\"string\"},{\"name\":\"item_code\",\"type\":\"string\"},{\"name\":\"quantity\",\"type\":\"int\"},{\"name\":\"total\",\"type\":\"float\"},{\"name\":\"Op\",\"type\":\"string\"}]}","INSTANT_TIME":"20200422083923"} │ {} ║ The name of the parquet file in the partition is c9df1d00-5dda-4bf7-8f27-1d4534bbbe4c-0_3-23-40_20200422072539.parquet and the log file name is .c9df1d00-5dda-4bf7-8f27-1d4534bbbe4c-0_20200422072539.log.1_1-24-33 The partiton metadata contents are commitTime=20200422072539 partitionDepth=1 Not sure why a query on the _rt table does not reflect the delete. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org