[GitHub] [hudi] nsivabalan commented on issue #6591: [SUPPORT]Duplicate records in MOR
nsivabalan commented on issue #6591: URL: https://github.com/apache/hudi/issues/6591#issuecomment-1526991688 Its already fixed w/ this patch https://github.com/apache/hudi/pull/8490 ``` scala> spark.read.format("hudi").load(basePath).show(false) +---+-+--+--+---++---++---+--+ |_hoodie_commit_time|_hoodie_commit_seqno |_hoodie_record_key|_hoodie_partition_path|_hoodie_file_name |game_schedule_id|game_id|game_date_cn|insert_date|dt| +---+-+--+--+---++---++---+--+ |20230427215728276 |20230427215728276_0_3|game_schedule_id:5|2022-08-30 |bd4d1121-57bc-4103-91a0-5541a474ef9e-0_0-28-369_20230427215728276.parquet |5 |10005 |2022-08-31 |2022-08-30 12:00:00.000|2022-08-30| |20230427215728276 |20230427215728276_0_4|game_schedule_id:6|2022-08-30 |bd4d1121-57bc-4103-91a0-5541a474ef9e-0_0-28-369_20230427215728276.parquet |6 |10006 |2022-08-31 |2022-08-30 12:00:00.000|2022-08-30| |20230427215728276 |20230427215728276_0_5|game_schedule_id:1|2022-08-30 |bd4d1121-57bc-4103-91a0-5541a474ef9e-0_0-28-369_20230427215728276.parquet |1 |10001 |2022-08-30 |2022-08-30 12:00:00.000|2022-08-30| |20230427215753406 |20230427215753406_1_0|game_schedule_id:2|2022-08-30 |484347af-e681-4b1e-ad99-e7c1cd9adeea-0_1-150-1051_20230427215753406.parquet|2 |10002 |2022-08-31 |2022-08-30 12:00:00.000|2022-08-30| |20230427215753406 |20230427215753406_1_1|game_schedule_id:3|2022-08-30 |484347af-e681-4b1e-ad99-e7c1cd9adeea-0_1-150-1051_20230427215753406.parquet|3 |10003 |2022-08-31 |2022-08-30 12:00:00.000|2022-08-30| |20230427215753406 |20230427215753406_1_2|game_schedule_id:4|2022-08-30 |484347af-e681-4b1e-ad99-e7c1cd9adeea-0_1-150-1051_20230427215753406.parquet|4 |10004 |2022-08-31 |2022-08-30 12:00:00.000|2022-08-30| +---+-+--+--+---++---++---+--+ scala> spark.read.format("hudi").load(basePath).count res9: Long = 6 ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] nsivabalan commented on issue #6591: [SUPPORT]Duplicate records in MOR
nsivabalan commented on issue #6591: URL: https://github.com/apache/hudi/issues/6591#issuecomment-1247367471 yes, I get it. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] nsivabalan commented on issue #6591: [SUPPORT]Duplicate records in MOR
nsivabalan commented on issue #6591: URL: https://github.com/apache/hudi/issues/6591#issuecomment-1246366467 atleast I will file a follow up jira so that we don't miss this. thanks for filing the issue. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] nsivabalan commented on issue #6591: [SUPPORT]Duplicate records in MOR
nsivabalan commented on issue #6591: URL: https://github.com/apache/hudi/issues/6591#issuecomment-1246261284 yes, this is a known limitation we have w/ MOR table. If a compaction kicks in, you may not see the update in older partitions/ where it was deleted. it will be an issue until then. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org