[GitHub] [hudi] nsivabalan commented on issue #6591: [SUPPORT]Duplicate records in MOR

2023-04-27 Thread via GitHub


nsivabalan commented on issue #6591:
URL: https://github.com/apache/hudi/issues/6591#issuecomment-1526991688

   Its already fixed w/ this patch https://github.com/apache/hudi/pull/8490
   
   ```
   scala> spark.read.format("hudi").load(basePath).show(false)
   
+---+-+--+--+---++---++---+--+
   |_hoodie_commit_time|_hoodie_commit_seqno 
|_hoodie_record_key|_hoodie_partition_path|_hoodie_file_name
  
|game_schedule_id|game_id|game_date_cn|insert_date|dt|
   
+---+-+--+--+---++---++---+--+
   |20230427215728276  |20230427215728276_0_3|game_schedule_id:5|2022-08-30 
   
|bd4d1121-57bc-4103-91a0-5541a474ef9e-0_0-28-369_20230427215728276.parquet  |5  
 |10005  |2022-08-31  |2022-08-30 12:00:00.000|2022-08-30|
   |20230427215728276  |20230427215728276_0_4|game_schedule_id:6|2022-08-30 
   
|bd4d1121-57bc-4103-91a0-5541a474ef9e-0_0-28-369_20230427215728276.parquet  |6  
 |10006  |2022-08-31  |2022-08-30 12:00:00.000|2022-08-30|
   |20230427215728276  |20230427215728276_0_5|game_schedule_id:1|2022-08-30 
   
|bd4d1121-57bc-4103-91a0-5541a474ef9e-0_0-28-369_20230427215728276.parquet  |1  
 |10001  |2022-08-30  |2022-08-30 12:00:00.000|2022-08-30|
   |20230427215753406  |20230427215753406_1_0|game_schedule_id:2|2022-08-30 
   
|484347af-e681-4b1e-ad99-e7c1cd9adeea-0_1-150-1051_20230427215753406.parquet|2  
 |10002  |2022-08-31  |2022-08-30 12:00:00.000|2022-08-30|
   |20230427215753406  |20230427215753406_1_1|game_schedule_id:3|2022-08-30 
   
|484347af-e681-4b1e-ad99-e7c1cd9adeea-0_1-150-1051_20230427215753406.parquet|3  
 |10003  |2022-08-31  |2022-08-30 12:00:00.000|2022-08-30|
   |20230427215753406  |20230427215753406_1_2|game_schedule_id:4|2022-08-30 
   
|484347af-e681-4b1e-ad99-e7c1cd9adeea-0_1-150-1051_20230427215753406.parquet|4  
 |10004  |2022-08-31  |2022-08-30 12:00:00.000|2022-08-30|
   
+---+-+--+--+---++---++---+--+
   
   
   scala> spark.read.format("hudi").load(basePath).count
   res9: Long = 6
   
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] nsivabalan commented on issue #6591: [SUPPORT]Duplicate records in MOR

2022-09-14 Thread GitBox


nsivabalan commented on issue #6591:
URL: https://github.com/apache/hudi/issues/6591#issuecomment-1247367471

   yes, I get it. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] nsivabalan commented on issue #6591: [SUPPORT]Duplicate records in MOR

2022-09-14 Thread GitBox


nsivabalan commented on issue #6591:
URL: https://github.com/apache/hudi/issues/6591#issuecomment-1246366467

   atleast I will file a follow up jira so that we don't miss this. thanks for 
filing the issue. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] nsivabalan commented on issue #6591: [SUPPORT]Duplicate records in MOR

2022-09-13 Thread GitBox


nsivabalan commented on issue #6591:
URL: https://github.com/apache/hudi/issues/6591#issuecomment-1246261284

   yes, this is a known limitation we have w/ MOR table. If a compaction kicks 
in, you may not see the update in older partitions/ where it was deleted. it 
will be an issue until then. 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org