[GitHub] [hudi] bvaradar commented on issue #2338: [SUPPORT] MOR table found duplicate and process so slowly

2021-01-16 Thread GitBox
bvaradar commented on issue #2338: URL: https://github.com/apache/hudi/issues/2338#issuecomment-761532496 cc @nsivabalan Not sure if you saw this blog about index usages :https://hudi.apache.org/blog/hudi-indexing-mechanisms/ The stage names could be misleading. It is likely

[GitHub] [hudi] bvaradar commented on issue #2338: [SUPPORT] MOR table found duplicate and process so slowly

2021-01-09 Thread GitBox
bvaradar commented on issue #2338: URL: https://github.com/apache/hudi/issues/2338#issuecomment-757125096 @so-lazy : when you query through spark datasource (not just single file), are you able to see unique record ? val df = spark.read.format("hudi").load("hdfs://hadoop01:9

[GitHub] [hudi] bvaradar commented on issue #2338: [SUPPORT] MOR table found duplicate and process so slowly

2021-01-09 Thread GitBox
bvaradar commented on issue #2338: URL: https://github.com/apache/hudi/issues/2338#issuecomment-757125096 @so-lazy : when you query through spark datasource (not just single file), are you able to see unique record ? val df = spark.read.format("hudi").load("hdfs://hadoop01:9

[GitHub] [hudi] bvaradar commented on issue #2338: [SUPPORT] MOR table found duplicate and process so slowly

2021-01-04 Thread GitBox
bvaradar commented on issue #2338: URL: https://github.com/apache/hudi/issues/2338#issuecomment-754448424 @so-lazy : I think the hive table may not be a hudi table. can you show the output of the following hive command ? desc formatted table Also, can you please attach th

[GitHub] [hudi] bvaradar commented on issue #2338: [SUPPORT] MOR table found duplicate and process so slowly

2020-12-15 Thread GitBox
bvaradar commented on issue #2338: URL: https://github.com/apache/hudi/issues/2338#issuecomment-745713009 @nsivabalan : Can you take a look at this. Thanks, Balaji.V This is an automated message from the Apache Git