[GitHub] [incubator-hudi] vinothchandar commented on issue #1328: Hudi upsert hangs

2020-03-03 Thread GitBox
vinothchandar commented on issue #1328: Hudi upsert hangs URL: https://github.com/apache/incubator-hudi/issues/1328#issuecomment-594114291 Fix landed on master This is an automated message from the Apache Git Service. To respo

[GitHub] [incubator-hudi] vinothchandar commented on issue #1328: Hudi upsert hangs

2020-02-20 Thread GitBox
vinothchandar commented on issue #1328: Hudi upsert hangs URL: https://github.com/apache/incubator-hudi/issues/1328#issuecomment-589452198 @bwu2 Got it.. I think the root issue is that the map is spilling more than needed. I am trying to understand why.. Will update the JIRA as I uncover s

[GitHub] [incubator-hudi] vinothchandar commented on issue #1328: Hudi upsert hangs

2020-02-20 Thread GitBox
vinothchandar commented on issue #1328: Hudi upsert hangs URL: https://github.com/apache/incubator-hudi/issues/1328#issuecomment-589195840 https://issues.apache.org/jira/browse/HUDI-625 filed this to look into this scenario.. @bwu2 In the meantime, could you run your benchmark again

[GitHub] [incubator-hudi] vinothchandar commented on issue #1328: Hudi upsert hangs

2020-02-20 Thread GitBox
vinothchandar commented on issue #1328: Hudi upsert hangs URL: https://github.com/apache/incubator-hudi/issues/1328#issuecomment-589152895 @lamber-ken is right.. I am looking into why the DiskBasedMap is so slow (there was a recent change.. wondering if its a regression.. ) Will raise a JI

[GitHub] [incubator-hudi] vinothchandar commented on issue #1328: Hudi upsert hangs

2020-02-19 Thread GitBox
vinothchandar commented on issue #1328: Hudi upsert hangs URL: https://github.com/apache/incubator-hudi/issues/1328#issuecomment-588341138 I ported your code to scala and looking into the issue now.. Will keep you posted. ``` val HUDI_FORMAT = "org.apache.hudi" val TABLE_NAME

[GitHub] [incubator-hudi] vinothchandar commented on issue #1328: Hudi upsert hangs

2020-02-18 Thread GitBox
vinothchandar commented on issue #1328: Hudi upsert hangs URL: https://github.com/apache/incubator-hudi/issues/1328#issuecomment-587663160 Started on this.. Was trying to port to scala, since I am not super familiar with pySpark. Will resume today and circle back. :) -

[GitHub] [incubator-hudi] vinothchandar commented on issue #1328: Hudi upsert hangs

2020-02-14 Thread GitBox
vinothchandar commented on issue #1328: Hudi upsert hangs URL: https://github.com/apache/incubator-hudi/issues/1328#issuecomment-586451527 Even #800 is a reasonable workload.. I don't understand what's going on here .. Its just a single file being versioned.. same as the next two commits, w

[GitHub] [incubator-hudi] vinothchandar commented on issue #1328: Hudi upsert hangs

2020-02-13 Thread GitBox
vinothchandar commented on issue #1328: Hudi upsert hangs URL: https://github.com/apache/incubator-hudi/issues/1328#issuecomment-585991857 There must be something else going on.. just used my own benchmark jobs to generate a pattern where the records are fully overwritten in a second (and a

[GitHub] [incubator-hudi] vinothchandar commented on issue #1328: Hudi upsert hangs

2020-02-12 Thread GitBox
vinothchandar commented on issue #1328: Hudi upsert hangs URL: https://github.com/apache/incubator-hudi/issues/1328#issuecomment-585503340 Reposting my response here.. There seems to be a lot of common concerns here.. https://cwiki.apache.org/confluence/display/HUDI/Tuning+Guide is