vinothchandar commented on issue #1328: Hudi upsert hangs
URL: https://github.com/apache/incubator-hudi/issues/1328#issuecomment-594114291
Fix landed on master
This is an automated message from the Apache Git Service.
To respo
vinothchandar commented on issue #1328: Hudi upsert hangs
URL: https://github.com/apache/incubator-hudi/issues/1328#issuecomment-589452198
@bwu2 Got it.. I think the root issue is that the map is spilling more than
needed. I am trying to understand why.. Will update the JIRA as I uncover
s
vinothchandar commented on issue #1328: Hudi upsert hangs
URL: https://github.com/apache/incubator-hudi/issues/1328#issuecomment-589195840
https://issues.apache.org/jira/browse/HUDI-625 filed this to look into this
scenario..
@bwu2 In the meantime, could you run your benchmark again
vinothchandar commented on issue #1328: Hudi upsert hangs
URL: https://github.com/apache/incubator-hudi/issues/1328#issuecomment-589152895
@lamber-ken is right.. I am looking into why the DiskBasedMap is so slow
(there was a recent change.. wondering if its a regression.. ) Will raise a
JI
vinothchandar commented on issue #1328: Hudi upsert hangs
URL: https://github.com/apache/incubator-hudi/issues/1328#issuecomment-588341138
I ported your code to scala and looking into the issue now.. Will keep you
posted.
```
val HUDI_FORMAT = "org.apache.hudi"
val TABLE_NAME
vinothchandar commented on issue #1328: Hudi upsert hangs
URL: https://github.com/apache/incubator-hudi/issues/1328#issuecomment-587663160
Started on this.. Was trying to port to scala, since I am not super familiar
with pySpark. Will resume today and circle back. :)
-
vinothchandar commented on issue #1328: Hudi upsert hangs
URL: https://github.com/apache/incubator-hudi/issues/1328#issuecomment-586451527
Even #800 is a reasonable workload.. I don't understand what's going on here
.. Its just a single file being versioned.. same as the next two commits, w
vinothchandar commented on issue #1328: Hudi upsert hangs
URL: https://github.com/apache/incubator-hudi/issues/1328#issuecomment-585991857
There must be something else going on.. just used my own benchmark jobs to
generate a pattern where the records are fully overwritten in a second (and a
vinothchandar commented on issue #1328: Hudi upsert hangs
URL: https://github.com/apache/incubator-hudi/issues/1328#issuecomment-585503340
Reposting my response here..
There seems to be a lot of common concerns here..
https://cwiki.apache.org/confluence/display/HUDI/Tuning+Guide is