bwu2 edited a comment on issue #1328: Hudi upsert hangs URL: https://github.com/apache/incubator-hudi/issues/1328#issuecomment-586108720 Ok, after reading https://github.com/apache/incubator-hudi/issues/800 (which is similar), I think this is a memory issue (even though GC seems to be a small proportion of the total time, according to the Spark history server & we're not getting any OOME). When I double the `spark.executor.memory` the job runs much faster. I'm sure there are maybe other optimizations and/or tuning I can consider also. Is it normal/expected that an upsert with mostly updates would require more memory than an upsert with mostly inserts?
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services