HI I am using spark with iceberg, updating the table with 1700 columns , We are loading 0.6 Million rows from parquet files ,in future it will be 16 Million rows and trying to update the data in the table which has 16 buckets . Using the default partitioner of spark .Also we don't do any repartitioning of the dataset.on the bucketing column, One of the executor fails with OOME , and it recovers and again fails.when we are using Merge Into strategy of iceberg Merge into target( select * from source) on Target.id= source.id when matched then update set When not matched then insert
But when we do append blind append . this works. Question : How to find what the issue is ? as we are running spark on EKS cluster .when executor gives OOME it dies logs also gone , unable to see the logs. DO we need to partition of the column in the dataset ? when at the time of loading or once the data is loaded . Need help to understand?