We encountered some special OOM cases of "cogroup" when the data in one partition is not balanced.
1. The estimated size of used memory is inaccurate. For example, there are too many values for some special keys. Because SizeEstimator.visitArray only samples at most 100 cells for an array, it may miss most of these special keys and get a wrong estimated size of used memory. In our case, it reports a CompactBuffer is only 27M, but actually it's more than 5G. Since the estimated value is wrong, the CompactBuffer won't be spilled and cause OOM. 2. There are too many values for a special key and these values cannot fit into memory. Spilling data to disk helps nothing because cogroup needs to read all values for a key into memory. Any suggestion to solve these OOM cases? Thank you,. Best Regards, Shixiong Zhu