Hi, I'm benchmarking a job with large state in various window sizes (hourly, daily). I noticed that it would consistently slow down after 30 minutes into the benchmark due to high disk read IOPs. The first 30 minutes were fine, with close to 0 disk IOPs. Then after 30 minutes, read IOPs would gradually climb to as high as 10k/s. At this point, the job was bottlenecked on disk IOPs because I'm using 2TB EBS-backed volume.
Another thread on the mailing list mentioned potentially running into burst IOPs credit could be the cause of slowdown. It's not that in this case because I'm using 2TB EBS. Someone also mentioned RocksDB compaction could potentially increase read IOPs a lot. I'm currently running the job with these RocksDB settings. @Override public DBOptions createDBOptions(DBOptions currentOptions) { return currentOptions .setIncreaseParallelism(4) .setUseFsync(false) .setMaxOpenFiles(-1); } @Override public ColumnFamilyOptions createColumnOptions(ColumnFamilyOptions currentOptions) { final long blockCacheSize = 64 * 1024 * 1024; return currentOptions .setTableFormatConfig( new BlockBasedTableConfig() .setBlockCacheSize(blockCacheSize) ); } Any insights into how I can further diagnose this? Is there anyway to see compaction stats or any settings I should try? Thanks, Ning