Hi,

I'm benchmarking a job with large state in various window sizes
(hourly, daily). I noticed that it would consistently slow down after
30 minutes into the benchmark due to high disk read IOPs. The first 30
minutes were fine, with close to 0 disk IOPs. Then after 30 minutes,
read IOPs would gradually climb to as high as 10k/s. At this point,
the job was bottlenecked on disk IOPs because I'm using 2TB EBS-backed
volume.

Another thread on the mailing list mentioned potentially running into
burst IOPs credit could be the cause of slowdown. It's not that in
this case because I'm using 2TB EBS.

Someone also mentioned RocksDB compaction could potentially increase
read IOPs a lot.

I'm currently running the job with these RocksDB settings.

@Override
public DBOptions createDBOptions(DBOptions currentOptions) {
    return currentOptions
        .setIncreaseParallelism(4)
        .setUseFsync(false)
        .setMaxOpenFiles(-1);
}

@Override
public ColumnFamilyOptions createColumnOptions(ColumnFamilyOptions
currentOptions) {
    final long blockCacheSize = 64 * 1024 * 1024;
    return currentOptions
        .setTableFormatConfig(
            new BlockBasedTableConfig()
                .setBlockCacheSize(blockCacheSize)
        );
}

Any insights into how I can further diagnose this? Is there anyway to
see compaction stats or any settings I should try?

Thanks,

Ning

Reply via email to