Yun,

> Then I would share some experience about tuning RocksDB performance. Since 
> you did not cache index and filter in block cache, it's no worry about the 
> competition between data blocks and index&filter blocks[1]. And to improve 
> the read performance, you should increase your block cache size to 256MB or 
> even 512MB. What's more, writer buffer in rocksDB also acts as a role for 
> reading, from our experience, we use 4 max write buffers and 32MB each, e.g.  
> setMaxWriteBufferNumber(4) and setWriteBufferSize(32*1024*1024)

Thank you very much for the hints. I read that tuning guide and added
some settings. Now it's doing much much better. The IOPs stays under
300 except for when checkpoints are taken, then it spikes to about
1.5k, which is totally expected.

For reference, the following are the settings I'm using right now. The
reason I didn't bump block cache size is because we have limited
amount of memory per instance (30GB).

    @Override
    public DBOptions createDBOptions(DBOptions currentOptions) {
        return currentOptions
                .setIncreaseParallelism(4)
                .setMaxBackgroundFlushes(1)
                .setMaxBackgroundCompactions(1)
                .setUseFsync(false)
                .setMaxOpenFiles(-1);
    }

    @Override
    public ColumnFamilyOptions createColumnOptions(ColumnFamilyOptions
currentOptions) {
        final long blockCacheSize = 64 * 1024 * 1024;
        final long writeBufferSize = 64 * 1024 * 1024;

        return currentOptions
                .setCompressionType(CompressionType.LZ4_COMPRESSION)

                .setCompactionStyle(CompactionStyle.LEVEL)
                .setLevel0FileNumCompactionTrigger(10)
                .setLevel0SlowdownWritesTrigger(20)
                .setLevel0StopWritesTrigger(40)

                .setWriteBufferSize(writeBufferSize) // In-memory memtable size
                .setMaxWriteBufferNumber(5) // Max number of memtables
before stalling writes
                .setMinWriteBufferNumberToMerge(2) // Merge two
memtables together to reduce duplicate keys

                .setTargetFileSizeBase(writeBufferSize) // L0 file
size, same as memtable size
                .setMaxBytesForLevelBase(writeBufferSize * 8)

                .setTableFormatConfig(
                        new BlockBasedTableConfig()
                                .setFilter(new BloomFilter())
                                .setBlockCacheSize(blockCacheSize)
                );
    }

Ning

Reply via email to