[
https://issues.apache.org/jira/browse/HDDS-15530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ivan Andika updated HDDS-15530:
-------------------------------
Summary: Write small file directly to RocksDB with deferred writes to disk
(was: Write small file data RocksDB with deferred writes)
> Write small file directly to RocksDB with deferred writes to disk
> -----------------------------------------------------------------
>
> Key: HDDS-15530
> URL: https://issues.apache.org/jira/browse/HDDS-15530
> Project: Apache Ozone
> Issue Type: Improvement
> Reporter: Ivan Andika
> Assignee: Ivan Andika
> Priority: Major
>
> We have previously discussed of storing small files in the RocksDB.
> When reading the Ceph BlueStore paper
> ([https://pdl.cmu.edu/PDL-FTP/Storage/ceph-exp-sosp19.pdf] ), there is an
> idea of writing small files to RocksDB as well with deferred writes to disk
> {quote}For writes smaller than the minimum allocation size, both data and
> metadata are first inserted to RocksDB as promises of future I/O, and then
> asynchronously written to disk after the transaction commits. This deferred
> write mechanism has two purposes. First, it batches small writes to increase
> efficiency, because new data writes require two I/O operations whereas an
> insert to RocksDB requires one. Second, it optimizes I/O based on the device
> type. 64 KiB (or smaller) overwrites of a large object on an HDD are
> performed asynchronously in place to avoid seeks during reads, whereas
> in-place overwrites only happen for I/O sizes less than 16 KiB on SSDs
> {quote}
> We can consider this. Previously, we also considered whether storing data
> inline OM DB is better, but I think we should always store data in datanodes,
> not in OM DB since if we have billions of small files, this can overwhelm OM
> DB quickly.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]