[
https://issues.apache.org/jira/browse/HDDS-15530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ivan Andika updated HDDS-15530:
-------------------------------
Description:
We have previously discussed of storing small files in the RocksDB.
When reading the Ceph BlueStore paper
([https://pdl.cmu.edu/PDL-FTP/Storage/ceph-exp-sosp19.pdf] ), there is an idea
of writing small files to RocksDB as well with deferred writes to disk
{quote}For writes smaller than the minimum allocation size, both data and
metadata are first inserted to RocksDB as promises of future I/O, and then
asynchronously written to disk after the transaction commits. This deferred
write mechanism has two purposes. First, it batches small writes to increase
efficiency, because new data writes require two I/O operations whereas an
insert to RocksDB requires one. Second, it optimizes I/O based on the device
type. 64 KiB (or smaller) overwrites of a large object on an HDD are performed
asynchronously in place to avoid seeks during reads, whereas in-place
overwrites only happen for I/O sizes less than 16 KiB on SSDs
{quote}
We can consider this.
was:
We have previously discussed of storing small files in the RocksDB.
When reading the Ceph BlueStore paper
([https://pdl.cmu.edu/PDL-FTP/Storage/ceph-exp-sosp19.pdf] ), there is an idea
of writing small files to RocksDB as well with deferred writes to disk
{quote}For writes smaller than the minimum allocation size, both data and
metadata are first inserted to RocksDB as promises of future I/O, and then
asynchronously written to disk after the transaction commits. This deferred
write mechanism has two purposes. First, it batches small writes to increase
efficiency, because new data writes require two I/O operations whereas an
insert to RocksDB requires one. Second, it optimizes I/O based on the device
type. 64 KiB (or smaller) overwrites of a large object on an HDD are performed
asynchronously in place to avoid seeks during reads, whereas in-place over
{quote}
We can consider this.
> Write small file data RocksDB with deferred writes
> --------------------------------------------------
>
> Key: HDDS-15530
> URL: https://issues.apache.org/jira/browse/HDDS-15530
> Project: Apache Ozone
> Issue Type: Improvement
> Reporter: Ivan Andika
> Assignee: Ivan Andika
> Priority: Major
>
> We have previously discussed of storing small files in the RocksDB.
> When reading the Ceph BlueStore paper
> ([https://pdl.cmu.edu/PDL-FTP/Storage/ceph-exp-sosp19.pdf] ), there is an
> idea of writing small files to RocksDB as well with deferred writes to disk
> {quote}For writes smaller than the minimum allocation size, both data and
> metadata are first inserted to RocksDB as promises of future I/O, and then
> asynchronously written to disk after the transaction commits. This deferred
> write mechanism has two purposes. First, it batches small writes to increase
> efficiency, because new data writes require two I/O operations whereas an
> insert to RocksDB requires one. Second, it optimizes I/O based on the device
> type. 64 KiB (or smaller) overwrites of a large object on an HDD are
> performed asynchronously in place to avoid seeks during reads, whereas
> in-place overwrites only happen for I/O sizes less than 16 KiB on SSDs
> {quote}
> We can consider this.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]