[jira] [Updated] (HDDS-12903) Separate OM namespace and blockspace into separate Column Families

Ivan Andika (Jira) Thu, 24 Apr 2025 21:11:08 -0700


     [ 
https://issues.apache.org/jira/browse/HDDS-12903?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Ivan Andika updated HDDS-12903:
-------------------------------
    Description: 
Currently when OM list keys, it needs to create a RocksDB iterator which will 
load both the RocksDB key and value. 

For a key with a lot of blocks (for example MPU key with few hundred parts), 
this can take a considerable amount of OM heap memory. These blocks is even 
loaded to OM memory even when we use listKeysLight which will remove the key 
block info.

One possible way to handle this is to separate the keyspace and blockspace into 
two separate column families. The keyspace CF will only store the basic key 
info (similar to BasicOmKeyInfo), while the blockspace CF stores the blocks 
associated with the key. Therefore, during list, we will only load the 
BasicOmKeyInfo which will result in lower memory overhead.

The downside is that now we have two CF to update or query for key get, 
creation, update, and deletion. For get, we can use RocksDB multiget to get 
both the keyspace and blockspace CF.

This is inspired by Tectonic Filesystem Namespace and Block layer 
(https://www.usenix.org/system/files/fast21-pan.pdf).

  was:
Currently when OM list keys, it needs to create a RocksDB iterator which will 
load both the RocksDB key and value. 

For a key with a lot of blocks (for example MPU key with few hundred parts), 
this can take a considerable amount of OM heap memory. These blocks is even 
loaded to OM memory even when we use listKeysLight which will remove the key 
block info.

One possible way to handle this is to separate the keyspace and blockspace into 
two separate column families. The keyspace CF will only store the basic key 
info (similar to BasicOmKeyInfo), while the blockspace CF stores the blocks 
associated with the key. Therefore, during list, we will only load the 
BasicOmKeyInfo which will result in lower memory overhead.

The downside is that now we have two CF to update or query for key get, 
creation, update, and deletion. For get, we can use RocksDB multiget to get 
both the keyspace and blockspace CF.

This is inspired by Tectonic Filesystem Namespace and Block layer.


> Separate OM namespace and blockspace into separate Column Families
> ------------------------------------------------------------------
>
>                 Key: HDDS-12903
>                 URL: https://issues.apache.org/jira/browse/HDDS-12903
>             Project: Apache Ozone
>          Issue Type: Wish
>          Components: Ozone Manager
>            Reporter: Ivan Andika
>            Assignee: Ivan Andika
>            Priority: Major
>
> Currently when OM list keys, it needs to create a RocksDB iterator which will 
> load both the RocksDB key and value. 
> For a key with a lot of blocks (for example MPU key with few hundred parts), 
> this can take a considerable amount of OM heap memory. These blocks is even 
> loaded to OM memory even when we use listKeysLight which will remove the key 
> block info.
> One possible way to handle this is to separate the keyspace and blockspace 
> into two separate column families. The keyspace CF will only store the basic 
> key info (similar to BasicOmKeyInfo), while the blockspace CF stores the 
> blocks associated with the key. Therefore, during list, we will only load the 
> BasicOmKeyInfo which will result in lower memory overhead.
> The downside is that now we have two CF to update or query for key get, 
> creation, update, and deletion. For get, we can use RocksDB multiget to get 
> both the keyspace and blockspace CF.
> This is inspired by Tectonic Filesystem Namespace and Block layer 
> (https://www.usenix.org/system/files/fast21-pan.pdf).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Updated] (HDDS-12903) Separate OM namespace and blockspace into separate Column Families

Reply via email to