(kvrocks-website) branch main updated: Add docs for Timeseries (#324)

twice Mon, 15 Sep 2025 20:37:25 -0700

This is an automated email from the ASF dual-hosted git repository.

twice pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/kvrocks-website.git



The following commit(s) were added to refs/heads/main by this push:
     new ed056cc2 Add docs for Timeseries (#324)
ed056cc2 is described below

commit ed056cc2d7d4f50374cdda94a9178c4c69ff387d
Author: RX Xiao <[email protected]>
AuthorDate: Tue Sep 16 10:01:56 2025 +0800

    Add docs for Timeseries (#324)
---
 community/data-structure-on-rocksdb.md | 85 ++++++++++++++++++++++++++++++++++
 1 file changed, 85 insertions(+)

diff --git a/community/data-structure-on-rocksdb.md 
b/community/data-structure-on-rocksdb.md
index d7b86f0e..2f195a04 100644
--- a/community/data-structure-on-rocksdb.md
+++ b/community/data-structure-on-rocksdb.md
@@ -453,3 +453,88 @@ During each merge, we will flush the buffer to the 
centroids and merge the centr
                                        | (8byte) double |
                                        +----------------+
 ```
+
+## TimeSeries
+
+RedisTimeSeries is a Redis module that enables a full-featured time-series 
database within Redis. To bring this powerful capability to Kvrocks, we've 
implemented a compatible TimeSeries data structure, leveraging RocksDB for 
efficient storage and retrieval.
+
+#### TimeSeries metadata
+
+The metadata stores the overall configuration for a single time series, such 
as retention, duplicate policy and chunk settings. 
+
+```text
+        
+----------+----------+-----------+------------------+----------------+-----------+-----------+----------------+----------------+-----------+
+key =>  |  flags   |  expire  |  version  | size(chunkCount) | retentionTime  
| chunkSize | chunkType | duplicatePolicy| sourceKey_size | sourceKey |
+        | (1byte)  |  (Ebyte) |  (8byte)  |      (Sbyte)     |    (8byte)     
|  (8byte)  |  (1byte)  |    (1byte)     |    (4byte)     |  (Xbyte)  |
+        
+----------+----------+-----------+------------------+----------------+-----------+-----------+----------------+----------------+-----------+
+```
+- `retentionTime`: Maximum age (in milliseconds) for samples compared to the 
latest timestamp. A value of `0` disables retention.
+- `chunkSize`: The preferred number of samples per data chunk.
+- `chunkType`: The storage format of the chunk (compressed or uncompressed).
+- `duplicatePolicy`: An enum represents the policy to handle samples with 
duplicate timestamps (e.g., BLOCK, FIRST, LAST).
+- `sourceKey`: If this series is a downstream target for compaction, this 
field stores the key of the source series.
+
+#### TimeSeries sub keys-values
+
+Internally, TimeSeries data structure uses several types of sub-keys to store 
its components: **time chunks**, **labels**, and **downstream rule metadata**. 
A **key type** enum is used as a prefix in the key to differentiate between 
them.
+
+| key type     | enum value |
+| ------------ | ---------- |
+| `CHUNK`      | 0          |
+| `LABEL`      | 1          |
+| `DOWNSTREAM` | 2          |
+
+##### CHUNK sub keys
+The actual time series data is stored in sequential blocks called **chunks**. 
Each chunk is identified by a `chunk_id`, which corresponds to the timestamp of 
the first sample within that chunk.
+
+```text
+                              
+--------+------------+----------+------------+----------+   
+------------+----------+
+key|version|CHUNK|chunk_id => | count  | timestamp1 |  value1  | timestamp2 |  
value2  |...| timestampN |  valueN  |
+                              |(8byte) |  (8byte)   |  (8byte) |  (8byte)   |  
(8byte) |...|  (8byte)   |  (8byte) |
+                              
+--------+------------+----------+------------+----------+   
+------------+----------+
+```
+
+##### LABEL sub keys
+These sub keys store label key-value pairs associated with the time series.
+
+```text
+                                   +----------------+
+key|version|LABEL|label_key1 =>    |  label_value1  |
+                                   |    (Xbyte)     |
+                                   +----------------+
+                                   +----------------+
+key|version|LABEL|label_key2 =>    |  label_value2  |
+                                   |    (Xbyte)     |
+                                   +----------------+
+...
+```
+
+##### DOWNSTREAM sub keys
+Kvrocks supports RedisTimeSeries's compaction rules, which automatically 
aggregate data from a source series into a destination (or downstream) series. 
This sub-key stores the configuration for each compaction rule applied to the 
source series.
+
+```text
+                                          
+-------------+----------------+-------------+---------------------+------------+
+key|version|DOWNSTREAM|downstream_key =>  |  aggregator |bucket_duration |  
alignment  | latest_bucket_index |  auxinfo   |
+                                          |   (1byte)   |   (8byte)      |   
(8byte)   |       (8byte)       |  (XByte)   |
+                                          
+-------------+----------------+-------------+---------------------+------------+
+```
+- `aggregator`, `bucket_duration`, `alignment` are the parameters defined by 
the [compaction rule](https://redis.io/docs/latest/commands/ts.createrule/).
+- `latest_bucket_index`: Tracks the index of the latest bucket to optimize 
ongoing aggregations.
+- `auxinfo`: Auxiliary data to speed up aggregations without re-scanning the 
entire bucket. Upon appending samples, it is [updated whenever a new chunk is 
created](https://github.com/apache/kvrocks/blob/b5b419995c8327bd07a6d63090da367d98f59b72/src/types/redis_timeseries.cc#L322)
 in the source series.
+
+#### label-based reverse index
+
+To enable fast, label-based queries, a reverse index is maintained. This index 
is critical for efficiently locating all time series that match a given set of 
label filters. It is stored in a dedicated `Index` Column Family.
+
+```text
++-------------+-------------+------------+--------------+---------+
+|  namespace  | index type  | label_key  | label_value  |   key   | => null
+|  (1+Xbyte)  |  (1byte)    | (4+Ybyte)  |  (4+Zbyte)   | (Kbyte) |
++-------------+-------------+------------+--------------+---------+
+
+```
+`index type` is an enum that distinguishes between different types of indexes. 
For TimeSeries, it currently includes the following value:
+
+| index type     | enum value |
+| ------------   | ---------- |
+| `TS_LABEL`     | 0          |

(kvrocks-website) branch main updated: Add docs for Timeseries (#324)

Reply via email to