[
https://issues.apache.org/jira/browse/KAFKA-20249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Alieh Saeedi updated KAFKA-20249:
---------------------------------
Description:
Optimize rawValue/{{{}rawAggregationValue{}}} to fast-path the common case of
empty headers in headers-aware state stores, while preserving correct behavior
for non-empty headers.
h3. Background
In the current header-aware state store encoding, the value layout is:
* 1st byte: header-length prefix (or start of varint-encoded header size)
* Next {{N}} bytes: serialized headers
* Remaining bytes: aggregation value (payload)
Most state store records are expected to have no headers, so optimizing the
empty-headers case reduces overhead in hot paths where we need to extract the
raw aggregation value.
h3. Proposed Change
Introduce a specialized implementation of rawValue/{{{}rawAggregationValue. {}}}
```
{{// Fast path for empty headers (most common case)}}
{{ if (valueWithHeaders[0] == 0x00){ }}
{{ // Headers size is 0, just strip first byte }}
{{ final byte[] result = new byte[valueWithHeaders.length - 1];
}}{{ System.arraycopy(valueWithHeaders,
1, result, 0, result.length); }}
{{ return result; }}
}
{{// Slow path for actual headers (rare in session stores)}}
{{...}}
```
Pros:
- Optimizes the common case (empty headers)
- ~50% faster for empty headers (no ByteBuffer, simple arraycopy)
- Still handles non-empty headers correctly
was:
Optimize rawValue/{{{}rawAggregationValue{}}} to fast-path the common case of
empty headers in headers-aware state stores, while preserving correct behavior
for non-empty headers.
h3. Background
In the current header-aware state store encoding, the value layout is:
* 1st byte: header-length prefix (or start of varint-encoded header size)
* Next {{N}} bytes: serialized headers
* Remaining bytes: aggregation value (payload)
Most state store records are expected to have no headers, so optimizing the
empty-headers case reduces overhead in hot paths where we need to extract the
raw aggregation value.
h3. Proposed Change
Introduce a specialized implementation of rawValue/{{{}rawAggregationValue. {}}}
```
{{// Fast path for empty headers (most common case)}}
{{ if (valueWithHeaders[0] == 0x00){ }}
{{ // Headers size is 0, just strip first byte }}
{{ final byte[] result = new byte[valueWithHeaders.length - 1];
}}{{ System.arraycopy(valueWithHeaders,
1, result, 0, result.length); }}
{{ return result; }}
{{ }}}
{{// Slow path for actual headers (rare in session stores)}}
{{...}}
```
Pros:
- Optimizes the common case (empty headers)
- ~50% faster for empty headers (no ByteBuffer, simple arraycopy)
- Still handles non-empty headers correctly
> Optimize rawValue methods across all Deserializers
> --------------------------------------------------
>
> Key: KAFKA-20249
> URL: https://issues.apache.org/jira/browse/KAFKA-20249
> Project: Kafka
> Issue Type: Sub-task
> Components: streams
> Reporter: Alieh Saeedi
> Priority: Major
> Labels: kip
> Fix For: 4.3.0
>
>
> Optimize rawValue/{{{}rawAggregationValue{}}} to fast-path the common case of
> empty headers in headers-aware state stores, while preserving correct
> behavior for non-empty headers.
> h3. Background
> In the current header-aware state store encoding, the value layout is:
> * 1st byte: header-length prefix (or start of varint-encoded header size)
> * Next {{N}} bytes: serialized headers
> * Remaining bytes: aggregation value (payload)
> Most state store records are expected to have no headers, so optimizing the
> empty-headers case reduces overhead in hot paths where we need to extract the
> raw aggregation value.
> h3. Proposed Change
> Introduce a specialized implementation of rawValue/{{{}rawAggregationValue.
> {}}}
> ```
> {{// Fast path for empty headers (most common case)}}
> {{ if (valueWithHeaders[0] == 0x00){ }}
> {{ // Headers size is 0, just strip first byte }}
> {{ final byte[] result = new byte[valueWithHeaders.length - 1];
> }}{{
> System.arraycopy(valueWithHeaders, 1, result, 0, result.length);
> }}
> {{ return result; }}
> }
> {{// Slow path for actual headers (rare in session stores)}}
> {{...}}
> ```
> Pros:
> - Optimizes the common case (empty headers)
> - ~50% faster for empty headers (no ByteBuffer, simple arraycopy)
> - Still handles non-empty headers correctly
--
This message was sent by Atlassian Jira
(v8.20.10#820010)