[ https://issues.apache.org/jira/browse/FLINK-38137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18010347#comment-18010347 ]
Rui Fan edited comment on FLINK-38137 at 7/30/25 8:58 AM: ---------------------------------------------------------- Merged to master(2.2.0) via: 70b673406299fa518300a8ef3bdfcd7899358f63 and 8c8bcecd8d564f6da3ecad521aaab86c5b9fa960 2.1.1 via: 9faac10f84f56029afbbf6e6a4459762428d2160 and d2a88f6b0e87a637fd0759c7205e93af2f4d1a6f 2.0.1 via: fca72ea1acd2b8974e2aca1682034f47e08927e7 and db43791a5af26b3e222117fdf67a2dcf064ddc33 1.20.3 via: fca72ea1acd2b8974e2aca1682034f47e08927e7 and db43791a5af26b3e222117fdf67a2dcf064ddc33 was (Author: fanrui): Merged to master(2.2.0) via: 70b673406299fa518300a8ef3bdfcd7899358f63 and 8c8bcecd8d564f6da3ecad521aaab86c5b9fa960 > RocksDB State Backend Null Serialization Causes NPE and Asymmetric > (De)Serialization Logic > ------------------------------------------------------------------------------------------ > > Key: FLINK-38137 > URL: https://issues.apache.org/jira/browse/FLINK-38137 > Project: Flink > Issue Type: Bug > Components: Runtime / State Backends > Reporter: Ramin Gharib > Assignee: Rui Fan > Priority: Major > Labels: pull-request-available > Fix For: 2.0.1, 1.20.3, 2.2.0, 2.1.1 > > > The RocksDB state backend has a critical flaw in its handling of null values, > which can cause NullPointerExceptions and create unnecessary serialization > overhead. > h3. Problem Description > In [AbstractRocksDBState, the > serializeValueNullSensitive()|https://github.com/raminqaf/flink/blob/main/flink-state-backends/flink-statebackend-rocksdb/src/main/java/org/apache/flink/state/rocksdb/AbstractRocksDBState.java#L179-L179] > method unconditionally calls the serializer even when the value is null: > {code:java} > <T> byte[] serializeValueNullSensitive(T value, TypeSerializer<T> serializer) > throws IOException { > dataOutputView.clear(); > dataOutputView.writeBoolean(value == null); // Write null flag > return serializeValueInternal(value, serializer); // Always serialize, > even if null > } > private <T> byte[] serializeValueInternal(T value, TypeSerializer<T> > serializer) > throws IOException { > serializer.serialize(value, dataOutputView); // Can throw NPE if value > is null > return dataOutputView.getCopyOfBuffer(); > } {code} > This design has two major flaws: > # NPE Risk: The behavior becomes dependent on the TypeSerializer > implementation. Serializers not designed to handle null inputs will throw > NullPointerException. > # Asymmetric Logic: There's a critical mismatch between serialization and > deserialization: > * > ** Serialization: Writes null flag + always attempts to serialize the value > object (even if null) > ** Deserialization: Reads null flag + immediately returns null without > attempting deserialization if flag is true > h3. Evidence from Deserialization Code > In > [RocksDBMapState|https://github.com/apache/flink/blob/bf1cd860617f7b51ac91516814c0e931e5bba241/flink-state-backends/flink-statebackend-rocksdb/src/main/java/org/apache/flink/state/rocksdb/RocksDBMapState.java#L413], > the deserialization logic shows this asymmetry: > {code:java} > private static <UV> UV deserializeValueNullSensitive( > byte[] rawValueBytes, > TypeSerializer<UV> valueSerializer) > throws IOException { > dataInputView.setBuffer(rawValueBytes); > boolean isNull = dataInputView.readBoolean(); > return isNull ? null : valueSerializer.deserialize(dataInputView); // > Never deserializes if null > } {code} > -- This message was sent by Atlassian Jira (v8.20.10#820010)