Ramin Gharib created FLINK-38137:
------------------------------------

             Summary: RocksDB State Backend Null Serialization Causes NPE and 
Asymmetric (De)Serialization Logic
                 Key: FLINK-38137
                 URL: https://issues.apache.org/jira/browse/FLINK-38137
             Project: Flink
          Issue Type: Bug
          Components: Runtime / State Backends
            Reporter: Ramin Gharib


The RocksDB state backend has a critical flaw in its handling of null values, 
which can cause NullPointerExceptions and create unnecessary serialization 
overhead.
h3. Problem Description

In [AbstractRocksDBState, the 
serializeValueNullSensitive()|https://github.com/raminqaf/flink/blob/main/flink-state-backends/flink-statebackend-rocksdb/src/main/java/org/apache/flink/state/rocksdb/AbstractRocksDBState.java#L179-L179]
 method unconditionally calls the serializer even when the value is null:


{code:java}
<T> byte[] serializeValueNullSensitive(T value, TypeSerializer<T> serializer)
        throws IOException {
    dataOutputView.clear();
    dataOutputView.writeBoolean(value == null);  // Write null flag
    return serializeValueInternal(value, serializer);  // Always serialize, 
even if null
}

private <T> byte[] serializeValueInternal(T value, TypeSerializer<T> serializer)
        throws IOException {
    serializer.serialize(value, dataOutputView);  // Can throw NPE if value is 
null
    return dataOutputView.getCopyOfBuffer();
} {code}
 

 

This design has two major flaws:
 # NPE Risk: The behavior becomes dependent on the TypeSerializer 
implementation. Serializers not designed to handle null inputs will throw 
NullPointerException.

 # Asymmetric Logic: There's a critical mismatch between serialization and 
deserialization:

 * 
 ** Serialization: Writes null flag + always attempts to serialize the value 
object (even if null)

 * 
 ** Deserialization: Reads null flag + immediately returns null without 
attempting deserialization if flag is true

h3. Evidence from Deserialization Code

In 
[RocksDBMapState|https://github.com/apache/flink/blob/bf1cd860617f7b51ac91516814c0e931e5bba241/flink-state-backends/flink-statebackend-rocksdb/src/main/java/org/apache/flink/state/rocksdb/RocksDBMapState.java#L413],
 the deserialization logic shows this asymmetry:
{code:java}
private static <UV> UV deserializeValueNullSensitive(
        byte[] rawValueBytes,
        TypeSerializer<UV> valueSerializer)
        throws IOException {
    dataInputView.setBuffer(rawValueBytes);
    boolean isNull = dataInputView.readBoolean();
    return isNull ? null : valueSerializer.deserialize(dataInputView);  // 
Never deserializes if null
} {code}
 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to