Hello,

Using dictionary encoding, it is very easy to create a compression bomb simply by setting bit width = 0. Then you can encode a virtually infinite number of values in a constant (very small) data size. This is an ideal payload for a potential denial of service, either through CPU or memory exhaustion.

Looking at the dictionary encoder in Arrow C++, bit width == 0 is only emitted when there are 0 physical values to encode. Do other encoders have different policies? Would it be reasonable to state that bit width == 0 is only allowed if there are zero physical values in the page?

Regards

Antoine.


Reply via email to