Hi

I think the best way how to support it would be to modify the VDO target 
to use the asynchronous compression API (so that it could use arbitrary 
algorithms). Then, the support for IAA could be plugged in easily with 
little or no extra code.

It is not good to have branches like "if (iaa_enabled) ... else ...;", 
because that would just blow into unmaintainable bunch of code when other 
accelerators (maybe s390x?) would be added.

Also, it would be good (to ease reviewing), to split the patch into 
several smaller patches.

Mikulas


On Wed, 29 Apr 2026, ... wrote:

> 
> Hello,
> 
> I am following up here after an earlier reply suggested that Linux dm-vdo 
> changes should be discussed on the device-mapper mailing list rather than as 
> a GitHub PR.
> 
>  Intel IAA, the In-Memory Analytics Accelerator, is a built-in accelerator in 
> recent Intel Xeon processors. One of its main uses is offloading compression 
> and decompression
> work from CPU cores. This is relevant to dm-vdo because VDO already spends 
> CPU time in the compressed write and read paths, and the kernel already 
> exposes IAA compression
> through the crypto API as the deflate-iaa algorithm. The existing IAA crypto 
> driver documentation 
> (dm-linux/Documentation/driver-api/crypto/iaa/iaa-crypto.rst) also describes
> zswap as one consumer of this interface, so my prototype tries the same 
> general model for dm-vdo.
> 
> Before changing dm-vdo, I compared the current LZ4 path with IAA on a set of 
> single-thread compression tests. For compression, IAA hardware was faster 
> than LZ4 on most tested
> datasets, withthe measured compression time often reduced by about 1.3x to 
> 3.5x compared with LZ4. The compression ratio was also generally comparable 
> to or better than LZ4,
> because the IAA path uses DEFLATE rather than LZ4. Decompression was more 
> mixed: IAA hardware was close to LZ4 or faster on some datasets, but slower 
> on others.
> [IMAGE]
> Please see figure1 in the attachments.
> 
> The prototype is in:
> 
>   https://github.com/dm-vdo/dm-linux/pull/96
> 
> It contains two commits:
> 
>   dm vdo: add minimal IAA compression support
> 
>   dm vdo: preserve compressed block format for IAA
> 
> The change is intentionally narrow and only touches the dm-vdo compression and
> 
> decompression path under `drivers/md/dm-vdo/`.The design is as shown in the 
> figure below:
> 
> [IMAGE]
> 
> Please see figure2 in the attachments.
> 
> The current result is:
> 
> 1. dm-vdo gets an optional iaa_enabled module parameter.
> 
> 2. On writes, compress_data_vio() first tries deflate-iaa through the async 
> compression crypto API. If IAA is disabled, unavailable, or the IAA 
> compression attempt fails, it
> falls back to the existing LZ4 path.
> 
> 3. On reads, uncompress_data_vio() can decode data produced by the IAA path. 
> It tries IAA-assisted DEFLATE first, then software zlib inflate, and then the 
> existing LZ4 path.
> 
> 4. The prototype preserves the existing compressed block format. I did not 
> add a new on-disk compressor field in this version.
> 
> 5. Data written through the IAA path is not dependent on IAA hardware being 
> present later, because it can still be decoded by the software DEFLATE 
> fallback.
> 
> In local testing, the prototype was able to write and read IAA-compressed VDO 
> data correctly, including the software fallback case when IAA was not used 
> for the read. The
> performance is as shown in the figure below. As can be seen, the performance 
> remains largely consistent with the original after adding IAA.
> 
> Write path:
> 
> Command: dd if=../mnt_ori/SRR.fastq of=./SSR.fastq bs=1M oflag=direct
> 
> IAA: 6430986673 bytes (6.4 GB, 6.0 GiB) copied, 4.98476 s, 1.3
> GB/s                                                                          
>                                                                    
> 
> LZ4: 6430986673 bytes (6.4 GB, 6.0 GiB) copied, 4.91684 s, 1.3 GB/s
> 
> Read path:
> 
> [IMAGE]
> 
> Please see figure3 in the attachments.
> 
> The reason I think this may be worth discussing is that IAA gives dm-vdo a 
> way to offload compression work on systems that already have the accelerator, 
> while keeping the
> existing LZ4 path as the compatibility and fallback path. The goal of this 
> RFC is not to propose a final format or policy yet, but to check whether this 
> direction is acceptable
> before I spend more time preparing a proper patch series.
> 
> I would appreciate feedback on whether using the existing kernel IAA crypto 
> API from dm-vdo is a reasonable direction, and whether preserving the current 
> compressed block
> format is acceptable for an initial RFC.
> 
>  
> 
> Best regards,
> 
> 
> 
> 

Reply via email to