On 29-04-2026 07:12 pm, Gregory Price wrote:
>>
>> Great! I believe "writable budget" could be an interesting idea which
>> can solve the 'bus error' sort of scenarios due to device not capable of
>> taking any more writes. The write budget could be replenished using the
>> control path and writes will not go ahead without the budget available,
>> right?>
>>
>
> Write budget is simple
>
> budget=1 (up to 1 page can be writable
> 1) swap 1 page -> cram alloc 1 page, put VSWAP_CRAM in PTE
> 2) read-fault -> cram upgrades VSWAP_CRAM to R/O PTE
> 3) write-fault ->
> a) if (writable_cnt < budget) { budget++; mkwrite(pte); }
> b) else: normal swap semantic -> promote to normal memory
>
> Meanwhile - use ballooning and a simple shrinker to dynamically size the
> region to respond to real compression ratio.
>
>
> All said an done - you get something close to zswap but with R/O
> mappings for all entries, and optional R/W-mappings for administrators
> who know something about their workload and can afford to take the risk
> of some amount of capacity being written to uncontended in exchange for
> performance.
>
> The writable-budget is a risk-dial: How much do you trust your workload
> to now spew un/poorly-compressible memory? The write-budget is a direct
> measure of that. (so take P99.99999 compression ratios, and you can make
> a good chunk of that writable).
>
> ~Gregory
>
>
I believe we are converging. Agree to most points you mentioned.
I see this problem statement can be solved by 'write-control + write
budget' approach similar to what you have described, whether we take
swap path or not.
But I see this 'write budget' (budget in terms of number of write
operations that can be handled by the device, not capacity) to be
provided by the device in control plane; not by the workloads in the host.
The budget can be communicated by the device in the device control plane
periodically (to be handled in the specific cram back-end driver; may be
interpreting the device back-pressure indications into a write budget
value). Even if the control plane breaks down, the host does not run
into issues except that it will not write further.
I assume you see this value coming from the workloads. This might be a
place where I have a different opinion.
There are multiple advantages of this value coming from the device:
1) We can modulate the write budget depending on the actual
compressibility in the device (and so workloads data). We don't have to
do estimation based on the workloads.
2) We don't have to do the capacity modulation - as in ballooning or
shrinker.
3) Even if the control path is broken, host can write only till the
available 'write budget'; so it won't get into 'bus error' situations.
~Arun George