On Mon Feb 9, 2026 at 10:42 PM CET, Joel Fernandes wrote:

[...]

> +//! params.size_bytes = SZ_8M as u64;

It looks there are ~30 occurences of `as u64` in this example code, which seems
quite inconvinient for drivers.

In nova-core I proposed to have FromSafeCast / IntoSafeCast for usize, u32 and
u64, which would help here as well, once factored out.

But even this seems pretty annoying. I wonder if we should just have separate
64-bit size constants, as they'd be pretty useful in other places as well, e.g.
GPUVM.

> +/// Inner structure holding the actual buddy allocator.
> +///
> +/// # Synchronization
> +///
> +/// The C `gpu_buddy` API requires synchronization (see 
> `include/linux/gpu_buddy.h`).
> +/// The internal [`GpuBuddyGuard`] ensures that the lock is held for all
> +/// allocator and free operations, preventing races between concurrent 
> allocations
> +/// and the freeing that occurs when [`AllocatedBlocks`] is dropped.
> +///
> +/// # Invariants
> +///
> +/// The inner [`Opaque`] contains a valid, initialized buddy allocator.
> +#[pin_data(PinnedDrop)]
> +struct GpuBuddyInner {
> +    #[pin]
> +    inner: Opaque<bindings::gpu_buddy>,
> +    #[pin]
> +    lock: Mutex<()>,

Why don't we have the mutex around the Opaque<bindings::gpu_buddy>? It's the
only field the mutex does protect.

Is it because mutex does not take an impl PinInit? If so, we should add a
comment with a proper TODO.

> +    /// Base offset for all allocations (does not change after init).
> +    base_offset: u64,
> +    /// Cached chunk size (does not change after init).
> +    chunk_size: u64,
> +    /// Cached total size (does not change after init).
> +    size: u64,
> +}
> +
> +impl GpuBuddyInner {
> +    /// Create a pin-initializer for the buddy allocator.
> +    fn new(params: &GpuBuddyParams) -> impl PinInit<Self, Error> {

I think we can just pass them by value, they shouldn't be needed anymore after
the GpuBuddy instance has been constructed.

> +        let base_offset = params.base_offset_bytes;
> +        let size = params.physical_memory_size_bytes;
> +        let chunk_size = params.chunk_size_bytes;
> +
> +        try_pin_init!(Self {
> +            inner <- Opaque::try_ffi_init(|ptr| {
> +                // SAFETY: ptr points to valid uninitialized memory from the 
> pin-init
> +                // infrastructure. gpu_buddy_init will initialize the 
> structure.
> +                to_result(unsafe { bindings::gpu_buddy_init(ptr, size, 
> chunk_size) })
> +            }),
> +            lock <- new_mutex!(()),
> +            base_offset: base_offset,
> +            chunk_size: chunk_size,
> +            size: size,
> +        })
> +    }

<snip>

> +/// GPU buddy allocator instance.
> +///
> +/// This structure wraps the C `gpu_buddy` allocator using reference 
> counting.
> +/// The allocator is automatically cleaned up when all references are 
> dropped.
> +///
> +/// # Invariants
> +///
> +/// The inner [`Arc`] points to a valid, initialized GPU buddy allocator.
> +pub struct GpuBuddy(Arc<GpuBuddyInner>);
> +
> +impl GpuBuddy {
> +    /// Create a new buddy allocator.
> +    ///
> +    /// Creates a buddy allocator that manages a contiguous address space of 
> the given
> +    /// size, with the specified minimum allocation unit (chunk_size must be 
> at least 4KB).
> +    pub fn new(params: &GpuBuddyParams) -> Result<Self> {

Same here, we should be able to take this by value.

> +        Ok(Self(Arc::pin_init(
> +            GpuBuddyInner::new(params),
> +            GFP_KERNEL,
> +        )?))
> +    }

<snip>

> +    /// Allocate blocks from the buddy allocator.
> +    ///
> +    /// Returns an [`Arc<AllocatedBlocks>`] structure that owns the 
> allocated blocks
> +    /// and automatically frees them when all references are dropped.
> +    ///
> +    /// Takes `&self` instead of `&mut self` because the internal [`Mutex`] 
> provides
> +    /// synchronization - no external `&mut` exclusivity needed.
> +    pub fn alloc_blocks(&self, params: &GpuBuddyAllocParams) -> 
> Result<Arc<AllocatedBlocks>> {

Why do we force a reference count here? I think we should just return
impl PinInit<AllocatedBlocks, Error> and let the driver decide where to
initialize the object, no?

I.e. what if the driver wants to store additional data in a driver private
structure? Then we'd need two allocations otherwise and another reference count
in the worst case.

> +        let buddy_arc = Arc::clone(&self.0);
> +
> +        // Create pin-initializer that initializes list and allocates blocks.
> +        let init = try_pin_init!(AllocatedBlocks {
> +            buddy: Arc::clone(&buddy_arc),
> +            list <- CListHead::new(),
> +            flags: params.buddy_flags,
> +            _: {
> +                // Lock while allocating to serialize with concurrent frees.
> +                let guard = buddy.lock();
> +
> +                // SAFETY: `guard` provides exclusive access to the buddy 
> allocator.
> +                to_result(unsafe {
> +                    bindings::gpu_buddy_alloc_blocks(
> +                        guard.as_raw(),
> +                        params.start_range_address,
> +                        params.end_range_address,
> +                        params.size_bytes,
> +                        params.min_block_size_bytes,
> +                        list.as_raw(),
> +                        params.buddy_flags.as_raw(),
> +                    )
> +                })?
> +            }
> +        });
> +
> +        Arc::pin_init(init, GFP_KERNEL)
> +    }
> +}

Reply via email to