On Mon, Apr 06, 2026 at 06:10:07PM -0400, Joel Fernandes wrote:
> 
> 
> On 4/6/2026 5:24 PM, Joel Fernandes wrote:
> > 
> > 
> > On 4/2/2026 1:59 AM, Matthew Brost wrote:
> >> On Tue, Mar 31, 2026 at 05:20:34PM -0400, Joel Fernandes wrote:
> >>> Add TLB (Translation Lookaside Buffer) flush support for GPU MMU.
> >>>
> >>> After modifying page table entries, the GPU's TLB must be invalidated
> >>> to ensure the new mappings take effect. The Tlb struct provides flush
> >>> functionality through BAR0 registers.
> >>>
> >>> The flush operation writes the page directory base address and triggers
> >>> an invalidation, polling for completion with a 2 second timeout matching
> >>> the Nouveau driver.
> >>>
> >>> Cc: Nikola Djukic <[email protected]>
> >>> Signed-off-by: Joel Fernandes <[email protected]>
> >>> ---
> >>>  drivers/gpu/nova-core/mm.rs     |  1 +
> >>>  drivers/gpu/nova-core/mm/tlb.rs | 95 +++++++++++++++++++++++++++++++++
> >>>  drivers/gpu/nova-core/regs.rs   | 42 +++++++++++++++
> >>>  3 files changed, 138 insertions(+)
> >>>  create mode 100644 drivers/gpu/nova-core/mm/tlb.rs
> >>>
> >>> diff --git a/drivers/gpu/nova-core/mm.rs b/drivers/gpu/nova-core/mm.rs
> >>> index 8f3089a5fa88..cfe9cbe11d57 100644
> >>> --- a/drivers/gpu/nova-core/mm.rs
> >>> +++ b/drivers/gpu/nova-core/mm.rs
> >>> @@ -5,6 +5,7 @@
> >>>  #![expect(dead_code)]
> >>>  
> >>>  pub(crate) mod pramin;
> >>> +pub(crate) mod tlb;
> >>>  
> >>>  use kernel::sizes::SZ_4K;
> >>>  
> >>> diff --git a/drivers/gpu/nova-core/mm/tlb.rs 
> >>> b/drivers/gpu/nova-core/mm/tlb.rs
> >>> new file mode 100644
> >>> index 000000000000..cd3cbcf4c739
> >>> --- /dev/null
> >>> +++ b/drivers/gpu/nova-core/mm/tlb.rs
> >>> @@ -0,0 +1,95 @@
> >>> +// SPDX-License-Identifier: GPL-2.0
> >>> +
> >>> +//! TLB (Translation Lookaside Buffer) flush support for GPU MMU.
> >>> +//!
> >>> +//! After modifying page table entries, the GPU's TLB must be flushed to
> >>> +//! ensure the new mappings take effect. This module provides TLB flush
> >>> +//! functionality for virtual memory managers.
> >>> +//!
> >>> +//! # Example
> >>> +//!
> >>> +//! ```ignore
> >>> +//! use crate::mm::tlb::Tlb;
> >>> +//!
> >>> +//! fn page_table_update(tlb: &Tlb, pdb_addr: VramAddress) -> Result<()> 
> >>> {
> >>> +//!     // ... modify page tables ...
> >>> +//!
> >>> +//!     // Flush TLB to make changes visible (polls for completion).
> >>> +//!     tlb.flush(pdb_addr)?;
> >>> +//!
> >>> +//!     Ok(())
> >>> +//! }
> >>> +//! ```
> >>> +
> >>> +use kernel::{
> >>> +    devres::Devres,
> >>> +    io::poll::read_poll_timeout,
> >>> +    io::Io,
> >>> +    new_mutex,
> >>> +    prelude::*,
> >>> +    sync::{
> >>> +        Arc,
> >>> +        Mutex, //
> >>> +    },
> >>> +    time::Delta, //
> >>> +};
> >>> +
> >>> +use crate::{
> >>> +    driver::Bar0,
> >>> +    mm::VramAddress,
> >>> +    regs, //
> >>> +};
> >>> +
> >>> +/// TLB manager for GPU translation buffer operations.
> >>> +#[pin_data]
> >>> +pub(crate) struct Tlb {
> >>> +    bar: Arc<Devres<Bar0>>,
> >>> +    /// TLB flush serialization lock: This lock is acquired during the
> >>> +    /// DMA fence signalling critical path. It must NEVER be held across 
> >>> any
> >>> +    /// reclaimable CPU memory allocations because the memory reclaim 
> >>> path can
> >>> +    /// call `dma_fence_wait()`, which would deadlock with this lock 
> >>> held.
> >>> +    #[pin]
> >>> +    lock: Mutex<()>,
> >>> +}
> >>> +
> >>> +impl Tlb {
> >>> +    /// Create a new TLB manager.
> >>> +    pub(super) fn new(bar: Arc<Devres<Bar0>>) -> impl PinInit<Self> {
> >>> +        pin_init!(Self {
> >>> +            bar,
> >>> +            lock <- new_mutex!((), "tlb_flush"),
> >>> +        })
> >>> +    }
> >>> +
> >>> +    /// Flush the GPU TLB for a specific page directory base.
> >>> +    ///
> >>> +    /// This invalidates all TLB entries associated with the given PDB 
> >>> address.
> >>> +    /// Must be called after modifying page table entries to ensure the 
> >>> GPU sees
> >>> +    /// the updated mappings.
> >>> +    pub(crate) fn flush(&self, pdb_addr: VramAddress) -> Result {
> >>
> >> This landed on my list randomly, so I took a look.
> >>
> >> Wouldn’t you want to virtualize the invalidation based on your device?
> >> For example, what if you need to register interface changes on future 
> >> hardware?
> > 
> > Good point, for future hardware it indeed makes sense. I will do that.
> Actually, at least in the future as far as I can see, the register definitions
> are the same for TLB invalidation are the same, so we are good and I will not 
> be
> making any change in this regard.
> 
> But, thanks for raising the point and forcing me to double check!
> 

Not my driver, but this looks like a classic “works now” change that may
not hold up later, which is why I replied to something that isn’t really
my business.

Again, not my area, but I’ve been through this before. Generally,
getting the abstractions right up front pays off.

Matt

> --
> Joel Fernandes
> 

Reply via email to