On Tue, Sep 09, 2025 at 08:40:28PM -0700, Nicolin Chen wrote:
> > + *  level
> > + *     The number of table hops from the lowest leaf. Level 0
> > + *     is always a table of only leaves of the least significant VA bits. 
> > The
> 
> Hmm, I am a bit confused here. I thought "leaf" was meant to be a
> "leaf" table? But here "a table of only leaves" makes it feel like
> a "leaf" table entry?

How aboutL:

 *  level
 *     Level 0 is always a table of only leaves with no futher table pointers.
 *     Increasing levels increase the size of the table items. The least
 *     significant VA bits used to index page tables are used to index the Level
 *     0 table. The various labels for table levels used by HW descriptions are
 *     not used.


> > + *  table
> > + *     A linear array of entries representing the translation items for 
> > that
> > + *     level.
> > + *  index
> > + *     The position in a table of an element: item = table[index]
> > + *  item
> > + *     A single position in a table
> > + *  entry
> > + *     A single logical element in a table. If contiguous pages are not
> > + *     supported then item and entry are the same thing, otherwise entry 
> > refers
> > + *     to the all the items that comprise a single contiguous translation.
> 
> So, an "entry" is a group of "items" if contiguous pages (huge
> page?) are supported. 

Yes

> Then, the "entry" sounds like a physical (v.s. "logical") table
> entry, e.g. a PTE that we usually say?

I choose entry because it is related to PTE and in most cases you want
to work on the entries. The replication of entry to item is somewhat
hidden.

>From a HW perspective the TLB should be loading entries.

> > +#if !IS_ENABLED(CONFIG_GENERIC_ATOMIC64)
> > +static inline bool pt_table_install64(struct pt_state *pts, u64 
> > table_entry)
> > +{
> > +   u64 *entryp = pt_cur_table(pts, u64) + pts->index;
> > +   u64 old_entry = pts->entry;
> > +   bool ret;
> > +
> > +   /*
> > +    * Ensure the zero'd table content itself is visible before its PTE can
> > +    * be. release is a NOP on !SMP, but the HW is still doing an acquire.
> > +    */
> > +   if (!IS_ENABLED(CONFIG_SMP))
> > +           dma_wmb();
> 
> Mind elaborating why SMP doesn't need this?

The command says the "relase is a NOP" it means this:

        ret = try_cmpxchg64_release(entryp, &old_entry, table_entry);

On SMP release does an actual release, on UP it doesn't have a
barrier.

> Otherwise, these validations wouldn't be effective?
> 
> drivers/iommu/generic_pt/pt_iter.h:388: if (PT_WARN_ON(!pts->table_lower))
> drivers/iommu/generic_pt/pt_iter.h-389-         return -EINVAL;

Right, when PT_WARN_ON is disabled it does nothing and never evaluates
its expression. All the branches like the above are removed.

Jason

Reply via email to