On Tue, Sep 09, 2025 at 08:40:28PM -0700, Nicolin Chen wrote:
> > + * level
> > + * The number of table hops from the lowest leaf. Level 0
> > + * is always a table of only leaves of the least significant VA bits.
> > The
>
> Hmm, I am a bit confused here. I thought "leaf" was meant to be a
> "leaf" table? But here "a table of only leaves" makes it feel like
> a "leaf" table entry?
How aboutL:
* level
* Level 0 is always a table of only leaves with no futher table pointers.
* Increasing levels increase the size of the table items. The least
* significant VA bits used to index page tables are used to index the Level
* 0 table. The various labels for table levels used by HW descriptions are
* not used.
> > + * table
> > + * A linear array of entries representing the translation items for
> > that
> > + * level.
> > + * index
> > + * The position in a table of an element: item = table[index]
> > + * item
> > + * A single position in a table
> > + * entry
> > + * A single logical element in a table. If contiguous pages are not
> > + * supported then item and entry are the same thing, otherwise entry
> > refers
> > + * to the all the items that comprise a single contiguous translation.
>
> So, an "entry" is a group of "items" if contiguous pages (huge
> page?) are supported.
Yes
> Then, the "entry" sounds like a physical (v.s. "logical") table
> entry, e.g. a PTE that we usually say?
I choose entry because it is related to PTE and in most cases you want
to work on the entries. The replication of entry to item is somewhat
hidden.
>From a HW perspective the TLB should be loading entries.
> > +#if !IS_ENABLED(CONFIG_GENERIC_ATOMIC64)
> > +static inline bool pt_table_install64(struct pt_state *pts, u64
> > table_entry)
> > +{
> > + u64 *entryp = pt_cur_table(pts, u64) + pts->index;
> > + u64 old_entry = pts->entry;
> > + bool ret;
> > +
> > + /*
> > + * Ensure the zero'd table content itself is visible before its PTE can
> > + * be. release is a NOP on !SMP, but the HW is still doing an acquire.
> > + */
> > + if (!IS_ENABLED(CONFIG_SMP))
> > + dma_wmb();
>
> Mind elaborating why SMP doesn't need this?
The command says the "relase is a NOP" it means this:
ret = try_cmpxchg64_release(entryp, &old_entry, table_entry);
On SMP release does an actual release, on UP it doesn't have a
barrier.
> Otherwise, these validations wouldn't be effective?
>
> drivers/iommu/generic_pt/pt_iter.h:388: if (PT_WARN_ON(!pts->table_lower))
> drivers/iommu/generic_pt/pt_iter.h-389- return -EINVAL;
Right, when PT_WARN_ON is disabled it does nothing and never evaluates
its expression. All the branches like the above are removed.
Jason