Re: [osol-code] Question about

Gavin Maltby Wed, 28 Jan 2009 15:29:47 -0800

Abhishek Bhattacharjee wrote:
> Hi all --
> 
> I'm new to Solaris and am trying to understand how the kernel handles TLB 
> misses.
>  From reading the manuals, I gather that instruction and data TLB misses 
> cause an
 > interrupt to the OS. Thereafter, the miss handler uses the MMU Tag Register 
 > and
 > TSB pointer register to search for the required translation table entry in 
 > the TSB.


There is a huge amount of complexity to this code - you've chosen a challenging
area at which to begin learning Solaris.  If you try to follow the code in this
area there are many pitfalls - particularly of hot-patching of trap handler
entries to better suit particular systems.

There are 3 levels of storage for translation table entries - hardware
TLB, software TSB, and backend hashes.  When a mapping
is established it is entered into a giant (really) hash - uhme_hash for
all userland entries across all hatids, and khme_hsdh for kernel mappings.

When a virtual address is presented to the mmu it looks it up in the
TLBs (there are several, handling different page sizes).  If it hits
it returns the translated physical address, otherwise we miss and
trap.  The trap handler uses the info you quote and a bunch of
trickery to check the TSB for a hit - the TSB is a per-hat
cache of recently-used translations and if we can hit here
we avoid walking the backend hash.  Note that the tlb
miss handling code cannot itself incur another tlb
miss, which is why it performs lookups using physical
addresses.

If we miss in the TSB then we have to go to the backend hash and fill
from there.  That hash is designed to be highly scaleable and
avoid contention.  If we miss in the hash then there is no
current translation for this VA in the given hat (address space)
so we pagefault to create one (or fault the access).

> I'm wondering how TSB accesses are monitored for multithreaded programs
 > running on multicore machines. Suppose that in a 2-core system, a parallel
 > program's workload is split into two threads, each executing on one core.
 > If both cores experience TLB misses around the same time, can they access
 > the TSB simultaneously (they would eventually need to access the same TSB
 > since their PIDs are the same) ? Or is the TSB locked by whichever core 
 > misses first, serializing lookup?

Updates to the TSB are made with atomic instructions.  Thereafter the rule
is that if you can see a valid entry in the TSB then you can use it,
no need for locking.  If we have to go to the backend hash then
we hash to a single hash bucket and we lock that for the brief
moment it takes to scan down the list of hmeblks on that bucket.

The cost comes in destroying mapping entries - you need to do this on
all cpus that have ever used this address space.  So we cross-trap
to those cpus and have them all remove the entry from TLB and
invalidate the TSB entry in some tricky order dance.  There
has been a lot of change and optimization around this area
since I last knew this code.

Gavin
_______________________________________________
opensolaris-code mailing list
opensolaris-code@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/opensolaris-code

Re: [osol-code] Question about

Reply via email to