On 3/9/06, Eric Lowe <[EMAIL PROTECTED]> wrote: > Hello, > > [...] > > Comparing SF68k/SF15k with Niagara is problematic. The broken MMU design in > > the US3/4 CPU models used in these machines is not able to use a > > significant amount of 64k pages. If you still got a small performance win > > there then this would prove that an all-64k kernel has significant > > performance advanges over the stock version with it's 8k "dwarf page" size. > > Please explain what you mean by "broken".
US3 only has one TLB set with 512 entries for 8k pages. US3+ improved this by the addition of another TLB set with 512 entries for 4M pages - anything between these points - 64k and 512k pages - was ignored. Today this design shows it's drawbacks as "automatic" MPSS has only limited use on such CPU models. > If you mean US-III, I would > agree, as this chip only supported 8K pages in the "T8" (the only big 512 > entry) TLB. 64K performance was, well, problematic on that machine because > all the entries fight for the fully associative T16 TLB which only has > around 9 available unlocked entries. Not a pretty picture! > > For this reason the autoMPSS code is turned off on US-III platforms. We > tried it, and it was miserable for some apps. > > On the other hand, US-III+, US- IIIi, and later chips support programming > both large 512-entry data TLBs so there is no loss of associativity nor > capacity, so 64K pages work great within the DTLB. Ok - this is new information for me (Linux just assumes the 2nd TLB set with 512 entries is for 4M pages). > The performance > analysis was, as I recall, done mostly using US-III+. Did this include the concept that dwarfpages (8k) are no longer available to both kernel and user land applications? > The instruction TLB on some of these chips only supports 8K, but that has > a straightforward workaround in the form of TTE synthesis/replication. > US-IV+ supports 8K or 64K in the ITLB so you're set there. > > The fact that one of the dual large TLBs is always "hard wired" to 8K in > Solaris today on US-III+/IV/IV+ is an artifact of the implementation and > doesn't reflect any limitation of the hardware. > > > Could Sun get the project code released into Opensolaris, please? I agree > > with both David Miller and Roland Mainz that a kernel which uses 64k pages > > by default will have significant performance advantages over the kernel > > which uses "dwarfpages". The 8k page size is a significant limitation. > > I'm sure the archives are lying around somewhere. Can Sun release the code? I'd be more than happy to write a project proposal then. I assume we will receive some help from the HPC community. The next generation of vector supercomputers may use similar large page sizes (>= 64k) as minimum configurable value and Opensolaris may be a good testbed implementation - assuming we can completely eradicate dwarfpages from kernel and user land. > As I stated earlier, I'd be interested in a thorough analysis of how 64K > base page size kernel stacks up against our current autoMPSS code on the > T2000. If there was a platform where the TLB-vs-.* tradeoffs are obvious, > Niagara would be it. In the last twenty years the page size for SPARC CPUs did only double (from 4k to 8k, both which I call dwarfpages as they are no longer the optimum - in fact they are far below the optimum) while disk and memory throughput increased many many times. Even latency was reduced by a factor larger than 8. One of the remaining factors is the default page size which did not increase to reflect the new conditions. Holger _______________________________________________ opensolaris-discuss mailing list opensolaris-discuss@opensolaris.org