Re: Determining CPU features / cache organization from userland

2003-10-15 Thread Joseph Koshy


Hi Bruce,

A few thoughts on your API:

1) Rather than naming the struct's as "l1", "l2" etc, it may be more
   orthogonal to use an array of cache entries like so

   struct entry { ... } entries[MAX_ENTRIES];  where MAX_ENTRIES would be say, 
8.

2) We could pass information back about whether the cache is write-back or
   write-through and whether it uses write-allocate.  In some CPUs (e.g. 
   the AMD K6-2) this aspect of the cache is programmable at boot time.

3) Have a bit indicating whether the cache is indexed virtually or physically. 

   This allows us to describe TLBs and caches using the same descriptor; the
   MIPS R4K used virtually addressed L1 caches, IIRC.

4) For caches and TLBs that support variable line/page sizes, we would 
   be reporting the currently programmed size (the kernel knows this
   information) I guess?

The 'type' field of the cache descriptor could be an `u_int32_t' or 
`u_int16_t',
allocated out as follows:

kind:   tlb/cache/other 2 bits
addressing: virtual/physical/unknown2 bits
mode:   data/instruction/both/unknown   2 bits
distance:   L0/L1/L2/whatever   3 bits
on-write-hit:   write-back/write-thru/unknown   2 bits
on-write-miss:  write-allocate/unknown  2 bits

Another suggestion I have is that the sysctl return:

int n_entries;
struct entry entries[n_entries];

since it isn't clear how many levels of cache and how many kinds
of TLBs are going to be used in the systems of tomorrow.

Regards,
Koshy
<[EMAIL PROTECTED]>

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Determining CPU features / cache organization from userland

2003-10-14 Thread John-Mark Gurney
Bruce M Simpson wrote this message on Mon, Oct 13, 2003 at 20:32 +0100:
> i386 pc98 amd64
> ---
> Action: Add code to identcpu.c to fill out hw_cacheinfo.
> 
> Cache discovery: Extended CPUID.
>  Static tables if 486-class machine. No cache on 386.
> TLB discovery: Extended CPUID.
>  Static tables if 486-class machine. No cache on 386.

not to be a stick, but I do have a Am386DX-40 that has 128kb of cache
on the mother board.  It's the same style SRAMs that most 486's have
on their board.

-- 
  John-Mark Gurney  Voice: +1 415 225 5579

 "All that I will do, has been done, All that I have, has not."
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Determining CPU features / cache organization from userland

2003-10-13 Thread Bob Bishop
Hi,

ISTR that AMD 486 had different cache arrangements from Intel. Just threw 
one out - I'll see if I can find another around here.

--
Bob Bishop  +44 (0)118 977 4017
[EMAIL PROTECTED]   fax +44 (0)118 989 4254
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Determining CPU features / cache organization from userland

2003-10-13 Thread Bruce M Simpson
All,

Here are detailed design documents for determining cache and TLB
geometry across our currently supported processor architectures,
with recommendations outlined for implementation.

What I haven't addressed yet is how indirect consumers of the API might
use it, e.g. mutex consumers vs. UMA, in the context of allocating
cache-aligned mutexes from a mutex pool.

Please let me know your thoughts.

BMS

Detailed design for cache/tlb geometry discovery


all
---
Review NetBSD's uvm_page_recolor() viz applicability to FreeBSD VM/UMA.

alpha
-
Action: Add code to machdep.c in identifycpu() to fill out hw_cacheinfo.

Cache discovery: Static tables keyed on specific CPU model.
TLB discovery: Static tables keyed on specific CPU model.

Cache heuristic:
 8Kb L1 Split Direct Mapped (21064)
 2MB L2 Unified Direct Mapped (21064)
 All CPUs below 21264 have a 32-byte L1 line size.
 21264 (EV6) has a 64-byte L1 line size.
 Optional L3 cache.
TLB heuristic:
 ITLB 8KB page 8 lines, 4MB page 4 lines (21064)
 DTLB 32 lines, all page sizes, fully associative. (21064)

ia64

Action: Add code to machdep.c in identifycpu() to fill out hw_cacheinfo.
Review Linux's pal.h and palinfo.c. files.

Cache discovery: Call the platform functions PAL_CACHE_SUMMARY and
 PAL_CACHE_INFO to get this information.
TLB discovery: Static tables keyed on specific CPU model.

Cache heuristic:
 L1 typically split 4-way set-associative 16KB,
 L2 256KB unified, L3 3MB-6MB unified.
 Line size isn't defined by the architecture.
TLB heuristic:
 L1 TLB, split, data/instruction 32 entries each, fully associative
 L2 TLB, split, data/instruction 128 entries each, fully associative

i386 pc98 amd64
---
Action: Add code to identcpu.c to fill out hw_cacheinfo.

Cache discovery: Extended CPUID.
 Static tables if 486-class machine. No cache on 386.
TLB discovery: Extended CPUID.
 Static tables if 486-class machine. No cache on 386.

Cache heuristic (Intel): L1: 4-way, 32 bytes/line
Cache heuristic (AMD): L2: 8-way, 64 bytes/line
TLB heuristic (Intel):
 4KB Code: 32 entries, 4-way, LRU
 4MB Code: 2 entries, Fully associative, LRU
 4KB Data: 64 entries, 4-way, LRU
 4MB Data: 8 entries, 4-way, LRU
TLB heuristic (AMD):
 4KB L1 Code: 16 entries, Fully associative, LRU
 4MB/2MB L1 Code: 8 entries, Fully associative, LRU
 4KB L1 Data: 24/32 entries, Fully associative, LRU
 4MB/2MB L1 Data: 8 entries, 4-way, LRU
 4KB L2 Code: 256 entries, 4-way, LRU
 4KB L2 Data: 256 entries, 4-way, LRU

(That's 6 distinct TLBs to deal with on AMD-based i386 architectures).

powerpc
---
Action: Adapt from NetBSD as appropriate.

Cache discovery:
 Open Firmware on CHRP if available.
 Static tables keyed on specific CPU model.
TLB discovery:
 Open Firmware on CHRP if available.
 Static tables keyed on specific CPU model.

Cache heuristic:
  L1 line size: 32 bytes across family.
   Pre-G5: 32KB/32KB Split, 8-way
   G5: 64KB/32KB Split, 1-way
  L2 line size: 32/64/128 bytes,
TLB heuristic:
 PPC 601e:
  4KB Instruction TLB, 4 entries, most recently used translations
  UTLB, 256 entries, 2-way set associative, software selectable block size

OFW properties:
 i-cache-size i-cache-sets i-cache-block-size
 d-cache-size d-cache-sets d-cache-block-size
 tlb-size tlb-sets l2-cache

[*] CHRP only

mips

Action: Adapt from NetBSD as appropriate.

Cache discovery: Static tables keyed on specific CPU model.
TLB discovery: MIPS32/MIPS64 Privileged Resource Architecture registers
Cache heuristic: Split/unified L1/L2, unified L3.
TLB heuristic: 16KB page size, 64 entries, fully associative (R1)

sparc64
---
Action:
 Adapt existing code in cache.c to fill out and use hw_cacheinfo.
 Review assembly code, particularly that which abuses the TLB.
 Work closely with jake@ to avoid code churn.

Cache discovery: Open Firmware.
TLB discovery: Open Firmware.
Cache heuristic: Split L1, Unified L2.
TLB heuristic: Split L1 TLB. Fully Associative. NLU. 64 lines each.

OFW properties:
icache-size icache-line-size icache-associativity
dcache-size dcache-line-size dcache-associativity
ecache-size ecache-line-size ecache-associativity
#dtlb-entries #itlb-entries

Maintain information about cache and TLB geometry in an MI structure.
The abstraction is intended to reflect current and future machine
architectures.

It is expected that the contents of these structures may not change over
the lifetime of the kernel. Keeping this information in a structure doesn't
significantly increase the cost of retrieving it from userland anyway.

Userland consumers such as thread libraries and memory allocators should
take a copy of this structure upon initialization. Kernel consumers
may feel free to cache the information in local variables as they like.

TLBs are 'caches' for virtual address lookups. Like data and instruction
caches, they may employ set associativity to reduce the risk of
unnecessary cache flushes/misses in multiprogramming envir

Re: Determining CPU features / cache organization from userland

2003-10-13 Thread Sean Winn
Peter Jeremy wrote:
On Sun, Oct 12, 2003 at 08:57:52PM +0100, Bruce M Simpson wrote:

[ Andrew: Perhaps you can shed some light on how the necessary information
can be gathered on Alpha? My search was incomplete and I could not find
a reliable source for DEC's development manuals. ]


L1 cache information is in the CPU datasheets.  I don't know of a
summary across the whole Alpha family.  The datasheets can be
(nominally) found at:
http://h18000.www1.hp.com/products/software/alpha-tools/documentation/current/chip-docs.html
Last time I went digging, some of the links didn't work but if you
look at the links and rummage around the FTP site, the information was
all there (and other material that wasn't referenced in the HTML pages).

sysctl is a good interface for retrieving this information as it doesn't
change during the lifetime of the kernel, and it is small. sysctl is already
invoked from within libc to retrieve information in this way.


I agree.  sysctl would appear to be the best interface.


alpha
-
Cache discovery? Static.


AFAIK, there's no PALcode interface, unfortunately.


i386 pc98 amd64
---
Cache discovery? CPUID.
Earlier chips which don't support it probably don't have a cache,
or aren't worth supporting.


80386 has no on-chip cache.
Intel i486 has 8KB _unified_ 4-way, 16 bytes/line L1.  Cache alignment has
a significant effect and gcc defaults to 16-byte alignment on -m486.
Only the DX, SX, DX2, SX2 and GX - DX4 has a 16kB one, and it may be 
write through or write back.

However, I believe the DX4s have CPUID so detecting them should be simple.

ports/benchmarks/lmbench includes tools that can experimentally
determine the cache configuration - though not quickly/efficiently
enough to form part of the boot.
Peter
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Determining CPU features / cache organization from userland

2003-10-13 Thread Peter Jeremy
On Sun, Oct 12, 2003 at 08:57:52PM +0100, Bruce M Simpson wrote:
>[ Andrew: Perhaps you can shed some light on how the necessary information
>can be gathered on Alpha? My search was incomplete and I could not find
>a reliable source for DEC's development manuals. ]

L1 cache information is in the CPU datasheets.  I don't know of a
summary across the whole Alpha family.  The datasheets can be
(nominally) found at:
http://h18000.www1.hp.com/products/software/alpha-tools/documentation/current/chip-docs.html

Last time I went digging, some of the links didn't work but if you
look at the links and rummage around the FTP site, the information was
all there (and other material that wasn't referenced in the HTML pages).

>sysctl is a good interface for retrieving this information as it doesn't
>change during the lifetime of the kernel, and it is small. sysctl is already
>invoked from within libc to retrieve information in this way.

I agree.  sysctl would appear to be the best interface.

>alpha
>-
>Cache discovery? Static.

AFAIK, there's no PALcode interface, unfortunately.

>i386 pc98 amd64
>---
>Cache discovery? CPUID.
>Earlier chips which don't support it probably don't have a cache,
>or aren't worth supporting.

80386 has no on-chip cache.
Intel i486 has 8KB _unified_ 4-way, 16 bytes/line L1.  Cache alignment has
a significant effect and gcc defaults to 16-byte alignment on -m486.

ports/benchmarks/lmbench includes tools that can experimentally
determine the cache configuration - though not quickly/efficiently
enough to form part of the boot.

Peter
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Determining CPU features / cache organization from userland

2003-10-12 Thread Bruce M Simpson
All,

I came up with the attached text file today to summarize some of my
findings, after looking at various open source trees to see how they
handle run-time cache geometry detection.

Many will find it ironic that i386 is the easiest platform to deal with.

[ Andrew: Perhaps you can shed some light on how the necessary information
can be gathered on Alpha? My search was incomplete and I could not find
a reliable source for DEC's development manuals. ]

Jeff Roberson suggested I adopt NetBSD's API, however, on further
examination it's clear that NetBSD's approach isn't consistent across
all platforms. Darwin takes a similar approach, but it is perhaps too
PowerPC-centric.

sysctl is a good interface for retrieving this information as it doesn't
change during the lifetime of the kernel, and it is small. sysctl is already
invoked from within libc to retrieve information in this way.

glibc's approach to dealing with situations where knowledge of the cache
line size is needed is a bit fractious - it retrieves the information from
an 'aux vector' passed to glibc at startup.

I think threading libraries should seriously consider becoming consumers of
the API once it's finalized. Mutex alignment on cache line boundaries is
desirable for userland applications too. However, phk malloc would need to
be changed in order to support this specific form of aligned allocation.

Perhaps a separate pool or zone could be used for this kind of allocation?
This becomes more important and timely when one considers the I/O alignment
restrictions we've encountered. Some applications may need to align their
buffers on arbitrary boundaries to suit devices, too.

BMS

all
---
NetBSD cache information API(s) are not consistent across platforms.

alpha
-
Cache discovery? Static.
21064, 21064A, 21066, 21066A, 21164 all have line sizes of 32-bytes.
The 21264 has a 64-byte line size.
21364: L1 split, 64KB each, 2-way set-associative, 
Virtual caches can be implemented using PALcode, but this is
probably more of a curiosity than anything else.

ia64

Cache discovery? Call PAL_CACHE_INFO, I think.
No documentation on how to do this at this time.
I have emailed [EMAIL PROTECTED] asking for advice.

i386 pc98 amd64
---
Cache discovery? CPUID.
Earlier chips which don't support it probably don't have a cache,
or aren't worth supporting.

General rule for x86: split L1, unified L2, optional unified L3.
General rule for Intel P5: 2-way, 32 bytes/line
General rule for Intel MMX and up: 4-way, 32 bytes/line
PPro doesn't have L3.
The newer cores have different cache geometry.

powerpc
---
Cache line discovery? Static.
Many core variants.
I have not seen any runtime code for this.
The POWER clcs instruction is obsolete.

OpenDarwin assumes 32-bytes. It has hooks for discovering the
cache geometry at runtime but these are not used.

NetBSD statically initializes this information according to the
discovered CPU model in use, which is the way to go.
NetBSD tells uvm to recolor the page queues if required.

Linux uses static #define's from IBM people, except in the case
of ppc64, which is strikingly similar to the OpenDarwin code
except it actually talks to the open firmware.

Open Firmware on CHRP should however provide the following
for each cpu device node configured in the system:
i-cache-size i-cache-sets i-cache-block-size
d-cache-size d-cache-sets d-cache-block-size
tlb-size tlb-sets l2-cache
All are integers except for l2-cache which is the address of an l2-cache
device node if the system found one.

mips

The NetBSD MIPS code for dealing with cache geometry
was recently updated.
MIPS caches may be split/unified at L1/L2 and unified at L3.
Cache detection code is quite voluminous. Swipe NetBSD's
if FreeBSD/mips ever kicks off.
Many, many core variants.

sparc64
---
Cache line discovery? Performed by Open Firmware.

Open Firmware property names used are ever so slightly different from Apple's.
icache-size icache-line-size icache-associativity
dcache-size dcache-line-size dcache-associativity
ecache-size ecache-line-size ecache-associativity

Already handled within cache.c, but assembly stubs *expect* this
information in a certain format.  Specifically they need to see
the data cache/instruction cache sizes and line sizes.

General rule: Split L1, Unified L2.
Cores: Spitfire/Blackbird/Cheetah
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Determining CPU features / cache organization from userland

2003-10-11 Thread Bruce M Simpson
On Sat, Oct 11, 2003 at 08:12:31PM +1000, Peter Jeremy wrote:
> Out of interest, do any systems other than the big-iron Alpha's use L3
> cache?  A quick look at the code suggests that only L2 is coloured.

L3 cache is present on many MIPS and Pentium Xeon systems, as well as
PowerPC G4.

> Do any systems use split L2 (or L3) caches?  And how do you define the
> wierd micro-instruction cache used in the P4?

I believe certain models of MIPS may have split L2. Most L3 caches I
believe will be unified.

> How do you distinguish between a direct-mapped and fully-associative
> cache?  (Do any current CPUs have fully-associative caches?)  For
> set-associative caches, is it worth identifying and reporting the
> replacement algorithm (eg random, LRU or pseudo-LRU)

Add a sysctl type. enum cachetype { notpresent, direct, setassoc,
fullyassoc }.  Only look at sets if cache type set accordingly.

[TLB]
> This is possibly more useful on the RISC CPUs where the TLB is managed
> in firmware (eg Alpha PALcode) so TLB misses are expensive.  Note that
> at least the Alpha has multiple sets of TLB registers for different
> mapping types and sizes.  The number of registers in each set varies
> between different AXP generations (though I think the sets remain the
> same).

I know a number of individuals and organizations involved with FreeBSD pay
very close attention to this, to the point of doing TLB profiling to ensure
they don't churn too much in time-critical code, particulary on i386 derived
platforms. I think knowledge of TLB geometry is valuable everywhere, but more
so in the cases you point out. sparc64 has software-managed TLB.

[on non-symmetric SMP processor clock-speeds and cache organisation]
> Whether FreeBSD wants to support this market is another issue.

We'll build that bridge when we come to it.

BMS
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Determining CPU features / cache organization from userland

2003-10-11 Thread Peter Jeremy
On Sat, Oct 11, 2003 at 09:27:11AM +0100, Bruce M Simpson wrote:
>OS X definitions considered too PowerPC centric. I think the best way
>to handle all cases is thus:-
>
> - Support 3 levels of cache.

Out of interest, do any systems other than the big-iron Alpha's use L3
cache?  A quick look at the code suggests that only L2 is coloured.

> - Each level may be unified or split between code and data
>   not-quite-Von-Neumann-style.

Do any systems use split L2 (or L3) caches?  And how do you define the
wierd micro-instruction cache used in the P4?

> - Allow explicit retrieval of this info keyed on the cache you're
>   interested in. This means: hw.cache.lN.(linesize|lines|sets)

How do you distinguish between a direct-mapped and fully-associative
cache?  (Do any current CPUs have fully-associative caches?)  For
set-associative caches, is it worth identifying and reporting the
replacement algorithm (eg random, LRU or pseudo-LRU)

> - Do similar for the TLB insofar as we can return information about
>   the chip's TLB. I know for example from talking to peter@ that
>   the Opteron is quite a different beast (ASNs, flush filter, etc).

This is possibly more useful on the RISC CPUs where the TLB is managed
in firmware (eg Alpha PALcode) so TLB misses are expensive.  Note that
at least the Alpha has multiple sets of TLB registers for different
mapping types and sizes.  The number of registers in each set varies
between different AXP generations (though I think the sets remain the
same).

> - Assume that all CPUs have identical characteristics in an SMP system.
>   Trying to assume otherwise is pointless. People should be using matched
>   chips anyway.

HP AlphaServer ES47 (and ES45 from memory) allow different speed CPUs
in an SMP system.  Some of the high-end SPARCservers probably do as
well.  This probably does make sense when you're talking about a
system which might be expanded over its lifetime - and the slow CPUs
that came with the system initially might no longer be available but
you need more CPUs and can't justify replacing the existing ones.

Whether FreeBSD wants to support this market is another issue.

Peter
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Determining CPU features / cache organization from userland

2003-10-11 Thread Bruce M Simpson
On Sat, Oct 11, 2003 at 01:58:27PM +1000, Peter Jeremy wrote:
> >If you do this,  it may make sense to use the same names as MacOSX.
> 
> What if your hardware has different linesizes for different caches?

I noticed whilst peering in Apple Developer Notes that G5 has 128 byte
cache line size, and this screws up mutexes bigtime. (!!)

OS X definitions considered too PowerPC centric. I think the best way
to handle all cases is thus:-

 - Support 3 levels of cache.
 - Each level may be unified or split between code and data
   not-quite-Von-Neumann-style.
 - Allow explicit retrieval of this info keyed on the cache you're
   interested in. This means: hw.cache.lN.(linesize|lines|sets)
 - Do similar for the TLB insofar as we can return information about
   the chip's TLB. I know for example from talking to peter@ that
   the Opteron is quite a different beast (ASNs, flush filter, etc).
 - Assume that all CPUs have identical characteristics in an SMP system.
   Trying to assume otherwise is pointless. People should be using matched
   chips anyway.

BMS
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Determining CPU features / cache organization from userland

2003-10-10 Thread Peter Jeremy
On Fri, Oct 10, 2003 at 03:09:47PM -0400, Andrew Gallatin wrote:
>
>Bruce M Simpson writes:
> > I've been thinking we should definitely make the cache organization
> > info available via sysctl. I am thinking we should do this to make
> > the UMA_ALIGN_CACHE definition mean something...
>
>If you do this,  it may make sense to use the same names as MacOSX.
>
>g51% sysctl hw | grep cache
>hw.cachelinesize: 128
>hw.l1icachesize: 65536
>hw.l1dcachesize: 32768
>hw.l2cachesize: 524288

What if your hardware has different linesizes for different caches?

Peter
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Determining CPU features / cache organization from userland

2003-10-10 Thread Bruce M Simpson
On Fri, Oct 10, 2003 at 03:09:47PM -0400, Andrew Gallatin wrote:
> Bruce M Simpson writes:
>  > I've been thinking we should definitely make the cache organization
>  > info available via sysctl. I am thinking we should do this to make
>  > the UMA_ALIGN_CACHE definition mean something...
> 
> If you do this,  it may make sense to use the same names as MacOSX.
> 
> Eg: 
> 
> g51% sysctl hw | grep cache
> hw.cachelinesize: 128
> hw.l1icachesize: 65536
> hw.l1dcachesize: 32768
> hw.l2cachesize: 524288

Er, that's weird, considering POWER has the CLCS instruction which is
intended to support variable cache line sizes. Doesn't POWER4 and POWER5
have a cache which is split in this way?

Also can we assume they are the same for all CPUs in an SMP system? I'd
like to think that that is the case.

BMS
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Determining CPU features / cache organization from userland

2003-10-10 Thread Andrew Gallatin

Bruce M Simpson writes:
 > I've been thinking we should definitely make the cache organization
 > info available via sysctl. I am thinking we should do this to make
 > the UMA_ALIGN_CACHE definition mean something...

If you do this,  it may make sense to use the same names as MacOSX.

Eg: 

g51% sysctl hw | grep cache
hw.cachelinesize: 128
hw.l1icachesize: 65536
hw.l1dcachesize: 32768
hw.l2cachesize: 524288


Drew
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Determining CPU features / cache organization from userland

2003-10-10 Thread Brian Reichert
On Fri, Oct 10, 2003 at 02:44:00PM +0100, Bruce M Simpson wrote:
> On Fri, Oct 10, 2003 at 03:36:40AM -0700, Joseph Koshy wrote:
> > I'm looking for ways that a userland program can determine the CPU
> > features available on an SMP machine -- processor model, stepping
> > numbers, supported features, cache organization etc.
> 
> "What Silby said" and have a look at the sysutils/x86info port.

Hey, cool, I'd never heard about this.

Just tried this, and got some wierdness.  Can I ask about it here,
or do I poke at the port maintainer?

> I've been thinking we should definitely make the cache organization
> info available via sysctl. I am thinking we should do this to make
> the UMA_ALIGN_CACHE definition mean something...
> 
> I will probably throw diffs Jeff's way soon for this but I'm recovering
> from a bit of a nasty cold right now.
> 
> BMS
> ___
> [EMAIL PROTECTED] mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
> To unsubscribe, send any mail to "[EMAIL PROTECTED]"
> 

-- 
Brian 'you Bastard' Reichert<[EMAIL PROTECTED]>
37 Crystal Ave. #303Daytime number: (603) 434-6842
Derry NH 03038-1713 USA BSD admin/developer at large
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Determining CPU features / cache organization from userland

2003-10-10 Thread Bruce M Simpson
On Fri, Oct 10, 2003 at 03:36:40AM -0700, Joseph Koshy wrote:
> I'm looking for ways that a userland program can determine the CPU
> features available on an SMP machine -- processor model, stepping
> numbers, supported features, cache organization etc.

"What Silby said" and have a look at the sysutils/x86info port.

I've been thinking we should definitely make the cache organization
info available via sysctl. I am thinking we should do this to make
the UMA_ALIGN_CACHE definition mean something...

I will probably throw diffs Jeff's way soon for this but I'm recovering
from a bit of a nasty cold right now.

BMS
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Determining CPU features / cache organization from userland

2003-10-10 Thread Mike Silbersack

On Fri, 10 Oct 2003, Joseph Koshy wrote:

> Hi -hackers,
>
> I'm looking for ways that a userland program can determine the CPU
> features available on an SMP machine -- processor model, stepping
> numbers, supported features, cache organization etc.
>
> For example, on some x86 processors the CPUID instruction could be
> used to determine some of these parameters, but using this instruction
> in an SMP context is a little tricky since we do not know which CPU
> gets to execute the instruction.

At least in the Intel world, multiprocessor systems are _always_ supposed
to have matching processor steppings, so the reliability of the
information should be very good indeed.

Mike "Silby" Silbersack
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Determining CPU features / cache organization from userland

2003-10-10 Thread Joseph Koshy


Hi -hackers,

I'm looking for ways that a userland program can determine the CPU
features available on an SMP machine -- processor model, stepping
numbers, supported features, cache organization etc.

For example, on some x86 processors the CPUID instruction could be
used to determine some of these parameters, but using this instruction
in an SMP context is a little tricky since we do not know which CPU 
gets to execute the instruction.

Would you know of any existing APIs, in use in other OSes, for
retrieving this kind of information?

Regards,
Koshy
<[EMAIL PROTECTED]>
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"