Re: [PATCH] Reading deterministic cache parameters and exporting it in /sysfs

2005-03-18 Thread Dave Jones
On Fri, Mar 18, 2005 at 11:18:47AM -0800, Venkatesh Pallipadi wrote:
 > 
 > Here is the updated patch. 
 > 
 > I have seperated out the changes related to 
 > (1) using new method to determine cache size in existing /proc/cpuinfo and
 > kernel boot messages (All but last hunk below)
 > (2) code to look at sharedness of the caches and store these details for 
 > future
 > uses inside kernel and also exporting the cache details in /sysfs (last
 > hunk in the patch)
 >   
 > Dave: Do you still feel having the cache details exported in /sysfs is a bad
 > idea? If yes, we can go ahead with the basic part of this patch (1 - above)
 > and look at (2) sometime later, as and when required.

tbh I think its just bloat, but if no-one else has any objections I won't 
oppose it.
The rest of the patch I have no problem with.

Dave

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Reading deterministic cache parameters and exporting it in /sysfs

2005-03-18 Thread Venkatesh Pallipadi

Here is the updated patch. 

I have seperated out the changes related to 
(1) using new method to determine cache size in existing /proc/cpuinfo and
kernel boot messages (All but last hunk below)
(2) code to look at sharedness of the caches and store these details for future
uses inside kernel and also exporting the cache details in /sysfs (last
hunk in the patch)
  
Dave: Do you still feel having the cache details exported in /sysfs is a bad
idea? If yes, we can go ahead with the basic part of this patch (1 - above)
and look at (2) sometime later, as and when required.

Thanks,
Venki



The attached patch adds support for using cpuid(4) instead of cpuid(2), to get
CPU cache information in a deterministic way for Intel CPUs, whenever
supported. The details of cpuid(4) can be found here

IA-32 Intel Architecture Software Developer's Manual (vol 2a)
(http://developer.intel.com/design/pentium4/manuals/index_new.htm#sdm_vol2a)
and
Prescott New Instructions (PNI) Technology: Software Developer's Guide
(http://www.intel.com/cd/ids/developer/asmo-na/eng/events/43988.htm)

The advantage of using the cpuid(4) ('Deterministic Cache Parameters Leaf') are:
* It provides more information than the descriptors provided by cpuid(2)
* It is not table based as cpuid(2). So, we will not need changes to the
  kernel to support new cache descriptors in the descriptor table (as is the
  case with cpuid(2)).

The patch also adds a bunch of interfaces under
/sys/devices/system/cpu/cpuX/cache, showing various information about the
caches. Most useful field being shared_cpu_map, which says what caches are
shared among which logical cpus.

The patch adds support for both i386 and x86-64.

Signed-off-by: Venkatesh Pallipadi <[EMAIL PROTECTED]>


--- linux-2.6.11/include/asm-i386/processor.h.org   2005-03-18 
12:39:09.0 -0800
+++ linux-2.6.11/include/asm-i386/processor.h   2005-03-18 08:44:56.0 
-0800
@@ -147,6 +147,18 @@ static inline void cpuid(int op, int *ea
: "0" (op), "c"(0));
 }
 
+/* Some CPUID calls want 'count' to be placed in ecx */
+static inline void cpuid_count(int op, int count, int *eax, int *ebx, int *ecx,
+   int *edx)
+{
+   __asm__("cpuid"
+   : "=a" (*eax),
+ "=b" (*ebx),
+ "=c" (*ecx),
+ "=d" (*edx)
+   : "0" (op), "c" (count));
+}
+
 /*
  * CPUID functions returning a single datum
  */
--- linux-2.6.11/include/asm-x86_64/msr.h.org   2005-03-14 13:27:47.0 
-0800
+++ linux-2.6.11/include/asm-x86_64/msr.h   2005-03-18 08:46:22.0 
-0800
@@ -78,6 +78,18 @@ extern inline void cpuid(int op, unsigne
: "0" (op));
 }
 
+/* Some CPUID calls want 'count' to be placed in ecx */
+static inline void cpuid_count(int op, int count, int *eax, int *ebx, int *ecx,
+   int *edx)
+{
+   __asm__("cpuid"
+   : "=a" (*eax),
+ "=b" (*ebx),
+ "=c" (*ecx),
+ "=d" (*edx)
+   : "0" (op), "c" (count));
+}
+
 /*
  * CPUID functions returning a single datum
  */
--- linux-2.6.11/arch/i386/kernel/cpu/intel_cacheinfo.c.org 2005-03-14 
13:27:20.0 -0800
+++ linux-2.6.11/arch/i386/kernel/cpu/intel_cacheinfo.c 2005-03-18 
13:46:54.0 -0800
@@ -1,5 +1,18 @@
+/*
+ *  Routines to indentify caches on Intel CPU.
+ *
+ *  Changes:
+ *  Venkatesh Pallipadi: Adding cache identification through cpuid(4)
+ */
+
 #include 
+#include 
+#include 
+#include 
+#include 
+
 #include 
+#include 
 
 #define LVL_1_INST 1
 #define LVL_1_DATA 2
@@ -58,10 +71,142 @@ static struct _cache_table cache_table[]
{ 0x00, 0, 0}
 };
 
+
+enum _cache_type
+{
+   CACHE_TYPE_NULL = 0,
+   CACHE_TYPE_DATA = 1,
+   CACHE_TYPE_INST = 2,
+   CACHE_TYPE_UNIFIED = 3
+};
+
+union _cpuid4_leaf_eax {
+   struct {
+   enum _cache_typetype:5;
+   unsigned intlevel:3;
+   unsigned intis_self_initializing:1;
+   unsigned intis_fully_associative:1;
+   unsigned intreserved:4;
+   unsigned intnum_threads_sharing:12;
+   unsigned intnum_cores_on_die:6;
+   } split;
+   u32 full;
+};
+
+union _cpuid4_leaf_ebx {
+   struct {
+   unsigned intcoherency_line_size:12;
+   unsigned intphysical_line_partition:10;
+   unsigned intways_of_associativity:10;
+   } split;
+   u32 full;
+};
+
+union _cpuid4_leaf_ecx {
+   struct {
+   unsigned intnumber_of_sets:32;
+   } split;
+   u32 full;
+};
+
+struct _cpuid4_info {
+   union _cpuid4_leaf_eax eax;
+   union _cpuid4_leaf_ebx ebx;
+   union _cpuid4_leaf_ecx ecx;
+   unsigned long size;
+   cpumask_t shared_cpu_map;
+};
+

Re: [PATCH] Reading deterministic cache parameters and exporting it in /sysfs

2005-03-18 Thread Venkatesh Pallipadi

Here is the updated patch. 

I have seperated out the changes related to 
(1) using new method to determine cache size in existing /proc/cpuinfo and
kernel boot messages (All but last hunk below)
(2) code to look at sharedness of the caches and store these details for future
uses inside kernel and also exporting the cache details in /sysfs (last
hunk in the patch)
  
Dave: Do you still feel having the cache details exported in /sysfs is a bad
idea? If yes, we can go ahead with the basic part of this patch (1 - above)
and look at (2) sometime later, as and when required.

Thanks,
Venki



The attached patch adds support for using cpuid(4) instead of cpuid(2), to get
CPU cache information in a deterministic way for Intel CPUs, whenever
supported. The details of cpuid(4) can be found here

IA-32 Intel Architecture Software Developer's Manual (vol 2a)
(http://developer.intel.com/design/pentium4/manuals/index_new.htm#sdm_vol2a)
and
Prescott New Instructions (PNI) Technology: Software Developer's Guide
(http://www.intel.com/cd/ids/developer/asmo-na/eng/events/43988.htm)

The advantage of using the cpuid(4) ('Deterministic Cache Parameters Leaf') are:
* It provides more information than the descriptors provided by cpuid(2)
* It is not table based as cpuid(2). So, we will not need changes to the
  kernel to support new cache descriptors in the descriptor table (as is the
  case with cpuid(2)).

The patch also adds a bunch of interfaces under
/sys/devices/system/cpu/cpuX/cache, showing various information about the
caches. Most useful field being shared_cpu_map, which says what caches are
shared among which logical cpus.

The patch adds support for both i386 and x86-64.

Signed-off-by: Venkatesh Pallipadi [EMAIL PROTECTED]


--- linux-2.6.11/include/asm-i386/processor.h.org   2005-03-18 
12:39:09.0 -0800
+++ linux-2.6.11/include/asm-i386/processor.h   2005-03-18 08:44:56.0 
-0800
@@ -147,6 +147,18 @@ static inline void cpuid(int op, int *ea
: 0 (op), c(0));
 }
 
+/* Some CPUID calls want 'count' to be placed in ecx */
+static inline void cpuid_count(int op, int count, int *eax, int *ebx, int *ecx,
+   int *edx)
+{
+   __asm__(cpuid
+   : =a (*eax),
+ =b (*ebx),
+ =c (*ecx),
+ =d (*edx)
+   : 0 (op), c (count));
+}
+
 /*
  * CPUID functions returning a single datum
  */
--- linux-2.6.11/include/asm-x86_64/msr.h.org   2005-03-14 13:27:47.0 
-0800
+++ linux-2.6.11/include/asm-x86_64/msr.h   2005-03-18 08:46:22.0 
-0800
@@ -78,6 +78,18 @@ extern inline void cpuid(int op, unsigne
: 0 (op));
 }
 
+/* Some CPUID calls want 'count' to be placed in ecx */
+static inline void cpuid_count(int op, int count, int *eax, int *ebx, int *ecx,
+   int *edx)
+{
+   __asm__(cpuid
+   : =a (*eax),
+ =b (*ebx),
+ =c (*ecx),
+ =d (*edx)
+   : 0 (op), c (count));
+}
+
 /*
  * CPUID functions returning a single datum
  */
--- linux-2.6.11/arch/i386/kernel/cpu/intel_cacheinfo.c.org 2005-03-14 
13:27:20.0 -0800
+++ linux-2.6.11/arch/i386/kernel/cpu/intel_cacheinfo.c 2005-03-18 
13:46:54.0 -0800
@@ -1,5 +1,18 @@
+/*
+ *  Routines to indentify caches on Intel CPU.
+ *
+ *  Changes:
+ *  Venkatesh Pallipadi: Adding cache identification through cpuid(4)
+ */
+
 #include linux/init.h
+#include linux/slab.h
+#include linux/device.h
+#include linux/compiler.h
+#include linux/cpu.h
+
 #include asm/processor.h
+#include asm/smp.h
 
 #define LVL_1_INST 1
 #define LVL_1_DATA 2
@@ -58,10 +71,142 @@ static struct _cache_table cache_table[]
{ 0x00, 0, 0}
 };
 
+
+enum _cache_type
+{
+   CACHE_TYPE_NULL = 0,
+   CACHE_TYPE_DATA = 1,
+   CACHE_TYPE_INST = 2,
+   CACHE_TYPE_UNIFIED = 3
+};
+
+union _cpuid4_leaf_eax {
+   struct {
+   enum _cache_typetype:5;
+   unsigned intlevel:3;
+   unsigned intis_self_initializing:1;
+   unsigned intis_fully_associative:1;
+   unsigned intreserved:4;
+   unsigned intnum_threads_sharing:12;
+   unsigned intnum_cores_on_die:6;
+   } split;
+   u32 full;
+};
+
+union _cpuid4_leaf_ebx {
+   struct {
+   unsigned intcoherency_line_size:12;
+   unsigned intphysical_line_partition:10;
+   unsigned intways_of_associativity:10;
+   } split;
+   u32 full;
+};
+
+union _cpuid4_leaf_ecx {
+   struct {
+   unsigned intnumber_of_sets:32;
+   } split;
+   u32 full;
+};
+
+struct _cpuid4_info {
+   union _cpuid4_leaf_eax eax;
+   union _cpuid4_leaf_ebx ebx;
+   union _cpuid4_leaf_ecx ecx;
+   unsigned long 

Re: [PATCH] Reading deterministic cache parameters and exporting it in /sysfs

2005-03-18 Thread Dave Jones
On Fri, Mar 18, 2005 at 11:18:47AM -0800, Venkatesh Pallipadi wrote:
  
  Here is the updated patch. 
  
  I have seperated out the changes related to 
  (1) using new method to determine cache size in existing /proc/cpuinfo and
  kernel boot messages (All but last hunk below)
  (2) code to look at sharedness of the caches and store these details for 
  future
  uses inside kernel and also exporting the cache details in /sysfs (last
  hunk in the patch)

  Dave: Do you still feel having the cache details exported in /sysfs is a bad
  idea? If yes, we can go ahead with the basic part of this patch (1 - above)
  and look at (2) sometime later, as and when required.

tbh I think its just bloat, but if no-one else has any objections I won't 
oppose it.
The rest of the patch I have no problem with.

Dave

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Reading deterministic cache parameters and exporting it in /sysfs

2005-03-16 Thread Daniel Egger
On 16.03.2005, at 00:36, Dave Jones wrote:
I really want to live to see the death of /proc/cpuinfo one day,
Please don't. cpuinfo contains a vast amount of useful
information for a quick inspection which cannot be determined
usefully from userspace (think embedded devices) for very
little code.
Servus,
  Daniel


PGP.sig
Description: This is a digitally signed message part


Re: [PATCH] Reading deterministic cache parameters and exporting it in /sysfs

2005-03-16 Thread Daniel Egger
On 16.03.2005, at 00:36, Dave Jones wrote:
I really want to live to see the death of /proc/cpuinfo one day,
Please don't. cpuinfo contains a vast amount of useful
information for a quick inspection which cannot be determined
usefully from userspace (think embedded devices) for very
little code.
Servus,
  Daniel


PGP.sig
Description: This is a digitally signed message part


Re: [PATCH] Reading deterministic cache parameters and exporting it in /sysfs

2005-03-15 Thread Andrew Morton
Venkatesh Pallipadi <[EMAIL PROTECTED]> wrote:
>
>  The attached patch adds support for using cpuid(4) instead of cpuid(2), to 
> get 
>  CPU cache information in a deterministic way for Intel CPUs, whenever 
>  supported.

- find_num_cache_leaves can be marked __init

- Please look for other __init opportunities.  That's quite a lot of code.

- Some functions have a space before the ( and some don't:

+static ssize_t show_size (struct _cpuid4_info *this_leaf, char *buf)

  omitting the space is preferred.

- Don't cast the return value of kmalloc:

+   cpuid4_info[cpu] = (struct _cpuid4_info *)kmalloc(
+   sizeof(struct _cpuid4_info) * num_cache_leaves, GFP_KERNEL);

- Sometimes there's a space after an `if', sometimes not.

+   if(cpuid4_info[i])

  a space is preferred.

- kfree(NULL) is permitted:

+   if(cpuid4_info[i])
+   kfree(cpuid4_info[i]);
+   if(cache_kobject[i])
+   kfree(cache_kobject[i]);
+   if(index_kobject[i])
+   kfree(index_kobject[i]);

  (in several places)


Once you've worked through the design issues with davej, please upissue the
patch, thanks.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Reading deterministic cache parameters and exporting it in /sysfs

2005-03-15 Thread Venkatesh Pallipadi
On Tue, Mar 15, 2005 at 06:36:20PM -0500, Dave Jones wrote:
> On Tue, Mar 15, 2005 at 03:24:48PM -0800, Venkatesh Pallipadi wrote:
>  >  
>  > The attached patch adds support for using cpuid(4) instead of cpuid(2), to 
> get 
>  > CPU cache information in a deterministic way for Intel CPUs, whenever 
>  > supported. The details of cpuid(4) can be found here
>  > 
>  > IA-32 Intel Architecture Software Developer's Manual (vol 2a)
>  > 
> (http://developer.intel.com/design/pentium4/manuals/index_new.htm#sdm_vol2a)
>  > and
>  > Prescott New Instructions (PNI) Technology: Software Developer's Guide
>  > (http://www.intel.com/cd/ids/developer/asmo-na/eng/events/43988.htm)
>  >  
>  > The advantage of using the cpuid(4) ('Deterministic Cache Parameters 
> Leaf') are:
>  > * It provides more information than the descriptors provided by cpuid(2)
>  > * It is not table based as cpuid(2). So, we will not need changes to the 
>  >   kernel to support new cache descriptors in the descriptor table (as is 
> the 
>  >   case with cpuid(2)).
>  >  
>  > The patch also adds a bunch of interfaces under 
>  > /sys/devices/system/cpu/cpuX/cache, showing various information about the
>  > caches.
> 
> Why does this need to be in kernel-space ? 

Currently, the CPU cache information is printed as a part of kernel bootup
messages and /proc/cpuinfo using cpuid(2). This patch is trying to use cpuid(4)
to print the messages in these places. I think this part of the patch is
required. Otherwise, we may end up printing 0 cache sizes on some CPUs.
It will also reduce the zero_cache_size_complaints on lkml :-).

> Is there some reason that prevents
> you from enhancing x86info for example ?  I really want to live to see the
> death of /proc/cpuinfo one day, and reinventing it in sysfs seems pointless
> if it can all be done in userspace.
> Given that the most useful field is of limited use to a majority of users,
> and those that are interested can read this from userspace, this has me very 
> puzzled.

Agreed. Exporting it in /sysfs is debatable. And some of the information like,
'Which CPUs are sharing what caches' may not be useful today. But,
with CPUs with HT and multiple cores and combinations of it, sharing different
caches, having this information will be useful inside the kernel as well. 
scheduler for example. We can setup some of the scheduler domain parameters 
based on whether L2 is shared or not. 
Also, we felt, exporting this information to userspace in a consistent way will
help userspace apps to do various things like binding to specific CPUs, using
the working set size based on cache size, etc, to optimize the performance. 
Again, this can be done in userspace as well. But, if kernel is already doing
it, it may be better to export it from the kernel space.

Thanks,
Venki

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Reading deterministic cache parameters and exporting it in /sysfs

2005-03-15 Thread Dave Jones
On Tue, Mar 15, 2005 at 03:24:48PM -0800, Venkatesh Pallipadi wrote:
 >  
 > The attached patch adds support for using cpuid(4) instead of cpuid(2), to 
 > get 
 > CPU cache information in a deterministic way for Intel CPUs, whenever 
 > supported. The details of cpuid(4) can be found here
 > 
 > IA-32 Intel Architecture Software Developer's Manual (vol 2a)
 > (http://developer.intel.com/design/pentium4/manuals/index_new.htm#sdm_vol2a)
 > and
 > Prescott New Instructions (PNI) Technology: Software Developer's Guide
 > (http://www.intel.com/cd/ids/developer/asmo-na/eng/events/43988.htm)
 >  
 > The advantage of using the cpuid(4) ('Deterministic Cache Parameters Leaf') 
 > are:
 > * It provides more information than the descriptors provided by cpuid(2)
 > * It is not table based as cpuid(2). So, we will not need changes to the 
 >   kernel to support new cache descriptors in the descriptor table (as is the 
 >   case with cpuid(2)).
 >  
 > The patch also adds a bunch of interfaces under 
 > /sys/devices/system/cpu/cpuX/cache, showing various information about the
 > caches.

Why does this need to be in kernel-space ? Is there some reason that prevents
you from enhancing x86info for example ?  I really want to live to see the
death of /proc/cpuinfo one day, and reinventing it in sysfs seems pointless
if it can all be done in userspace.
 
 > Most useful field being shared_cpu_map, which says what caches are 
 > shared among which logical cpus. 

Given that the most useful field is of limited use to a majority of users,
and those that are interested can read this from userspace, this has me very 
puzzled.

Dave

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] Reading deterministic cache parameters and exporting it in /sysfs

2005-03-15 Thread Venkatesh Pallipadi
 
The attached patch adds support for using cpuid(4) instead of cpuid(2), to get 
CPU cache information in a deterministic way for Intel CPUs, whenever 
supported. The details of cpuid(4) can be found here

IA-32 Intel Architecture Software Developer's Manual (vol 2a)
(http://developer.intel.com/design/pentium4/manuals/index_new.htm#sdm_vol2a)
and
Prescott New Instructions (PNI) Technology: Software Developer's Guide
(http://www.intel.com/cd/ids/developer/asmo-na/eng/events/43988.htm)
 
The advantage of using the cpuid(4) ('Deterministic Cache Parameters Leaf') are:
* It provides more information than the descriptors provided by cpuid(2)
* It is not table based as cpuid(2). So, we will not need changes to the 
  kernel to support new cache descriptors in the descriptor table (as is the 
  case with cpuid(2)).
 
The patch also adds a bunch of interfaces under 
/sys/devices/system/cpu/cpuX/cache, showing various information about the
caches. Most useful field being shared_cpu_map, which says what caches are 
shared among which logical cpus. 

The patch adds support for both i386 and x86-64.

Signed-off-by: Venkatesh Pallipadi <[EMAIL PROTECTED]>


--- linux-2.6.11/include/asm-i386/processor.h.org   2005-03-14 
13:27:34.0 -0800
+++ linux-2.6.11/include/asm-i386/processor.h   2005-03-14 20:33:39.0 
-0800
@@ -147,6 +147,18 @@ static inline void cpuid(int op, int *ea
: "0" (op), "c"(0));
 }
 
+/* Some CPUID calls want 'count' to be placed in ecx */
+static inline void cpuid_count(int op, int count, int *eax, int *ebx, int *ecx,
+   int *edx)
+{
+   __asm__("cpuid"
+   : "=a" (*eax),
+ "=b" (*ebx),
+ "=c" (*ecx),
+ "=d" (*edx)
+   : "0" (op), "c" (count));
+}
+
 /*
  * CPUID functions returning a single datum
  */
--- linux-2.6.11/include/asm-x86_64/msr.h.org   2005-03-14 13:27:47.0 
-0800
+++ linux-2.6.11/include/asm-x86_64/msr.h   2005-03-14 20:33:39.0 
-0800
@@ -78,6 +78,18 @@ extern inline void cpuid(int op, unsigne
: "0" (op));
 }
 
+/* Some CPUID calls want 'count' to be placed in ecx */
+static inline void cpuid_count(int op, int count, int *eax, int *ebx, int *ecx,
+   int *edx)
+{
+   __asm__("cpuid"
+   : "=a" (*eax),
+ "=b" (*ebx),
+ "=c" (*ecx),
+ "=d" (*edx)
+   : "0" (op), "c" (count));
+}
+
 /*
  * CPUID functions returning a single datum
  */
--- linux-2.6.11/arch/i386/kernel/cpu/intel_cacheinfo.c.org 2005-03-14 
13:27:20.0 -0800
+++ linux-2.6.11/arch/i386/kernel/cpu/intel_cacheinfo.c 2005-03-15 
13:57:30.0 -0800
@@ -1,5 +1,17 @@
+/*
+ *  Routines to indentify caches on Intel CPU.
+ *
+ *  Changes:
+ *  Venkatesh Pallipadi: Adding cache identification through cpuid(4)
+ */
+
 #include 
+#include 
+#include 
+#include 
+
 #include 
+#include 
 
 #define LVL_1_INST 1
 #define LVL_1_DATA 2
@@ -58,10 +70,142 @@ static struct _cache_table cache_table[]
{ 0x00, 0, 0}
 };
 
+
+enum _cache_type
+{
+   CACHE_TYPE_NULL = 0,
+   CACHE_TYPE_DATA = 1,
+   CACHE_TYPE_INST = 2,
+   CACHE_TYPE_UNIFIED = 3
+};
+
+union _cpuid4_leaf_eax {
+   struct {
+   enum _cache_typetype:5;
+   unsigned intlevel:3;
+   unsigned intis_self_initializing:1;
+   unsigned intis_fully_associative:1;
+   unsigned intreserved:4;
+   unsigned intnum_threads_sharing:12;
+   unsigned intnum_cores_on_die:6;
+   } split;
+   u32 full;
+};
+
+union _cpuid4_leaf_ebx {
+   struct {
+   unsigned intcoherency_line_size:12;
+   unsigned intphysical_line_partition:10;
+   unsigned intways_of_associativity:10;
+   } split;
+   u32 full;
+};
+
+union _cpuid4_leaf_ecx {
+   struct {
+   unsigned intnumber_of_sets:32;
+   } split;
+   u32 full;
+};
+
+struct _cpuid4_info {
+   union _cpuid4_leaf_eax eax;
+   union _cpuid4_leaf_ebx ebx;
+   union _cpuid4_leaf_ecx ecx;
+   unsigned long size;
+   cpumask_t shared_cpu_map;
+};
+
+#define MAX_CACHE_LEAVES   4
+static unsigned short  num_cache_leaves;
+
+static int cpuid4_cache_lookup(int index, struct _cpuid4_info *this_leaf)
+{
+   unsigned inteax, ebx, ecx, edx;
+   union _cpuid4_leaf_eax  cache_eax;
+
+   cpuid_count(4, index, , , , );
+   cache_eax.full = eax;
+   if (cache_eax.split.type == CACHE_TYPE_NULL)
+   return -1;
+
+   this_leaf->eax.full = eax;
+   this_leaf->ebx.full = ebx;
+   this_leaf->ecx.full = ecx;
+   this_leaf->size = (this_leaf->ecx.split.number_of_sets + 1) *
+

[PATCH] Reading deterministic cache parameters and exporting it in /sysfs

2005-03-15 Thread Venkatesh Pallipadi
 
The attached patch adds support for using cpuid(4) instead of cpuid(2), to get 
CPU cache information in a deterministic way for Intel CPUs, whenever 
supported. The details of cpuid(4) can be found here

IA-32 Intel Architecture Software Developer's Manual (vol 2a)
(http://developer.intel.com/design/pentium4/manuals/index_new.htm#sdm_vol2a)
and
Prescott New Instructions (PNI) Technology: Software Developer's Guide
(http://www.intel.com/cd/ids/developer/asmo-na/eng/events/43988.htm)
 
The advantage of using the cpuid(4) ('Deterministic Cache Parameters Leaf') are:
* It provides more information than the descriptors provided by cpuid(2)
* It is not table based as cpuid(2). So, we will not need changes to the 
  kernel to support new cache descriptors in the descriptor table (as is the 
  case with cpuid(2)).
 
The patch also adds a bunch of interfaces under 
/sys/devices/system/cpu/cpuX/cache, showing various information about the
caches. Most useful field being shared_cpu_map, which says what caches are 
shared among which logical cpus. 

The patch adds support for both i386 and x86-64.

Signed-off-by: Venkatesh Pallipadi [EMAIL PROTECTED]


--- linux-2.6.11/include/asm-i386/processor.h.org   2005-03-14 
13:27:34.0 -0800
+++ linux-2.6.11/include/asm-i386/processor.h   2005-03-14 20:33:39.0 
-0800
@@ -147,6 +147,18 @@ static inline void cpuid(int op, int *ea
: 0 (op), c(0));
 }
 
+/* Some CPUID calls want 'count' to be placed in ecx */
+static inline void cpuid_count(int op, int count, int *eax, int *ebx, int *ecx,
+   int *edx)
+{
+   __asm__(cpuid
+   : =a (*eax),
+ =b (*ebx),
+ =c (*ecx),
+ =d (*edx)
+   : 0 (op), c (count));
+}
+
 /*
  * CPUID functions returning a single datum
  */
--- linux-2.6.11/include/asm-x86_64/msr.h.org   2005-03-14 13:27:47.0 
-0800
+++ linux-2.6.11/include/asm-x86_64/msr.h   2005-03-14 20:33:39.0 
-0800
@@ -78,6 +78,18 @@ extern inline void cpuid(int op, unsigne
: 0 (op));
 }
 
+/* Some CPUID calls want 'count' to be placed in ecx */
+static inline void cpuid_count(int op, int count, int *eax, int *ebx, int *ecx,
+   int *edx)
+{
+   __asm__(cpuid
+   : =a (*eax),
+ =b (*ebx),
+ =c (*ecx),
+ =d (*edx)
+   : 0 (op), c (count));
+}
+
 /*
  * CPUID functions returning a single datum
  */
--- linux-2.6.11/arch/i386/kernel/cpu/intel_cacheinfo.c.org 2005-03-14 
13:27:20.0 -0800
+++ linux-2.6.11/arch/i386/kernel/cpu/intel_cacheinfo.c 2005-03-15 
13:57:30.0 -0800
@@ -1,5 +1,17 @@
+/*
+ *  Routines to indentify caches on Intel CPU.
+ *
+ *  Changes:
+ *  Venkatesh Pallipadi: Adding cache identification through cpuid(4)
+ */
+
 #include linux/init.h
+#include linux/slab.h
+#include linux/device.h
+#include linux/compiler.h
+
 #include asm/processor.h
+#include asm/smp.h
 
 #define LVL_1_INST 1
 #define LVL_1_DATA 2
@@ -58,10 +70,142 @@ static struct _cache_table cache_table[]
{ 0x00, 0, 0}
 };
 
+
+enum _cache_type
+{
+   CACHE_TYPE_NULL = 0,
+   CACHE_TYPE_DATA = 1,
+   CACHE_TYPE_INST = 2,
+   CACHE_TYPE_UNIFIED = 3
+};
+
+union _cpuid4_leaf_eax {
+   struct {
+   enum _cache_typetype:5;
+   unsigned intlevel:3;
+   unsigned intis_self_initializing:1;
+   unsigned intis_fully_associative:1;
+   unsigned intreserved:4;
+   unsigned intnum_threads_sharing:12;
+   unsigned intnum_cores_on_die:6;
+   } split;
+   u32 full;
+};
+
+union _cpuid4_leaf_ebx {
+   struct {
+   unsigned intcoherency_line_size:12;
+   unsigned intphysical_line_partition:10;
+   unsigned intways_of_associativity:10;
+   } split;
+   u32 full;
+};
+
+union _cpuid4_leaf_ecx {
+   struct {
+   unsigned intnumber_of_sets:32;
+   } split;
+   u32 full;
+};
+
+struct _cpuid4_info {
+   union _cpuid4_leaf_eax eax;
+   union _cpuid4_leaf_ebx ebx;
+   union _cpuid4_leaf_ecx ecx;
+   unsigned long size;
+   cpumask_t shared_cpu_map;
+};
+
+#define MAX_CACHE_LEAVES   4
+static unsigned short  num_cache_leaves;
+
+static int cpuid4_cache_lookup(int index, struct _cpuid4_info *this_leaf)
+{
+   unsigned inteax, ebx, ecx, edx;
+   union _cpuid4_leaf_eax  cache_eax;
+
+   cpuid_count(4, index, eax, ebx, ecx, edx);
+   cache_eax.full = eax;
+   if (cache_eax.split.type == CACHE_TYPE_NULL)
+   return -1;
+
+   this_leaf-eax.full = eax;
+   this_leaf-ebx.full = ebx;
+   this_leaf-ecx.full = ecx;
+   this_leaf-size = 

Re: [PATCH] Reading deterministic cache parameters and exporting it in /sysfs

2005-03-15 Thread Dave Jones
On Tue, Mar 15, 2005 at 03:24:48PM -0800, Venkatesh Pallipadi wrote:
   
  The attached patch adds support for using cpuid(4) instead of cpuid(2), to 
  get 
  CPU cache information in a deterministic way for Intel CPUs, whenever 
  supported. The details of cpuid(4) can be found here
  
  IA-32 Intel Architecture Software Developer's Manual (vol 2a)
  (http://developer.intel.com/design/pentium4/manuals/index_new.htm#sdm_vol2a)
  and
  Prescott New Instructions (PNI) Technology: Software Developer's Guide
  (http://www.intel.com/cd/ids/developer/asmo-na/eng/events/43988.htm)
   
  The advantage of using the cpuid(4) ('Deterministic Cache Parameters Leaf') 
  are:
  * It provides more information than the descriptors provided by cpuid(2)
  * It is not table based as cpuid(2). So, we will not need changes to the 
kernel to support new cache descriptors in the descriptor table (as is the 
case with cpuid(2)).
   
  The patch also adds a bunch of interfaces under 
  /sys/devices/system/cpu/cpuX/cache, showing various information about the
  caches.

Why does this need to be in kernel-space ? Is there some reason that prevents
you from enhancing x86info for example ?  I really want to live to see the
death of /proc/cpuinfo one day, and reinventing it in sysfs seems pointless
if it can all be done in userspace.
 
  Most useful field being shared_cpu_map, which says what caches are 
  shared among which logical cpus. 

Given that the most useful field is of limited use to a majority of users,
and those that are interested can read this from userspace, this has me very 
puzzled.

Dave

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Reading deterministic cache parameters and exporting it in /sysfs

2005-03-15 Thread Venkatesh Pallipadi
On Tue, Mar 15, 2005 at 06:36:20PM -0500, Dave Jones wrote:
 On Tue, Mar 15, 2005 at 03:24:48PM -0800, Venkatesh Pallipadi wrote:

   The attached patch adds support for using cpuid(4) instead of cpuid(2), to 
 get 
   CPU cache information in a deterministic way for Intel CPUs, whenever 
   supported. The details of cpuid(4) can be found here
   
   IA-32 Intel Architecture Software Developer's Manual (vol 2a)
   
 (http://developer.intel.com/design/pentium4/manuals/index_new.htm#sdm_vol2a)
   and
   Prescott New Instructions (PNI) Technology: Software Developer's Guide
   (http://www.intel.com/cd/ids/developer/asmo-na/eng/events/43988.htm)

   The advantage of using the cpuid(4) ('Deterministic Cache Parameters 
 Leaf') are:
   * It provides more information than the descriptors provided by cpuid(2)
   * It is not table based as cpuid(2). So, we will not need changes to the 
 kernel to support new cache descriptors in the descriptor table (as is 
 the 
 case with cpuid(2)).

   The patch also adds a bunch of interfaces under 
   /sys/devices/system/cpu/cpuX/cache, showing various information about the
   caches.
 
 Why does this need to be in kernel-space ? 

Currently, the CPU cache information is printed as a part of kernel bootup
messages and /proc/cpuinfo using cpuid(2). This patch is trying to use cpuid(4)
to print the messages in these places. I think this part of the patch is
required. Otherwise, we may end up printing 0 cache sizes on some CPUs.
It will also reduce the zero_cache_size_complaints on lkml :-).

 Is there some reason that prevents
 you from enhancing x86info for example ?  I really want to live to see the
 death of /proc/cpuinfo one day, and reinventing it in sysfs seems pointless
 if it can all be done in userspace.
 Given that the most useful field is of limited use to a majority of users,
 and those that are interested can read this from userspace, this has me very 
 puzzled.

Agreed. Exporting it in /sysfs is debatable. And some of the information like,
'Which CPUs are sharing what caches' may not be useful today. But,
with CPUs with HT and multiple cores and combinations of it, sharing different
caches, having this information will be useful inside the kernel as well. 
scheduler for example. We can setup some of the scheduler domain parameters 
based on whether L2 is shared or not. 
Also, we felt, exporting this information to userspace in a consistent way will
help userspace apps to do various things like binding to specific CPUs, using
the working set size based on cache size, etc, to optimize the performance. 
Again, this can be done in userspace as well. But, if kernel is already doing
it, it may be better to export it from the kernel space.

Thanks,
Venki

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Reading deterministic cache parameters and exporting it in /sysfs

2005-03-15 Thread Andrew Morton
Venkatesh Pallipadi [EMAIL PROTECTED] wrote:

  The attached patch adds support for using cpuid(4) instead of cpuid(2), to 
 get 
  CPU cache information in a deterministic way for Intel CPUs, whenever 
  supported.

- find_num_cache_leaves can be marked __init

- Please look for other __init opportunities.  That's quite a lot of code.

- Some functions have a space before the ( and some don't:

+static ssize_t show_size (struct _cpuid4_info *this_leaf, char *buf)

  omitting the space is preferred.

- Don't cast the return value of kmalloc:

+   cpuid4_info[cpu] = (struct _cpuid4_info *)kmalloc(
+   sizeof(struct _cpuid4_info) * num_cache_leaves, GFP_KERNEL);

- Sometimes there's a space after an `if', sometimes not.

+   if(cpuid4_info[i])

  a space is preferred.

- kfree(NULL) is permitted:

+   if(cpuid4_info[i])
+   kfree(cpuid4_info[i]);
+   if(cache_kobject[i])
+   kfree(cache_kobject[i]);
+   if(index_kobject[i])
+   kfree(index_kobject[i]);

  (in several places)


Once you've worked through the design issues with davej, please upissue the
patch, thanks.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/