Re: [Openipmi-developer] ipmi_si appears to be broken on multinode systems in 2.6.18 kernel

2006-11-17 Thread Corey Minyard
Carol Hebert wrote:
 Hi Corey,

 I wanted to let you know about some of the testing I've done with some
 of the new 39.1 patches and also to ask you about an issue I found.

 First, I wanted to ask you about the ipmi-remove-device-interface-limits
 patch.  It seems that when I have this patch loaded (along with just the
 3 multinode fix patches listed below), the drivers work fine if ipmi_si
 is loaded last, but if ipmi_si loaded before ipmi_devintf, the system
 oopses (appended below).  Does this patch require one or more of the
 other patches in the 39.1 set to be happy (for instance, the
 allow-hot-smi-remove patch), or am I running into some other issue?
   
I looked at this yesterday and today, I cannot figure out what would be 
different between the two scenarios.  I could not reproduce this, and 
it's probably best to just take all the patches as that is what I 
tested.  I did test different loading orders in my testing.

I'll try to look at this again today.
 Also, I wanted to let you know that I was able to get some time on an
 8-way node and tested the following 39.1 patches:
 ipmi-fix-device-model-name.patch,
 ipmi-remove-interface-number-limits.patch
 ipmi-handle-sysfs-errors.patch
 ipmi-pass-sysfs-name-from-lower-level-driver.patch

 They seemed to work fine (with the drivers loaded in the good order
 described above).  All 8 device nodes were created and seemed to be
 equally usable.  
   
Ok, thanks.
 I got a bit of info about the order in which the SMBIOS table is
 populated and found out that it's currently populated in order of
 increasing KCS I/O address but that this isn't necessarily an ordering
 scheme that can be assumed for the future.  Also, regarding changing the
 BIOS to make the deviceID unique across BMCs, I was told that if these
 changes were made, we would likely be facing many issues such as
 DeviceID mismatches with what's coded up in the SDR data, etc.  So I
 suspect it's something that might not happen anytime soon (if ever).
   
That really doesn't make any sense.  The only place I could find where 
this Device ID is used is in the type 13 SDR: Management Controller 
Confirmation Record.  This record is used by utility software to record 
that it found a specific management controller in the system.  It seems 
of limited value to me, anyway, and having different device IDs would 
seem to make this easier, not harder, to identify the different 
management controllers.  From what I can tell, the use of this is for 
system software to record the current management controller 
configuration.  Then if system software finds something different, it 
can say Hey, something changed and handle it.

Note that the term Device ID is heavily overloaded in the IPMI spec.  
It also has FRU Device ID and Device ID String, but those are 
completely different things.

I see no other reliable mechanism to correlate management controllers 
with nodes, especially if nodes ever become dynamic.  I really doubt you 
will have any issues unless you have software that is hardcoded to 
handle this.  That doesn't seem so, since they are all the same and it 
doesn't provide any real useful information.  Perhaps the group doing 
the work can suggest a reliable way to correlate the nodes and the 
management controllers?

Thanks

-Corey


-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT  business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.phpp=sourceforgeCID=DEVDEV
___
Openipmi-developer mailing list
Openipmi-developer@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/openipmi-developer


Re: [Openipmi-developer] ipmi_si appears to be broken on multinode systems in 2.6.18 kernel

2006-11-17 Thread Carol Hebert
On Fri, 2006-11-17 at 09:53 -0600, Corey Minyard wrote:

  oopses (appended below).  Does this patch require one or more of the
  other patches in the 39.1 set to be happy (for instance, the
  allow-hot-smi-remove patch), or am I running into some other issue?

 I looked at this yesterday and today, I cannot figure out what would be 
 different between the two scenarios.  I could not reproduce this, and 
 it's probably best to just take all the patches as that is what I 
 tested.  I did test different loading orders in my testing.
 
 I'll try to look at this again today.

I totally agree that it would be best to use all the patches rather than
to just pull out a few of them and I would always prefer to do that but
in the case of the particular scenario/issue I'm currently working on,
it won't be possible in the near-term. :-(  

I did try loading the ipmi-allow-hot-smi-remove patch along with the
other 4 patches I listed and it did seem to fix the driver-load-order
oops.  Is that a stand-alone patch or are there others in the set that
need to be loaded along with it?  I ran some tests with this new 5-patch
subset and didn't find any problems but my testing wasn't exhaustive so
I'm hoping to verify that grabbing only these 5 alone won't introduce
some other issue in some area I didn't touch in my testing.

  I got a bit of info about the order in which the SMBIOS table is
  populated and found out that it's currently populated in order of
  increasing KCS I/O address but that this isn't necessarily an ordering
  scheme that can be assumed for the future.  Also, regarding changing the
  BIOS to make the deviceID unique across BMCs, I was told that if these
  changes were made, we would likely be facing many issues such as
  DeviceID mismatches with what's coded up in the SDR data, etc.  So I
  suspect it's something that might not happen anytime soon (if ever).

 That really doesn't make any sense.  The only place I could find where 
 this Device ID is used is in the type 13 SDR: Management Controller 
 Confirmation Record.  This record is used by utility software to record 
 that it found a specific management controller in the system.  It seems 
 of limited value to me, anyway, and having different device IDs would 
 seem to make this easier, not harder, to identify the different 
 management controllers.  From what I can tell, the use of this is for 
 system software to record the current management controller 
 configuration.  Then if system software finds something different, it 
 can say Hey, something changed and handle it.
 
 Note that the term Device ID is heavily overloaded in the IPMI spec.  
 It also has FRU Device ID and Device ID String, but those are 
 completely different things.
 
 I see no other reliable mechanism to correlate management controllers 
 with nodes, especially if nodes ever become dynamic.  I really doubt you 
 will have any issues unless you have software that is hardcoded to 
 handle this.  That doesn't seem so, since they are all the same and it 
 doesn't provide any real useful information.  Perhaps the group doing 
 the work can suggest a reliable way to correlate the nodes and the 
 management controllers?
 
Thanks very much for your input on this.  I'll take what you've said
back to the BIOS folks and re-open the discussion.  :-)

Thank you very much again for your ongoing and excellent help. :-)

Carol


-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT  business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.phpp=sourceforgeCID=DEVDEV
___
Openipmi-developer mailing list
Openipmi-developer@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/openipmi-developer


Re: [Openipmi-developer] ipmi_si appears to be broken on multinode systems in 2.6.18 kernel

2006-10-20 Thread Corey Minyard
Carol Hebert wrote:
 On Thu, 2006-10-19 at 21:46 -0500, Corey Minyard wrote:
   
 .

 I'm waiting for one more patch to be finished up and tested, and I'm
 putting out a 2.6.18 patch set.

 

 That's excellent news!  I'll run the patch set on my multi-nodes as soon
 as it's out.  BTW:  I was wondering if it would be much trouble to get
 the table-list patch put into the 2.4 tree as well?  I'd be happy to
 help and would be happy to test it out on a multi-node.  :-)
   
Hmm, that might be harder on 2.4.  I have to review the set of patches
to see what will go on for  the 2.4 release, so I'll look at it then. 
It seems to me that 2.4 and the multi-node beasts wouldn't be a good
match, but if it's needed...

-Corey

-
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnkkid=120709bid=263057dat=121642
___
Openipmi-developer mailing list
Openipmi-developer@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/openipmi-developer


Re: [Openipmi-developer] ipmi_si appears to be broken on multinode systems in 2.6.18 kernel

2006-10-19 Thread Corey Minyard
Ok, patch is attached.

Carol Hebert wrote:
 On Wed, 2006-10-18 at 13:37 -0700, Carol Hebert wrote:
   
 Hi Corey,

 This latest patch worked great on my 2-node system! :-D   I'll try to
 get some time on a 4-node and 8-node system asap to test it out on them
 as well. 
 

 Oops, I guess I'll probably need that patch you were talking about
 earlier to increase the number of supported nodes to  4 to test the
 8-node system properly.  :-}  I think you mentioned changing the table
 to a list to be able to support an arbitrary number of devices?  I was
 wondering if you had any idea when you might be able to get a chance to
 make that change?

 Thanks again for all your help,

 Carol Hebert


 -
 Using Tomcat but need to do more? Need to support web services, security?
 Get stuff done quickly with pre-integrated technology to make your job easier
 Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
 http://sel.as-us.falkag.net/sel?cmd=lnkkid=120709bid=263057dat=121642
 ___
 Openipmi-developer mailing list
 Openipmi-developer@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/openipmi-developer
   

This patch removes the arbitrary limit of number of IPMI interfaces.

Signed-off-by: Corey Minyard [EMAIL PROTECTED]

Index: linux-2.6.18/drivers/char/ipmi/ipmi_msghandler.c
===
--- linux-2.6.18.orig/drivers/char/ipmi/ipmi_msghandler.c
+++ linux-2.6.18/drivers/char/ipmi/ipmi_msghandler.c
@@ -193,6 +193,9 @@ struct ipmi_smi
 
 	struct kref refcount;
 
+	/* Used for a list of interfaces. */
+	struct list_head link;
+
 	/* The list of upper layers that are using me.  seq_lock
 	 * protects this. */
 	struct list_head users;
@@ -338,13 +341,6 @@ struct ipmi_smi
 };
 #define to_si_intf_from_dev(device) container_of(device, struct ipmi_smi, dev)
 
-/* Used to mark an interface entry that cannot be used but is not a
- * free entry, either, primarily used at creation and deletion time so
- * a slot doesn't get reused too quickly. */
-#define IPMI_INVALID_INTERFACE_ENTRY ((ipmi_smi_t) ((long) 1))
-#define IPMI_INVALID_INTERFACE(i) (((i) == NULL) \
-   || (i == IPMI_INVALID_INTERFACE_ENTRY))
-
 /**
  * The driver model view of the IPMI messaging driver.
  */
@@ -354,11 +350,8 @@ static struct device_driver ipmidriver =
 };
 static DEFINE_MUTEX(ipmidriver_mutex);
 
-#define MAX_IPMI_INTERFACES 4
-static ipmi_smi_t ipmi_interfaces[MAX_IPMI_INTERFACES];
-
-/* Directly protects the ipmi_interfaces data structure. */
-static DEFINE_SPINLOCK(interfaces_lock);
+static struct list_head ipmi_interfaces = LIST_HEAD_INIT(ipmi_interfaces);
+static DEFINE_MUTEX(ipmi_interfaces_mutex);
 
 /* List of watchers that want to know when smi's are added and
deleted. */
@@ -413,25 +406,50 @@ static void intf_free(struct kref *ref)
 	kfree(intf);
 }
 
+struct watcher_entry {
+	struct list_head link;
+	int intf_num;
+};
+
 int ipmi_smi_watcher_register(struct ipmi_smi_watcher *watcher)
 {
-	int   i;
-	unsigned long flags;
+	ipmi_smi_t intf;
+	struct list_head to_deliver = LIST_HEAD_INIT(to_deliver);
+	struct watcher_entry *e, *e2;
+
+	mutex_lock(ipmi_interfaces_mutex);
+
+	list_for_each_entry_rcu(intf, ipmi_interfaces, link) {
+		if (intf-intf_num == -1)
+			continue;
+		e = kmalloc(sizeof(*e));
+		if (!e)
+			goto out_err;
+		e-intf_num = intf-intf_num;
+		list_add_tail(e-link, to_deliver);
+	}
 
 	down_write(smi_watchers_sem);
 	list_add((watcher-link), smi_watchers);
 	up_write(smi_watchers_sem);
-	spin_lock_irqsave(interfaces_lock, flags);
-	for (i = 0; i  MAX_IPMI_INTERFACES; i++) {
-		ipmi_smi_t intf = ipmi_interfaces[i];
-		if (IPMI_INVALID_INTERFACE(intf))
-			continue;
-		spin_unlock_irqrestore(interfaces_lock, flags);
-		watcher-new_smi(i, intf-si_dev);
-		spin_lock_irqsave(interfaces_lock, flags);
+
+	mutex_unlock(ipmi_interfaces_mutex);
+
+	list_for_each_entry_safe(e, e2, to_deliver, link) {
+		list_del(e-link);
+		watcher-new_smi(e-intf_num, intf-si_dev);
+		kfree(e);
 	}
-	spin_unlock_irqrestore(interfaces_lock, flags);
+
+
 	return 0;
+
+ out_err:
+	list_for_each_entry_safe(e, e2, to_deliver, link) {
+		list_del(e-link);
+		kfree(e);
+	}
+	return -ENOMEM;
 }
 
 int ipmi_smi_watcher_unregister(struct ipmi_smi_watcher *watcher)
@@ -766,17 +784,19 @@ int ipmi_create_user(unsigned int   
 	if (!new_user)
 		return -ENOMEM;
 
-	spin_lock_irqsave(interfaces_lock, flags);
-	intf = ipmi_interfaces[if_num];
-	if ((if_num = MAX_IPMI_INTERFACES) || IPMI_INVALID_INTERFACE(intf)) {
-		spin_unlock_irqrestore(interfaces_lock, flags);
-		rv = -EINVAL;
-		goto out_kfree;
+	rcu_read_lock();
+	list_for_each_entry_rcu(intf, ipmi_interfaces, link) {
+		if (intf-intf_num == if_num)
+			goto found;
 	}
+	rcu_read_unlock();
+	rv = -EINVAL;
+	goto out_kfree;
 
+ found:
 	/* Note that each existing user holds a refcount 

Re: [Openipmi-developer] ipmi_si appears to be broken on multinode systems in 2.6.18 kernel

2006-10-19 Thread Carol Hebert
Hi,

Wow!  I barely hit return on my email and the patch was in my
inbox!! :-)

I made a couple of adjustments to the patch to make my compiler happy.
In the ipmi_smi_watcher_register() routine, I deleted the  on
to_deliver; also, I added GFP_KERNEL as a second arg to kmalloc:

int ipmi_smi_watcher_register(struct ipmi_smi_watcher *watcher)
{
ipmi_smi_t intf;
-   struct list_head to_deliver = LIST_HEAD_INIT(to_deliver);
+   struct list_head to_deliver = LIST_HEAD_INIT(to_deliver);

struct watcher_entry *e, *e2;

mutex_lock(ipmi_interfaces_mutex);

list_for_each_entry_rcu(intf, ipmi_interfaces, link) {
if (intf-intf_num == -1)
continue;

-   e = kmalloc(sizeof(*e));
+   e = kmalloc(sizeof(*e), GFP_KERNEL);

if (!e)
goto out_err;
e-intf_num = intf-intf_num;
list_add_tail(e-link, to_deliver);
}



I ran it on my 2-node system and it seemed to work as well as the
previous table-oriented patched version (e.g. great! :-).  I'm still
working on getting an 8-node to test it on -- hopefully I'll get one
next week.

Thanks very much again,  :-)

Carol Hebert

On Thu, 2006-10-19 at 16:23 -0500, Corey Minyard wrote:
 Ok, patch is attached.
 
 Carol Hebert wrote:
  On Wed, 2006-10-18 at 13:37 -0700, Carol Hebert wrote:

  Hi Corey,
 
  This latest patch worked great on my 2-node system! :-D   I'll try to
  get some time on a 4-node and 8-node system asap to test it out on them
  as well. 
  
 
  Oops, I guess I'll probably need that patch you were talking about
  earlier to increase the number of supported nodes to  4 to test the
  8-node system properly.  :-}  I think you mentioned changing the table
  to a list to be able to support an arbitrary number of devices?  I was
  wondering if you had any idea when you might be able to get a chance to
  make that change?
 
  Thanks again for all your help,
 
  Carol Hebert
 
 
  -
  Using Tomcat but need to do more? Need to support web services, security?
  Get stuff done quickly with pre-integrated technology to make your job 
  easier
  Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
  http://sel.as-us.falkag.net/sel?cmd=lnkkid=120709bid=263057dat=121642
  ___
  Openipmi-developer mailing list
  Openipmi-developer@lists.sourceforge.net
  https://lists.sourceforge.net/lists/listinfo/openipmi-developer

 


-
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnkkid=120709bid=263057dat=121642
___
Openipmi-developer mailing list
Openipmi-developer@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/openipmi-developer


Re: [Openipmi-developer] ipmi_si appears to be broken on multinode systems in 2.6.18 kernel

2006-10-19 Thread Corey Minyard
Carol Hebert wrote:
 Hi,

 Wow!  I barely hit return on my email and the patch was in my
 inbox!! :-)
   
Well, I had it sitting there, so it was easy.  Sorry about the compile
errors, those fixes had snuck into a later patch but didn't get put into
the right place.

I'm waiting for one more patch to be finished up and tested, and I'm
putting out a 2.6.18 patch set.

-Corey


-
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnkkid=120709bid=263057dat=121642
___
Openipmi-developer mailing list
Openipmi-developer@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/openipmi-developer


Re: [Openipmi-developer] ipmi_si appears to be broken on multinode systems in 2.6.18 kernel

2006-10-18 Thread Carol Hebert

Hi Corey,

This latest patch worked great on my 2-node system! :-D   I'll try to
get some time on a 4-node and 8-node system asap to test it out on them
as well. 

I've listed below how ipmi and the BMCs are now represented in sysfs.
Do you still want me to continue working on trying to get some unique
BMC device ID/GUID change made in the f/w as well (and in the process
find out what we have now ;-}?  I'm also working on finding out whether
or not it's guaranteed that the BMCs are listed in node order in the
SMBIOS table.

Thanks very much for your help and for making my day! :-D

Carol Hebert



/sys/class/ipmi/ipmi1/device
/sys/class/ipmi/ipmi1/dev
/sys/class/ipmi/ipmi1/uevent
/sys/class/ipmi/ipmi1/subsystem
/sys/class/ipmi/ipmi0
/sys/class/ipmi/ipmi0/device
/sys/class/ipmi/ipmi0/dev
/sys/class/ipmi/ipmi0/uevent
/sys/class/ipmi/ipmi0/subsystem
/sys/bus/pci/drivers/ipmi_si
/sys/bus/pci/drivers/ipmi_si/new_id
/sys/bus/pci/drivers/ipmi_si/bind
/sys/bus/pci/drivers/ipmi_si/unbind
/sys/bus/pci/drivers/ipmi_si/module
/sys/bus/platform/drivers/ipmi_si
/sys/bus/platform/drivers/ipmi_si/ipmi_si.1
/sys/bus/platform/drivers/ipmi_si/ipmi_si.0
/sys/bus/platform/drivers/ipmi_si/bind
/sys/bus/platform/drivers/ipmi_si/unbind
/sys/bus/platform/drivers/ipmi
/sys/bus/platform/drivers/ipmi/ipmi_bmc.0007.33
/sys/bus/platform/drivers/ipmi/ipmi_bmc.0007.32
/sys/bus/platform/drivers/ipmi/bind
/sys/bus/platform/drivers/ipmi/unbind
/sys/bus/platform/devices/ipmi_bmc.0007.33
/sys/bus/platform/devices/ipmi_si.1
/sys/bus/platform/devices/ipmi_bmc.0007.32
/sys/bus/platform/devices/ipmi_si.0
/sys/devices/platform/ipmi_bmc.0007.33
/sys/devices/platform/ipmi_bmc.0007.33/ipmi1
/sys/devices/platform/ipmi_bmc.0007.33/guid
/sys/devices/platform/ipmi_bmc.0007.33/aux_firmware_revision
/sys/devices/platform/ipmi_bmc.0007.33/product_id
/sys/devices/platform/ipmi_bmc.0007.33/manufacturer_id
/sys/devices/platform/ipmi_bmc.0007.33/additional_device_support
/sys/devices/platform/ipmi_bmc.0007.33/ipmi_version
/sys/devices/platform/ipmi_bmc.0007.33/firmware_revision
/sys/devices/platform/ipmi_bmc.0007.33/revision
/sys/devices/platform/ipmi_bmc.0007.33/provides_device_sdrs
/sys/devices/platform/ipmi_bmc.0007.33/device_id
/sys/devices/platform/ipmi_bmc.0007.33/driver
/sys/devices/platform/ipmi_bmc.0007.33/bus
/sys/devices/platform/ipmi_bmc.0007.33/subsystem
/sys/devices/platform/ipmi_bmc.0007.33/modalias
/sys/devices/platform/ipmi_bmc.0007.33/power
/sys/devices/platform/ipmi_bmc.0007.33/power/wakeup
/sys/devices/platform/ipmi_bmc.0007.33/power/state
/sys/devices/platform/ipmi_bmc.0007.33/uevent
/sys/devices/platform/ipmi_si.1
/sys/devices/platform/ipmi_si.1/ipmi:ipmi1
/sys/devices/platform/ipmi_si.1/bmc
/sys/devices/platform/ipmi_si.1/driver
/sys/devices/platform/ipmi_si.1/bus
/sys/devices/platform/ipmi_si.1/subsystem
/sys/devices/platform/ipmi_si.1/modalias
/sys/devices/platform/ipmi_si.1/power
/sys/devices/platform/ipmi_si.1/power/wakeup
/sys/devices/platform/ipmi_si.1/power/state
/sys/devices/platform/ipmi_si.1/uevent
/sys/devices/platform/ipmi_bmc.0007.32
/sys/devices/platform/ipmi_bmc.0007.32/ipmi0
/sys/devices/platform/ipmi_bmc.0007.32/guid
/sys/devices/platform/ipmi_bmc.0007.32/aux_firmware_revision
/sys/devices/platform/ipmi_bmc.0007.32/product_id
/sys/devices/platform/ipmi_bmc.0007.32/manufacturer_id
/sys/devices/platform/ipmi_bmc.0007.32/additional_device_support
/sys/devices/platform/ipmi_bmc.0007.32/ipmi_version
/sys/devices/platform/ipmi_bmc.0007.32/firmware_revision
/sys/devices/platform/ipmi_bmc.0007.32/revision
/sys/devices/platform/ipmi_bmc.0007.32/provides_device_sdrs
/sys/devices/platform/ipmi_bmc.0007.32/device_id
/sys/devices/platform/ipmi_bmc.0007.32/driver
/sys/devices/platform/ipmi_bmc.0007.32/bus
/sys/devices/platform/ipmi_bmc.0007.32/subsystem
/sys/devices/platform/ipmi_bmc.0007.32/modalias
/sys/devices/platform/ipmi_bmc.0007.32/power
/sys/devices/platform/ipmi_bmc.0007.32/power/wakeup
/sys/devices/platform/ipmi_bmc.0007.32/power/state
/sys/devices/platform/ipmi_bmc.0007.32/uevent
/sys/devices/platform/ipmi_si.0
/sys/devices/platform/ipmi_si.0/ipmi:ipmi0
/sys/devices/platform/ipmi_si.0/bmc
/sys/devices/platform/ipmi_si.0/driver
/sys/devices/platform/ipmi_si.0/bus
/sys/devices/platform/ipmi_si.0/subsystem
/sys/devices/platform/ipmi_si.0/modalias
/sys/devices/platform/ipmi_si.0/power
/sys/devices/platform/ipmi_si.0/power/wakeup
/sys/devices/platform/ipmi_si.0/power/state
/sys/devices/platform/ipmi_si.0/uevent


# ls -l /dev/ipmi*
crw--- 1 root root 252, 0 Oct 18 11:52  /dev/ipmi0
crw--- 1 root root 252, 1 Oct 18 11:52  /dev/ipmi1


On Tue, 2006-10-17 at 17:22 -0500, Corey Minyard wrote:
 Corey Minyard wrote:
 
  Please let me know what I can do to help.  In the meantime, I'll take a
  look at the current code and try to figure out why it's still oopsing. 
  
  I thought the oops was fixed.  If not, can you send one?
 
  As far as things you can do, I'm not really sure.  I don't have enough
  details on 

Re: [Openipmi-developer] ipmi_si appears to be broken on multinode systems in 2.6.18 kernel

2006-10-18 Thread Corey Minyard
Carol Hebert wrote:
 Hi Corey,

 This latest patch worked great on my 2-node system! :-D   I'll try to
 get some time on a 4-node and 8-node system asap to test it out on them
 as well. 

 I've listed below how ipmi and the BMCs are now represented in sysfs.
 Do you still want me to continue working on trying to get some unique
 BMC device ID/GUID change made in the f/w as well (and in the process
 find out what we have now ;-}?  I'm also working on finding out whether
 or not it's guaranteed that the BMCs are listed in node order in the
 SMBIOS table.
   
It's probably best to get the unique device id in the firmware.  That is
the only sure way to know that a specific IPMI device maps to a specific
node's BMC, and IMHO it's the right way to do things.
 Thanks very much for your help and for making my day! :-D
   
You are welcome.

-Corey
 Carol Hebert

 

 /sys/class/ipmi/ipmi1/device
 /sys/class/ipmi/ipmi1/dev
 /sys/class/ipmi/ipmi1/uevent
 /sys/class/ipmi/ipmi1/subsystem
 /sys/class/ipmi/ipmi0
 /sys/class/ipmi/ipmi0/device
 /sys/class/ipmi/ipmi0/dev
 /sys/class/ipmi/ipmi0/uevent
 /sys/class/ipmi/ipmi0/subsystem
 /sys/bus/pci/drivers/ipmi_si
 /sys/bus/pci/drivers/ipmi_si/new_id
 /sys/bus/pci/drivers/ipmi_si/bind
 /sys/bus/pci/drivers/ipmi_si/unbind
 /sys/bus/pci/drivers/ipmi_si/module
 /sys/bus/platform/drivers/ipmi_si
 /sys/bus/platform/drivers/ipmi_si/ipmi_si.1
 /sys/bus/platform/drivers/ipmi_si/ipmi_si.0
 /sys/bus/platform/drivers/ipmi_si/bind
 /sys/bus/platform/drivers/ipmi_si/unbind
 /sys/bus/platform/drivers/ipmi
 /sys/bus/platform/drivers/ipmi/ipmi_bmc.0007.33
 /sys/bus/platform/drivers/ipmi/ipmi_bmc.0007.32
 /sys/bus/platform/drivers/ipmi/bind
 /sys/bus/platform/drivers/ipmi/unbind
 /sys/bus/platform/devices/ipmi_bmc.0007.33
 /sys/bus/platform/devices/ipmi_si.1
 /sys/bus/platform/devices/ipmi_bmc.0007.32
 /sys/bus/platform/devices/ipmi_si.0
 /sys/devices/platform/ipmi_bmc.0007.33
 /sys/devices/platform/ipmi_bmc.0007.33/ipmi1
 /sys/devices/platform/ipmi_bmc.0007.33/guid
 /sys/devices/platform/ipmi_bmc.0007.33/aux_firmware_revision
 /sys/devices/platform/ipmi_bmc.0007.33/product_id
 /sys/devices/platform/ipmi_bmc.0007.33/manufacturer_id
 /sys/devices/platform/ipmi_bmc.0007.33/additional_device_support
 /sys/devices/platform/ipmi_bmc.0007.33/ipmi_version
 /sys/devices/platform/ipmi_bmc.0007.33/firmware_revision
 /sys/devices/platform/ipmi_bmc.0007.33/revision
 /sys/devices/platform/ipmi_bmc.0007.33/provides_device_sdrs
 /sys/devices/platform/ipmi_bmc.0007.33/device_id
 /sys/devices/platform/ipmi_bmc.0007.33/driver
 /sys/devices/platform/ipmi_bmc.0007.33/bus
 /sys/devices/platform/ipmi_bmc.0007.33/subsystem
 /sys/devices/platform/ipmi_bmc.0007.33/modalias
 /sys/devices/platform/ipmi_bmc.0007.33/power
 /sys/devices/platform/ipmi_bmc.0007.33/power/wakeup
 /sys/devices/platform/ipmi_bmc.0007.33/power/state
 /sys/devices/platform/ipmi_bmc.0007.33/uevent
 /sys/devices/platform/ipmi_si.1
 /sys/devices/platform/ipmi_si.1/ipmi:ipmi1
 /sys/devices/platform/ipmi_si.1/bmc
 /sys/devices/platform/ipmi_si.1/driver
 /sys/devices/platform/ipmi_si.1/bus
 /sys/devices/platform/ipmi_si.1/subsystem
 /sys/devices/platform/ipmi_si.1/modalias
 /sys/devices/platform/ipmi_si.1/power
 /sys/devices/platform/ipmi_si.1/power/wakeup
 /sys/devices/platform/ipmi_si.1/power/state
 /sys/devices/platform/ipmi_si.1/uevent
 /sys/devices/platform/ipmi_bmc.0007.32
 /sys/devices/platform/ipmi_bmc.0007.32/ipmi0
 /sys/devices/platform/ipmi_bmc.0007.32/guid
 /sys/devices/platform/ipmi_bmc.0007.32/aux_firmware_revision
 /sys/devices/platform/ipmi_bmc.0007.32/product_id
 /sys/devices/platform/ipmi_bmc.0007.32/manufacturer_id
 /sys/devices/platform/ipmi_bmc.0007.32/additional_device_support
 /sys/devices/platform/ipmi_bmc.0007.32/ipmi_version
 /sys/devices/platform/ipmi_bmc.0007.32/firmware_revision
 /sys/devices/platform/ipmi_bmc.0007.32/revision
 /sys/devices/platform/ipmi_bmc.0007.32/provides_device_sdrs
 /sys/devices/platform/ipmi_bmc.0007.32/device_id
 /sys/devices/platform/ipmi_bmc.0007.32/driver
 /sys/devices/platform/ipmi_bmc.0007.32/bus
 /sys/devices/platform/ipmi_bmc.0007.32/subsystem
 /sys/devices/platform/ipmi_bmc.0007.32/modalias
 /sys/devices/platform/ipmi_bmc.0007.32/power
 /sys/devices/platform/ipmi_bmc.0007.32/power/wakeup
 /sys/devices/platform/ipmi_bmc.0007.32/power/state
 /sys/devices/platform/ipmi_bmc.0007.32/uevent
 /sys/devices/platform/ipmi_si.0
 /sys/devices/platform/ipmi_si.0/ipmi:ipmi0
 /sys/devices/platform/ipmi_si.0/bmc
 /sys/devices/platform/ipmi_si.0/driver
 /sys/devices/platform/ipmi_si.0/bus
 /sys/devices/platform/ipmi_si.0/subsystem
 /sys/devices/platform/ipmi_si.0/modalias
 /sys/devices/platform/ipmi_si.0/power
 /sys/devices/platform/ipmi_si.0/power/wakeup
 /sys/devices/platform/ipmi_si.0/power/state
 /sys/devices/platform/ipmi_si.0/uevent


 # ls -l /dev/ipmi*
 crw--- 1 root root 252, 0 Oct 18 11:52  /dev/ipmi0
 crw--- 1 root root 252, 1 Oct 18 11:52  /dev/ipmi1


 On Tue, 2006-10-17 at 

Re: [Openipmi-developer] ipmi_si appears to be broken on multinode systems in 2.6.18 kernel

2006-10-17 Thread Corey Minyard
Corey Minyard wrote:

 Please let me know what I can do to help.  In the meantime, I'll take a
 look at the current code and try to figure out why it's still oopsing. 
 
 I thought the oops was fixed.  If not, can you send one?

 As far as things you can do, I'm not really sure.  I don't have enough
 details on how this hardware works to design a solution.  This is really
 nitty-gritty detail information, like how the nodes map their BMC
 addresses and how the SMBIOS table is populated.  If the BMCs appeared
 in the SMBIOS tables in node order, then the solution is very easy, just
 detect and add 1 for each.  I could just print a warning at startup when
 it detects this and it would probably cover a multitude of future evils :-).

   
I thought about this some more, and it's a good idea, I believe to do
this.  The patch was easy, and I have tested it using a simulator.

Note that I found a bug in the product id stuff.  It may be that your
BMCs don't have a *device* GUID or at least a unique device GUID.  (Note
that a device GUID is different than a system GUID, and your system may
only have a system GUID.  The system GUID is supposed to be the same for
the entire system, but each BMC is supposed to have its own unique
device GUID if it supports that).  I was passing a 16-bit value as an
unsigned char in the compare routine, so it never matched based on
product/device id.  So with this patch, either you will get the previous
behavior (if your system supports device GUIDs) or all the BMCs will
appear to be the a single BMC with multiple interfaces to it (if device
GUIDs are not supported).

This patch replaces the previous one I sent you.

-Corey

This patch adds the product id to the driver model platform device
name, in addition to the device id.  The IPMI speci does not require
that individual BMCs in a system have unique devices IDs, but it
does require that the product id/device id combination be unique.

This also remove a redundant check and cleans up error handling
when the sysfs registration fails.  It also passes in the sysfs
name from the lower-level driver, as the coming IPMI serial driver
will need that to link properly from the serial device sysfs
directory.

Index: linux-2.6.18/drivers/char/ipmi/ipmi_msghandler.c
===
--- linux-2.6.18.orig/drivers/char/ipmi/ipmi_msghandler.c
+++ linux-2.6.18/drivers/char/ipmi/ipmi_msghandler.c
@@ -202,6 +202,7 @@ struct ipmi_smi
 
 	struct bmc_device *bmc;
 	char *my_dev_name;
+	char *sysfs_name;
 
 	/* This is the lower-layer's sender routine. */
 	struct ipmi_smi_handlers *handlers;
@@ -1807,13 +1808,12 @@ static int __find_bmc_prod_dev_id(struct
 	struct bmc_device *bmc = dev_get_drvdata(dev);
 
 	return (bmc-id.product_id == id-product_id
-		 bmc-id.product_id == id-product_id
 		 bmc-id.device_id == id-device_id);
 }
 
 static struct bmc_device *ipmi_find_bmc_prod_dev_id(
 	struct device_driver *drv,
-	unsigned char product_id, unsigned char device_id)
+	unsigned int product_id, unsigned char device_id)
 {
 	struct prod_dev_id id = {
 		.product_id = product_id,
@@ -1930,6 +1930,9 @@ static ssize_t guid_show(struct device *
 
 static void remove_files(struct bmc_device *bmc)
 {
+	if (!bmc-dev)
+		return;
+
 	device_remove_file(bmc-dev-dev,
 			   bmc-device_id_attr);
 	device_remove_file(bmc-dev-dev,
@@ -1963,7 +1966,8 @@ cleanup_bmc_device(struct kref *ref)
 	bmc = container_of(ref, struct bmc_device, refcount);
 
 	remove_files(bmc);
-	platform_device_unregister(bmc-dev);
+	if (bmc-dev)
+		platform_device_unregister(bmc-dev);
 	kfree(bmc);
 }
 
@@ -1971,7 +1975,11 @@ static void ipmi_bmc_unregister(ipmi_smi
 {
 	struct bmc_device *bmc = intf-bmc;
 
-	sysfs_remove_link(intf-si_dev-kobj, bmc);
+	if (intf-sysfs_name) {
+		sysfs_remove_link(intf-si_dev-kobj, intf-sysfs_name);
+		kfree(intf-sysfs_name);
+		intf-sysfs_name = NULL;
+	}
 	if (intf-my_dev_name) {
 		sysfs_remove_link(bmc-dev-dev.kobj, intf-my_dev_name);
 		kfree(intf-my_dev_name);
@@ -1980,6 +1988,7 @@ static void ipmi_bmc_unregister(ipmi_smi
 
 	mutex_lock(ipmidriver_mutex);
 	kref_put(bmc-refcount, cleanup_bmc_device);
+	intf-bmc = NULL;
 	mutex_unlock(ipmidriver_mutex);
 }
 
@@ -1987,6 +1996,56 @@ static int create_files(struct bmc_devic
 {
 	int err;
 
+	bmc-device_id_attr.attr.name = device_id;
+	bmc-device_id_attr.attr.owner = THIS_MODULE;
+	bmc-device_id_attr.attr.mode = S_IRUGO;
+	bmc-device_id_attr.show = device_id_show;
+
+	bmc-provides_dev_sdrs_attr.attr.name = provides_device_sdrs;
+	bmc-provides_dev_sdrs_attr.attr.owner = THIS_MODULE;
+	bmc-provides_dev_sdrs_attr.attr.mode = S_IRUGO;
+	bmc-provides_dev_sdrs_attr.show = provides_dev_sdrs_show;
+
+	bmc-revision_attr.attr.name = revision;
+	bmc-revision_attr.attr.owner = THIS_MODULE;
+	bmc-revision_attr.attr.mode = S_IRUGO;
+	bmc-revision_attr.show = revision_show;
+
+	bmc-firmware_rev_attr.attr.name = firmware_revision;
+	bmc-firmware_rev_attr.attr.owner = 

Re: [Openipmi-developer] ipmi_si appears to be broken on multinode systems in 2.6.18 kernel

2006-10-11 Thread Carol Hebert

Hi,

I believe your assessment of my x460 dual-node system configuration is
correct with the exception of maybe changing the word slot to system
since the nodes are joined by scalability cables rather than being
connected via a common backplane.

Regarding the uniqueness of the Device ID, I think you mentioned in an
earlier email that the spec was a bit contradictory on the topic.  I
took a look at the spec and agree that it is not at all clear whether
the Device ID should be unique for all controllers or only for ones that
support a different set of application commands/OEM fields.  In one
paragraph, it states that:

Controllers that implement identical sets of applications (sic)
commands can have the same Device ID in a given system. Thus a
'standardized' controller could be produced where multiple instances of
the controller are used in a system, and all have the same Device ID
value.  The controllers would still be differentiable by their
address...  

and in the *immediately following* paragraph, it states 

A controller can optionally use the Device ID as an 'instance'
identifier if more than one controller of that kind is used in the
system.   (It then goes on to say that the GUID, however, is the
preferred method of uniquely identifying controllers.)

Sheesh.  :-}  

In checking out the dmidecode data, I verified that the addresses of the
controllers on the multi-node system are unique and available there.  So
both the GUID and the address are unique for the controllers on the
multi-node system whereas the Device ID is not.  Can we use the GUID
(maybe in some more easily digestible form) or the address instead of
the Device ID?   It seems like the only thing that's clear from the spec
is that the Device ID's uniqueness is something we can't count on.

Regarding the ipmi device support currently being fixed at a max of 4,
the largest multi-node configuration we currently have is 8 so we would
need to have the table size bumped up to at least 8.  However, for
future support, it might be useful to increase it even more (12, 16?).

Finally, I don't believe dynamic node plugging will generally be an
issue for my system since the nodes are merged at boot time rather than
being dynamically added and/or removed.

Thanks very much,

Carol Hebert

On Wed, 2006-10-11 at 10:25 -0500, Corey Minyard wrote:
 Now the driver is doing exactly what it is supposed to do, but now that
 may not be what we want.  I'm not sure of the configuration of this
 system, but the information below gives me some clues.  Here's my guess
 on the system:
 
 This is a NUMA system with hot-plug CPU boards.  Each board has an IPMI
 controller on it.  The BIOS maps the I/O address and SMBIOS tables for
 the IPMI controller to different I/O locations based upon the slot the
 board is in.  There are a number of problems beyond this one for a
 configuration of this nature.  I'll address those later.
 
 In response to your question, I believe this is exactly what the Device
 ID in IPMI is intended for.  Each board in the system should have a
 unique device id based upon the slot it is in.  Say you have an
 application that monitors the CPU temperature of all the CPUs.  If a
 temperature goes out of range, you want to know which board that CPU is
 on.  And the Device ID can tell you that.  The IPMI device number that
 you suggest using are arbitrary, especially in a hot-plug system where
 devices can come and go dynamically.
 
 In addition, you would probably want to be able to do udev mappings so
 that the same slots appear as the same device names (slot 1 is
 /dev/ipmi1, slot 2 is /dev/ipmi2, etc.).  The driver needs to be able to
 give udev information about the devices, and the Product ID/Device ID is
 really all it's got.
 
 Now for the other problems:
 
1. The IPMI driver doesn't current support an arbitrary number of
   devices.  It has a fixed table of four.  I can fix this fairly
   easily, though.  I wasn't really expecting a system to be designed
   like this.
2. The IPMI driver has no way to handle dynamic node plugging.  I
   don't know of a standard way to tell the IPMI driver: Hey, you
   have a new controller here.  The driver should support adding new
   devices dynamically, but I need some way to know the device is
   there, or that it is going away.
3. I don't think the IPMI driver provides a way for sysfs to report
   the information that udev needs to do the udev mappings properly 
   As always with sysfs, this is probably easy once you spend 2 days
   figuring out what to do.
 
 Am I on the right track here?



-
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo

Re: [Openipmi-developer] ipmi_si appears to be broken on multinode systems in 2.6.18 kernel

2006-10-11 Thread Corey Minyard
Carol Hebert wrote:
 Hi,

 I believe your assessment of my x460 dual-node system configuration is
 correct with the exception of maybe changing the word slot to system
 since the nodes are joined by scalability cables rather than being
 connected via a common backplane.

 Regarding the uniqueness of the Device ID, I think you mentioned in an
 earlier email that the spec was a bit contradictory on the topic.  I
 took a look at the spec and agree that it is not at all clear whether
 the Device ID should be unique for all controllers or only for ones that
 support a different set of application commands/OEM fields.  In one
 paragraph, it states that:

 Controllers that implement identical sets of applications (sic)
 commands can have the same Device ID in a given system. Thus a
 'standardized' controller could be produced where multiple instances of
 the controller are used in a system, and all have the same Device ID
 value.  The controllers would still be differentiable by their
 address...  

 and in the *immediately following* paragraph, it states 

 A controller can optionally use the Device ID as an 'instance'
 identifier if more than one controller of that kind is used in the
 system.   (It then goes on to say that the GUID, however, is the
 preferred method of uniquely identifying controllers.)

 Sheesh.  :-}  

 In checking out the dmidecode data, I verified that the addresses of the
 controllers on the multi-node system are unique and available there.  So
 both the GUID and the address are unique for the controllers on the
 multi-node system whereas the Device ID is not.  Can we use the GUID
 (maybe in some more easily digestible form) or the address instead of
 the Device ID?   It seems like the only thing that's clear from the spec
 is that the Device ID's uniqueness is something we can't count on.
   
The easily digestible form part is the problem here.  You need some
method to correlate a GUID to something a human being can use to
identify a system, and it would be nice if it wasn't custom for every
installed system out there.  The address is perhaps better, but I'm
going to have to have some way to translate the addresses to system
numbers, and it's going to have to be OEM for this type of hardware, and
the addresses are not available at the level this is happening, this
code is generic for all interface types.

So what we can do, in my order of preference :-) :

   1. Modify the IPMI firmware to set the device id to a unique number
  for every BMC in the system.  It would be really nice if this was
  done in a way that the device ids could be correlated with
  physical systems.  This will work with the IPMI driver as-is, and
  I checked and udev translations can be done as-is, too, I believe.
   2. Use some OEM IPMI command that could query the physical system
  number, if something like this exists.
   3. Create an OEM handler to use the GUID to map to physical systems. 
  I'm going to need some help with this, I have no idea how to do
  this.  Looking at the GUID format (Table 20-10 in the IPMI 2.0
  spec), I don't see any way to do this.  The node field, BTW, is
  supposed to be the 802.x MAC address.
   4. Use the I/O address.  This introduces a lot of headaches into the
  structure of the IPMI driver as the address has to be propagated
  from the interface-specific handler to the generic code, and it
  introduces an OEM handler.  And I'll need some way to map the I/O
  addresses to physical systems.

Any more ideas?

-Corey
 Regarding the ipmi device support currently being fixed at a max of 4,
 the largest multi-node configuration we currently have is 8 so we would
 need to have the table size bumped up to at least 8.  However, for
 future support, it might be useful to increase it even more (12, 16?).
   
I'll probably just make it a list and get rid of the table so it can be
arbitrary counts.
 Finally, I don't believe dynamic node plugging will generally be an
 issue for my system since the nodes are merged at boot time rather than
 being dynamically added and/or removed.
   
So the time is not here yet, but I'm sure it's coming someday :)  I can
wait on this one, then, but I decided it would be pretty easy to do
through the hotplug subsystem.

-Corey
 Thanks very much,

 Carol Hebert

 On Wed, 2006-10-11 at 10:25 -0500, Corey Minyard wrote:
   
 Now the driver is doing exactly what it is supposed to do, but now that
 may not be what we want.  I'm not sure of the configuration of this
 system, but the information below gives me some clues.  Here's my guess
 on the system:

 This is a NUMA system with hot-plug CPU boards.  Each board has an IPMI
 controller on it.  The BIOS maps the I/O address and SMBIOS tables for
 the IPMI controller to different I/O locations based upon the slot the
 board is in.  There are a number of problems beyond this one for a
 configuration of this nature.  I'll address those later.

 In response to your 

Re: [Openipmi-developer] ipmi_si appears to be broken on multinode systems in 2.6.18 kernel

2006-10-10 Thread Carol Hebert
Hi Corey,

I'm still having problems with the new patches due to the device ID and
the Product ID being the same on each of the nodes (still have
segfault/oops).  The dual node system is really two separate nodes that
are joined at will (via RSA setup).  Since each began life (and can
resume life at any time) as a standalone system, isn't it reasonable
that they could have the same BMC Product and Device IDs?  If not, do
you think this is something that could/should be changed/set in the BIOS
for each BMC on multi-node systems?

Alternately, would it be possible to differentiate between the two BMCs
for sysfs file naming purposes by using the value of intf-intf_num in
ipmi_bmc_register()?  I believe that's pretty similar to what's
currently done to differentiate between the ipmi.0 and ipmi.1
interfaces.  As an example, I tacked the intf_num onto the product id in
ipmi_bmc_register() (your and Jeff's patched version of the
ipmi_msghandler.c file):

} else {
-   char name[14];
+   char name[16];
snprintf(name, sizeof(name),
-  ipmi_bmc.%4.4x, bmc-id.product_id);
+  ipmi_bmc.%4.4x%d, bmc-id.product_id,
intf-intf_num);

and the modules loaded fine.  The file names become:  ipmi_bmc.00070.32
and ipmi_bmc.00071.32 (see debug trace below).  I suspect I may be
grossly oversimplifying the feasibility/usability/implementation of this
solution but at first glance/touch test, it appears to work so I thought
it might be good to discuss it.

Anyway, thanks again for your help.  Please let me know what you'd like
me to try next.  Also, I can probably get some time on a 4-node and/or
an 8-node system so we can really stress the solution once we've settled
on a fix.

Thanks much,

Carol Hebert  

-

kobject ipmi_msghandler: registering. parent: NULL, set: module
kobject_uevent
fill_kobj_path: path = '/module/ipmi_msghandler'
kobject ipmi: registering. parent: NULL, set: drivers
kobject_uevent
fill_kobj_path: path = '/bus/platform/drivers/ipmi'
ipmi message handler version 39.0
kobject ipmi_devintf: registering. parent: NULL, set: module
kobject_uevent
fill_kobj_path: path = '/module/ipmi_devintf'
ipmi device interface
subsystem ipmi: registering
kobject ipmi: registering. parent: NULL, set: class
kobject ipmi_si: registering. parent: NULL, set: module
kobject_uevent
fill_kobj_path: path = '/module/ipmi_si'
kobject ipmi_si: registering. parent: NULL, set: drivers
kobject_uevent
fill_kobj_path: path = '/bus/platform/drivers/ipmi_si'
IPMI System Interface driver.
ipmi_si: Trying SMBIOS-specified KCS state machine at I/O address
0x90a8, slave address 0x20, irq 0
kobject ipmi_si.0: registering. parent: platform, set: devices
PM: Adding info for platform:ipmi_si.0
kobject_uevent
fill_kobj_path: path = '/devices/platform/ipmi_si.0'
CAH: ipmi: NEW BMC: name =  ipmi_bmc.00070;  intf_num = 0
kobject ipmi_bmc.00070.32: registering. parent: platform, set: devices
PM: Adding info for platform:ipmi_bmc.00070.32
kobject_uevent
fill_kobj_path: path = '/devices/platform/ipmi_bmc.00070.32'
ipmi: Found new BMC (man_id: 0x02,  prod_id: 0x0007, dev_id: 0x20)
kobject ipmi0: registering. parent: ipmi, set: class_obj
kobject_uevent
fill_kobj_path: path = '/class/ipmi/ipmi0'
fill_kobj_path: path = '/devices/platform/ipmi_si.0'
 IPMI KCS interface initialized
ipmi_si: Trying SMBIOS-specified KCS state machine at I/O address 0xca8,
slave address 0x20, irq 0
kobject ipmi_si.1: registering. parent: platform, set: devices
PM: Adding info for platform:ipmi_si.1
kobject_uevent
fill_kobj_path: path = '/devices/platform/ipmi_si.1'
CAH: ipmi: NEW BMC: name =  ipmi_bmc.00071;  intf_num = 1
kobject ipmi_bmc.00071.32: registering. parent: platform, set: devices
PM: Adding info for platform:ipmi_bmc.00071.32
kobject_uevent
fill_kobj_path: path = '/devices/platform/ipmi_bmc.00071.32'
ipmi: Found new BMC (man_id: 0x02,  prod_id: 0x0007, dev_id: 0x20)
kobject ipmi1: registering. parent: ipmi, set: class_obj
kobject_uevent
fill_kobj_path: path = '/class/ipmi/ipmi1'
fill_kobj_path: path = '/devices/platform/ipmi_si.1'
 IPMI KCS interface initialized
kobject ipmi_si: registering. parent: NULL, set: drivers
kobject_uevent
fill_kobj_path: path = '/bus/pci/drivers/ipmi_si'


On Tue, 2006-10-10 at 10:49 -0500, Corey Minyard wrote: 
 Sorry, I messed up the error recovery in the previous patch.  This one
 should fix it; I've simulated this and it works fine.  I've also
 included a patch from Jeff Garzik that does some more cleanup.  Jeff's
 patch must be applied first; it is named ipmi-handle-sysfs-errors.patch.
 
 I'm still not sure what to do about the naming problem, though.  I am
 assuming you the two devices have different GUIDs, otherwise they would
 should up as the same BMC.  I'd prefer to not use the GUID, as it is
 huge and meaningless to humans and applications.
 
 I re-read the section in the spec again, and I really believe it is the
 intent that different BMCs 

Re: [Openipmi-developer] ipmi_si appears to be broken on multinode systems in 2.6.18 kernel

2006-10-09 Thread Carol Hebert
Hi Corey,

Thanks very much for the patch.  :-)  I built it and ran it on my system
and it works a bit better than the original but it still has some
problems.  I'm attaching the dmesg output below (with a bit of debug
turned on in it).  

With the patch, the modprobe appears to create one of the two ipmi
device nodes (ipmi0) expected for the dual-node system although modprobe
of ipmi_si appears to hang  Could you please take a look at the error
messages below and see if you can spot the problem?

Thanks much again,

Carol Hebert

-
kobject ipmi_msghandler: registering. parent: NULL, set: module
kobject_uevent
fill_kobj_path: path = '/module/ipmi_msghandler'
kobject ipmi: registering. parent: NULL, set: drivers
kobject_uevent
fill_kobj_path: path = '/bus/platform/drivers/ipmi'
ipmi message handler version 39.0
kobject ipmi_devintf: registering. parent: NULL, set: module
kobject_uevent
fill_kobj_path: path = '/module/ipmi_devintf'
ipmi device interface
subsystem ipmi: registering
kobject ipmi: registering. parent: NULL, set: class
kobject ipmi_si: registering. parent: NULL, set: module
kobject_uevent
fill_kobj_path: path = '/module/ipmi_si'
kobject ipmi_si: registering. parent: NULL, set: drivers
kobject_uevent
fill_kobj_path: path = '/bus/platform/drivers/ipmi_si'
IPMI System Interface driver.
ipmi_si: Trying SMBIOS-specified KCS state machine at I/O address
0x90a8, slave address 0x20, irq 0
kobject ipmi_si.0: registering. parent: platform, set: devices
PM: Adding info for platform:ipmi_si.0
kobject_uevent
fill_kobj_path: path = '/devices/platform/ipmi_si.0'
kobject ipmi_bmc.0007.32: registering. parent: platform, set: devices
PM: Adding info for platform:ipmi_bmc.0007.32
kobject_uevent
fill_kobj_path: path = '/devices/platform/ipmi_bmc.0007.32'
ipmi: Found new BMC (man_id: 0x02,  prod_id: 0x0007, dev_id: 0x20)
kobject ipmi0: registering. parent: ipmi, set: class_obj
kobject_uevent
fill_kobj_path: path = '/class/ipmi/ipmi0'
fill_kobj_path: path = '/devices/platform/ipmi_si.0'
 IPMI KCS interface initialized
ipmi_si: Trying SMBIOS-specified KCS state machine at I/O address 0xca8,
slave address 0x20, irq 0
kobject ipmi_si.1: registering. parent: platform, set: devices
PM: Adding info for platform:ipmi_si.1
kobject_uevent
fill_kobj_path: path = '/devices/platform/ipmi_si.1'
kobject ipmi_bmc.0007.32: registering. parent: platform, set: devices
kobject_add failed for ipmi_bmc.0007.32 with -EEXIST, don't try to
register things with the same name in the same directory.
 [c04051ed] show_trace_log_lvl+0x58/0x16a
 [c04057fa] show_trace+0xd/0x10
 [c0405913] dump_stack+0x19/0x1b
 [c04e8892] kobject_add+0x186/0x1ac
 [c0552041] device_add+0x7a/0x2de
 [c0554eb3] platform_device_add+0xde/0x10e
 [c0554ef8] platform_device_register+0x15/0x18
 [f9780c28] ipmi_register_smi+0x563/0x987 [ipmi_msghandler]
 [f978ee5e] try_smi_init+0x3ff/0x5a7 [ipmi_si]
 [f978f99e] init_ipmi_si+0x40f/0x6db [ipmi_si]
 [c04423de] sys_init_module+0x16ad/0x1856
 [c0403fb7] syscall_call+0x7/0xb
DWARF2 unwinder stuck at syscall_call+0x7/0xb
Leftover inexact backtrace:
 [c04057fa] show_trace+0xd/0x10
 [c0405913] dump_stack+0x19/0x1b
 [c04e8892] kobject_add+0x186/0x1ac
 [c0552041] device_add+0x7a/0x2de
 [c0554eb3] platform_device_add+0xde/0x10e
 [c0554ef8] platform_device_register+0x15/0x18
 [f9780c28] ipmi_register_smi+0x563/0x987 [ipmi_msghandler]
 [f978ee5e] try_smi_init+0x3ff/0x5a7 [ipmi_si]
 [f978f99e] init_ipmi_si+0x40f/0x6db [ipmi_si]
 [c04423de] sys_init_module+0x16ad/0x1856
 [c0403fb7] syscall_call+0x7/0xb
kobject ipmi_bmc.0007.32: cleaning up
ipmi_msghandler: Unable to register bmc device: -17
ipmi_si: Unable to register device: error -17
BUG: unable to handle kernel paging request at virtual address 6b6b6c73
 printing eip:
c04ab7f4
*pde = 
Oops:  [#1]
SMP
last sysfs file: /class/drm/card0/dev
Modules linked in: ipmi_si(U) ipmi_devintf(U) ipmi_msghandler(U)
radeon(U) drm(U) autofs4(U) hidp(U) rfcomm(U) l2cap(U) bluetooth(U)
sunrpc(U) ipv6(U) acpi_cpufreq(U) video(U) sbs(U) i2c_ec(U) button(U)
battery(U) asus_acpi(U) ac(U) parport_pc(U) lp(U) parport(U) joydev(U)
sg(U) i2c_piix4(U) ide_cd(U) i2c_core(U) aacraid(U) tg3(U) cdrom(U)
serio_raw(U) pcspkr(U) dm_snapshot(U) dm_zero(U) dm_mirror(U) dm_mod(U)
aic94xx(U) libsas(U) scsi_transport_sas(U) sd_mod(U) scsi_mod(U) ext3(U)
jbd(U) ehci_hcd(U) ohci_hcd(U) uhci_hcd(U)
CPU:14
EIP:0060:[c04ab7f4]Not tainted VLI
EFLAGS: 00010212   (2.6.18-ipmipatch #3)
EIP is at sysfs_remove_link+0x1/0xd
eax: 6b6b6c43   ebx: f54c876c   ecx: c042d7c9   edx: f9781b20
esi: 6b6b6b6b   edi: f54c876c   ebp: f4894e58   esp: f4894e48
ds: 007b   es: 007b   ss: 0068
Process modprobe (pid: 5643, ti=f4894000 task=f6c5e030 task.ti=f4894000)
Stack: f4894e58 f977fec3 ffef  f4894e6c f9780564 ffef
dfc0db38
   ffef f4894e84 f978ef34 0118f8be 0ca8 0004 
f4894eac
   f978f99e  0004 

Re: [Openipmi-developer] ipmi_si appears to be broken on multinode systems in 2.6.18 kernel

2006-10-08 Thread Corey Minyard
The basic problem is that platform_device_alloc() is being called with
the device id, but not the product id as part of the name.  According to
the spec, The combo of the two is required to be unique on a machine. 
But the device id is the same on both BMCs, it appears.

Carol, can you confirm that the product id's are different?  They are
printed at driver load time.

I'll get a patch soon.

-Corey

Yani Ioannou wrote:
 Hi Carol,

 On 10/6/06, Carol Hebert [EMAIL PROTECTED] wrote:
   
 I believe I may have found a problem with the ipmi driver v39 in the
 2.6.18 kernel when loaded on multi-node systems (in my particular case,
 an dual-node x460 with two BMCs).  At first glance, it appears the
 problem may be in the sysfs code added last January -- it looks like it
 may not be handling the multiple BMCs correctly.   The result is that
 the ipmi_si module won't load and the ipmi device nodes don't get
 created.
 

 I guess I shouldn't be suprised - its very hard to find someone with
 access to a system with multiple BMCs (not just multiple interfaces)
 to who is willing to test this out with, I only have access to a old
 HP workstation with a rudimentary IPMI 1.0 card myself.

   
 I'm only starting to debug the issue but wanted to let you know what
 I've seen asap in case someone's already spotted this problem but I
 missed seeing a patch and also because I'm not a sysfs expert and I
 don't know what the original intent was for how to present multiple BMCs
 (from multi-node systems) in the sysfs.
 

 I did write the code to handle multiple BMCs, but it looks like I
 overlooked something, from your backtrace at first glance it appears
 that some sysfs file is being duplicated in the same directory. Could
 you perhaps turn on sysfs/kobject debugging in the kernel debugging
 options?

 Thanks,
 Yani

 -
 Take Surveys. Earn Cash. Influence the Future of IT
 Join SourceForge.net's Techsay panel and you'll get the chance to share your
 opinions on IT  business topics through brief surveys -- and earn cash
 http://www.techsay.com/default.php?page=join.phpp=sourceforgeCID=DEVDEV
 ___
 Openipmi-developer mailing list
 Openipmi-developer@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/openipmi-developer
   


-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT  business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.phpp=sourceforgeCID=DEVDEV
___
Openipmi-developer mailing list
Openipmi-developer@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/openipmi-developer


Re: [Openipmi-developer] ipmi_si appears to be broken on multinode systems in 2.6.18 kernel

2006-10-08 Thread Corey Minyard
Hopefully the attached patch will fix the problem and clean up the error
handling in this failure case.

-Corey

Carol Hebert wrote:
 Hi Corey,

 I believe I may have found a problem with the ipmi driver v39 in the
 2.6.18 kernel when loaded on multi-node systems (in my particular case,
 an dual-node x460 with two BMCs).  At first glance, it appears the
 problem may be in the sysfs code added last January -- it looks like it
 may not be handling the multiple BMCs correctly.   The result is that
 the ipmi_si module won't load and the ipmi device nodes don't get
 created.

 I'm only starting to debug the issue but wanted to let you know what
 I've seen asap in case someone's already spotted this problem but I
 missed seeing a patch and also because I'm not a sysfs expert and I
 don't know what the original intent was for how to present multiple BMCs
 (from multi-node systems) in the sysfs.

 I'm pasting the stack backtrace below.  Please let me know if you have
 any suggestions or questions.

 Thanks much,

 Carol Hebert


 ipmi message handler version 39.0
 IPMI System Interface driver.
 ipmi_si: Trying SMBIOS-specified KCS state machine at I/O address
 0x90a8, slave address 0x20, irq 0
 PM: Adding info for platform:ipmi_si.0
 PM: Adding info for platform:ipmi_bmc.32
 ipmi: Found new BMC (man_id: 0x02,  prod_id: 0x0007, dev_id: 0x20)
  IPMI KCS interface initialized
 ipmi_si: Trying SMBIOS-specified KCS state machine at I/O address 0xca8,
 slave address 0x20, irq 0
 PM: Adding info for platform:ipmi_si.1
 kobject_add failed for ipmi_bmc.32 with -EEXIST, don't try to register
 things with the same name in the same directory.
  [c04051e3] show_trace_log_lvl+0x58/0x16a
  [c04057f0] show_trace+0xd/0x10
  [c0405900] dump_stack+0x19/0x1b
  [c04e7529] kobject_add+0x14b/0x171
  [c0550ced] device_add+0x7a/0x2de
  [c0553b5f] platform_device_add+0xde/0x10e
  [c0553ba4] platform_device_register+0x15/0x18
  [f8b09bf2] ipmi_register_smi+0x538/0x94a [ipmi_msghandler]
  [f980be5e] try_smi_init+0x3ff/0x5a7 [ipmi_si]
  [f980c99e] init_ipmi_si+0x40f/0x6db [ipmi_si]
  [c04427ee] sys_init_module+0x16ad/0x1856
  [c0403fb7] syscall_call+0x7/0xb
 DWARF2 unwinder stuck at syscall_call+0x7/0xb
 Leftover inexact backtrace:
  [c04057f0] show_trace+0xd/0x10
  [c0405900] dump_stack+0x19/0x1b
  [c04e7529] kobject_add+0x14b/0x171
  [c0550ced] device_add+0x7a/0x2de
  [c0553b5f] platform_device_add+0xde/0x10e
  [c0553ba4] platform_device_register+0x15/0x18
  [f8b09bf2] ipmi_register_smi+0x538/0x94a [ipmi_msghandler]
  [f980be5e] try_smi_init+0x3ff/0x5a7 [ipmi_si]
  [f980c99e] init_ipmi_si+0x40f/0x6db [ipmi_si]
  [c04427ee] sys_init_module+0x16ad/0x1856
  [c0403fb7] syscall_call+0x7/0xb
 ipmi_msghandler: Unable to register bmc device: -17
 ipmi_si: Unable to register device: error -17
 BUG: unable to handle kernel paging request at virtual address 6b6b6c73
  printing eip:
 c04aa1d4
 *pde = 6b6b6b6b
 Oops:  [#1]
 SMP
 last sysfs file: /class/drm/card0/dev
 Modules linked in: ipmi_si ipmi_msghandler radeon drm autofs4 hidp
 rfcomm l2cap bluetooth sunrpc ipv6 acpi_cpufreq video sbs i2c_ec button
 battery asus_acpi ac parport_pc lp parport joydev sg pcspkr tg3 aacraid
 i2c_piix4 i2c_core ide_cd cdrom serio_raw dm_snapshot dm_zero dm_mirror
 dm_mod aic94xx libsas scsi_transport_sas sd_mod scsi_mod ext3 jbd
 ehci_hcd ohci_hcd uhci_hcd
 CPU:8
 EIP:0060:[c04aa1d4]Not tainted VLI
 EFLAGS: 00010212   (2.6.18-1.2702.el5PAE #1)
 EIP is at sysfs_remove_link+0x1/0xd
 eax: 6b6b6c43   ebx: e722ad78   ecx: c042dc05   edx: f8b0aad8
 esi: 6b6b6b6b   edi: e722ad78   ebp: e7152e58   esp: e7152e48
 ds: 007b   es: 007b   ss: 0068
 Process modprobe (pid: 20599, ti=e7152000 task=f72b0030
 task.ti=e7152000)
 Stack: e7152e58 f8b08ebf ffef  e7152e6c f8b09559 ffef
 eeb70248
ffef e7152e84 f980bf34 0118c8be 0ca8 0004 
 e7152eac
f980c99e  0004 d1c2d700 010020ac 0ca8 f9814480
 f9814480
 Call Trace:
  [f8b08ebf] ipmi_bmc_unregister+0x1c/0x63 [ipmi_msghandler]
  [f8b09559] ipmi_unregister_smi+0xf/0xc3 [ipmi_msghandler]
  [f980bf34] try_smi_init+0x4d5/0x5a7 [ipmi_si]
  [f980c99e] init_ipmi_si+0x40f/0x6db [ipmi_si]
  [c04427ee] sys_init_module+0x16ad/0x1856
  [c0403fb7] syscall_call+0x7/0xb
 DWARF2 unwinder stuck at syscall_call+0x7/0xb
 Leftover inexact backtrace:
  [c040537f] show_stack_log_lvl+0x8a/0x95
  [c04054b7] show_registers+0x12d/0x19a
  [c04056b4] die+0x190/0x293
  [c0613331] do_page_fault+0x4e8/0x5ba
  [c0404be9] error_code+0x39/0x40
  [f8b09559] ipmi_unregister_smi+0xf/0xc3 [ipmi_msghandler]
  [f980bf34] try_smi_init+0x4d5/0x5a7 [ipmi_si]
  [f980c99e] init_ipmi_si+0x40f/0x6db [ipmi_si]
  [c04427ee] sys_init_module+0x16ad/0x1856
  [c0403fb7] syscall_call+0x7/0xb
 Code: f1 f8 ff 8b 45 f0 e8 06 d0 03 00 8b 45 ec e8 fe cf 03 00 8b 55 e4
 8b 4d e0 8b 41 1c 89 54 81 20 83 c4 14 31 c0 5b 5e 5f 5d c3 55 8b 40
 30 89 e5 e8 d0 e4 ff ff 5d c3 55 89 e5 57 56 89 ce 53 83
 EIP: [c04aa1d4] 

Re: [Openipmi-developer] ipmi_si appears to be broken on multinode systems in 2.6.18 kernel

2006-10-06 Thread Yani Ioannou
Hi Carol,

On 10/6/06, Carol Hebert [EMAIL PROTECTED] wrote:
 I believe I may have found a problem with the ipmi driver v39 in the
 2.6.18 kernel when loaded on multi-node systems (in my particular case,
 an dual-node x460 with two BMCs).  At first glance, it appears the
 problem may be in the sysfs code added last January -- it looks like it
 may not be handling the multiple BMCs correctly.   The result is that
 the ipmi_si module won't load and the ipmi device nodes don't get
 created.

I guess I shouldn't be suprised - its very hard to find someone with
access to a system with multiple BMCs (not just multiple interfaces)
to who is willing to test this out with, I only have access to a old
HP workstation with a rudimentary IPMI 1.0 card myself.

 I'm only starting to debug the issue but wanted to let you know what
 I've seen asap in case someone's already spotted this problem but I
 missed seeing a patch and also because I'm not a sysfs expert and I
 don't know what the original intent was for how to present multiple BMCs
 (from multi-node systems) in the sysfs.

I did write the code to handle multiple BMCs, but it looks like I
overlooked something, from your backtrace at first glance it appears
that some sysfs file is being duplicated in the same directory. Could
you perhaps turn on sysfs/kobject debugging in the kernel debugging
options?

Thanks,
Yani

-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT  business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.phpp=sourceforgeCID=DEVDEV
___
Openipmi-developer mailing list
Openipmi-developer@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/openipmi-developer