Re: [PATCH 0/4, v3] Physical PCI slot objects
Hi Kenji-san, I have been thinking about this problem for quite a bit, and think that there are no good solutions... * Kenji Kaneshige <[EMAIL PROTECTED]>: > On my system, hotplug slots themselves can be added, removed > and replaced with the ohter type of I/O box. >> Are you talking about some sort of I/O cabinet/chassis that you >> can attach to the actual computer? Can the I/O expander unit be >> hotplugged? Or do you need to power your machine down to attach >> it? >> If you can hotplug it, I'm guessing that is why your firmware >> presents SxFy objects in the namespace with "weird" _SUN values, >> and it's why you have to check _STA to see if the slots are valid >> or not. That means the value returned by _SUN will change too, >> right? What will it turn into? >> > > Currently, it's not hotpluggable (will be hotpluggable in the future). > Here is a sample AML code to explain what my firmware is doing. > > Device (PCI0) { > Device (P2PA) { > Device (P2PB) { // for I/O unit (A) > Name (_ADR, ...) > Method (_STA) { ... } > } > Device (S0F0) { // for I/O unit (B) > Name (_ADR, ...) > Method (_STA) { ... } > Method (_EJx) { ... } > Method (_SUN) { ... } > } > ... > } > ... > } > > If the I/O unit (A) is connected, _STA of P2PB returns as present > and _STA of S0F0 returns as not present. > If the I/O unit (B) is connected, _STA of P2PB returns as not > present and _STA of S0F0 returns as present. If I/O unit A or B can never appear while the system is turned on (aka not hotpluggable), then it is incorrect to present them in the current namespace. >>> In addtion, I think we should not trust the _SUN value of >>> non-existing device because the ACPI spec says in "6.5.1 _INI >>> (Init)" that _INI method is run before _ADR, _CID, _HID, _SUN, and >>> _UID are run. It means _SUN could be initialized in _INI method >>> implecitely. And it also says that "If the _STA method indicates >>> that the device is not present, OSPM will not run the _INI and will >>> not examine the children of the device for _INI methods.". After all, >>> _SUN for non-existing device is not reliable because it might not >>> initialized by _INI method. >> This is true, but HP platforms provide _INI at the root >> device/host bridge level, not on SxFy objects, so it doesn't seem >> that we would need to call _STA before calling _SUN for SxFy. >> Does your firmware provide _INI on SxFy objects? > > No, it doesn't. But what I wanted to say was we should not use _SUN > value of non-existing device object. There is nothing illegal about evaluating _SUN for an object that returns 0x0 for _STA. Also, when you say "non-existing", I think of the ACPI CA exception code AE_NOT_EXIST which means "absent from the namespace", and is the reason why my code works on both HP and IBM machines. It does not mean "_STA == 0x0". >> Our firmware teams seem to think that _STA should give the status >> of the card for hotplug support and general functional state. >> They claim that it doesn't makes much sense to support _STA on >> the slot itself unless you can physically change the slot >> topology on the machine at runtime, which we can't do (although >> maybe you can). >> The section of the spec you quoted is correct as long as we are >> talking ACPI 2.0 or later. My platforms implement ACPI 1.0b for >> legacy reasons. :-/ >> In ACPI 1.0b, _EJx definition says (section 6.3.2): >> For hot removal, the device must be immediately ejected >> when the OS calls the _EJ0 control method. The _EJ0 >> control method does not return until ejection is >> complete. After calling _EJ0, the OS will call _STA to >> determine whether or not the eject succeeded. >> So your firmware implementation does not seem backward compatible >> with the 1.0b spec. The different versions of ACPI is part of the >> reason why my patch is breaking on your machine. > > I think this is the real reason. My platform implements ACPI 2.0 or > later. I didn't notice the chage to_EJx definition. Maybe we need to > check ACPI version in pci_slot driver. I did some experiments on HP low-end ia64 (ACPI 1.0b only) and our mid-range and high-end ia64 platforms (ACPI 2.0c). Checking for _STA before evaluating _SUN leads to the same result for me: we only detect populated slots. I think that the real issue is not 1.0 vs 2.0, but the semantics that our different firmware teams have placed on _STA. Again, - HP firmware thinks _STA should give status of the card - Fujitsu firmware thinks _STA should give status of the slot So we are at an impasse. :( >> But as long as we are quoting the spec... :) >> _SUN evaluates to a DWORD that is the number to be used >> in the user interface. This number is required to be >> unique among
Re: [PATCH 0/4, v3] Physical PCI slot objects
Hi Kenji-san, I have been thinking about this problem for quite a bit, and think that there are no good solutions... * Kenji Kaneshige [EMAIL PROTECTED]: On my system, hotplug slots themselves can be added, removed and replaced with the ohter type of I/O box. Are you talking about some sort of I/O cabinet/chassis that you can attach to the actual computer? Can the I/O expander unit be hotplugged? Or do you need to power your machine down to attach it? If you can hotplug it, I'm guessing that is why your firmware presents SxFy objects in the namespace with weird _SUN values, and it's why you have to check _STA to see if the slots are valid or not. That means the value returned by _SUN will change too, right? What will it turn into? Currently, it's not hotpluggable (will be hotpluggable in the future). Here is a sample AML code to explain what my firmware is doing. Device (PCI0) { Device (P2PA) { Device (P2PB) { // for I/O unit (A) Name (_ADR, ...) Method (_STA) { ... } } Device (S0F0) { // for I/O unit (B) Name (_ADR, ...) Method (_STA) { ... } Method (_EJx) { ... } Method (_SUN) { ... } } ... } ... } If the I/O unit (A) is connected, _STA of P2PB returns as present and _STA of S0F0 returns as not present. If the I/O unit (B) is connected, _STA of P2PB returns as not present and _STA of S0F0 returns as present. If I/O unit A or B can never appear while the system is turned on (aka not hotpluggable), then it is incorrect to present them in the current namespace. In addtion, I think we should not trust the _SUN value of non-existing device because the ACPI spec says in 6.5.1 _INI (Init) that _INI method is run before _ADR, _CID, _HID, _SUN, and _UID are run. It means _SUN could be initialized in _INI method implecitely. And it also says that If the _STA method indicates that the device is not present, OSPM will not run the _INI and will not examine the children of the device for _INI methods.. After all, _SUN for non-existing device is not reliable because it might not initialized by _INI method. This is true, but HP platforms provide _INI at the root device/host bridge level, not on SxFy objects, so it doesn't seem that we would need to call _STA before calling _SUN for SxFy. Does your firmware provide _INI on SxFy objects? No, it doesn't. But what I wanted to say was we should not use _SUN value of non-existing device object. There is nothing illegal about evaluating _SUN for an object that returns 0x0 for _STA. Also, when you say non-existing, I think of the ACPI CA exception code AE_NOT_EXIST which means absent from the namespace, and is the reason why my code works on both HP and IBM machines. It does not mean _STA == 0x0. Our firmware teams seem to think that _STA should give the status of the card for hotplug support and general functional state. They claim that it doesn't makes much sense to support _STA on the slot itself unless you can physically change the slot topology on the machine at runtime, which we can't do (although maybe you can). The section of the spec you quoted is correct as long as we are talking ACPI 2.0 or later. My platforms implement ACPI 1.0b for legacy reasons. :-/ In ACPI 1.0b, _EJx definition says (section 6.3.2): For hot removal, the device must be immediately ejected when the OS calls the _EJ0 control method. The _EJ0 control method does not return until ejection is complete. After calling _EJ0, the OS will call _STA to determine whether or not the eject succeeded. So your firmware implementation does not seem backward compatible with the 1.0b spec. The different versions of ACPI is part of the reason why my patch is breaking on your machine. I think this is the real reason. My platform implements ACPI 2.0 or later. I didn't notice the chage to_EJx definition. Maybe we need to check ACPI version in pci_slot driver. I did some experiments on HP low-end ia64 (ACPI 1.0b only) and our mid-range and high-end ia64 platforms (ACPI 2.0c). Checking for _STA before evaluating _SUN leads to the same result for me: we only detect populated slots. I think that the real issue is not 1.0 vs 2.0, but the semantics that our different firmware teams have placed on _STA. Again, - HP firmware thinks _STA should give status of the card - Fujitsu firmware thinks _STA should give status of the slot So we are at an impasse. :( But as long as we are quoting the spec... :) _SUN evaluates to a DWORD that is the number to be used in the user interface. This number is required to be unique among the slots of the same type. It is also recommended that this number match the slot number printed on the physical slot whenever possible.
Re: [PATCH 0/4, v3] Physical PCI slot objects
Hi Alex-san, Hi Kenji-san, * Kenji Kaneshige <[EMAIL PROTECTED]>: Hi Alex-san, On my system, hotplug slots themselves can be added, removed and replaced with the ohter type of I/O box. Are you talking about some sort of I/O cabinet/chassis that you can attach to the actual computer? Can the I/O expander unit be hotplugged? Or do you need to power your machine down to attach it? If you can hotplug it, I'm guessing that is why your firmware presents SxFy objects in the namespace with "weird" _SUN values, and it's why you have to check _STA to see if the slots are valid or not. That means the value returned by _SUN will change too, right? What will it turn into? Currently, it's not hotpluggable (will be hotpluggable in the future). Here is a sample AML code to explain what my firmware is doing. Device (PCI0) { Device (P2PA) { Device (P2PB) { // for I/O unit (A) Name (_ADR, ...) Method (_STA) { ... } } Device (S0F0) { // for I/O unit (B) Name (_ADR, ...) Method (_STA) { ... } Method (_EJx) { ... } Method (_SUN) { ... } } ... } ... } If the I/O unit (A) is connected, _STA of P2PB returns as present and _STA of S0F0 returns as not present. If the I/O unit (B) is connected, _STA of P2PB returns as not present and _STA of S0F0 returns as present. In addtion, I think we should not trust the _SUN value of non-existing device because the ACPI spec says in "6.5.1 _INI (Init)" that _INI method is run before _ADR, _CID, _HID, _SUN, and _UID are run. It means _SUN could be initialized in _INI method implecitely. And it also says that "If the _STA method indicates that the device is not present, OSPM will not run the _INI and will not examine the children of the device for _INI methods.". After all, _SUN for non-existing device is not reliable because it might not initialized by _INI method. This is true, but HP platforms provide _INI at the root device/host bridge level, not on SxFy objects, so it doesn't seem that we would need to call _STA before calling _SUN for SxFy. Does your firmware provide _INI on SxFy objects? No, it doesn't. But what I wanted to say was we should not use _SUN value of non-existing device object. Our firmware teams seem to think that _STA should give the status of the card for hotplug support and general functional state. They claim that it doesn't makes much sense to support _STA on the slot itself unless you can physically change the slot topology on the machine at runtime, which we can't do (although maybe you can). The section of the spec you quoted is correct as long as we are talking ACPI 2.0 or later. My platforms implement ACPI 1.0b for legacy reasons. :-/ In ACPI 1.0b, _EJx definition says (section 6.3.2): For hot removal, the device must be immediately ejected when the OS calls the _EJ0 control method. The _EJ0 control method does not return until ejection is complete. After calling _EJ0, the OS will call _STA to determine whether or not the eject succeeded. So your firmware implementation does not seem backward compatible with the 1.0b spec. The different versions of ACPI is part of the reason why my patch is breaking on your machine. I think this is the real reason. My platform implements ACPI 2.0 or later. I didn't notice the chage to_EJx definition. Maybe we need to check ACPI version in pci_slot driver. But as long as we are quoting the spec... :) _SUN evaluates to a DWORD that is the number to be used in the user interface. This number is required to be unique among the slots of the same type. It is also recommended that this number match the slot number printed on the physical slot whenever possible. section 6.1.6 of ACPI 2.0c My question is, why is your firmware returning multiple values of 1023 then? This seems to be the real reason why my patch is breaking on your machine. While depending on ACPI 1.0b behavior might be somewhat risky, returning the same value for _SUN multiple times, for slots of the same type, definitely seems non-compliant. The reason is very simple. The reason is your patch is evaluating invalid _SUN method. We must skip non-existing device object. This is what your patch is already doing for pci root bridges. In addition, even if those _SUN method were changed to return unique number, none of the problems would be solved. Maybe pci_slot driver would detect many unknown slots. Thanks, Kenji Kaneshige -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/4, v3] Physical PCI slot objects
Hi Alex-san, Hi Kenji-san, * Kenji Kaneshige [EMAIL PROTECTED]: Hi Alex-san, On my system, hotplug slots themselves can be added, removed and replaced with the ohter type of I/O box. Are you talking about some sort of I/O cabinet/chassis that you can attach to the actual computer? Can the I/O expander unit be hotplugged? Or do you need to power your machine down to attach it? If you can hotplug it, I'm guessing that is why your firmware presents SxFy objects in the namespace with weird _SUN values, and it's why you have to check _STA to see if the slots are valid or not. That means the value returned by _SUN will change too, right? What will it turn into? Currently, it's not hotpluggable (will be hotpluggable in the future). Here is a sample AML code to explain what my firmware is doing. Device (PCI0) { Device (P2PA) { Device (P2PB) { // for I/O unit (A) Name (_ADR, ...) Method (_STA) { ... } } Device (S0F0) { // for I/O unit (B) Name (_ADR, ...) Method (_STA) { ... } Method (_EJx) { ... } Method (_SUN) { ... } } ... } ... } If the I/O unit (A) is connected, _STA of P2PB returns as present and _STA of S0F0 returns as not present. If the I/O unit (B) is connected, _STA of P2PB returns as not present and _STA of S0F0 returns as present. In addtion, I think we should not trust the _SUN value of non-existing device because the ACPI spec says in 6.5.1 _INI (Init) that _INI method is run before _ADR, _CID, _HID, _SUN, and _UID are run. It means _SUN could be initialized in _INI method implecitely. And it also says that If the _STA method indicates that the device is not present, OSPM will not run the _INI and will not examine the children of the device for _INI methods.. After all, _SUN for non-existing device is not reliable because it might not initialized by _INI method. This is true, but HP platforms provide _INI at the root device/host bridge level, not on SxFy objects, so it doesn't seem that we would need to call _STA before calling _SUN for SxFy. Does your firmware provide _INI on SxFy objects? No, it doesn't. But what I wanted to say was we should not use _SUN value of non-existing device object. Our firmware teams seem to think that _STA should give the status of the card for hotplug support and general functional state. They claim that it doesn't makes much sense to support _STA on the slot itself unless you can physically change the slot topology on the machine at runtime, which we can't do (although maybe you can). The section of the spec you quoted is correct as long as we are talking ACPI 2.0 or later. My platforms implement ACPI 1.0b for legacy reasons. :-/ In ACPI 1.0b, _EJx definition says (section 6.3.2): For hot removal, the device must be immediately ejected when the OS calls the _EJ0 control method. The _EJ0 control method does not return until ejection is complete. After calling _EJ0, the OS will call _STA to determine whether or not the eject succeeded. So your firmware implementation does not seem backward compatible with the 1.0b spec. The different versions of ACPI is part of the reason why my patch is breaking on your machine. I think this is the real reason. My platform implements ACPI 2.0 or later. I didn't notice the chage to_EJx definition. Maybe we need to check ACPI version in pci_slot driver. But as long as we are quoting the spec... :) _SUN evaluates to a DWORD that is the number to be used in the user interface. This number is required to be unique among the slots of the same type. It is also recommended that this number match the slot number printed on the physical slot whenever possible. section 6.1.6 of ACPI 2.0c My question is, why is your firmware returning multiple values of 1023 then? This seems to be the real reason why my patch is breaking on your machine. While depending on ACPI 1.0b behavior might be somewhat risky, returning the same value for _SUN multiple times, for slots of the same type, definitely seems non-compliant. The reason is very simple. The reason is your patch is evaluating invalid _SUN method. We must skip non-existing device object. This is what your patch is already doing for pci root bridges. In addition, even if those _SUN method were changed to return unique number, none of the problems would be solved. Maybe pci_slot driver would detect many unknown slots. Thanks, Kenji Kaneshige -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/4, v3] Physical PCI slot objects
Hi Kenji-san, * Kenji Kaneshige <[EMAIL PROTECTED]>: > Hi Alex-san, > >>> On my system, hotplug slots themselves can be added, removed >>> and replaced with the ohter type of I/O box. Are you talking about some sort of I/O cabinet/chassis that you can attach to the actual computer? Can the I/O expander unit be hotplugged? Or do you need to power your machine down to attach it? If you can hotplug it, I'm guessing that is why your firmware presents SxFy objects in the namespace with "weird" _SUN values, and it's why you have to check _STA to see if the slots are valid or not. That means the value returned by _SUN will change too, right? What will it turn into? Is that right? Or am I completely wrong? :) >>> The ACPI firmware tells OS the presence of those slots using >>> _STA method (That is, it doesn't use 'LoadTable()' AML >>> operator). On the other hand, current pci_slot driver doesn't >>> check _STA. As a result, pci_slot driver tryied to register >>> the invalid (non-existing) slots. The ACPI firmware of my >>> system returns '1023' if the invalid slot's _SUN is >>> evaluated. This is the cause of Call Traces mentioned above. >>> To fix this problem, pci_slot driver need to check _STA when >>> scanning ACPI Namespace. >> >> Now this is very curious. The relevant line in pci_slot is: >> check_slot() >> status = acpi_evaluate_integer(handle, "_SUN", NULL, sun); >> if (ACPI_FAILURE(status)) >> return -1; >> Why does your firmware return the error information inside sun, >> instead of returning an error in status? That doesn't seem right >> to me... > > Because ACPI spec doesn't provide any way for firmware (AML) > to return as error. You are right -- I got confused about the interpreter returning AE_NOT_FOUND vs the actual firmware returning an error value. Thank you for this clarification. > In addtion, I think we should not trust the _SUN value of > non-existing device because the ACPI spec says in "6.5.1 _INI > (Init)" that _INI method is run before _ADR, _CID, _HID, _SUN, and > _UID are run. It means _SUN could be initialized in _INI method > implecitely. And it also says that "If the _STA method indicates > that the device is not present, OSPM will not run the _INI and will > not examine the children of the device for _INI methods.". After all, > _SUN for non-existing device is not reliable because it might not > initialized by _INI method. This is true, but HP platforms provide _INI at the root device/host bridge level, not on SxFy objects, so it doesn't seem that we would need to call _STA before calling _SUN for SxFy. Does your firmware provide _INI on SxFy objects? >>> I'm sorry for reporting this so late. I'm attaching the patch >>> to fix the problem. This is against 2.6.24-rc3 with your >>> patches applied. Could you try it? >> Applying this patch causes me to only detect populated slots in >> my system, which isn't what I want -- otherwise, I could have >> just enumerated the PCI bus and found the devices that way. :) >> Maybe on your machine, checking existence of _STA might do the >> right thing, but I don't think we should actually be looking at >> any of the actual bits returned. If we check ACPI_STA_DEVICE_PRESENT, then >> we will not detect >> empty slots on my system. Can you try this patch to see if at >> least the first call to acpi_evaluate_integer helps? If that >> doesn't help, maybe the second block will help you, but it breaks >> my machine... > > Maybe the result is as you guess. > The first block doesn't help me (with the first block, all of the > slot disappeared. Please see the bottom of this mail for details). > The second block helps me. > > There seems a difference of the interpretation about _STA for PCI > hotplug slot between your firmware and my firmware. The difference > is: > > - Your firmware provides the _STA method to represent the presence >of PCI adapter card on the slot. > > - My firmware provides the _STA method to represent the presence >of the slot. Yes, that sounds right... > Providing _STA method to represent the presence of PCI adpater card > on the slot (as your firmware does) doesn't seem right to me because > of the following reasons. > > - ACPI spec says "After calling _EJ0, OSPM verifies the device no >longer exists to determine if the eject succeeded. For _HID devices, >OSPM evaluates the _STA method. For _ADR devices, OSPM checks with >the bus driver for that device." in "6.3.3 _EJx (Eject)". Because >PCI adapter card on the slot is _ADR device, the presence of the >card must be checked with bus driver, not _STA. > > - ACPI spec provides the example AML code which uses _STA to >represent Docking Station (See 6.3.2 _EJD (Ejection Dependent >Device)". The usage of this is same as my firmware. > > What do you think about that? Our firmware teams seem to think that _STA should give the status of the card for hotplug support and general functional state. They claim that it
Re: [PATCH 0/4, v3] Physical PCI slot objects
Hi Kenji-san, * Kenji Kaneshige [EMAIL PROTECTED]: Hi Alex-san, On my system, hotplug slots themselves can be added, removed and replaced with the ohter type of I/O box. Are you talking about some sort of I/O cabinet/chassis that you can attach to the actual computer? Can the I/O expander unit be hotplugged? Or do you need to power your machine down to attach it? If you can hotplug it, I'm guessing that is why your firmware presents SxFy objects in the namespace with weird _SUN values, and it's why you have to check _STA to see if the slots are valid or not. That means the value returned by _SUN will change too, right? What will it turn into? Is that right? Or am I completely wrong? :) The ACPI firmware tells OS the presence of those slots using _STA method (That is, it doesn't use 'LoadTable()' AML operator). On the other hand, current pci_slot driver doesn't check _STA. As a result, pci_slot driver tryied to register the invalid (non-existing) slots. The ACPI firmware of my system returns '1023' if the invalid slot's _SUN is evaluated. This is the cause of Call Traces mentioned above. To fix this problem, pci_slot driver need to check _STA when scanning ACPI Namespace. Now this is very curious. The relevant line in pci_slot is: check_slot() status = acpi_evaluate_integer(handle, _SUN, NULL, sun); if (ACPI_FAILURE(status)) return -1; Why does your firmware return the error information inside sun, instead of returning an error in status? That doesn't seem right to me... Because ACPI spec doesn't provide any way for firmware (AML) to return as error. You are right -- I got confused about the interpreter returning AE_NOT_FOUND vs the actual firmware returning an error value. Thank you for this clarification. In addtion, I think we should not trust the _SUN value of non-existing device because the ACPI spec says in 6.5.1 _INI (Init) that _INI method is run before _ADR, _CID, _HID, _SUN, and _UID are run. It means _SUN could be initialized in _INI method implecitely. And it also says that If the _STA method indicates that the device is not present, OSPM will not run the _INI and will not examine the children of the device for _INI methods.. After all, _SUN for non-existing device is not reliable because it might not initialized by _INI method. This is true, but HP platforms provide _INI at the root device/host bridge level, not on SxFy objects, so it doesn't seem that we would need to call _STA before calling _SUN for SxFy. Does your firmware provide _INI on SxFy objects? I'm sorry for reporting this so late. I'm attaching the patch to fix the problem. This is against 2.6.24-rc3 with your patches applied. Could you try it? Applying this patch causes me to only detect populated slots in my system, which isn't what I want -- otherwise, I could have just enumerated the PCI bus and found the devices that way. :) Maybe on your machine, checking existence of _STA might do the right thing, but I don't think we should actually be looking at any of the actual bits returned. If we check ACPI_STA_DEVICE_PRESENT, then we will not detect empty slots on my system. Can you try this patch to see if at least the first call to acpi_evaluate_integer helps? If that doesn't help, maybe the second block will help you, but it breaks my machine... Maybe the result is as you guess. The first block doesn't help me (with the first block, all of the slot disappeared. Please see the bottom of this mail for details). The second block helps me. There seems a difference of the interpretation about _STA for PCI hotplug slot between your firmware and my firmware. The difference is: - Your firmware provides the _STA method to represent the presence of PCI adapter card on the slot. - My firmware provides the _STA method to represent the presence of the slot. Yes, that sounds right... Providing _STA method to represent the presence of PCI adpater card on the slot (as your firmware does) doesn't seem right to me because of the following reasons. - ACPI spec says After calling _EJ0, OSPM verifies the device no longer exists to determine if the eject succeeded. For _HID devices, OSPM evaluates the _STA method. For _ADR devices, OSPM checks with the bus driver for that device. in 6.3.3 _EJx (Eject). Because PCI adapter card on the slot is _ADR device, the presence of the card must be checked with bus driver, not _STA. - ACPI spec provides the example AML code which uses _STA to represent Docking Station (See 6.3.2 _EJD (Ejection Dependent Device). The usage of this is same as my firmware. What do you think about that? Our firmware teams seem to think that _STA should give the status of the card for hotplug support and general functional state. They claim that it doesn't makes much sense to support _STA on the slot itself unless you can physically change the slot topology on the machine at runtime, which
Re: [PATCH 0/4, v3] Physical PCI slot objects
Hi Alex-san, On my system, hotplug slots themselves can be added, removed and replaced with the ohter type of I/O box. The ACPI firmware tells OS the presence of those slots using _STA method (That is, it doesn't use 'LoadTable()' AML operator). On the other hand, current pci_slot driver doesn't check _STA. As a result, pci_slot driver tryied to register the invalid (non-existing) slots. The ACPI firmware of my system returns '1023' if the invalid slot's _SUN is evaluated. This is the cause of Call Traces mentioned above. To fix this problem, pci_slot driver need to check _STA when scanning ACPI Namespace. Now this is very curious. The relevant line in pci_slot is: check_slot() status = acpi_evaluate_integer(handle, "_SUN", NULL, sun); if (ACPI_FAILURE(status)) return -1; Why does your firmware return the error information inside sun, instead of returning an error in status? That doesn't seem right to me... Because ACPI spec doesn't provide any way for firmware (AML) to return as error. In addtion, I think we should not trust the _SUN value of non-existing device because the ACPI spec says in "6.5.1 _INI (Init)" that _INI method is run before _ADR, _CID, _HID, _SUN, and _UID are run. It means _SUN could be initialized in _INI method implecitely. And it also says that "If the _STA method indicates that the device is not present, OSPM will not run the _INI and will not examine the children of the device for _INI methods.". After all, _SUN for non-existing device is not reliable because it might not initialized by _INI method. I'm sorry for reporting this so late. I'm attaching the patch to fix the problem. This is against 2.6.24-rc3 with your patches applied. Could you try it? Applying this patch causes me to only detect populated slots in my system, which isn't what I want -- otherwise, I could have just enumerated the PCI bus and found the devices that way. :) Maybe on your machine, checking existence of _STA might do the right thing, but I don't think we should actually be looking at any of the actual bits returned. If we check ACPI_STA_DEVICE_PRESENT, then we will not detect empty slots on my system. Can you try this patch to see if at least the first call to acpi_evaluate_integer helps? If that doesn't help, maybe the second block will help you, but it breaks my machine... Maybe the result is as you guess. The first block doesn't help me (with the first block, all of the slot disappeared. Please see the bottom of this mail for details). The second block helps me. There seems a difference of the interpretation about _STA for PCI hotplug slot between your firmware and my firmware. The difference is: - Your firmware provides the _STA method to represent the presence of PCI adapter card on the slot. - My firmware provides the _STA method to represent the presence of the slot. Providing _STA method to represent the presence of PCI adpater card on the slot (as your firmware does) doesn't seem right to me because of the following reasons. - ACPI spec says "After calling _EJ0, OSPM verifies the device no longer exists to determine if the eject succeeded. For _HID devices, OSPM evaluates the _STA method. For _ADR devices, OSPM checks with the bus driver for that device." in "6.3.3 _EJx (Eject)". Because PCI adapter card on the slot is _ADR device, the presence of the card must be checked with bus driver, not _STA. - ACPI spec provides the example AML code which uses _STA to represent Docking Station (See 6.3.2 _EJD (Ejection Dependent Device)". The usage of this is same as my firmware. What do you think about that? P.S. None of the slots except the strange slot named '1023' were detected with your patch. It would happen on other machines (might including hp machine) too. The reason is _STA evaluation fails on the hotplug slot which doesn't have _STA method. If the device object doesn't have a _STA method, we need to handle it as if it is present. I believe firmware normally doesn't provide _STA method for PCI hotplug slots. Thanks, Kenji Kaneshige -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/4, v3] Physical PCI slot objects
Hi Alex-san, On my system, hotplug slots themselves can be added, removed and replaced with the ohter type of I/O box. The ACPI firmware tells OS the presence of those slots using _STA method (That is, it doesn't use 'LoadTable()' AML operator). On the other hand, current pci_slot driver doesn't check _STA. As a result, pci_slot driver tryied to register the invalid (non-existing) slots. The ACPI firmware of my system returns '1023' if the invalid slot's _SUN is evaluated. This is the cause of Call Traces mentioned above. To fix this problem, pci_slot driver need to check _STA when scanning ACPI Namespace. Now this is very curious. The relevant line in pci_slot is: check_slot() status = acpi_evaluate_integer(handle, _SUN, NULL, sun); if (ACPI_FAILURE(status)) return -1; Why does your firmware return the error information inside sun, instead of returning an error in status? That doesn't seem right to me... Because ACPI spec doesn't provide any way for firmware (AML) to return as error. In addtion, I think we should not trust the _SUN value of non-existing device because the ACPI spec says in 6.5.1 _INI (Init) that _INI method is run before _ADR, _CID, _HID, _SUN, and _UID are run. It means _SUN could be initialized in _INI method implecitely. And it also says that If the _STA method indicates that the device is not present, OSPM will not run the _INI and will not examine the children of the device for _INI methods.. After all, _SUN for non-existing device is not reliable because it might not initialized by _INI method. I'm sorry for reporting this so late. I'm attaching the patch to fix the problem. This is against 2.6.24-rc3 with your patches applied. Could you try it? Applying this patch causes me to only detect populated slots in my system, which isn't what I want -- otherwise, I could have just enumerated the PCI bus and found the devices that way. :) Maybe on your machine, checking existence of _STA might do the right thing, but I don't think we should actually be looking at any of the actual bits returned. If we check ACPI_STA_DEVICE_PRESENT, then we will not detect empty slots on my system. Can you try this patch to see if at least the first call to acpi_evaluate_integer helps? If that doesn't help, maybe the second block will help you, but it breaks my machine... Maybe the result is as you guess. The first block doesn't help me (with the first block, all of the slot disappeared. Please see the bottom of this mail for details). The second block helps me. There seems a difference of the interpretation about _STA for PCI hotplug slot between your firmware and my firmware. The difference is: - Your firmware provides the _STA method to represent the presence of PCI adapter card on the slot. - My firmware provides the _STA method to represent the presence of the slot. Providing _STA method to represent the presence of PCI adpater card on the slot (as your firmware does) doesn't seem right to me because of the following reasons. - ACPI spec says After calling _EJ0, OSPM verifies the device no longer exists to determine if the eject succeeded. For _HID devices, OSPM evaluates the _STA method. For _ADR devices, OSPM checks with the bus driver for that device. in 6.3.3 _EJx (Eject). Because PCI adapter card on the slot is _ADR device, the presence of the card must be checked with bus driver, not _STA. - ACPI spec provides the example AML code which uses _STA to represent Docking Station (See 6.3.2 _EJD (Ejection Dependent Device). The usage of this is same as my firmware. What do you think about that? P.S. None of the slots except the strange slot named '1023' were detected with your patch. It would happen on other machines (might including hp machine) too. The reason is _STA evaluation fails on the hotplug slot which doesn't have _STA method. If the device object doesn't have a _STA method, we need to handle it as if it is present. I believe firmware normally doesn't provide _STA method for PCI hotplug slots. Thanks, Kenji Kaneshige -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/4, v3] Physical PCI slot objects
On Thu, Nov 29, 2007 at 06:19:41PM -0700, Alex Chiang wrote: < snip > > > [] kthread+0x47/0x73 > > [] child_rip+0xa/0x12 > > [] kthread+0x0/0x73 > > [] child_rip+0x0/0x12 > > Maybe we're trying to kick off a hotplug event on the wrong slot? > I really have no idea... One hotplug event related difference that I believe I noticed between the x3850 were I did not see the problem and the x3950 is the hotplug event arrival location. On the x3850 the handler receives the handle for the transparent p2p bridge above the slot. On the x3950 I believe the handler is receiving the handle for the root bridge (directly above the transparent p2p bridge). acpiphp installs a handler for each of these possible arrival locations. pci_slot installs a handler for neither location which seems like it should be okay. I notice that acpi_pci_register_driver() was only being used by acpiphp before pci_slot. Perhaps your changes are flushing out some sort of ACPI core coexistence issue not seen previously because there was only a single user? My silly idea is probably no better than your no idea. :) > > > Code: ff ff ff ff 40 23 2c 88 ff ff ff ff 00 c8 c6 3b 10 81 ff ff > > RIP [] :pci_slot:__this_module+0x21c4/0xf204 > > RSP > > Can you apply this debug patch on top of your tree, and send me > the output? > > I'd be curious to see the output for your failure case: > > # modprobe pci_slot debug=1 > # modprobe acpiphp debug=1 Someone else is using the system right now so I may not be able to do this until next week. Gary -- Gary Hade System x Enablement IBM Linux Technology Center 503-578-4503 IBM T/L: 775-4503 [EMAIL PROTECTED] http://www.ibm.com/linux/ltc - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/4, v3] Physical PCI slot objects
On Thu, Nov 29, 2007 at 06:19:41PM -0700, Alex Chiang wrote: snip [802499bc] kthread+0x47/0x73 [8020cc98] child_rip+0xa/0x12 [80249975] kthread+0x0/0x73 [8020cc8e] child_rip+0x0/0x12 Maybe we're trying to kick off a hotplug event on the wrong slot? I really have no idea... One hotplug event related difference that I believe I noticed between the x3850 were I did not see the problem and the x3950 is the hotplug event arrival location. On the x3850 the handler receives the handle for the transparent p2p bridge above the slot. On the x3950 I believe the handler is receiving the handle for the root bridge (directly above the transparent p2p bridge). acpiphp installs a handler for each of these possible arrival locations. pci_slot installs a handler for neither location which seems like it should be okay. I notice that acpi_pci_register_driver() was only being used by acpiphp before pci_slot. Perhaps your changes are flushing out some sort of ACPI core coexistence issue not seen previously because there was only a single user? My silly idea is probably no better than your no idea. :) Code: ff ff ff ff 40 23 2c 88 ff ff ff ff 00 c8 c6 3b 10 81 ff ff RIP [882c2344] :pci_slot:__this_module+0x21c4/0xf204 RSP 81103fa43ea8 Can you apply this debug patch on top of your tree, and send me the output? I'd be curious to see the output for your failure case: # modprobe pci_slot debug=1 # modprobe acpiphp debug=1 Someone else is using the system right now so I may not be able to do this until next week. Gary -- Gary Hade System x Enablement IBM Linux Technology Center 503-578-4503 IBM T/L: 775-4503 [EMAIL PROTECTED] http://www.ibm.com/linux/ltc - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/4, v3] Physical PCI slot objects
Hi Kenji-san, * Kenji Kaneshige <[EMAIL PROTECTED]>: > > Hi Gary, Kenji-san, et. al, > > > > * Gary Hade <[EMAIL PROTECTED]>: > >> Alex, What I was trying to suggest is a boot-time kernel > >> option, not a kernel configuration option. The basic idea is > >> to give the user (with a single binary kernel) the ability to > >> include your ACPI-PCI slot driver feature changes only when > >> they are really needed. In addition to reducing the number of > >> system/PCI hotplug driver combinations where your changes would > >> need to be validated, I believe would also help alleviate other > >> worries (e.g. Andi Kleen's memory consumption concern). I > >> believe this goal could also be achieved with the kernel config > >> option by making the pci_slot module runtime loadable with the > >> PCI hotplug drivers only visiting your new code when the > >> pci_slot driver is loaded, although I think this would be more > >> difficult to implement. > > > > I have modified my patch series so that the final patch that > > introduces my ACPI-PCI slot driver is a full-fledged module, that > > has a tristate Kconfig option. > > > > Thank you for your good job. Thanks for testing. :) > I tested shpchp and pciehp both with and without pci_slot > module. There seems no regression from shpchp and pciehp's > point of view. (I had a little concern about the hotplug > slots' name that vary depending on whether pci_slot > functionality is enabled or disabled. But, now that we can > build pci_slot driver as a kernel module, I don't think it is a > big problem). Hm, you are right. On my machine, if I load pciehp first and acpiphp second (even without loading pci_slot), I will see the following: [EMAIL PROTECTED] slots]# ls 0016_0006 0197_0005 10 3 4 7 8 9 [EMAIL PROTECTED] slots]# lsmod | grep pci_slot [EMAIL PROTECTED] slots]# lsmod | grep hp acpiphp 115984 0 pciehp140616 0 pci_hotplug 123972 2 acpiphp,pciehp On the other hand, if I do load pci_slot first, and then pciehp, you are right, I will see something like this: [EMAIL PROTECTED] slots]# ls 1 10 2 3 4 5 6 7 8 9 [EMAIL PROTECTED] slots]# lsmod | grep pci_slot pci_slot 74436 0 [EMAIL PROTECTED] slots]# lsmod | grep hp pciehp140616 0 pci_hotplug 123972 1 pciehp But I do agree, people don't need to load pci_slot at all if they don't want it, and they won't be bothered. > Only the problems is that I got Call Traces with the following > error messages when pci_slot driver was loaded, and one strange > slot named '1023' was registered (other slots are fine). This > is the same problem I reported before. > > sysfs: duplicate filename '1023' can not be created > WARNING: at fs/sysfs/dir.c:424 sysfs_add_one() > > kobject_add failed for 1023 with -EEXIST, don't try to > register things with the same name in the same directory. > > On my system, hotplug slots themselves can be added, removed > and replaced with the ohter type of I/O box. The ACPI firmware > tells OS the presence of those slots using _STA method (That > is, it doesn't use 'LoadTable()' AML operator). On the other > hand, current pci_slot driver doesn't check _STA. As a result, > pci_slot driver tryied to register the invalid (non-existing) > slots. The ACPI firmware of my system returns '1023' if the > invalid slot's _SUN is evaluated. This is the cause of Call > Traces mentioned above. To fix this problem, pci_slot driver > need to check _STA when scanning ACPI Namespace. Now this is very curious. The relevant line in pci_slot is: check_slot() status = acpi_evaluate_integer(handle, "_SUN", NULL, sun); if (ACPI_FAILURE(status)) return -1; Why does your firmware return the error information inside sun, instead of returning an error in status? That doesn't seem right to me... > I'm sorry for reporting this so late. I'm attaching the patch > to fix the problem. This is against 2.6.24-rc3 with your > patches applied. Could you try it? Applying this patch causes me to only detect populated slots in my system, which isn't what I want -- otherwise, I could have just enumerated the PCI bus and found the devices that way. :) Maybe on your machine, checking existence of _STA might do the right thing, but I don't think we should actually be looking at any of the actual bits returned. If we check ACPI_STA_DEVICE_PRESENT, then we will not detect empty slots on my system. Can you try this patch to see if at least the first call to acpi_evaluate_integer helps? If that doesn't help, maybe the second block will help you, but it breaks my machine... Thanks. /ac diff --git a/drivers/acpi/pci_slot.c b/drivers/acpi/pci_slot.c index 724f4f0..63a4dc8 100644 --- a/drivers/acpi/pci_slot.c +++ b/drivers/acpi/pci_slot.c @@ -55,9 +65,21 @@ static struct acpi_pci_driver acpi_pci_slot_driver = { static int check_slot(acpi_handle handle, int *device, unsigned long
Re: [PATCH 0/4, v3] Physical PCI slot objects
Hi Gary, First, thanks for all the help and testing -- I really appreciate it. * Gary Hade <[EMAIL PROTECTED]>: > > I'm getting back to you but unfortunately with not so good > news. Sorry Alex. :-/ > On the x3950 (configured single node) I encountered the below > problem when attempting to hotplug a PCIe adapter when 'pci_slot' > was loaded prior to 'acpiphp'. I did not see the problem when > the drivers were loaded in the opposite order. Very bizarre, especially given the stack trace below, which doesn't really make any sense to me at all. > FYI, the node contains 2 hotpluggable PCIe slots and 5 > non-hotpluggable PCIe slots but 'pci_slot' only exposed > the 2 hotpluggable slots. This does not appear to be due > to a 'pci_slot' driver problem since I looked at the DSDT > and SSDT and found that there are currently no _SUN methods > for the non-hotpluggable slots. Ok, this is not too surprising, but it's a different can o' worms. ;) Let's save this for another day... > invalid opcode: [1] SMP > CPU 1 > Modules linked in: acpiphp pci_slot e1000 aic79xx scsi_transport_spi shpchp > dock pci_hotplug ipt_LOG xt_limit xt_pkttype button battery ac power_supply > ip6t_REJECT xt_tcpudp ipt_REJECT iptable_mangle iptable_filter > ip6table_mangle ip_tables ip6table_filter ip6_tables x_tables ipv6 usbhid > ff_memless ext3 jbd loop dm_mod ehci_hcd uhci_hcd usbcore ide_cd bnx2 cdrom > rng_core reiserfs ata_piix ahci libata thermal processor piix sg megaraid_sas > fan edd sd_mod scsi_mod ide_disk ide_core > Pid: 121, comm: kacpi_notify Not tainted 2.6.24-rc3-gh-smp #1 > RIP: 0010:[] [] > :pci_slot:__this_module+0x21c4/0xf204 > RSP: 0018:81103fa43ea8 EFLAGS: 00010216 > RAX: 81103f944a18 RBX: 81103d4fe910 RCX: 000f > RDX: RSI: RDI: 8110400d13d0 > RBP: 8032d97b R08: 8110400fc7e0 R09: 0002 > R10: R11: 8021d193 R12: 811040105cf0 > R13: R14: 80635820 R15: > FS: () GS:8110400ed8c0() knlGS: > CS: 0010 DS: 0018 ES: 0018 CR0: 8005003b > CR2: 2b266d876471 CR3: 00103c825000 CR4: 06e0 > DR0: DR1: DR2: > DR3: DR6: 0ff0 DR7: 0400 > Process kacpi_notify (pid: 121, threadinfo 81103fa42000, task > 81103f9f8040) > Stack: 809c 81103d119a00 8032d99e 81103f9fc540 > 8024618d 81103f9fc540 81103f9fc540 8024696c > 80246a46 81103f9f8040 80249ada > Call Trace: > [] acpi_ev_notify_dispatch+0x57/0x60 > [] acpi_os_execute_notify+0x23/0x2c > [] run_workqueue+0x7f/0x10b > [] worker_thread+0x0/0xe4 > [] worker_thread+0xda/0xe4 > [] autoremove_wake_function+0x0/0x2e > [] kthread+0x47/0x73 > [] child_rip+0xa/0x12 > [] kthread+0x0/0x73 > [] child_rip+0x0/0x12 Maybe we're trying to kick off a hotplug event on the wrong slot? I really have no idea... > Code: ff ff ff ff 40 23 2c 88 ff ff ff ff 00 c8 c6 3b 10 81 ff ff > RIP [] :pci_slot:__this_module+0x21c4/0xf204 > RSP Can you apply this debug patch on top of your tree, and send me the output? I'd be curious to see the output for your failure case: # modprobe pci_slot debug=1 # modprobe acpiphp debug=1 Thanks. /ac diff --git a/drivers/acpi/pci_slot.c b/drivers/acpi/pci_slot.c index 724f4f0..5a62def 100644 --- a/drivers/acpi/pci_slot.c +++ b/drivers/acpi/pci_slot.c @@ -30,12 +30,16 @@ #include #include +static int debug; + #define DRIVER_VERSION "0.1" #define DRIVER_AUTHOR "Alex Chiang <[EMAIL PROTECTED]>" #define DRIVER_DESC"ACPI PCI Slot Detection Driver" MODULE_AUTHOR(DRIVER_AUTHOR); MODULE_DESCRIPTION(DRIVER_DESC); MODULE_LICENSE("GPL"); +MODULE_PARM_DESC(debug, "Debugging mode enabled or not"); +module_param(debug, bool, 0644); #define _COMPONENT ACPI_PCI_COMPONENT ACPI_MODULE_NAME("pci_slot"); @@ -43,6 +47,12 @@ ACPI_MODULE_NAME("pci_slot"); #define MY_NAME "pci_slot" #define err(format, arg...) printk(KERN_ERR "%s: " format , MY_NAME , ## arg) #define info(format, arg...) printk(KERN_INFO "%s: " format , MY_NAME , ## arg) +#define dbg(format, arg...)\ + do {\ + if (debug) \ + printk(KERN_DEBUG "%s: " format,\ + MY_NAME , ## arg); \ + } while (0) static int acpi_pci_slot_add(acpi_handle handle); static void acpi_pci_slot_remove(acpi_handle handle); @@ -125,6 +135,9 @@ register_slot(acpi_handle handle, u32 lvl, void *context, void **rv) if (IS_ERR(pci_slot)) err("pci_create_slot returned %ld\n", PTR_ERR(pci_slot)); +
Re: [PATCH 0/4, v3] Physical PCI slot objects
Hi Gary, First, thanks for all the help and testing -- I really appreciate it. * Gary Hade [EMAIL PROTECTED]: I'm getting back to you but unfortunately with not so good news. Sorry Alex. :-/ On the x3950 (configured single node) I encountered the below problem when attempting to hotplug a PCIe adapter when 'pci_slot' was loaded prior to 'acpiphp'. I did not see the problem when the drivers were loaded in the opposite order. Very bizarre, especially given the stack trace below, which doesn't really make any sense to me at all. FYI, the node contains 2 hotpluggable PCIe slots and 5 non-hotpluggable PCIe slots but 'pci_slot' only exposed the 2 hotpluggable slots. This does not appear to be due to a 'pci_slot' driver problem since I looked at the DSDT and SSDT and found that there are currently no _SUN methods for the non-hotpluggable slots. Ok, this is not too surprising, but it's a different can o' worms. ;) Let's save this for another day... invalid opcode: [1] SMP CPU 1 Modules linked in: acpiphp pci_slot e1000 aic79xx scsi_transport_spi shpchp dock pci_hotplug ipt_LOG xt_limit xt_pkttype button battery ac power_supply ip6t_REJECT xt_tcpudp ipt_REJECT iptable_mangle iptable_filter ip6table_mangle ip_tables ip6table_filter ip6_tables x_tables ipv6 usbhid ff_memless ext3 jbd loop dm_mod ehci_hcd uhci_hcd usbcore ide_cd bnx2 cdrom rng_core reiserfs ata_piix ahci libata thermal processor piix sg megaraid_sas fan edd sd_mod scsi_mod ide_disk ide_core Pid: 121, comm: kacpi_notify Not tainted 2.6.24-rc3-gh-smp #1 RIP: 0010:[882c2344] [882c2344] :pci_slot:__this_module+0x21c4/0xf204 RSP: 0018:81103fa43ea8 EFLAGS: 00010216 RAX: 81103f944a18 RBX: 81103d4fe910 RCX: 000f RDX: RSI: RDI: 8110400d13d0 RBP: 8032d97b R08: 8110400fc7e0 R09: 0002 R10: R11: 8021d193 R12: 811040105cf0 R13: R14: 80635820 R15: FS: () GS:8110400ed8c0() knlGS: CS: 0010 DS: 0018 ES: 0018 CR0: 8005003b CR2: 2b266d876471 CR3: 00103c825000 CR4: 06e0 DR0: DR1: DR2: DR3: DR6: 0ff0 DR7: 0400 Process kacpi_notify (pid: 121, threadinfo 81103fa42000, task 81103f9f8040) Stack: 809c 81103d119a00 8032d99e 81103f9fc540 8024618d 81103f9fc540 81103f9fc540 8024696c 80246a46 81103f9f8040 80249ada Call Trace: [809c] acpi_ev_notify_dispatch+0x57/0x60 [8032d99e] acpi_os_execute_notify+0x23/0x2c [8024618d] run_workqueue+0x7f/0x10b [8024696c] worker_thread+0x0/0xe4 [80246a46] worker_thread+0xda/0xe4 [80249ada] autoremove_wake_function+0x0/0x2e [802499bc] kthread+0x47/0x73 [8020cc98] child_rip+0xa/0x12 [80249975] kthread+0x0/0x73 [8020cc8e] child_rip+0x0/0x12 Maybe we're trying to kick off a hotplug event on the wrong slot? I really have no idea... Code: ff ff ff ff 40 23 2c 88 ff ff ff ff 00 c8 c6 3b 10 81 ff ff RIP [882c2344] :pci_slot:__this_module+0x21c4/0xf204 RSP 81103fa43ea8 Can you apply this debug patch on top of your tree, and send me the output? I'd be curious to see the output for your failure case: # modprobe pci_slot debug=1 # modprobe acpiphp debug=1 Thanks. /ac diff --git a/drivers/acpi/pci_slot.c b/drivers/acpi/pci_slot.c index 724f4f0..5a62def 100644 --- a/drivers/acpi/pci_slot.c +++ b/drivers/acpi/pci_slot.c @@ -30,12 +30,16 @@ #include acpi/acpi_bus.h #include acpi/acpi_drivers.h +static int debug; + #define DRIVER_VERSION 0.1 #define DRIVER_AUTHOR Alex Chiang [EMAIL PROTECTED] #define DRIVER_DESCACPI PCI Slot Detection Driver MODULE_AUTHOR(DRIVER_AUTHOR); MODULE_DESCRIPTION(DRIVER_DESC); MODULE_LICENSE(GPL); +MODULE_PARM_DESC(debug, Debugging mode enabled or not); +module_param(debug, bool, 0644); #define _COMPONENT ACPI_PCI_COMPONENT ACPI_MODULE_NAME(pci_slot); @@ -43,6 +47,12 @@ ACPI_MODULE_NAME(pci_slot); #define MY_NAME pci_slot #define err(format, arg...) printk(KERN_ERR %s: format , MY_NAME , ## arg) #define info(format, arg...) printk(KERN_INFO %s: format , MY_NAME , ## arg) +#define dbg(format, arg...)\ + do {\ + if (debug) \ + printk(KERN_DEBUG %s: format,\ + MY_NAME , ## arg); \ + } while (0) static int acpi_pci_slot_add(acpi_handle handle); static void acpi_pci_slot_remove(acpi_handle handle); @@ -125,6 +135,9 @@
Re: [PATCH 0/4, v3] Physical PCI slot objects
Hi Kenji-san, * Kenji Kaneshige [EMAIL PROTECTED]: Hi Gary, Kenji-san, et. al, * Gary Hade [EMAIL PROTECTED]: Alex, What I was trying to suggest is a boot-time kernel option, not a kernel configuration option. The basic idea is to give the user (with a single binary kernel) the ability to include your ACPI-PCI slot driver feature changes only when they are really needed. In addition to reducing the number of system/PCI hotplug driver combinations where your changes would need to be validated, I believe would also help alleviate other worries (e.g. Andi Kleen's memory consumption concern). I believe this goal could also be achieved with the kernel config option by making the pci_slot module runtime loadable with the PCI hotplug drivers only visiting your new code when the pci_slot driver is loaded, although I think this would be more difficult to implement. I have modified my patch series so that the final patch that introduces my ACPI-PCI slot driver is a full-fledged module, that has a tristate Kconfig option. Thank you for your good job. Thanks for testing. :) I tested shpchp and pciehp both with and without pci_slot module. There seems no regression from shpchp and pciehp's point of view. (I had a little concern about the hotplug slots' name that vary depending on whether pci_slot functionality is enabled or disabled. But, now that we can build pci_slot driver as a kernel module, I don't think it is a big problem). Hm, you are right. On my machine, if I load pciehp first and acpiphp second (even without loading pci_slot), I will see the following: [EMAIL PROTECTED] slots]# ls 0016_0006 0197_0005 10 3 4 7 8 9 [EMAIL PROTECTED] slots]# lsmod | grep pci_slot [EMAIL PROTECTED] slots]# lsmod | grep hp acpiphp 115984 0 pciehp140616 0 pci_hotplug 123972 2 acpiphp,pciehp On the other hand, if I do load pci_slot first, and then pciehp, you are right, I will see something like this: [EMAIL PROTECTED] slots]# ls 1 10 2 3 4 5 6 7 8 9 [EMAIL PROTECTED] slots]# lsmod | grep pci_slot pci_slot 74436 0 [EMAIL PROTECTED] slots]# lsmod | grep hp pciehp140616 0 pci_hotplug 123972 1 pciehp But I do agree, people don't need to load pci_slot at all if they don't want it, and they won't be bothered. Only the problems is that I got Call Traces with the following error messages when pci_slot driver was loaded, and one strange slot named '1023' was registered (other slots are fine). This is the same problem I reported before. sysfs: duplicate filename '1023' can not be created WARNING: at fs/sysfs/dir.c:424 sysfs_add_one() kobject_add failed for 1023 with -EEXIST, don't try to register things with the same name in the same directory. On my system, hotplug slots themselves can be added, removed and replaced with the ohter type of I/O box. The ACPI firmware tells OS the presence of those slots using _STA method (That is, it doesn't use 'LoadTable()' AML operator). On the other hand, current pci_slot driver doesn't check _STA. As a result, pci_slot driver tryied to register the invalid (non-existing) slots. The ACPI firmware of my system returns '1023' if the invalid slot's _SUN is evaluated. This is the cause of Call Traces mentioned above. To fix this problem, pci_slot driver need to check _STA when scanning ACPI Namespace. Now this is very curious. The relevant line in pci_slot is: check_slot() status = acpi_evaluate_integer(handle, _SUN, NULL, sun); if (ACPI_FAILURE(status)) return -1; Why does your firmware return the error information inside sun, instead of returning an error in status? That doesn't seem right to me... I'm sorry for reporting this so late. I'm attaching the patch to fix the problem. This is against 2.6.24-rc3 with your patches applied. Could you try it? Applying this patch causes me to only detect populated slots in my system, which isn't what I want -- otherwise, I could have just enumerated the PCI bus and found the devices that way. :) Maybe on your machine, checking existence of _STA might do the right thing, but I don't think we should actually be looking at any of the actual bits returned. If we check ACPI_STA_DEVICE_PRESENT, then we will not detect empty slots on my system. Can you try this patch to see if at least the first call to acpi_evaluate_integer helps? If that doesn't help, maybe the second block will help you, but it breaks my machine... Thanks. /ac diff --git a/drivers/acpi/pci_slot.c b/drivers/acpi/pci_slot.c index 724f4f0..63a4dc8 100644 --- a/drivers/acpi/pci_slot.c +++ b/drivers/acpi/pci_slot.c @@ -55,9 +65,21 @@ static struct acpi_pci_driver acpi_pci_slot_driver = { static int check_slot(acpi_handle handle, int *device, unsigned long *sun) { - unsigned long adr; + unsigned long adr, sta; acpi_status
Re: [PATCH 0/4, v3] Physical PCI slot objects
> Hi Gary, Kenji-san, et. al, > > * Gary Hade <[EMAIL PROTECTED]>: >> Alex, What I was trying to suggest is a boot-time kernel >> option, not a kernel configuration option. The basic idea is >> to give the user (with a single binary kernel) the ability to >> include your ACPI-PCI slot driver feature changes only when >> they are really needed. In addition to reducing the number of >> system/PCI hotplug driver combinations where your changes would >> need to be validated, I believe would also help alleviate other >> worries (e.g. Andi Kleen's memory consumption concern). I >> believe this goal could also be achieved with the kernel config >> option by making the pci_slot module runtime loadable with the >> PCI hotplug drivers only visiting your new code when the >> pci_slot driver is loaded, although I think this would be more >> difficult to implement. > > I have modified my patch series so that the final patch that > introduces my ACPI-PCI slot driver is a full-fledged module, that > has a tristate Kconfig option. > Thank you for your good job. I tested shpchp and pciehp both with and without pci_slot module. There seems no regression from shpchp and pciehp's point of view. (I had a little concern about the hotplug slots' name that vary depending on whether pci_slot functionality is enabled or disabled. But, now that we can build pci_slot driver as a kernel module, I don't think it is a big problem). Only the problems is that I got Call Traces with the following error messages when pci_slot driver was loaded, and one strange slot named '1023' was registered (other slots are fine). This is the same problem I reported before. sysfs: duplicate filename '1023' can not be created WARNING: at fs/sysfs/dir.c:424 sysfs_add_one() kobject_add failed for 1023 with -EEXIST, don't try to register things with the same name in the same directory. On my system, hotplug slots themselves can be added, removed and replaced with the ohter type of I/O box. The ACPI firmware tells OS the presence of those slots using _STA method (That is, it doesn't use 'LoadTable()' AML operator). On the other hand, current pci_slot driver doesn't check _STA. As a result, pci_slot driver tryied to register the invalid (non-existing) slots. The ACPI firmware of my system returns '1023' if the invalid slot's _SUN is evaluated. This is the cause of Call Traces mentioned above. To fix this problem, pci_slot driver need to check _STA when scanning ACPI Namespace. I'm sorry for reporting this so late. I'm attaching the patch to fix the problem. This is against 2.6.24-rc3 with your patches applied. Could you try it? BTW, acpiphp also seems to have the same problem... Thanks, Kenji Kaneshige --- drivers/acpi/pci_slot.c | 13 + 1 file changed, 13 insertions(+) Index: linux-2.6.24-rc3/drivers/acpi/pci_slot.c === --- linux-2.6.24-rc3.orig/drivers/acpi/pci_slot.c +++ linux-2.6.24-rc3/drivers/acpi/pci_slot.c @@ -113,10 +113,17 @@ register_slot(acpi_handle handle, u32 lv int device; unsigned long sun; char name[KOBJ_NAME_LEN]; + acpi_status status; + struct acpi_device *dummy_device; struct pci_slot *pci_slot; struct pci_bus *pci_bus = context; + /* Skip non-existing device object. */ + status = acpi_bus_get_device(handle, _device); + if (ACPI_FAILURE(status)) + return AE_OK; + if (check_slot(handle, , )) return AE_OK; @@ -150,12 +157,18 @@ walk_p2p_bridge(acpi_handle handle, u32 acpi_status status; acpi_handle dummy_handle; acpi_walk_callback user_function; + struct acpi_device *dummy_device; struct pci_dev *dev; struct pci_bus *pci_bus; struct p2p_bridge_context child_context; struct p2p_bridge_context *parent_context = context; + /* Skip non-existing device object. */ + status = acpi_bus_get_device(handle, _device); + if (ACPI_FAILURE(status)) + return AE_OK; + pci_bus = parent_context->pci_bus; user_function = parent_context->user_function; - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/4, v3] Physical PCI slot objects
On Wed, Nov 28, 2007 at 04:02:38PM -0800, Kristen Carlson Accardi wrote: > On Wed, 28 Nov 2007 13:31:47 -0800 > Gary Hade <[EMAIL PROTECTED]> wrote: > > > FYI, the node contains 2 hotpluggable PCIe slots and 5 > > non-hotpluggable PCIe slots but 'pci_slot' only exposed > > the 2 hotpluggable slots. This does not appear to be due > > to a 'pci_slot' driver problem since I looked at the DSDT > > and SSDT and found that there are currently no _SUN methods > > for the non-hotpluggable slots. > > Thanks for testing Gary. I would think this situation would be the > common case, since I doubt most firmware writers would bother to > implement _SUN for non-hotpluggable slots -- at least on other DSDT > I've seen this has been the case as well. Yea, I was also not surprised although features such as Alex working on may provide some motivation to change that. Gary -- Gary Hade System x Enablement IBM Linux Technology Center 503-578-4503 IBM T/L: 775-4503 [EMAIL PROTECTED] http://www.ibm.com/linux/ltc - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/4, v3] Physical PCI slot objects
On Wed, 28 Nov 2007 13:31:47 -0800 Gary Hade <[EMAIL PROTECTED]> wrote: > FYI, the node contains 2 hotpluggable PCIe slots and 5 > non-hotpluggable PCIe slots but 'pci_slot' only exposed > the 2 hotpluggable slots. This does not appear to be due > to a 'pci_slot' driver problem since I looked at the DSDT > and SSDT and found that there are currently no _SUN methods > for the non-hotpluggable slots. Thanks for testing Gary. I would think this situation would be the common case, since I doubt most firmware writers would bother to implement _SUN for non-hotpluggable slots -- at least on other DSDT I've seen this has been the case as well. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/4, v3] Physical PCI slot objects
On Tue, Nov 27, 2007 at 11:11:36AM -0800, Kristen Carlson Accardi wrote: > On Mon, 26 Nov 2007 19:04:54 -0800 > Gary Hade <[EMAIL PROTECTED]> wrote: > > > > Is this patchset appropriate for the -mm tree yet? > > > > I would defer to our illustrious maintainers on this one. :) > > > > > Or do you think it still needs more work? > > > > I am now much more comfortable with your changes with respect > > to acpiphp on the systems I worry about but others may have > > concerns with respect to the other hotplug drivers, or even > > acpiphp, on other systems. > > I'm ok with adding it to my quilt tree (and into -mm) after Gary gets > back to us later this week. I'm getting back to you but unfortunately with not so good news. Sorry Alex. On the x3950 (configured single node) I encountered the below problem when attempting to hotplug a PCIe adapter when 'pci_slot' was loaded prior to 'acpiphp'. I did not see the problem when the drivers were loaded in the opposite order. After the messages appeared, the hotplug operation did not complete and I was unable to recover hotplug functionality without a reboot. I attempted recovery by unloading both drivers and then reloading only 'acpiphp' but even though an LED for the slot that had received the problem triggering adapter indicated that the slot was receiving power, the sysfs slot power file indicated otherwise (contained 0). So, I was unable to power off the slot to allow the removal and reinsertion of the adapter. FYI, the node contains 2 hotpluggable PCIe slots and 5 non-hotpluggable PCIe slots but 'pci_slot' only exposed the 2 hotpluggable slots. This does not appear to be due to a 'pci_slot' driver problem since I looked at the DSDT and SSDT and found that there are currently no _SUN methods for the non-hotpluggable slots. Gary -- Gary Hade System x Enablement IBM Linux Technology Center 503-578-4503 IBM T/L: 775-4503 [EMAIL PROTECTED] http://www.ibm.com/linux/ltc invalid opcode: [1] SMP CPU 1 Modules linked in: acpiphp pci_slot e1000 aic79xx scsi_transport_spi shpchp dock pci_hotplug ipt_LOG xt_limit xt_pkttype button battery ac power_supply ip6t_REJECT xt_tcpudp ipt_REJECT iptable_mangle iptable_filter ip6table_mangle ip_tables ip6table_filter ip6_tables x_tables ipv6 usbhid ff_memless ext3 jbd loop dm_mod ehci_hcd uhci_hcd usbcore ide_cd bnx2 cdrom rng_core reiserfs ata_piix ahci libata thermal processor piix sg megaraid_sas fan edd sd_mod scsi_mod ide_disk ide_core Pid: 121, comm: kacpi_notify Not tainted 2.6.24-rc3-gh-smp #1 RIP: 0010:[] [] :pci_slot:__this_module+0x21c4/0xf204 RSP: 0018:81103fa43ea8 EFLAGS: 00010216 RAX: 81103f944a18 RBX: 81103d4fe910 RCX: 000f RDX: RSI: RDI: 8110400d13d0 RBP: 8032d97b R08: 8110400fc7e0 R09: 0002 R10: R11: 8021d193 R12: 811040105cf0 R13: R14: 80635820 R15: FS: () GS:8110400ed8c0() knlGS: CS: 0010 DS: 0018 ES: 0018 CR0: 8005003b CR2: 2b266d876471 CR3: 00103c825000 CR4: 06e0 DR0: DR1: DR2: DR3: DR6: 0ff0 DR7: 0400 Process kacpi_notify (pid: 121, threadinfo 81103fa42000, task 81103f9f8040) Stack: 809c 81103d119a00 8032d99e 81103f9fc540 8024618d 81103f9fc540 81103f9fc540 8024696c 80246a46 81103f9f8040 80249ada Call Trace: [] acpi_ev_notify_dispatch+0x57/0x60 [] acpi_os_execute_notify+0x23/0x2c [] run_workqueue+0x7f/0x10b [] worker_thread+0x0/0xe4 [] worker_thread+0xda/0xe4 [] autoremove_wake_function+0x0/0x2e [] kthread+0x47/0x73 [] child_rip+0xa/0x12 [] kthread+0x0/0x73 [] child_rip+0x0/0x12 Code: ff ff ff ff 40 23 2c 88 ff ff ff ff 00 c8 c6 3b 10 81 ff ff RIP [] :pci_slot:__this_module+0x21c4/0xf204 RSP - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/4, v3] Physical PCI slot objects
On Tue, Nov 27, 2007 at 11:11:36AM -0800, Kristen Carlson Accardi wrote: On Mon, 26 Nov 2007 19:04:54 -0800 Gary Hade [EMAIL PROTECTED] wrote: Is this patchset appropriate for the -mm tree yet? I would defer to our illustrious maintainers on this one. :) Or do you think it still needs more work? I am now much more comfortable with your changes with respect to acpiphp on the systems I worry about but others may have concerns with respect to the other hotplug drivers, or even acpiphp, on other systems. I'm ok with adding it to my quilt tree (and into -mm) after Gary gets back to us later this week. I'm getting back to you but unfortunately with not so good news. Sorry Alex. On the x3950 (configured single node) I encountered the below problem when attempting to hotplug a PCIe adapter when 'pci_slot' was loaded prior to 'acpiphp'. I did not see the problem when the drivers were loaded in the opposite order. After the messages appeared, the hotplug operation did not complete and I was unable to recover hotplug functionality without a reboot. I attempted recovery by unloading both drivers and then reloading only 'acpiphp' but even though an LED for the slot that had received the problem triggering adapter indicated that the slot was receiving power, the sysfs slot power file indicated otherwise (contained 0). So, I was unable to power off the slot to allow the removal and reinsertion of the adapter. FYI, the node contains 2 hotpluggable PCIe slots and 5 non-hotpluggable PCIe slots but 'pci_slot' only exposed the 2 hotpluggable slots. This does not appear to be due to a 'pci_slot' driver problem since I looked at the DSDT and SSDT and found that there are currently no _SUN methods for the non-hotpluggable slots. Gary -- Gary Hade System x Enablement IBM Linux Technology Center 503-578-4503 IBM T/L: 775-4503 [EMAIL PROTECTED] http://www.ibm.com/linux/ltc invalid opcode: [1] SMP CPU 1 Modules linked in: acpiphp pci_slot e1000 aic79xx scsi_transport_spi shpchp dock pci_hotplug ipt_LOG xt_limit xt_pkttype button battery ac power_supply ip6t_REJECT xt_tcpudp ipt_REJECT iptable_mangle iptable_filter ip6table_mangle ip_tables ip6table_filter ip6_tables x_tables ipv6 usbhid ff_memless ext3 jbd loop dm_mod ehci_hcd uhci_hcd usbcore ide_cd bnx2 cdrom rng_core reiserfs ata_piix ahci libata thermal processor piix sg megaraid_sas fan edd sd_mod scsi_mod ide_disk ide_core Pid: 121, comm: kacpi_notify Not tainted 2.6.24-rc3-gh-smp #1 RIP: 0010:[882c2344] [882c2344] :pci_slot:__this_module+0x21c4/0xf204 RSP: 0018:81103fa43ea8 EFLAGS: 00010216 RAX: 81103f944a18 RBX: 81103d4fe910 RCX: 000f RDX: RSI: RDI: 8110400d13d0 RBP: 8032d97b R08: 8110400fc7e0 R09: 0002 R10: R11: 8021d193 R12: 811040105cf0 R13: R14: 80635820 R15: FS: () GS:8110400ed8c0() knlGS: CS: 0010 DS: 0018 ES: 0018 CR0: 8005003b CR2: 2b266d876471 CR3: 00103c825000 CR4: 06e0 DR0: DR1: DR2: DR3: DR6: 0ff0 DR7: 0400 Process kacpi_notify (pid: 121, threadinfo 81103fa42000, task 81103f9f8040) Stack: 809c 81103d119a00 8032d99e 81103f9fc540 8024618d 81103f9fc540 81103f9fc540 8024696c 80246a46 81103f9f8040 80249ada Call Trace: [809c] acpi_ev_notify_dispatch+0x57/0x60 [8032d99e] acpi_os_execute_notify+0x23/0x2c [8024618d] run_workqueue+0x7f/0x10b [8024696c] worker_thread+0x0/0xe4 [80246a46] worker_thread+0xda/0xe4 [80249ada] autoremove_wake_function+0x0/0x2e [802499bc] kthread+0x47/0x73 [8020cc98] child_rip+0xa/0x12 [80249975] kthread+0x0/0x73 [8020cc8e] child_rip+0x0/0x12 Code: ff ff ff ff 40 23 2c 88 ff ff ff ff 00 c8 c6 3b 10 81 ff ff RIP [882c2344] :pci_slot:__this_module+0x21c4/0xf204 RSP 81103fa43ea8 - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/4, v3] Physical PCI slot objects
On Wed, 28 Nov 2007 13:31:47 -0800 Gary Hade [EMAIL PROTECTED] wrote: FYI, the node contains 2 hotpluggable PCIe slots and 5 non-hotpluggable PCIe slots but 'pci_slot' only exposed the 2 hotpluggable slots. This does not appear to be due to a 'pci_slot' driver problem since I looked at the DSDT and SSDT and found that there are currently no _SUN methods for the non-hotpluggable slots. Thanks for testing Gary. I would think this situation would be the common case, since I doubt most firmware writers would bother to implement _SUN for non-hotpluggable slots -- at least on other DSDT I've seen this has been the case as well. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/4, v3] Physical PCI slot objects
On Wed, Nov 28, 2007 at 04:02:38PM -0800, Kristen Carlson Accardi wrote: On Wed, 28 Nov 2007 13:31:47 -0800 Gary Hade [EMAIL PROTECTED] wrote: FYI, the node contains 2 hotpluggable PCIe slots and 5 non-hotpluggable PCIe slots but 'pci_slot' only exposed the 2 hotpluggable slots. This does not appear to be due to a 'pci_slot' driver problem since I looked at the DSDT and SSDT and found that there are currently no _SUN methods for the non-hotpluggable slots. Thanks for testing Gary. I would think this situation would be the common case, since I doubt most firmware writers would bother to implement _SUN for non-hotpluggable slots -- at least on other DSDT I've seen this has been the case as well. Yea, I was also not surprised although features such as Alex working on may provide some motivation to change that. Gary -- Gary Hade System x Enablement IBM Linux Technology Center 503-578-4503 IBM T/L: 775-4503 [EMAIL PROTECTED] http://www.ibm.com/linux/ltc - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/4, v3] Physical PCI slot objects
Hi Gary, Kenji-san, et. al, * Gary Hade [EMAIL PROTECTED]: Alex, What I was trying to suggest is a boot-time kernel option, not a kernel configuration option. The basic idea is to give the user (with a single binary kernel) the ability to include your ACPI-PCI slot driver feature changes only when they are really needed. In addition to reducing the number of system/PCI hotplug driver combinations where your changes would need to be validated, I believe would also help alleviate other worries (e.g. Andi Kleen's memory consumption concern). I believe this goal could also be achieved with the kernel config option by making the pci_slot module runtime loadable with the PCI hotplug drivers only visiting your new code when the pci_slot driver is loaded, although I think this would be more difficult to implement. I have modified my patch series so that the final patch that introduces my ACPI-PCI slot driver is a full-fledged module, that has a tristate Kconfig option. Thank you for your good job. I tested shpchp and pciehp both with and without pci_slot module. There seems no regression from shpchp and pciehp's point of view. (I had a little concern about the hotplug slots' name that vary depending on whether pci_slot functionality is enabled or disabled. But, now that we can build pci_slot driver as a kernel module, I don't think it is a big problem). Only the problems is that I got Call Traces with the following error messages when pci_slot driver was loaded, and one strange slot named '1023' was registered (other slots are fine). This is the same problem I reported before. sysfs: duplicate filename '1023' can not be created WARNING: at fs/sysfs/dir.c:424 sysfs_add_one() kobject_add failed for 1023 with -EEXIST, don't try to register things with the same name in the same directory. On my system, hotplug slots themselves can be added, removed and replaced with the ohter type of I/O box. The ACPI firmware tells OS the presence of those slots using _STA method (That is, it doesn't use 'LoadTable()' AML operator). On the other hand, current pci_slot driver doesn't check _STA. As a result, pci_slot driver tryied to register the invalid (non-existing) slots. The ACPI firmware of my system returns '1023' if the invalid slot's _SUN is evaluated. This is the cause of Call Traces mentioned above. To fix this problem, pci_slot driver need to check _STA when scanning ACPI Namespace. I'm sorry for reporting this so late. I'm attaching the patch to fix the problem. This is against 2.6.24-rc3 with your patches applied. Could you try it? BTW, acpiphp also seems to have the same problem... Thanks, Kenji Kaneshige --- drivers/acpi/pci_slot.c | 13 + 1 file changed, 13 insertions(+) Index: linux-2.6.24-rc3/drivers/acpi/pci_slot.c === --- linux-2.6.24-rc3.orig/drivers/acpi/pci_slot.c +++ linux-2.6.24-rc3/drivers/acpi/pci_slot.c @@ -113,10 +113,17 @@ register_slot(acpi_handle handle, u32 lv int device; unsigned long sun; char name[KOBJ_NAME_LEN]; + acpi_status status; + struct acpi_device *dummy_device; struct pci_slot *pci_slot; struct pci_bus *pci_bus = context; + /* Skip non-existing device object. */ + status = acpi_bus_get_device(handle, dummy_device); + if (ACPI_FAILURE(status)) + return AE_OK; + if (check_slot(handle, device, sun)) return AE_OK; @@ -150,12 +157,18 @@ walk_p2p_bridge(acpi_handle handle, u32 acpi_status status; acpi_handle dummy_handle; acpi_walk_callback user_function; + struct acpi_device *dummy_device; struct pci_dev *dev; struct pci_bus *pci_bus; struct p2p_bridge_context child_context; struct p2p_bridge_context *parent_context = context; + /* Skip non-existing device object. */ + status = acpi_bus_get_device(handle, dummy_device); + if (ACPI_FAILURE(status)) + return AE_OK; + pci_bus = parent_context-pci_bus; user_function = parent_context-user_function; - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/4, v3] Physical PCI slot objects
On Mon, 26 Nov 2007 19:04:54 -0800 Gary Hade <[EMAIL PROTECTED]> wrote: > > Is this patchset appropriate for the -mm tree yet? > > I would defer to our illustrious maintainers on this one. :) > > > Or do you think it still needs more work? > > I am now much more comfortable with your changes with respect > to acpiphp on the systems I worry about but others may have > concerns with respect to the other hotplug drivers, or even > acpiphp, on other systems. I'm ok with adding it to my quilt tree (and into -mm) after Gary gets back to us later this week. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/4, v3] Physical PCI slot objects
On Mon, 26 Nov 2007 19:04:54 -0800 Gary Hade [EMAIL PROTECTED] wrote: Is this patchset appropriate for the -mm tree yet? I would defer to our illustrious maintainers on this one. :) Or do you think it still needs more work? I am now much more comfortable with your changes with respect to acpiphp on the systems I worry about but others may have concerns with respect to the other hotplug drivers, or even acpiphp, on other systems. I'm ok with adding it to my quilt tree (and into -mm) after Gary gets back to us later this week. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/4, v3] Physical PCI slot objects
On Mon, Nov 26, 2007 at 03:22:53PM -0700, Alex Chiang wrote: > Hi Gary, Kenji-san, et. al, > > * Gary Hade <[EMAIL PROTECTED]>: > > > > Alex, What I was trying to suggest is a boot-time kernel > > option, not a kernel configuration option. The basic idea is > > to give the user (with a single binary kernel) the ability to > > include your ACPI-PCI slot driver feature changes only when > > they are really needed. In addition to reducing the number of > > system/PCI hotplug driver combinations where your changes would > > need to be validated, I believe would also help alleviate other > > worries (e.g. Andi Kleen's memory consumption concern). I > > believe this goal could also be achieved with the kernel config > > option by making the pci_slot module runtime loadable with the > > PCI hotplug drivers only visiting your new code when the > > pci_slot driver is loaded, although I think this would be more > > difficult to implement. > > I have modified my patch series so that the final patch that > introduces my ACPI-PCI slot driver is a full-fledged module, that > has a tristate Kconfig option. > > It can be modprobe'd/rmmod'ed in any combination, and in any > order with other PCI hotplug modules. There is no ordering > dependency, even at module unload time, so you can safely rmmod > pci_slot, and safely continue using features provided by the PCI > hotplug drivers (acpiphp, pciehp, etc.). The opposite works too. Nice! I like the loadable module approach much better than my boot-time kernel option suggestion. > > The one limitation is that two separate hotplug drivers cannot > both claim the same device (2nd module loaded will get -EBUSY > errors), but I do not believe that is a regression from current > behavior. I cannot confirm this since the systems I am using only support a single hotplug driver (acpiphp). > > I have only tested with acpiphp and pciehp, as that's the only > hardware I have, but I believe my code will play nicely with the > other PCI hp drivers as well. I have only tested your changes with acpiphp. > > The patch series is fully bisectable, and the correct behavior > occurs no matter which patch you happen to have applied. Based on my testing (see below) this appears to be true. > > I'll be sending v5 of patches 3 and 4 shortly (patches 1 and 2 > did not change). It is still based on 2.6.24-rc2, because I was > too scared to do another git rebase while using stgit. :-/ I have been using 2.6.24-rc3 source for my testing. > > > Also, I notice that even with your current CONFIG_ACPI_PCI_SLOT > > implementation your numerous PCI hotplug driver changes (except > > for only two places in pci_hotplug_core.c where there is > > `#ifndef CONFIG_ACPI_PCI_SLOT` and `#ifdef CONFIG_ACPI_PCI_SLOT`) > > are _always_ exposed. So, even with CONFIG_ACPI_PCI_SLOT disabled > > there is IMO a need for testing of the affected PCI hotplug drivers > > on more than a small number of isolated systems. > > You are, of course, correct. > > In my opinion, though, I would say most of the changes to the PCI > hotplug drivers themselves are pretty straightforward, as in > removing the different ways of getting the PCI address. > > The scary part of the changes (aside from the ACPI-PCI slot > driver) revolve around the new struct pci_slot, which is > relatively self-contained, and only expose themselves via the > pci_create_slot/pci_destroy_slot interfaces which only the PCI > hotplug corecares about. I think this sounds like a reasonable argument for not doing what I was trying to suggest. > > Regardless, your point stands. How do you suggest I get more > testing time? I am only able to test with acpiphp. In addition to the testing on the x3850 described below I would also like to do some testing on an x3950 which has a mix of hotplug and non-hotplug slots. If this testing which I hope to complete this week goes well, I will be satisfied. I will let others speak for the other hotplug drivers and platforms. > Is this patchset appropriate for the -mm tree yet? I would defer to our illustrious maintainers on this one. :) > Or do you think it still needs more work? I am now much more comfortable with your changes with respect to acpiphp on the systems I worry about but others may have concerns with respect to the other hotplug drivers, or even acpiphp, on other systems. > > > The good news is that I was able to test your v3 changes > > (w/2.6.24-rc3 source) on our x3850 today with 'acpiphp' and, > > except for the above mentioned inability to run-time > > include/exclude them, they seemed to work fine. The previous > > boot-time ACPI error messages are gone and I was able to > > successfully hot-remove and hot-add both PCI-X and PCIe > > adapters. > > Thanks for testing. Please let me know how v5 works for you too. I just tried your v5 (1/4 v3, 2/4 v3, 3/4 v5, 4/4 v5) applied to 2.6.24-rc3 source with acpiphp on the x3850 and found nothing to complain about. About time, eh? :)
Re: [PATCH 0/4, v3] Physical PCI slot objects
Hi Gary, Kenji-san, et. al, * Gary Hade <[EMAIL PROTECTED]>: > > Alex, What I was trying to suggest is a boot-time kernel > option, not a kernel configuration option. The basic idea is > to give the user (with a single binary kernel) the ability to > include your ACPI-PCI slot driver feature changes only when > they are really needed. In addition to reducing the number of > system/PCI hotplug driver combinations where your changes would > need to be validated, I believe would also help alleviate other > worries (e.g. Andi Kleen's memory consumption concern). I > believe this goal could also be achieved with the kernel config > option by making the pci_slot module runtime loadable with the > PCI hotplug drivers only visiting your new code when the > pci_slot driver is loaded, although I think this would be more > difficult to implement. I have modified my patch series so that the final patch that introduces my ACPI-PCI slot driver is a full-fledged module, that has a tristate Kconfig option. It can be modprobe'd/rmmod'ed in any combination, and in any order with other PCI hotplug modules. There is no ordering dependency, even at module unload time, so you can safely rmmod pci_slot, and safely continue using features provided by the PCI hotplug drivers (acpiphp, pciehp, etc.). The opposite works too. The one limitation is that two separate hotplug drivers cannot both claim the same device (2nd module loaded will get -EBUSY errors), but I do not believe that is a regression from current behavior. I have only tested with acpiphp and pciehp, as that's the only hardware I have, but I believe my code will play nicely with the other PCI hp drivers as well. The patch series is fully bisectable, and the correct behavior occurs no matter which patch you happen to have applied. I'll be sending v5 of patches 3 and 4 shortly (patches 1 and 2 did not change). It is still based on 2.6.24-rc2, because I was too scared to do another git rebase while using stgit. :-/ > Also, I notice that even with your current CONFIG_ACPI_PCI_SLOT > implementation your numerous PCI hotplug driver changes (except > for only two places in pci_hotplug_core.c where there is > `#ifndef CONFIG_ACPI_PCI_SLOT` and `#ifdef CONFIG_ACPI_PCI_SLOT`) > are _always_ exposed. So, even with CONFIG_ACPI_PCI_SLOT disabled > there is IMO a need for testing of the affected PCI hotplug drivers > on more than a small number of isolated systems. You are, of course, correct. In my opinion, though, I would say most of the changes to the PCI hotplug drivers themselves are pretty straightforward, as in removing the different ways of getting the PCI address. The scary part of the changes (aside from the ACPI-PCI slot driver) revolve around the new struct pci_slot, which is relatively self-contained, and only expose themselves via the pci_create_slot/pci_destroy_slot interfaces which only the PCI hotplug corecares about. Regardless, your point stands. How do you suggest I get more testing time? Is this patchset appropriate for the -mm tree yet? Or do you think it still needs more work? > The good news is that I was able to test your v3 changes > (w/2.6.24-rc3 source) on our x3850 today with 'acpiphp' and, > except for the above mentioned inability to run-time > include/exclude them, they seemed to work fine. The previous > boot-time ACPI error messages are gone and I was able to > successfully hot-remove and hot-add both PCI-X and PCIe > adapters. Thanks for testing. Please let me know how v5 works for you too. chers, /ac - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/4, v3] Physical PCI slot objects
Hi Gary, Kenji-san, et. al, * Gary Hade [EMAIL PROTECTED]: Alex, What I was trying to suggest is a boot-time kernel option, not a kernel configuration option. The basic idea is to give the user (with a single binary kernel) the ability to include your ACPI-PCI slot driver feature changes only when they are really needed. In addition to reducing the number of system/PCI hotplug driver combinations where your changes would need to be validated, I believe would also help alleviate other worries (e.g. Andi Kleen's memory consumption concern). I believe this goal could also be achieved with the kernel config option by making the pci_slot module runtime loadable with the PCI hotplug drivers only visiting your new code when the pci_slot driver is loaded, although I think this would be more difficult to implement. I have modified my patch series so that the final patch that introduces my ACPI-PCI slot driver is a full-fledged module, that has a tristate Kconfig option. It can be modprobe'd/rmmod'ed in any combination, and in any order with other PCI hotplug modules. There is no ordering dependency, even at module unload time, so you can safely rmmod pci_slot, and safely continue using features provided by the PCI hotplug drivers (acpiphp, pciehp, etc.). The opposite works too. The one limitation is that two separate hotplug drivers cannot both claim the same device (2nd module loaded will get -EBUSY errors), but I do not believe that is a regression from current behavior. I have only tested with acpiphp and pciehp, as that's the only hardware I have, but I believe my code will play nicely with the other PCI hp drivers as well. The patch series is fully bisectable, and the correct behavior occurs no matter which patch you happen to have applied. I'll be sending v5 of patches 3 and 4 shortly (patches 1 and 2 did not change). It is still based on 2.6.24-rc2, because I was too scared to do another git rebase while using stgit. :-/ Also, I notice that even with your current CONFIG_ACPI_PCI_SLOT implementation your numerous PCI hotplug driver changes (except for only two places in pci_hotplug_core.c where there is `#ifndef CONFIG_ACPI_PCI_SLOT` and `#ifdef CONFIG_ACPI_PCI_SLOT`) are _always_ exposed. So, even with CONFIG_ACPI_PCI_SLOT disabled there is IMO a need for testing of the affected PCI hotplug drivers on more than a small number of isolated systems. You are, of course, correct. In my opinion, though, I would say most of the changes to the PCI hotplug drivers themselves are pretty straightforward, as in removing the different ways of getting the PCI address. The scary part of the changes (aside from the ACPI-PCI slot driver) revolve around the new struct pci_slot, which is relatively self-contained, and only expose themselves via the pci_create_slot/pci_destroy_slot interfaces which only the PCI hotplug corecares about. Regardless, your point stands. How do you suggest I get more testing time? Is this patchset appropriate for the -mm tree yet? Or do you think it still needs more work? The good news is that I was able to test your v3 changes (w/2.6.24-rc3 source) on our x3850 today with 'acpiphp' and, except for the above mentioned inability to run-time include/exclude them, they seemed to work fine. The previous boot-time ACPI error messages are gone and I was able to successfully hot-remove and hot-add both PCI-X and PCIe adapters. Thanks for testing. Please let me know how v5 works for you too. chers, /ac - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/4, v3] Physical PCI slot objects
On Mon, Nov 26, 2007 at 03:22:53PM -0700, Alex Chiang wrote: Hi Gary, Kenji-san, et. al, * Gary Hade [EMAIL PROTECTED]: Alex, What I was trying to suggest is a boot-time kernel option, not a kernel configuration option. The basic idea is to give the user (with a single binary kernel) the ability to include your ACPI-PCI slot driver feature changes only when they are really needed. In addition to reducing the number of system/PCI hotplug driver combinations where your changes would need to be validated, I believe would also help alleviate other worries (e.g. Andi Kleen's memory consumption concern). I believe this goal could also be achieved with the kernel config option by making the pci_slot module runtime loadable with the PCI hotplug drivers only visiting your new code when the pci_slot driver is loaded, although I think this would be more difficult to implement. I have modified my patch series so that the final patch that introduces my ACPI-PCI slot driver is a full-fledged module, that has a tristate Kconfig option. It can be modprobe'd/rmmod'ed in any combination, and in any order with other PCI hotplug modules. There is no ordering dependency, even at module unload time, so you can safely rmmod pci_slot, and safely continue using features provided by the PCI hotplug drivers (acpiphp, pciehp, etc.). The opposite works too. Nice! I like the loadable module approach much better than my boot-time kernel option suggestion. The one limitation is that two separate hotplug drivers cannot both claim the same device (2nd module loaded will get -EBUSY errors), but I do not believe that is a regression from current behavior. I cannot confirm this since the systems I am using only support a single hotplug driver (acpiphp). I have only tested with acpiphp and pciehp, as that's the only hardware I have, but I believe my code will play nicely with the other PCI hp drivers as well. I have only tested your changes with acpiphp. The patch series is fully bisectable, and the correct behavior occurs no matter which patch you happen to have applied. Based on my testing (see below) this appears to be true. I'll be sending v5 of patches 3 and 4 shortly (patches 1 and 2 did not change). It is still based on 2.6.24-rc2, because I was too scared to do another git rebase while using stgit. :-/ I have been using 2.6.24-rc3 source for my testing. Also, I notice that even with your current CONFIG_ACPI_PCI_SLOT implementation your numerous PCI hotplug driver changes (except for only two places in pci_hotplug_core.c where there is `#ifndef CONFIG_ACPI_PCI_SLOT` and `#ifdef CONFIG_ACPI_PCI_SLOT`) are _always_ exposed. So, even with CONFIG_ACPI_PCI_SLOT disabled there is IMO a need for testing of the affected PCI hotplug drivers on more than a small number of isolated systems. You are, of course, correct. In my opinion, though, I would say most of the changes to the PCI hotplug drivers themselves are pretty straightforward, as in removing the different ways of getting the PCI address. The scary part of the changes (aside from the ACPI-PCI slot driver) revolve around the new struct pci_slot, which is relatively self-contained, and only expose themselves via the pci_create_slot/pci_destroy_slot interfaces which only the PCI hotplug corecares about. I think this sounds like a reasonable argument for not doing what I was trying to suggest. Regardless, your point stands. How do you suggest I get more testing time? I am only able to test with acpiphp. In addition to the testing on the x3850 described below I would also like to do some testing on an x3950 which has a mix of hotplug and non-hotplug slots. If this testing which I hope to complete this week goes well, I will be satisfied. I will let others speak for the other hotplug drivers and platforms. Is this patchset appropriate for the -mm tree yet? I would defer to our illustrious maintainers on this one. :) Or do you think it still needs more work? I am now much more comfortable with your changes with respect to acpiphp on the systems I worry about but others may have concerns with respect to the other hotplug drivers, or even acpiphp, on other systems. The good news is that I was able to test your v3 changes (w/2.6.24-rc3 source) on our x3850 today with 'acpiphp' and, except for the above mentioned inability to run-time include/exclude them, they seemed to work fine. The previous boot-time ACPI error messages are gone and I was able to successfully hot-remove and hot-add both PCI-X and PCIe adapters. Thanks for testing. Please let me know how v5 works for you too. I just tried your v5 (1/4 v3, 2/4 v3, 3/4 v5, 4/4 v5) applied to 2.6.24-rc3 source with acpiphp on the x3850 and found nothing to complain about. About time, eh? :) Following boot, 'pci_slot' was already loaded and slot directories (each containing an address file) for
Re: [PATCH 0/4, v3] Physical PCI slot objects
On Tue, Nov 20, 2007 at 02:04:02AM +, Matthew Garrett wrote: > On Mon, Nov 19, 2007 at 03:32:25PM -0800, Gary Hade wrote: > > > Alex, What I was trying to suggest is a boot-time kernel option, > > not a kernel configuration option. The basic idea is to give > > the user (with a single binary kernel) the ability to include > > your ACPI-PCI slot driver feature changes only when they are > > really needed. In addition to reducing the number of > > system/PCI hotplug driver combinations where your changes > > would need to be validated, I believe would also help > > alleviate other worries (e.g. Andi Kleen's memory consumption > > concern). I believe this goal could also be achieved with the > > kernel config option by making the pci_slot module runtime > > loadable with the PCI hotplug drivers only visiting your new > > code when the pci_slot driver is loaded, although I think this > > would be more difficult to implement. > > If we're compiling something into the kernel, the default behaviour > should be for the functionality to be turned on unless the user > overrides it. It seems like others could have a problem with this but as long as there is a way to exclude the functionality in the event of problems without a kernel rebuild, "on" by default would work for me. Thanks, Gary -- Gary Hade System x Enablement IBM Linux Technology Center 503-578-4503 IBM T/L: 775-4503 [EMAIL PROTECTED] http://www.ibm.com/linux/ltc - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/4, v3] Physical PCI slot objects
On Tue, Nov 20, 2007 at 02:04:02AM +, Matthew Garrett wrote: On Mon, Nov 19, 2007 at 03:32:25PM -0800, Gary Hade wrote: Alex, What I was trying to suggest is a boot-time kernel option, not a kernel configuration option. The basic idea is to give the user (with a single binary kernel) the ability to include your ACPI-PCI slot driver feature changes only when they are really needed. In addition to reducing the number of system/PCI hotplug driver combinations where your changes would need to be validated, I believe would also help alleviate other worries (e.g. Andi Kleen's memory consumption concern). I believe this goal could also be achieved with the kernel config option by making the pci_slot module runtime loadable with the PCI hotplug drivers only visiting your new code when the pci_slot driver is loaded, although I think this would be more difficult to implement. If we're compiling something into the kernel, the default behaviour should be for the functionality to be turned on unless the user overrides it. It seems like others could have a problem with this but as long as there is a way to exclude the functionality in the event of problems without a kernel rebuild, on by default would work for me. Thanks, Gary -- Gary Hade System x Enablement IBM Linux Technology Center 503-578-4503 IBM T/L: 775-4503 [EMAIL PROTECTED] http://www.ibm.com/linux/ltc - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/4, v3] Physical PCI slot objects
On Mon, Nov 19, 2007 at 03:32:25PM -0800, Gary Hade wrote: > Alex, What I was trying to suggest is a boot-time kernel option, > not a kernel configuration option. The basic idea is to give > the user (with a single binary kernel) the ability to include > your ACPI-PCI slot driver feature changes only when they are > really needed. In addition to reducing the number of > system/PCI hotplug driver combinations where your changes > would need to be validated, I believe would also help > alleviate other worries (e.g. Andi Kleen's memory consumption > concern). I believe this goal could also be achieved with the > kernel config option by making the pci_slot module runtime > loadable with the PCI hotplug drivers only visiting your new > code when the pci_slot driver is loaded, although I think this > would be more difficult to implement. If we're compiling something into the kernel, the default behaviour should be for the functionality to be turned on unless the user overrides it. -- Matthew Garrett | [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/4, v3] Physical PCI slot objects
Gary Hade : On Sat, Nov 17, 2007 at 11:29:54AM -0700, Alex Chiang wrote: Hi all, This is v3 of the pci_slot patch series. The major change is making the ACPI-PCI slot driver a Kconfig option, as per the recommendations of others (Gary, Kenji-san). Alex, What I was trying to suggest is a boot-time kernel option, not a kernel configuration option. The basic idea is to give the user (with a single binary kernel) the ability to include your ACPI-PCI slot driver feature changes only when they are really needed. In addition to reducing the number of system/PCI hotplug driver combinations where your changes would need to be validated, I believe would also help alleviate other worries (e.g. Andi Kleen's memory consumption concern). I believe this goal could also be achieved with the kernel config option by making the pci_slot module runtime loadable with the PCI hotplug drivers only visiting your new code when the pci_slot driver is loaded, although I think this would be more difficult to implement. I agree to Gary very much. Thanks, Kenji Kaneshige - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/4, v3] Physical PCI slot objects
On Sat, Nov 17, 2007 at 11:29:54AM -0700, Alex Chiang wrote: > Hi all, > > This is v3 of the pci_slot patch series. > > The major change is making the ACPI-PCI slot driver a Kconfig > option, as per the recommendations of others (Gary, Kenji-san). Alex, What I was trying to suggest is a boot-time kernel option, not a kernel configuration option. The basic idea is to give the user (with a single binary kernel) the ability to include your ACPI-PCI slot driver feature changes only when they are really needed. In addition to reducing the number of system/PCI hotplug driver combinations where your changes would need to be validated, I believe would also help alleviate other worries (e.g. Andi Kleen's memory consumption concern). I believe this goal could also be achieved with the kernel config option by making the pci_slot module runtime loadable with the PCI hotplug drivers only visiting your new code when the pci_slot driver is loaded, although I think this would be more difficult to implement. Also, I notice that even with your current CONFIG_ACPI_PCI_SLOT implementation your numerous PCI hotplug driver changes (except for only two places in pci_hotplug_core.c where there is `#ifndef CONFIG_ACPI_PCI_SLOT` and `#ifdef CONFIG_ACPI_PCI_SLOT`) are _always_ exposed. So, even with CONFIG_ACPI_PCI_SLOT disabled there is IMO a need for testing of the affected PCI hotplug drivers on more than a small number of isolated systems. The good news is that I was able to test your v3 changes (w/2.6.24-rc3 source) on our x3850 today with 'acpiphp' and, except for the above mentioned inability to run-time include/exclude them, they seemed to work fine. The previous boot-time ACPI error messages are gone and I was able to successfully hot-remove and hot-add both PCI-X and PCIe adapters. Thanks, Gary -- Gary Hade System x Enablement IBM Linux Technology Center 503-578-4503 IBM T/L: 775-4503 [EMAIL PROTECTED] http://www.ibm.com/linux/ltc - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/4, v3] Physical PCI slot objects
Hi Kristin, * Kristen Carlson Accardi <[EMAIL PROTECTED]>: > Alex Chiang <[EMAIL PROTECTED]> wrote: > > > I have done quite a bit more testing, and verified that this > > series plays nicely with acpiphp during all stages of the > > series. Notably, you can modprobe/rmmod acpiphp repeatedly > > no matter where you are in the series, and no matter whether > > you have CONFIG_ACPI_PCI_SLOT turned on. The correct entries > > in /sys/bus/pci/slots/ will appear and disappear, and we > > correctly register/deregister ACPI slots with the pci_hp > > core. > > How does this patch play with non-acpi based hotplug such as > the pciehp driver or the shpchp driver for example? Thanks for asking these questions -- I fixed some bugs in patches 3/4 and 4/4 that should lead to a much better experience. First, it turns out I did not modify the pciehp driver correctly when using the new pci_hp_register interface. I fixed this bug, and noticed a problem in the rpaphp driver (which I fixed as well). I visually inspected the shpchp driver, and it *seems* to be correct, so no change there. I will send these fixes as Patch 3/4, v4. Second, I resolved the issue of what happens when two different hp drivers try to claim the same PCI slot. Basically, whoever registered the slot first wins, and second place gets a -EBUSY return value. I *think* that is the correct behavior, as Willy informs me that having two drivers try to claim the same slot is badness. These fixes will be sent as Patch 4/4, v4. I tested by modprobe/rmmod both acpiphp and pciehp multiple times, and in differing orders. I also tested both CONFIG_ACPI_PCI_SLOT turned on and off. In all cases, at least what I intended to happen did happen. Now whether my intentions were correct or misguided might be a different story... ;) I'm wondering most about the -EBUSY thing, but I don't see a better option. Patches 1/4 and 2/4 had no changes so I will not resend them. Thanks. /ac - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/4, v3] Physical PCI slot objects
* Kristen Carlson Accardi <[EMAIL PROTECTED]>: > On Sat, 17 Nov 2007 11:29:54 -0700 > Alex Chiang <[EMAIL PROTECTED]> wrote: > > > I have done quite a bit more testing, and verified that this > > series plays nicely with acpiphp during all stages of the series. > > Notably, you can modprobe/rmmod acpiphp repeatedly no matter > > where you are in the series, and no matter whether you have > > CONFIG_ACPI_PCI_SLOT turned on. The correct entries in > > /sys/bus/pci/slots/ will appear and disappear, and we correctly > > register/deregister ACPI slots with the pci_hp core. > > How does this patch play with non-acpi based hotplug such as the pciehp > driver Not well. :( pciehp: HPC vendor_id 103c device_id 403b ss_vid 0 ss_did 0 pciehp: pci_hp_register failed with error -17 pciehp: pciehp: slot initialization failed pciehp: HPC vendor_id 111d device_id 801c ss_vid 0 ss_did 0 pciehp: Can't get irq 0 for the hotplug controller pciehp: HPC vendor_id 111d device_id 801c ss_vid 0 ss_did 0 pciehp: Can't get irq 0 for the hotplug controller pciehp: HPC vendor_id 103c device_id 403b ss_vid 0 ss_did 0 pciehp: pci_hp_register failed with error -17 pciehp: pciehp: slot initialization failed I'll take a look at this today. Thanks for pointing it out. > or the shpchp driver for example? This one, I'm not sure on, as I don't have any shpc hardware. I'll do my best by just looking at the code, but maybe if someone out there has that hardware, they could let me know what breaks? Thanks. /ac - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/4, v3] Physical PCI slot objects
On Sat, 17 Nov 2007 11:29:54 -0700 Alex Chiang <[EMAIL PROTECTED]> wrote: > I have done quite a bit more testing, and verified that this > series plays nicely with acpiphp during all stages of the series. > Notably, you can modprobe/rmmod acpiphp repeatedly no matter > where you are in the series, and no matter whether you have > CONFIG_ACPI_PCI_SLOT turned on. The correct entries in > /sys/bus/pci/slots/ will appear and disappear, and we correctly > register/deregister ACPI slots with the pci_hp core. Hi Alex, How does this patch play with non-acpi based hotplug such as the pciehp driver or the shpchp driver for example? Thanks, Kristen - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/4, v3] Physical PCI slot objects
On Sat, 17 Nov 2007 11:29:54 -0700 Alex Chiang [EMAIL PROTECTED] wrote: I have done quite a bit more testing, and verified that this series plays nicely with acpiphp during all stages of the series. Notably, you can modprobe/rmmod acpiphp repeatedly no matter where you are in the series, and no matter whether you have CONFIG_ACPI_PCI_SLOT turned on. The correct entries in /sys/bus/pci/slots/ will appear and disappear, and we correctly register/deregister ACPI slots with the pci_hp core. Hi Alex, How does this patch play with non-acpi based hotplug such as the pciehp driver or the shpchp driver for example? Thanks, Kristen - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/4, v3] Physical PCI slot objects
* Kristen Carlson Accardi [EMAIL PROTECTED]: On Sat, 17 Nov 2007 11:29:54 -0700 Alex Chiang [EMAIL PROTECTED] wrote: I have done quite a bit more testing, and verified that this series plays nicely with acpiphp during all stages of the series. Notably, you can modprobe/rmmod acpiphp repeatedly no matter where you are in the series, and no matter whether you have CONFIG_ACPI_PCI_SLOT turned on. The correct entries in /sys/bus/pci/slots/ will appear and disappear, and we correctly register/deregister ACPI slots with the pci_hp core. How does this patch play with non-acpi based hotplug such as the pciehp driver Not well. :( pciehp: HPC vendor_id 103c device_id 403b ss_vid 0 ss_did 0 pciehp: pci_hp_register failed with error -17 pciehp: pciehp: slot initialization failed pciehp: HPC vendor_id 111d device_id 801c ss_vid 0 ss_did 0 pciehp: Can't get irq 0 for the hotplug controller pciehp: HPC vendor_id 111d device_id 801c ss_vid 0 ss_did 0 pciehp: Can't get irq 0 for the hotplug controller pciehp: HPC vendor_id 103c device_id 403b ss_vid 0 ss_did 0 pciehp: pci_hp_register failed with error -17 pciehp: pciehp: slot initialization failed I'll take a look at this today. Thanks for pointing it out. or the shpchp driver for example? This one, I'm not sure on, as I don't have any shpc hardware. I'll do my best by just looking at the code, but maybe if someone out there has that hardware, they could let me know what breaks? Thanks. /ac - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/4, v3] Physical PCI slot objects
Hi Kristin, * Kristen Carlson Accardi [EMAIL PROTECTED]: Alex Chiang [EMAIL PROTECTED] wrote: I have done quite a bit more testing, and verified that this series plays nicely with acpiphp during all stages of the series. Notably, you can modprobe/rmmod acpiphp repeatedly no matter where you are in the series, and no matter whether you have CONFIG_ACPI_PCI_SLOT turned on. The correct entries in /sys/bus/pci/slots/ will appear and disappear, and we correctly register/deregister ACPI slots with the pci_hp core. How does this patch play with non-acpi based hotplug such as the pciehp driver or the shpchp driver for example? Thanks for asking these questions -- I fixed some bugs in patches 3/4 and 4/4 that should lead to a much better experience. First, it turns out I did not modify the pciehp driver correctly when using the new pci_hp_register interface. I fixed this bug, and noticed a problem in the rpaphp driver (which I fixed as well). I visually inspected the shpchp driver, and it *seems* to be correct, so no change there. I will send these fixes as Patch 3/4, v4. Second, I resolved the issue of what happens when two different hp drivers try to claim the same PCI slot. Basically, whoever registered the slot first wins, and second place gets a -EBUSY return value. I *think* that is the correct behavior, as Willy informs me that having two drivers try to claim the same slot is badness. These fixes will be sent as Patch 4/4, v4. I tested by modprobe/rmmod both acpiphp and pciehp multiple times, and in differing orders. I also tested both CONFIG_ACPI_PCI_SLOT turned on and off. In all cases, at least what I intended to happen did happen. Now whether my intentions were correct or misguided might be a different story... ;) I'm wondering most about the -EBUSY thing, but I don't see a better option. Patches 1/4 and 2/4 had no changes so I will not resend them. Thanks. /ac - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/4, v3] Physical PCI slot objects
On Sat, Nov 17, 2007 at 11:29:54AM -0700, Alex Chiang wrote: Hi all, This is v3 of the pci_slot patch series. The major change is making the ACPI-PCI slot driver a Kconfig option, as per the recommendations of others (Gary, Kenji-san). Alex, What I was trying to suggest is a boot-time kernel option, not a kernel configuration option. The basic idea is to give the user (with a single binary kernel) the ability to include your ACPI-PCI slot driver feature changes only when they are really needed. In addition to reducing the number of system/PCI hotplug driver combinations where your changes would need to be validated, I believe would also help alleviate other worries (e.g. Andi Kleen's memory consumption concern). I believe this goal could also be achieved with the kernel config option by making the pci_slot module runtime loadable with the PCI hotplug drivers only visiting your new code when the pci_slot driver is loaded, although I think this would be more difficult to implement. Also, I notice that even with your current CONFIG_ACPI_PCI_SLOT implementation your numerous PCI hotplug driver changes (except for only two places in pci_hotplug_core.c where there is `#ifndef CONFIG_ACPI_PCI_SLOT` and `#ifdef CONFIG_ACPI_PCI_SLOT`) are _always_ exposed. So, even with CONFIG_ACPI_PCI_SLOT disabled there is IMO a need for testing of the affected PCI hotplug drivers on more than a small number of isolated systems. The good news is that I was able to test your v3 changes (w/2.6.24-rc3 source) on our x3850 today with 'acpiphp' and, except for the above mentioned inability to run-time include/exclude them, they seemed to work fine. The previous boot-time ACPI error messages are gone and I was able to successfully hot-remove and hot-add both PCI-X and PCIe adapters. Thanks, Gary -- Gary Hade System x Enablement IBM Linux Technology Center 503-578-4503 IBM T/L: 775-4503 [EMAIL PROTECTED] http://www.ibm.com/linux/ltc - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/4, v3] Physical PCI slot objects
Gary Hade : On Sat, Nov 17, 2007 at 11:29:54AM -0700, Alex Chiang wrote: Hi all, This is v3 of the pci_slot patch series. The major change is making the ACPI-PCI slot driver a Kconfig option, as per the recommendations of others (Gary, Kenji-san). Alex, What I was trying to suggest is a boot-time kernel option, not a kernel configuration option. The basic idea is to give the user (with a single binary kernel) the ability to include your ACPI-PCI slot driver feature changes only when they are really needed. In addition to reducing the number of system/PCI hotplug driver combinations where your changes would need to be validated, I believe would also help alleviate other worries (e.g. Andi Kleen's memory consumption concern). I believe this goal could also be achieved with the kernel config option by making the pci_slot module runtime loadable with the PCI hotplug drivers only visiting your new code when the pci_slot driver is loaded, although I think this would be more difficult to implement. I agree to Gary very much. Thanks, Kenji Kaneshige - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/4, v3] Physical PCI slot objects
On Mon, Nov 19, 2007 at 03:32:25PM -0800, Gary Hade wrote: Alex, What I was trying to suggest is a boot-time kernel option, not a kernel configuration option. The basic idea is to give the user (with a single binary kernel) the ability to include your ACPI-PCI slot driver feature changes only when they are really needed. In addition to reducing the number of system/PCI hotplug driver combinations where your changes would need to be validated, I believe would also help alleviate other worries (e.g. Andi Kleen's memory consumption concern). I believe this goal could also be achieved with the kernel config option by making the pci_slot module runtime loadable with the PCI hotplug drivers only visiting your new code when the pci_slot driver is loaded, although I think this would be more difficult to implement. If we're compiling something into the kernel, the default behaviour should be for the functionality to be turned on unless the user overrides it. -- Matthew Garrett | [EMAIL PROTECTED] - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 0/4, v3] Physical PCI slot objects
Hi all, This is v3 of the pci_slot patch series. The major change is making the ACPI-PCI slot driver a Kconfig option, as per the recommendations of others (Gary, Kenji-san). In the process of doing so, it made sense to collapse the former 3/5 and 4/5 patches into a single 3/4 patch. There really wasn't a reason to introduce a pci_slot patch, and then immediately follow it with another patch modifying its interface; logically, the changes should have been in the same patch. Combining the patches also has the nice side benefit of keeping the tree fully buildable and bisectable at all stages of series. I have done quite a bit more testing, and verified that this series plays nicely with acpiphp during all stages of the series. Notably, you can modprobe/rmmod acpiphp repeatedly no matter where you are in the series, and no matter whether you have CONFIG_ACPI_PCI_SLOT turned on. The correct entries in /sys/bus/pci/slots/ will appear and disappear, and we correctly register/deregister ACPI slots with the pci_hp core. Of course, if you *do* have the ACPI-PCI slot driver configured, the slots/ entries in sysfs will be persistent. What you will see is the hotplug attributes appear/disappear, depending on whether you have acpiphp loaded or not. Thanks for your consideration and all the feedback comments. They're appreciated. /ac v2 -> v3: Patch 1/4 - no change Patch 2/4 - incorporate Eike's comments around snprintf Patch 3/4 - Separated slot creation and slot hotplug ability into two interfaces. Fixed bugs in pci_destroy_slot(), and now properly calling from pci_hp_deregister. Patch 4/4 - Add Kconfig option to driver, allowing users to [de]config this driver. If configured, take slightly different code paths in pci_hp_register and pci_hp_deregister. v1 -> v2: Patch 1/5 - reworked to fix stupid compile bug Patch 2/5 - incorporate Eike, Linas, and Willy's comments Patch 3/5 - no change Patch 4/5 - was acpi-pci-slot-driver patch, now modifies pci_add_hotplug(). I changed the ordering on this so the tree doesn't break at this point in the series Patch 5/5 - now is acpi-pci-slot-driver patch, cleaned up implementation so our slot detection is a little better - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 0/4, v3] Physical PCI slot objects
Hi all, This is v3 of the pci_slot patch series. The major change is making the ACPI-PCI slot driver a Kconfig option, as per the recommendations of others (Gary, Kenji-san). In the process of doing so, it made sense to collapse the former 3/5 and 4/5 patches into a single 3/4 patch. There really wasn't a reason to introduce a pci_slot patch, and then immediately follow it with another patch modifying its interface; logically, the changes should have been in the same patch. Combining the patches also has the nice side benefit of keeping the tree fully buildable and bisectable at all stages of series. I have done quite a bit more testing, and verified that this series plays nicely with acpiphp during all stages of the series. Notably, you can modprobe/rmmod acpiphp repeatedly no matter where you are in the series, and no matter whether you have CONFIG_ACPI_PCI_SLOT turned on. The correct entries in /sys/bus/pci/slots/ will appear and disappear, and we correctly register/deregister ACPI slots with the pci_hp core. Of course, if you *do* have the ACPI-PCI slot driver configured, the slots/ entries in sysfs will be persistent. What you will see is the hotplug attributes appear/disappear, depending on whether you have acpiphp loaded or not. Thanks for your consideration and all the feedback comments. They're appreciated. /ac v2 - v3: Patch 1/4 - no change Patch 2/4 - incorporate Eike's comments around snprintf Patch 3/4 - Separated slot creation and slot hotplug ability into two interfaces. Fixed bugs in pci_destroy_slot(), and now properly calling from pci_hp_deregister. Patch 4/4 - Add Kconfig option to driver, allowing users to [de]config this driver. If configured, take slightly different code paths in pci_hp_register and pci_hp_deregister. v1 - v2: Patch 1/5 - reworked to fix stupid compile bug Patch 2/5 - incorporate Eike, Linas, and Willy's comments Patch 3/5 - no change Patch 4/5 - was acpi-pci-slot-driver patch, now modifies pci_add_hotplug(). I changed the ordering on this so the tree doesn't break at this point in the series Patch 5/5 - now is acpi-pci-slot-driver patch, cleaned up implementation so our slot detection is a little better - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/