Hi,

>>>> Even if the total lock time can be reduced, it's possible that interrupt
>>>> handler is executed while the interrupted code is still holding the 
>>>> semaphore.
>>>> I think your method only decrease the frequency of this problem.
>>>> Why does reducing the lock time solve this problem?
>>> there are several problems here that need addressing. It's not acceptable 
>>> for our driver to wait up to 15 seconds, and we can (presumably) reduce it 
>>> to milliseconds, so that would help a lot. We should in no case at all hold 
>>> it for any period longer than (give or take) half a second, so working 
>>> towards that is a very good step in the right direction.
>>>
>>> Adding the timer task back may also help, as we are no longer trying to 
>>> aqcuire the sw_fw_semaphore in interrupt context, but we removed it for a 
>>> reason, and I need to dig up what reason this exactly was before we can 
>>> revert it. Jesse might know, so I'll talk to him. But this will not fix the 
>>> fact that the semaphore is held for a long time :)
[...]
> I think this problem occurs because interrupt handler is executed in same
> CPU as process that acquires semaphore.
> How about disabling interrupt while the process is holding the semaphore?
> I think this is possible, if the total lock time has been reduced.

I created the attached patch based on the method described above.
This patch disables interrupt while the process is holding the semaphore.

I measured how long interrupts are being disabled 10,000 times using the
following method. TSC was read by rdtscll when interrupt was disabled
and interrupt was enabled again, then I subtract these two value.

The longest period interrupt was disabled is under 10usec, which seems
acceptable. The evaluation environment is;
  CPU    : Intel Xeon CPU 3.73GHz
  kernel : 2.6.19-rc5

I also ran the reproduction TP I sent previously and confirmed that the
system didn't panic.
  http://marc.theaimsgroup.com/?l=linux-netdev&m=116125332319850&w=2

How about this method?

I welcome any comments.
-- 
  Kenzo Iwami ([EMAIL PROTECTED])

Signed-off-by: Kenzo Iwami <[EMAIL PROTECTED]>

diff -urpN linux-2.6.19-rc5_org/drivers/net/e1000/e1000_hw.c
linux-2.6.19-rc5/drivers/net/e1000/e1000_hw.c
--- linux-2.6.19-rc5_org/drivers/net/e1000/e1000_hw.c   2006-11-14
17:47:28.000000000 +0900
+++ linux-2.6.19-rc5/drivers/net/e1000/e1000_hw.c       2006-11-15 
17:49:39.000000000
+0900
@@ -3379,6 +3379,7 @@ e1000_swfw_sync_acquire(struct e1000_hw
     uint32_t swmask = mask;
     uint32_t fwmask = mask << 16;
     int32_t timeout = 200;
+    unsigned long flags;

     DEBUGFUNC("e1000_swfw_sync_acquire");

@@ -3389,8 +3390,11 @@ e1000_swfw_sync_acquire(struct e1000_hw
         return e1000_get_hw_eeprom_semaphore(hw);

     while (timeout) {
-            if (e1000_get_hw_eeprom_semaphore(hw))
+            local_irq_save(flags);
+            if (e1000_get_hw_eeprom_semaphore(hw)) {
+                local_irq_restore(flags);
                 return -E1000_ERR_SWFW_SYNC;
+            }

             swfw_sync = E1000_READ_REG(hw, SW_FW_SYNC);
             if (!(swfw_sync & (fwmask | swmask))) {
@@ -3400,6 +3404,7 @@ e1000_swfw_sync_acquire(struct e1000_hw
             /* firmware currently using resource (fwmask) */
             /* or other software thread currently using resource (swmask) */
             e1000_put_hw_eeprom_semaphore(hw);
+            local_irq_restore(flags);
             mdelay(5);
             timeout--;
     }
@@ -3413,6 +3418,7 @@ e1000_swfw_sync_acquire(struct e1000_hw
     E1000_WRITE_REG(hw, SW_FW_SYNC, swfw_sync);

     e1000_put_hw_eeprom_semaphore(hw);
+    local_irq_restore(flags);
     return E1000_SUCCESS;
 }

@@ -3421,6 +3427,7 @@ e1000_swfw_sync_release(struct e1000_hw
 {
     uint32_t swfw_sync;
     uint32_t swmask = mask;
+    unsigned long flags;

     DEBUGFUNC("e1000_swfw_sync_release");

@@ -3434,6 +3441,7 @@ e1000_swfw_sync_release(struct e1000_hw
         return;
     }

+    local_irq_save(flags);
     /* if (e1000_get_hw_eeprom_semaphore(hw))
      *    return -E1000_ERR_SWFW_SYNC; */
     while (e1000_get_hw_eeprom_semaphore(hw) != E1000_SUCCESS);
@@ -3444,6 +3452,7 @@ e1000_swfw_sync_release(struct e1000_hw
     E1000_WRITE_REG(hw, SW_FW_SYNC, swfw_sync);

     e1000_put_hw_eeprom_semaphore(hw);
+    local_irq_restore(flags);
 }

 /*****************************************************************************
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to