----- Original Message -----
>Date: Fri, 04 Sep 2009 23:04:01 -0500
>From: Steven Stallion <[email protected]>
>To: Masa Murayama <[email protected]>
>CC: [email protected]
>Subject: Re: [driver-discuss] Hard hang during interrupt processing
>
>
>Masa Murayama wrote:
>> How did you duplicate the hang on your pc?
>> 
>> If I can, I'd like to see your source code and test it on my pc.
>
>That would be very helpful, thank you!
>
>You can clone it from the emancipation gate:
>% hg clone ssh://[email protected]/hg/emancipation/driver-gate
>
>Setup your ON environment like normal (i.e. via bldenv), change to the
>driver-gate/re/intel directory, and issue a 'dmake all'. Install the
>device and issue: add_drv -i '"pci10ec,8029"' re
>
>Mind you, the current version in the gate is very simple; there is only
>a single tx buffer in place (I took out the tx ring during debugging),
>and a few areas of the code still need to be optimized.
>
>To get the driver to hang, generate non-trivial traffic through the
>interface - for my tests, I attempt to FTP a large file.


Steve,

I could duplicate the issue by netperf. The system hanged easily.
I think I found the rootcause.

As I saw many receive errors happened when the system hanged, I add
following code before receiving packets in re_recv().
Rx interrupt bit may be set while you are receiving packets. In this
case, there is no valid packets in rx buffer when the next interrupt
happen.

$diff -c re.c re.c.new
*** re.c        Sat Sep  5 18:51:42 2009
--- re.c.new    Sat Sep  5 18:46:11 2009
***************
*** 1376,1381 ****
--- 1376,1387 ----

        ASSERT(mutex_owned(&rep->re_txlock));

+       /* Test if received packets are available */
+       re_setpage(rep, REG_PAGE_1);
+       if (REG_READ(rep, CURR_REG) == rep->re_rx_next) {
+               return (NULL);
+       }
+
        re_hdr_t hdr;
        do {
                uint16_t len;
***************

I prefer msgdsize() to ignore control messages.
*** 1498,1504 ****

        ASSERT(mutex_owned(&rep->re_txlock));

!       len = msgsize(mp);

        if (len > ETHERMAX + VLAN_TAGSZ) {
                rep->re_oerrors++;
--- 1504,1510 ----

        ASSERT(mutex_owned(&rep->re_txlock));

!       len = msgdsize(mp);

        if (len > ETHERMAX + VLAN_TAGSZ) {
                rep->re_oerrors++;
***************

It is not error if tx is busy.
*** 1508,1514 ****
        }

        if (rep->re_tx_busy) {
-               rep->re_oerrors++;
                rep->re_noxmtbuf++;
                return (DDI_FAILURE);
        }
--- 1514,1519 ----
***************

actual tx packet size shouldn't be padded to the next word boundary.
this caused oversized packets.
*** 1530,1535 ****
--- 1535,1541 ----
            rep->re_tx_len)) != DDI_SUCCESS) {
                return (error);
        }
+       rep->re_tx_len = max(ETHERMIN, len);

        rep->re_opackets++;
        rep->re_obytes += rep->re_tx_len;


I put the entire re.c in my web page.
http://homepage2.nifty.com/mrym3/taiyodo/re.c.new

-masa

>TIA,
>
>Steve

_______________________________________________
driver-discuss mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/driver-discuss

Reply via email to