Okay, I've debugged it and fixed the issues.

The new webrev is at http://cr.opensolaris.org/~gdamore/dnet-suspend/

I would definitely appreciate code review feedback, as I'd like to file 
an RTI for build 80.

This webrev also includes a fix for an MII link problem I found during 
debug, as well as AMD64 support, and a little housekeeping stuff 
(cstyle, remove some dead macros and structure members, etc.)

Using uadmin 3 25 35  (where 35 is the major number of dnet on my 
system, as found in /etc/name_to_major), dnet suspends and resumes 
properly.  I still haven't got a full S3 suspend going, but dnet is no 
longer the problem.

One thing I've noticed is that on the particular card I have, dnet needs 
a few seconds to get its brains in order.  I don't understand what is 
going on during this time, but don't be surprised if it takes up to 10 
seconds for packets to resume flowing.   The dnet driver is *really* 
crufty, and desperately needs some love.  I am not sure how much I want 
to invest in it.  (I am considering a GLDv3 port, but its not very high 
priority right now.)

In case you were curious, the problems I found were:

    *) I forgot to set dnet->suspended = B_FALSE in DDI_RESUME!

    *) There was a mutex problem surrounding dnet_set_addr().  (I've 
restructured this part of the code a bit more to make more sense.)

    *) Using GLD_NO_RESOURCES to return from dnet_send() was probably a 
mistake, and was also probably the cause of the problems (hangs, panics) 
in IP, when combined with the dnet->suspended problem above.  
(Basically, certain DLPI control messages were getting stuck behind 
these "deferred" ethernet packets.)  I changed the code to just drop the 
packets on the floor, which is more consistent with what I've done in 
other drivers.

Anyway, feel free to give it a whirl.  I can supply binaries on demand 
as well.

Thanks!

    -- Garrett


Garrett D'Amore wrote:
> FYI, I've reproduced the problem, and have begun debugging it.  (I'm 
> also doing this in a 64-bit kernel, as I added support for 64-bit to 
> dnet. :-)
>
> I should have something for you in the next day or two.
>
>    -- Garrett
>
> Juergen Keil wrote:
>> Btw. "ifconfig -a" or "ifconfig dnet" hangs, and when I interrupt
>> the hanging ifconfig with Ctrl-C, the machine panics.
>>
>>
>>  
>>> You're brave. :-)  I hadn't even tested the code myself yet.
>>>
>>> I'll have a look at this later today.
>>>
>>>     -- Garrett
>>>
>>> Juergen Keil wrote:
>>>    
>>>> Garrett wrote:
>>>>
>>>>        
>>>>> For those who've wanted it, I posted the webrev from the dnet 
>>>>> conversion that I did at OpenSolaris Developer Summit 2007 at 
>>>>> http://cr.opensolaris.org/~gdamore/dnet-suspend/
>>>>>             
>>>> Just tried that dnet patch on my ASUS P2B-LS "S3 suspend-to-ram" 
>>>> test box,
>>>> but it's crashing the box with a failed assertion panic during 
>>>> driver       
>> resume:
>>  
>>>>   assertion failed: mutex_owned((&dnetp->intrlock)), in file dnet.c 
>>>> line       
>> 2308
>>  
>>>>       stack backtrace shows
>>>>
>>>>       :panic...
>>>>   dnet:set_sia+12c(...)
>>>>   dnet:dnet_init_board+2a(...)
>>>>   dnet:dnetattach+5bd(..., 1)
>>>>   ...
>>>>   (messages manually copied, disk device is suspended, so no crash 
>>>> dump :-)
>>>>
>>>>
>>>> I think the "dnet_init_board(macinfo);" call in the dnetattach()
>>>> DDI_RESUME case should be moved after the mutex_enter() calls.
>>>>
>>>>
>>>> With that change, suspend / resume does not panic any more.
>>>>
>>>> But the dnet device seems to be dead after a resume.  And after maybe
>>>> a minute,  I got another panic when I tried an "ifconfig -a" after the
>>>> resume:
>>>>
>>>>        
>>>>> ::status             
>>>> debugging crash dump vmcore.3 (32-bit) from elise
>>>> operating system: 5.11 wos_b78_debug (i86pc)
>>>> panic message: assertion failed: connp->conn_oper_pending_ill == 0, 
>>>> file:       
>> ../../common/inet/ip/
>>  
>>>> ipclassifier.c, line: 2219
>>>> dump content: kernel pages only
>>>>        
>>>>> $C
>>>>>             
>>>> ccdd6c58 vpanic(feae72dc, fa2a1fc4, fa2a2b18, 8ab)
>>>> ccdd6c78 assfail+0x5a(fa2a1fc4, fa2a2b18, 8ab)
>>>> ccdd6c98 ipcl_conn_cleanup+0x209(cde9f540)
>>>> ccdd6cc8 ipcl_conn_destroy+0x151()
>>>> ccdd6ce0 udp_close+0xb8(cdbd3ca8, 3, cd552690)
>>>> ccdd6d08 qdetach+0x9b(cdbd3ca8, 1, 3, cd552690, 0)
>>>> ccdd6d50 strclose+0x392(cde09580, 3, cd552690)
>>>> ccdd6d80 socktpi_close+0x1cc(cde09580, 3, 1, 0, 0, cd552690)
>>>> ccdd6dcc fop_close+0x51(cde09580, 3, 1, 0, 0, cd552690)
>>>> ccdd6e18 closef+0x88(cd3bd700)
>>>> ccdd6e48 closeall+0x58(cb6bf9a4)
>>>> ccdd6e90 proc_exit+0x40a(2, 2)
>>>> ccdd6ea4 exit+0x11(2, 2)
>>>> ccdd6efc psig+0x4c3()
>>>> ccdd6f7c post_syscall+0x3f6(4, cc6f3880)
>>>> ccdd6f90 syscall_exit+0x48(cc6f3880, 4, cc6f3880)
>>>> ccdd6fa4 sys_call+0x21a()
>>>>
>>>>
>>>>        
>>>>> ::msgbuf
>>>>>             
>>>> ...
>>>> System is being suspended
>>>> NOTICE: acpica_ddi_setwake: could not get handle for       
>> /[EMAIL PROTECTED],0/pci10b9,[EMAIL PROTECTED], ohci:0
>>  
>>>> NOTICE: acpica_ddi_setwake: could not get handle for       
>> /[EMAIL PROTECTED],0/pci10b9,[EMAIL PROTECTED],1, ohci:1
>>  
>>>> NOTICE: acpica_ddi_setwake: could not get handle for       
>> /[EMAIL PROTECTED],0/pci10b9,[EMAIL PROTECTED],2, ohci:2
>>  
>>>> NOTICE: acpica_ddi_setwake: could not get handle for       
>> /[EMAIL PROTECTED],0/pci10b9,[EMAIL PROTECTED],3, ehci:0
>>  
>>>> NOTICE: acpica_ddi_setwake: could not get handle for       
>> /[EMAIL PROTECTED],0/pci10b9,[EMAIL PROTECTED],3, ehci:0
>>  
>>>> NOTICE: acpica_ddi_setwake: could not get handle for       
>> /[EMAIL PROTECTED],0/pci10b9,[EMAIL PROTECTED],2, ohci:2
>>  
>>>> NOTICE: acpica_ddi_setwake: could not get handle for       
>> /[EMAIL PROTECTED],0/pci10b9,[EMAIL PROTECTED],1, ohci:1
>>  
>>>> NOTICE: acpica_ddi_setwake: could not get handle for       
>> /[EMAIL PROTECTED],0/pci10b9,[EMAIL PROTECTED], ohci:0
>>  
>>>> The system is back where you left!
>>>> System has been resumed.
>>>>
>>>> panic[cpu0]/thread=cc6f3880: assertion failed: 
>>>> connp->conn_oper_pending_ill == 0, file:       
>> ../../common/inet/ip/
>>  
>>>> ipclassifier.c, line: 2219
>>>>
>>>>
>>>> ccdd6c78 genunix:assfail+5a (fa2a1fc4, fa2a2b18,)
>>>> ccdd6c98 ip:ipcl_conn_cleanup+209 (cde9f540)
>>>> ccdd6cc8 ip:ipcl_conn_destroy+151 (cde9f540, cd5cf27c,)
>>>> ccdd6ce0 ip:udp_close+b8 (cdbd3ca8, 3, cd5526)
>>>> ccdd6d08 genunix:qdetach+9b (cdbd3ca8, 1, 3, cd5)
>>>> ccdd6d50 genunix:strclose+392 (cde09580, 3, cd5526)
>>>> ccdd6d80 sockfs:socktpi_close+1cc (cde09580, 3, 1, 0, )
>>>> ccdd6dcc genunix:fop_close+51 (cde09580, 3, 1, 0, )
>>>> ccdd6e18 genunix:closef+88 (cd3bd700)
>>>> ccdd6e48 genunix:closeall+58 (cb6bf9a4)
>>>> ccdd6e90 genunix:proc_exit+40a (2, 2)
>>>> ccdd6ea4 genunix:exit+11 (2, 2)
>>>> ccdd6efc genunix:psig+4c3 (ccdd6f90, 8047a54, )
>>>> ccdd6f7c genunix:post_syscall+3f6 (4, cc6f3880)
>>>> ccdd6f90 genunix:syscall_exit+48 (cc6f3880, 4, cc6f38)
>>>>
>>>> syncing file systems...
>>>>  done
>>>> dumping to /dev/dsk/c0d0s1, offset 83951616, content: kernel
>>>>        
>>>>> ::cpuinfo -v
>>>>>             
>>>>  ID ADDR     FLG NRUN BSPL PRI RNRN KRNRN SWITCH THREAD   PROC
>>>>   0 fec24b98  1b    1    0  59   no    no t-0    cc6f3880 ifconfig
>>>>                |    |
>>>>     RUNNING <--+    +-->  PRI THREAD   PROC
>>>>       READY                60 ca94cde0 sched
>>>>      EXISTS              ENABLE        
>>>>
>>>>         
>>
>> Juergen Keil                  [EMAIL PROTECTED]
>> Tools GmbH            +49 (228) 9858011
>> Vorgebirgsstraße 37-39        http://www.tools.de
>> 53119 BONN
>> Sitz- und Registergericht       HRB Bonn 4026
>> Geschäftsführung        Wolfgang Franke & Wolfgang Solfrank
>>
>>   
>
>

_______________________________________________
driver-discuss mailing list
driver-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/driver-discuss

Reply via email to