Yep, this version works better.  I'm able start a STR, dnet is suspended,
STR fails (because box is old and does not support STR), and all drivers
including dnet are suspended.

Minor nit: in dnetattach() you've added a new local variable
"boolean_t wantsched;", but it's not used anywhere, so it should
be removed.


> Okay, I've debugged it and fixed the issues.
> 
> The new webrev is at http://cr.opensolaris.org/~gdamore/dnet-suspend/
> 
> I would definitely appreciate code review feedback, as I'd like to file 
> an RTI for build 80.
> 
> This webrev also includes a fix for an MII link problem I found during 
> debug, as well as AMD64 support, and a little housekeeping stuff 
> (cstyle, remove some dead macros and structure members, etc.)
> 
> Using uadmin 3 25 35  (where 35 is the major number of dnet on my 
> system, as found in /etc/name_to_major), dnet suspends and resumes 
> properly.  I still haven't got a full S3 suspend going, but dnet is no 
> longer the problem.
> 
> One thing I've noticed is that on the particular card I have, dnet needs 
> a few seconds to get its brains in order.  I don't understand what is 
> going on during this time, but don't be surprised if it takes up to 10 
> seconds for packets to resume flowing.   The dnet driver is *really* 
> crufty, and desperately needs some love.  I am not sure how much I want 
> to invest in it.  (I am considering a GLDv3 port, but its not very high 
> priority right now.)
> 
> In case you were curious, the problems I found were:
> 
>     *) I forgot to set dnet->suspended = B_FALSE in DDI_RESUME!
> 
>     *) There was a mutex problem surrounding dnet_set_addr().  (I've 
> restructured this part of the code a bit more to make more sense.)
> 
>     *) Using GLD_NO_RESOURCES to return from dnet_send() was probably a 
> mistake, and was also probably the cause of the problems (hangs, panics) 
> in IP, when combined with the dnet->suspended problem above.  
> (Basically, certain DLPI control messages were getting stuck behind 
> these "deferred" ethernet packets.)  I changed the code to just drop the 
> packets on the floor, which is more consistent with what I've done in 
> other drivers.
> 
> Anyway, feel free to give it a whirl.  I can supply binaries on demand 
> as well.
> 
> Thanks!
> 
>     -- Garrett
> 
> 
> Garrett D'Amore wrote:
> > FYI, I've reproduced the problem, and have begun debugging it.  (I'm 
> > also doing this in a 64-bit kernel, as I added support for 64-bit to 
> > dnet. :-)
> >
> > I should have something for you in the next day or two.
> >
> >    -- Garrett
> >
> > Juergen Keil wrote:
> >> Btw. "ifconfig -a" or "ifconfig dnet" hangs, and when I interrupt
> >> the hanging ifconfig with Ctrl-C, the machine panics.
> >>
> >>
> >>  
> >>> You're brave. :-)  I hadn't even tested the code myself yet.
> >>>
> >>> I'll have a look at this later today.
> >>>
> >>>     -- Garrett
> >>>
> >>> Juergen Keil wrote:
> >>>    
> >>>> Garrett wrote:
> >>>>
> >>>>        
> >>>>> For those who've wanted it, I posted the webrev from the dnet 
> >>>>> conversion that I did at OpenSolaris Developer Summit 2007 at 
> >>>>> http://cr.opensolaris.org/~gdamore/dnet-suspend/
> >>>>>             
> >>>> Just tried that dnet patch on my ASUS P2B-LS "S3 suspend-to-ram" 
> >>>> test box,
> >>>> but it's crashing the box with a failed assertion panic during 
> >>>> driver       
> >> resume:
> >>  
> >>>>   assertion failed: mutex_owned((&dnetp->intrlock)), in file dnet.c 
> >>>> line       
> >> 2308
> >>  
> >>>>       stack backtrace shows
> >>>>
> >>>>       :panic...
> >>>>   dnet:set_sia+12c(...)
> >>>>   dnet:dnet_init_board+2a(...)
> >>>>   dnet:dnetattach+5bd(..., 1)
> >>>>   ...
> >>>>   (messages manually copied, disk device is suspended, so no crash 
> >>>> dump :-)
> >>>>
> >>>>
> >>>> I think the "dnet_init_board(macinfo);" call in the dnetattach()
> >>>> DDI_RESUME case should be moved after the mutex_enter() calls.
> >>>>
> >>>>
> >>>> With that change, suspend / resume does not panic any more.
> >>>>
> >>>> But the dnet device seems to be dead after a resume.  And after maybe
> >>>> a minute,  I got another panic when I tried an "ifconfig -a" after the
> >>>> resume:
> >>>>
> >>>>        
> >>>>> ::status             
> >>>> debugging crash dump vmcore.3 (32-bit) from elise
> >>>> operating system: 5.11 wos_b78_debug (i86pc)
> >>>> panic message: assertion failed: connp->conn_oper_pending_ill == 0, 
> >>>> file:       
> >> ../../common/inet/ip/
> >>  
> >>>> ipclassifier.c, line: 2219
> >>>> dump content: kernel pages only
> >>>>        
> >>>>> $C
> >>>>>             
> >>>> ccdd6c58 vpanic(feae72dc, fa2a1fc4, fa2a2b18, 8ab)
> >>>> ccdd6c78 assfail+0x5a(fa2a1fc4, fa2a2b18, 8ab)
> >>>> ccdd6c98 ipcl_conn_cleanup+0x209(cde9f540)
> >>>> ccdd6cc8 ipcl_conn_destroy+0x151()
> >>>> ccdd6ce0 udp_close+0xb8(cdbd3ca8, 3, cd552690)
> >>>> ccdd6d08 qdetach+0x9b(cdbd3ca8, 1, 3, cd552690, 0)
> >>>> ccdd6d50 strclose+0x392(cde09580, 3, cd552690)
> >>>> ccdd6d80 socktpi_close+0x1cc(cde09580, 3, 1, 0, 0, cd552690)
> >>>> ccdd6dcc fop_close+0x51(cde09580, 3, 1, 0, 0, cd552690)
> >>>> ccdd6e18 closef+0x88(cd3bd700)
> >>>> ccdd6e48 closeall+0x58(cb6bf9a4)
> >>>> ccdd6e90 proc_exit+0x40a(2, 2)
> >>>> ccdd6ea4 exit+0x11(2, 2)
> >>>> ccdd6efc psig+0x4c3()
> >>>> ccdd6f7c post_syscall+0x3f6(4, cc6f3880)
> >>>> ccdd6f90 syscall_exit+0x48(cc6f3880, 4, cc6f3880)
> >>>> ccdd6fa4 sys_call+0x21a()
> >>>>
> >>>>
> >>>>        
> >>>>> ::msgbuf
> >>>>>             
> >>>> ...
> >>>> System is being suspended
> >>>> NOTICE: acpica_ddi_setwake: could not get handle for       
> >> /[EMAIL PROTECTED],0/pci10b9,[EMAIL PROTECTED], ohci:0
> >>  
> >>>> NOTICE: acpica_ddi_setwake: could not get handle for       
> >> /[EMAIL PROTECTED],0/pci10b9,[EMAIL PROTECTED],1, ohci:1
> >>  
> >>>> NOTICE: acpica_ddi_setwake: could not get handle for       
> >> /[EMAIL PROTECTED],0/pci10b9,[EMAIL PROTECTED],2, ohci:2
> >>  
> >>>> NOTICE: acpica_ddi_setwake: could not get handle for       
> >> /[EMAIL PROTECTED],0/pci10b9,[EMAIL PROTECTED],3, ehci:0
> >>  
> >>>> NOTICE: acpica_ddi_setwake: could not get handle for       
> >> /[EMAIL PROTECTED],0/pci10b9,[EMAIL PROTECTED],3, ehci:0
> >>  
> >>>> NOTICE: acpica_ddi_setwake: could not get handle for       
> >> /[EMAIL PROTECTED],0/pci10b9,[EMAIL PROTECTED],2, ohci:2
> >>  
> >>>> NOTICE: acpica_ddi_setwake: could not get handle for       
> >> /[EMAIL PROTECTED],0/pci10b9,[EMAIL PROTECTED],1, ohci:1
> >>  
> >>>> NOTICE: acpica_ddi_setwake: could not get handle for       
> >> /[EMAIL PROTECTED],0/pci10b9,[EMAIL PROTECTED], ohci:0
> >>  
> >>>> The system is back where you left!
> >>>> System has been resumed.
> >>>>
> >>>> panic[cpu0]/thread=cc6f3880: assertion failed: 
> >>>> connp->conn_oper_pending_ill == 0, file:       
> >> ../../common/inet/ip/
> >>  
> >>>> ipclassifier.c, line: 2219
> >>>>
> >>>>
> >>>> ccdd6c78 genunix:assfail+5a (fa2a1fc4, fa2a2b18,)
> >>>> ccdd6c98 ip:ipcl_conn_cleanup+209 (cde9f540)
> >>>> ccdd6cc8 ip:ipcl_conn_destroy+151 (cde9f540, cd5cf27c,)
> >>>> ccdd6ce0 ip:udp_close+b8 (cdbd3ca8, 3, cd5526)
> >>>> ccdd6d08 genunix:qdetach+9b (cdbd3ca8, 1, 3, cd5)
> >>>> ccdd6d50 genunix:strclose+392 (cde09580, 3, cd5526)
> >>>> ccdd6d80 sockfs:socktpi_close+1cc (cde09580, 3, 1, 0, )
> >>>> ccdd6dcc genunix:fop_close+51 (cde09580, 3, 1, 0, )
> >>>> ccdd6e18 genunix:closef+88 (cd3bd700)
> >>>> ccdd6e48 genunix:closeall+58 (cb6bf9a4)
> >>>> ccdd6e90 genunix:proc_exit+40a (2, 2)
> >>>> ccdd6ea4 genunix:exit+11 (2, 2)
> >>>> ccdd6efc genunix:psig+4c3 (ccdd6f90, 8047a54, )
> >>>> ccdd6f7c genunix:post_syscall+3f6 (4, cc6f3880)
> >>>> ccdd6f90 genunix:syscall_exit+48 (cc6f3880, 4, cc6f38)
> >>>>
> >>>> syncing file systems...
> >>>>  done
> >>>> dumping to /dev/dsk/c0d0s1, offset 83951616, content: kernel
> >>>>        
> >>>>> ::cpuinfo -v
> >>>>>             
> >>>>  ID ADDR     FLG NRUN BSPL PRI RNRN KRNRN SWITCH THREAD   PROC
> >>>>   0 fec24b98  1b    1    0  59   no    no t-0    cc6f3880 ifconfig
> >>>>                |    |
> >>>>     RUNNING <--+    +-->  PRI THREAD   PROC
> >>>>       READY                60 ca94cde0 sched
> >>>>      EXISTS              ENABLE        

_______________________________________________
driver-discuss mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/driver-discuss

Reply via email to