Yep, this version works better. I'm able start a STR, dnet is suspended, STR fails (because box is old and does not support STR), and all drivers including dnet are suspended.
Minor nit: in dnetattach() you've added a new local variable "boolean_t wantsched;", but it's not used anywhere, so it should be removed. > Okay, I've debugged it and fixed the issues. > > The new webrev is at http://cr.opensolaris.org/~gdamore/dnet-suspend/ > > I would definitely appreciate code review feedback, as I'd like to file > an RTI for build 80. > > This webrev also includes a fix for an MII link problem I found during > debug, as well as AMD64 support, and a little housekeeping stuff > (cstyle, remove some dead macros and structure members, etc.) > > Using uadmin 3 25 35 (where 35 is the major number of dnet on my > system, as found in /etc/name_to_major), dnet suspends and resumes > properly. I still haven't got a full S3 suspend going, but dnet is no > longer the problem. > > One thing I've noticed is that on the particular card I have, dnet needs > a few seconds to get its brains in order. I don't understand what is > going on during this time, but don't be surprised if it takes up to 10 > seconds for packets to resume flowing. The dnet driver is *really* > crufty, and desperately needs some love. I am not sure how much I want > to invest in it. (I am considering a GLDv3 port, but its not very high > priority right now.) > > In case you were curious, the problems I found were: > > *) I forgot to set dnet->suspended = B_FALSE in DDI_RESUME! > > *) There was a mutex problem surrounding dnet_set_addr(). (I've > restructured this part of the code a bit more to make more sense.) > > *) Using GLD_NO_RESOURCES to return from dnet_send() was probably a > mistake, and was also probably the cause of the problems (hangs, panics) > in IP, when combined with the dnet->suspended problem above. > (Basically, certain DLPI control messages were getting stuck behind > these "deferred" ethernet packets.) I changed the code to just drop the > packets on the floor, which is more consistent with what I've done in > other drivers. > > Anyway, feel free to give it a whirl. I can supply binaries on demand > as well. > > Thanks! > > -- Garrett > > > Garrett D'Amore wrote: > > FYI, I've reproduced the problem, and have begun debugging it. (I'm > > also doing this in a 64-bit kernel, as I added support for 64-bit to > > dnet. :-) > > > > I should have something for you in the next day or two. > > > > -- Garrett > > > > Juergen Keil wrote: > >> Btw. "ifconfig -a" or "ifconfig dnet" hangs, and when I interrupt > >> the hanging ifconfig with Ctrl-C, the machine panics. > >> > >> > >> > >>> You're brave. :-) I hadn't even tested the code myself yet. > >>> > >>> I'll have a look at this later today. > >>> > >>> -- Garrett > >>> > >>> Juergen Keil wrote: > >>> > >>>> Garrett wrote: > >>>> > >>>> > >>>>> For those who've wanted it, I posted the webrev from the dnet > >>>>> conversion that I did at OpenSolaris Developer Summit 2007 at > >>>>> http://cr.opensolaris.org/~gdamore/dnet-suspend/ > >>>>> > >>>> Just tried that dnet patch on my ASUS P2B-LS "S3 suspend-to-ram" > >>>> test box, > >>>> but it's crashing the box with a failed assertion panic during > >>>> driver > >> resume: > >> > >>>> assertion failed: mutex_owned((&dnetp->intrlock)), in file dnet.c > >>>> line > >> 2308 > >> > >>>> stack backtrace shows > >>>> > >>>> :panic... > >>>> dnet:set_sia+12c(...) > >>>> dnet:dnet_init_board+2a(...) > >>>> dnet:dnetattach+5bd(..., 1) > >>>> ... > >>>> (messages manually copied, disk device is suspended, so no crash > >>>> dump :-) > >>>> > >>>> > >>>> I think the "dnet_init_board(macinfo);" call in the dnetattach() > >>>> DDI_RESUME case should be moved after the mutex_enter() calls. > >>>> > >>>> > >>>> With that change, suspend / resume does not panic any more. > >>>> > >>>> But the dnet device seems to be dead after a resume. And after maybe > >>>> a minute, I got another panic when I tried an "ifconfig -a" after the > >>>> resume: > >>>> > >>>> > >>>>> ::status > >>>> debugging crash dump vmcore.3 (32-bit) from elise > >>>> operating system: 5.11 wos_b78_debug (i86pc) > >>>> panic message: assertion failed: connp->conn_oper_pending_ill == 0, > >>>> file: > >> ../../common/inet/ip/ > >> > >>>> ipclassifier.c, line: 2219 > >>>> dump content: kernel pages only > >>>> > >>>>> $C > >>>>> > >>>> ccdd6c58 vpanic(feae72dc, fa2a1fc4, fa2a2b18, 8ab) > >>>> ccdd6c78 assfail+0x5a(fa2a1fc4, fa2a2b18, 8ab) > >>>> ccdd6c98 ipcl_conn_cleanup+0x209(cde9f540) > >>>> ccdd6cc8 ipcl_conn_destroy+0x151() > >>>> ccdd6ce0 udp_close+0xb8(cdbd3ca8, 3, cd552690) > >>>> ccdd6d08 qdetach+0x9b(cdbd3ca8, 1, 3, cd552690, 0) > >>>> ccdd6d50 strclose+0x392(cde09580, 3, cd552690) > >>>> ccdd6d80 socktpi_close+0x1cc(cde09580, 3, 1, 0, 0, cd552690) > >>>> ccdd6dcc fop_close+0x51(cde09580, 3, 1, 0, 0, cd552690) > >>>> ccdd6e18 closef+0x88(cd3bd700) > >>>> ccdd6e48 closeall+0x58(cb6bf9a4) > >>>> ccdd6e90 proc_exit+0x40a(2, 2) > >>>> ccdd6ea4 exit+0x11(2, 2) > >>>> ccdd6efc psig+0x4c3() > >>>> ccdd6f7c post_syscall+0x3f6(4, cc6f3880) > >>>> ccdd6f90 syscall_exit+0x48(cc6f3880, 4, cc6f3880) > >>>> ccdd6fa4 sys_call+0x21a() > >>>> > >>>> > >>>> > >>>>> ::msgbuf > >>>>> > >>>> ... > >>>> System is being suspended > >>>> NOTICE: acpica_ddi_setwake: could not get handle for > >> /[EMAIL PROTECTED],0/pci10b9,[EMAIL PROTECTED], ohci:0 > >> > >>>> NOTICE: acpica_ddi_setwake: could not get handle for > >> /[EMAIL PROTECTED],0/pci10b9,[EMAIL PROTECTED],1, ohci:1 > >> > >>>> NOTICE: acpica_ddi_setwake: could not get handle for > >> /[EMAIL PROTECTED],0/pci10b9,[EMAIL PROTECTED],2, ohci:2 > >> > >>>> NOTICE: acpica_ddi_setwake: could not get handle for > >> /[EMAIL PROTECTED],0/pci10b9,[EMAIL PROTECTED],3, ehci:0 > >> > >>>> NOTICE: acpica_ddi_setwake: could not get handle for > >> /[EMAIL PROTECTED],0/pci10b9,[EMAIL PROTECTED],3, ehci:0 > >> > >>>> NOTICE: acpica_ddi_setwake: could not get handle for > >> /[EMAIL PROTECTED],0/pci10b9,[EMAIL PROTECTED],2, ohci:2 > >> > >>>> NOTICE: acpica_ddi_setwake: could not get handle for > >> /[EMAIL PROTECTED],0/pci10b9,[EMAIL PROTECTED],1, ohci:1 > >> > >>>> NOTICE: acpica_ddi_setwake: could not get handle for > >> /[EMAIL PROTECTED],0/pci10b9,[EMAIL PROTECTED], ohci:0 > >> > >>>> The system is back where you left! > >>>> System has been resumed. > >>>> > >>>> panic[cpu0]/thread=cc6f3880: assertion failed: > >>>> connp->conn_oper_pending_ill == 0, file: > >> ../../common/inet/ip/ > >> > >>>> ipclassifier.c, line: 2219 > >>>> > >>>> > >>>> ccdd6c78 genunix:assfail+5a (fa2a1fc4, fa2a2b18,) > >>>> ccdd6c98 ip:ipcl_conn_cleanup+209 (cde9f540) > >>>> ccdd6cc8 ip:ipcl_conn_destroy+151 (cde9f540, cd5cf27c,) > >>>> ccdd6ce0 ip:udp_close+b8 (cdbd3ca8, 3, cd5526) > >>>> ccdd6d08 genunix:qdetach+9b (cdbd3ca8, 1, 3, cd5) > >>>> ccdd6d50 genunix:strclose+392 (cde09580, 3, cd5526) > >>>> ccdd6d80 sockfs:socktpi_close+1cc (cde09580, 3, 1, 0, ) > >>>> ccdd6dcc genunix:fop_close+51 (cde09580, 3, 1, 0, ) > >>>> ccdd6e18 genunix:closef+88 (cd3bd700) > >>>> ccdd6e48 genunix:closeall+58 (cb6bf9a4) > >>>> ccdd6e90 genunix:proc_exit+40a (2, 2) > >>>> ccdd6ea4 genunix:exit+11 (2, 2) > >>>> ccdd6efc genunix:psig+4c3 (ccdd6f90, 8047a54, ) > >>>> ccdd6f7c genunix:post_syscall+3f6 (4, cc6f3880) > >>>> ccdd6f90 genunix:syscall_exit+48 (cc6f3880, 4, cc6f38) > >>>> > >>>> syncing file systems... > >>>> done > >>>> dumping to /dev/dsk/c0d0s1, offset 83951616, content: kernel > >>>> > >>>>> ::cpuinfo -v > >>>>> > >>>> ID ADDR FLG NRUN BSPL PRI RNRN KRNRN SWITCH THREAD PROC > >>>> 0 fec24b98 1b 1 0 59 no no t-0 cc6f3880 ifconfig > >>>> | | > >>>> RUNNING <--+ +--> PRI THREAD PROC > >>>> READY 60 ca94cde0 sched > >>>> EXISTS ENABLE _______________________________________________ driver-discuss mailing list [email protected] http://mail.opensolaris.org/mailman/listinfo/driver-discuss
