Okay, I've debugged it and fixed the issues. The new webrev is at http://cr.opensolaris.org/~gdamore/dnet-suspend/
I would definitely appreciate code review feedback, as I'd like to file an RTI for build 80. This webrev also includes a fix for an MII link problem I found during debug, as well as AMD64 support, and a little housekeeping stuff (cstyle, remove some dead macros and structure members, etc.) Using uadmin 3 25 35 (where 35 is the major number of dnet on my system, as found in /etc/name_to_major), dnet suspends and resumes properly. I still haven't got a full S3 suspend going, but dnet is no longer the problem. One thing I've noticed is that on the particular card I have, dnet needs a few seconds to get its brains in order. I don't understand what is going on during this time, but don't be surprised if it takes up to 10 seconds for packets to resume flowing. The dnet driver is *really* crufty, and desperately needs some love. I am not sure how much I want to invest in it. (I am considering a GLDv3 port, but its not very high priority right now.) In case you were curious, the problems I found were: *) I forgot to set dnet->suspended = B_FALSE in DDI_RESUME! *) There was a mutex problem surrounding dnet_set_addr(). (I've restructured this part of the code a bit more to make more sense.) *) Using GLD_NO_RESOURCES to return from dnet_send() was probably a mistake, and was also probably the cause of the problems (hangs, panics) in IP, when combined with the dnet->suspended problem above. (Basically, certain DLPI control messages were getting stuck behind these "deferred" ethernet packets.) I changed the code to just drop the packets on the floor, which is more consistent with what I've done in other drivers. Anyway, feel free to give it a whirl. I can supply binaries on demand as well. Thanks! -- Garrett Garrett D'Amore wrote: > FYI, I've reproduced the problem, and have begun debugging it. (I'm > also doing this in a 64-bit kernel, as I added support for 64-bit to > dnet. :-) > > I should have something for you in the next day or two. > > -- Garrett > > Juergen Keil wrote: >> Btw. "ifconfig -a" or "ifconfig dnet" hangs, and when I interrupt >> the hanging ifconfig with Ctrl-C, the machine panics. >> >> >> >>> You're brave. :-) I hadn't even tested the code myself yet. >>> >>> I'll have a look at this later today. >>> >>> -- Garrett >>> >>> Juergen Keil wrote: >>> >>>> Garrett wrote: >>>> >>>> >>>>> For those who've wanted it, I posted the webrev from the dnet >>>>> conversion that I did at OpenSolaris Developer Summit 2007 at >>>>> http://cr.opensolaris.org/~gdamore/dnet-suspend/ >>>>> >>>> Just tried that dnet patch on my ASUS P2B-LS "S3 suspend-to-ram" >>>> test box, >>>> but it's crashing the box with a failed assertion panic during >>>> driver >> resume: >> >>>> assertion failed: mutex_owned((&dnetp->intrlock)), in file dnet.c >>>> line >> 2308 >> >>>> stack backtrace shows >>>> >>>> :panic... >>>> dnet:set_sia+12c(...) >>>> dnet:dnet_init_board+2a(...) >>>> dnet:dnetattach+5bd(..., 1) >>>> ... >>>> (messages manually copied, disk device is suspended, so no crash >>>> dump :-) >>>> >>>> >>>> I think the "dnet_init_board(macinfo);" call in the dnetattach() >>>> DDI_RESUME case should be moved after the mutex_enter() calls. >>>> >>>> >>>> With that change, suspend / resume does not panic any more. >>>> >>>> But the dnet device seems to be dead after a resume. And after maybe >>>> a minute, I got another panic when I tried an "ifconfig -a" after the >>>> resume: >>>> >>>> >>>>> ::status >>>> debugging crash dump vmcore.3 (32-bit) from elise >>>> operating system: 5.11 wos_b78_debug (i86pc) >>>> panic message: assertion failed: connp->conn_oper_pending_ill == 0, >>>> file: >> ../../common/inet/ip/ >> >>>> ipclassifier.c, line: 2219 >>>> dump content: kernel pages only >>>> >>>>> $C >>>>> >>>> ccdd6c58 vpanic(feae72dc, fa2a1fc4, fa2a2b18, 8ab) >>>> ccdd6c78 assfail+0x5a(fa2a1fc4, fa2a2b18, 8ab) >>>> ccdd6c98 ipcl_conn_cleanup+0x209(cde9f540) >>>> ccdd6cc8 ipcl_conn_destroy+0x151() >>>> ccdd6ce0 udp_close+0xb8(cdbd3ca8, 3, cd552690) >>>> ccdd6d08 qdetach+0x9b(cdbd3ca8, 1, 3, cd552690, 0) >>>> ccdd6d50 strclose+0x392(cde09580, 3, cd552690) >>>> ccdd6d80 socktpi_close+0x1cc(cde09580, 3, 1, 0, 0, cd552690) >>>> ccdd6dcc fop_close+0x51(cde09580, 3, 1, 0, 0, cd552690) >>>> ccdd6e18 closef+0x88(cd3bd700) >>>> ccdd6e48 closeall+0x58(cb6bf9a4) >>>> ccdd6e90 proc_exit+0x40a(2, 2) >>>> ccdd6ea4 exit+0x11(2, 2) >>>> ccdd6efc psig+0x4c3() >>>> ccdd6f7c post_syscall+0x3f6(4, cc6f3880) >>>> ccdd6f90 syscall_exit+0x48(cc6f3880, 4, cc6f3880) >>>> ccdd6fa4 sys_call+0x21a() >>>> >>>> >>>> >>>>> ::msgbuf >>>>> >>>> ... >>>> System is being suspended >>>> NOTICE: acpica_ddi_setwake: could not get handle for >> /[EMAIL PROTECTED],0/pci10b9,[EMAIL PROTECTED], ohci:0 >> >>>> NOTICE: acpica_ddi_setwake: could not get handle for >> /[EMAIL PROTECTED],0/pci10b9,[EMAIL PROTECTED],1, ohci:1 >> >>>> NOTICE: acpica_ddi_setwake: could not get handle for >> /[EMAIL PROTECTED],0/pci10b9,[EMAIL PROTECTED],2, ohci:2 >> >>>> NOTICE: acpica_ddi_setwake: could not get handle for >> /[EMAIL PROTECTED],0/pci10b9,[EMAIL PROTECTED],3, ehci:0 >> >>>> NOTICE: acpica_ddi_setwake: could not get handle for >> /[EMAIL PROTECTED],0/pci10b9,[EMAIL PROTECTED],3, ehci:0 >> >>>> NOTICE: acpica_ddi_setwake: could not get handle for >> /[EMAIL PROTECTED],0/pci10b9,[EMAIL PROTECTED],2, ohci:2 >> >>>> NOTICE: acpica_ddi_setwake: could not get handle for >> /[EMAIL PROTECTED],0/pci10b9,[EMAIL PROTECTED],1, ohci:1 >> >>>> NOTICE: acpica_ddi_setwake: could not get handle for >> /[EMAIL PROTECTED],0/pci10b9,[EMAIL PROTECTED], ohci:0 >> >>>> The system is back where you left! >>>> System has been resumed. >>>> >>>> panic[cpu0]/thread=cc6f3880: assertion failed: >>>> connp->conn_oper_pending_ill == 0, file: >> ../../common/inet/ip/ >> >>>> ipclassifier.c, line: 2219 >>>> >>>> >>>> ccdd6c78 genunix:assfail+5a (fa2a1fc4, fa2a2b18,) >>>> ccdd6c98 ip:ipcl_conn_cleanup+209 (cde9f540) >>>> ccdd6cc8 ip:ipcl_conn_destroy+151 (cde9f540, cd5cf27c,) >>>> ccdd6ce0 ip:udp_close+b8 (cdbd3ca8, 3, cd5526) >>>> ccdd6d08 genunix:qdetach+9b (cdbd3ca8, 1, 3, cd5) >>>> ccdd6d50 genunix:strclose+392 (cde09580, 3, cd5526) >>>> ccdd6d80 sockfs:socktpi_close+1cc (cde09580, 3, 1, 0, ) >>>> ccdd6dcc genunix:fop_close+51 (cde09580, 3, 1, 0, ) >>>> ccdd6e18 genunix:closef+88 (cd3bd700) >>>> ccdd6e48 genunix:closeall+58 (cb6bf9a4) >>>> ccdd6e90 genunix:proc_exit+40a (2, 2) >>>> ccdd6ea4 genunix:exit+11 (2, 2) >>>> ccdd6efc genunix:psig+4c3 (ccdd6f90, 8047a54, ) >>>> ccdd6f7c genunix:post_syscall+3f6 (4, cc6f3880) >>>> ccdd6f90 genunix:syscall_exit+48 (cc6f3880, 4, cc6f38) >>>> >>>> syncing file systems... >>>> done >>>> dumping to /dev/dsk/c0d0s1, offset 83951616, content: kernel >>>> >>>>> ::cpuinfo -v >>>>> >>>> ID ADDR FLG NRUN BSPL PRI RNRN KRNRN SWITCH THREAD PROC >>>> 0 fec24b98 1b 1 0 59 no no t-0 cc6f3880 ifconfig >>>> | | >>>> RUNNING <--+ +--> PRI THREAD PROC >>>> READY 60 ca94cde0 sched >>>> EXISTS ENABLE >>>> >>>> >> >> Juergen Keil [EMAIL PROTECTED] >> Tools GmbH +49 (228) 9858011 >> Vorgebirgsstraße 37-39 http://www.tools.de >> 53119 BONN >> Sitz- und Registergericht HRB Bonn 4026 >> Geschäftsführung Wolfgang Franke & Wolfgang Solfrank >> >> > > _______________________________________________ driver-discuss mailing list driver-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/driver-discuss