buildworld errors at outset on fresh svn checkout
I'm running into a problem in updating my 10-STABLE system from source. A "make buildworld" quits immediately. I tried a fresh svn checkout for base/stable/10 and then tried to run buildworld again, but got the same error. I've been scratching my head over this for hours, but must be missing something simple. I have ccache installed and have been using it for a fairly long time now. My /etc/src.conf contains just two lines: PORTS_MODULES=multimedia/cuse4bsd-kmod sysutils/pefs-kmod # emulators/virtualbox-ose-kmod WITH_LLDB=yes My /etc/make.conf is rather longer, so I'll append it following .sig below. Here's what happens. Script started on Thu Oct 6 23:31:47 2016 hellas# cd /usr/src hellas# nice make buildworld Unknown modifier '[' "/usr/src/Makefile.inc1", line 1113: Malformed conditional (${BUILDKERNELS:[) Unknown modifier '[' "/usr/src/Makefile.inc1", line 1122: if-less endif Unknown modifier '[' "/usr/src/Makefile.inc1", line 1144: Malformed conditional (${BUILDKERNELS:[) Unknown modifier '[' "/usr/src/Makefile.inc1", line 1161: if-less endif Unknown modifier '[' "/usr/src/Makefile.inc1", line 1183: Malformed conditional (${BUILDKERNELS:[) Unknown modifier '[' "/usr/src/Makefile.inc1", line 1190: if-less endif bmake: fatal errors encountered -- cannot continue *** Error code 1 Stop. make: stopped in /usr/src hellas# exit exit Script done on Thu Oct 6 23:37:00 2016 This just started happening after my machine had been down for a couple of days after a hang that damaged stuff in /usr/home. I had already restored /usr/local from backups before narrowing down the weird behavior I was seeing in wmaker to /usr/home corruption. So /usr/home has now been restored to good condition, too, but perhaps I need to restore something else as well. This mess was part of my justification to myself for the fresh checkout of /usr/src, but that doesn't seem to have made any difference in the buildworld failure. If anyone else can see what's wrong and clue me in, I'd be grateful. I'm subscribed to the digest for this list, so please Cc: me directly, so I'll get replies right away. Thanks in advance! Scott Bennett, Comm. ASMELG, CFIAG ** * Internet: bennett at sdf.org *xor* bennett at freeshell.org * ** * "A well regulated and disciplined militia, is at all times a good * * objection to the introduction of that bane of all free governments * * -- a standing army." * *-- Gov. John Hancock, New York Journal, 28 January 1790 * ** /etc/make.conf contains: CPUTYPE?=core2 CFLAGS+="-mtune=core2" SVNFLAGS?="-r RELENG_10" # build ports with clang stack protector WITH_SSP=yes SSP_CFLAGS=-fstack-protector-all # added for ports system use to avoid dialogs by SJB 4 May 2007 BATCH=YES # added for new pkg system --SJB 10 December 2014 WITH_PKGNG=yes # build ports using ccache --SJB 19 January 2015 WITH_CCACHE_BUILD=yes ## buildworld and buildkernel using ccache --SJB 26 January 2015 .if (!empty(.CURDIR:M/usr/src*) || !empty(.CURDIR:M/usr/obj*)) .if !defined(NOCCACHE) && exists(/usr/local/libexec/ccache/world/cc) CC:=${CC:C,^cc,/usr/local/libexec/ccache/world/cc,1} CXX:=${CXX:C,^c\+\+,/usr/local/libexec/ccache/world/c++,1} CCACHE_COMPILERCHECK=content CCACHE_DIR=/buildwork/ccache.freebsd .endif .else CFLAGS+="-mssse3" #CFLAGS+="-mssse3 -msse4.1" .endif # added to deal with ccache bug 8460 --SJB 2 November 2013 # bug has been reported fixed, so try without this workaround #CCACHE_CPP2=1 # added as a better specification of -j by SJB 17 November 2009 MAKE_JOBS_NUMBER=4 # put build tree where there is plenty of temporary workspace WRKDIRPREFIX=/buildwork/ports DEFAULT_VERSIONS+= ssl=openssl # Allow updating of Mesa3D from 7.4.4 to 7.6.1 and libdrm from 2.4.12 to 2.4.17 WITHOUT_NOUVEAU=yes # Use ATLAS libraries in ports that use BLAS libraries OPTIONS_SET=ATLAS # Tell gnustep-related ports to use base system's compiler GNUSTEP_WITH_BASE_GCC=yes GNUSTEP_WITHOUT_LIBOBJC=yes QT4_OPTIONS= CUPS NAS QGTKSTYLE # Begin portconf settings # Do not touch these lines .if !empty(.CURDIR:M/usr/ports*) && exists(/usr/local/libexec/portconf) _PORTCONF!=/usr/local/libexec/portconf .if ${_PORTCONF} != "|" .for i in ${_PORTCONF:S/^|//:S/|/ /g} ${i:C/^([^=]*)=.*/\1/}=${i:C/^[^=]*=//:S/%/ /g} .endfor .endif .endif # End portconf settings ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: Reproducible panic - Going nowhere without my init!
Let me preface this by saying that I know nothing about this particular bit of code, but... As a general rule, I would question the use of gettimeofday() while panicing. At that stage, everything could have already gone down the plug hole. That said, it already calls sleep(), so maybe that uses the same gettimeofday() call internally. In which case, please ignore this comment. Graham On 7/10/2016 9:32 AM, Andy Farkas wrote: With your latest patch applied, I ran through my procedure more than a dozen times and no panics! Any explanation why sleep(STALL_TIMEOUT) as apposed to a bunch of sleep(1)'s tickles the panic? Also, it is definitely not sleeping for 30 seconds. I guess some event interrupts the sleep loop? Thanks heaps for your time and effort, -andyf %%% Please try the following patch. diff --git a/sbin/init/init.c b/sbin/init/init.c index bda86b5..25ac2bd 100644 --- a/sbin/init/init.c +++ b/sbin/init/init.c @@ -870,6 +870,7 @@ single_user(void) sigset_t mask; const char *shell; char *argv[2]; + struct timeval tv, tn; #ifdef SECURE struct ttyent *typ; struct passwd *pp; @@ -884,8 +885,13 @@ single_user(void) if (Reboot) { /* Instead of going single user, let's reboot the machine */ sync(); - reboot(howto); - _exit(0); + if (reboot(howto) == -1) { + emergency("reboot(%#x) failed, %s", howto, +strerror(errno)); + _exit(1); /* panic and reboot */ + } + warning("reboot(%#x) returned", howto); + _exit(0); /* panic as well */ } shell = get_shell(); @@ -1002,7 +1008,14 @@ single_user(void) * reboot(8) killed shell? */ warning("single user shell terminated."); - sleep(STALL_TIMEOUT); + gettimeofday(, NULL); + tn = tv; + tv.tv_sec += STALL_TIMEOUT; + while (tv.tv_sec > tn.tv_sec || (tv.tv_sec == +tn.tv_sec && tv.tv_usec > tn.tv_usec)) { + sleep(1); + gettimeofday(, NULL); + } _exit(0); } else { warning("single user shell terminated, restarting"); ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org" ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: Reproducible panic - Going nowhere without my init!
With your latest patch applied, I ran through my procedure more than a dozen times and no panics! Any explanation why sleep(STALL_TIMEOUT) as apposed to a bunch of sleep(1)'s tickles the panic? Also, it is definitely not sleeping for 30 seconds. I guess some event interrupts the sleep loop? Thanks heaps for your time and effort, -andyf %%% Please try the following patch. diff --git a/sbin/init/init.c b/sbin/init/init.c index bda86b5..25ac2bd 100644 --- a/sbin/init/init.c +++ b/sbin/init/init.c @@ -870,6 +870,7 @@ single_user(void) sigset_t mask; const char *shell; char *argv[2]; + struct timeval tv, tn; #ifdef SECURE struct ttyent *typ; struct passwd *pp; @@ -884,8 +885,13 @@ single_user(void) if (Reboot) { /* Instead of going single user, let's reboot the machine */ sync(); - reboot(howto); - _exit(0); + if (reboot(howto) == -1) { + emergency("reboot(%#x) failed, %s", howto, +strerror(errno)); + _exit(1); /* panic and reboot */ + } + warning("reboot(%#x) returned", howto); + _exit(0); /* panic as well */ } shell = get_shell(); @@ -1002,7 +1008,14 @@ single_user(void) * reboot(8) killed shell? */ warning("single user shell terminated."); - sleep(STALL_TIMEOUT); + gettimeofday(, NULL); + tn = tv; + tv.tv_sec += STALL_TIMEOUT; + while (tv.tv_sec > tn.tv_sec || (tv.tv_sec == +tn.tv_sec && tv.tv_usec > tn.tv_usec)) { + sleep(1); + gettimeofday(, NULL); + } _exit(0); } else { warning("single user shell terminated, restarting"); ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
11.0-RELEASE Status Update
As many of you are aware, 11.0-RELEASE needed to be rebuilt to address several issues that were discovered after the release was built. Extra caution is being taken in testing the rebuilt releases, so at present, the final release announcement is planned for Monday, October 10. Thank you for your patience in waiting for 11.0-RELEASE. Glen On behalf of: re@ signature.asc Description: PGP signature
Re: Reproducible panic - Going nowhere without my init!
On Thu, Oct 06, 2016 at 06:31:59PM +1000, Andy Farkas wrote: > Reverted your patch then changed line 1011 of init.c to _exit(97): > > --- init.c-orig 2016-10-05 18:52:24.02291 +1000 > +++ init.c 2016-10-06 17:02:33.714624000 +1000 > @@ -1008,7 +1008,7 @@ > */ > warning("single user shell terminated."); > sleep(STALL_TIMEOUT); > - _exit(0); > + _exit(97); > } else { > warning("single user shell terminated, restarting"); > return (state_func_t) single_user; > > ...and got a panic that showed "exit 97": http://imgur.com/xonPwxR > > I think that kern_reboot() is not being called somehow. > kern_reboot() is the only place rebooting = 1; is executed. > > "init died (signal 0, exit 97) > panic: Going nowhere without my init!" > > can only happen if rebooting = 0 in kern_exit.c exit1(). > > Another tell that kern_reboot() has not been called is "cpuid = 3" > because the first thing kern_reboot() does is bind to CPU 0. > > Why is kern_reboot() being skipped? I have no idea. > > Anything more I can do to help? Do you want a core dump? > Please try the following patch. diff --git a/sbin/init/init.c b/sbin/init/init.c index bda86b5..25ac2bd 100644 --- a/sbin/init/init.c +++ b/sbin/init/init.c @@ -870,6 +870,7 @@ single_user(void) sigset_t mask; const char *shell; char *argv[2]; + struct timeval tv, tn; #ifdef SECURE struct ttyent *typ; struct passwd *pp; @@ -884,8 +885,13 @@ single_user(void) if (Reboot) { /* Instead of going single user, let's reboot the machine */ sync(); - reboot(howto); - _exit(0); + if (reboot(howto) == -1) { + emergency("reboot(%#x) failed, %s", howto, + strerror(errno)); + _exit(1); /* panic and reboot */ + } + warning("reboot(%#x) returned", howto); + _exit(0); /* panic as well */ } shell = get_shell(); @@ -1002,7 +1008,14 @@ single_user(void) * reboot(8) killed shell? */ warning("single user shell terminated."); - sleep(STALL_TIMEOUT); + gettimeofday(, NULL); + tn = tv; + tv.tv_sec += STALL_TIMEOUT; + while (tv.tv_sec > tn.tv_sec || (tv.tv_sec == + tn.tv_sec && tv.tv_usec > tn.tv_usec)) { + sleep(1); + gettimeofday(, NULL); + } _exit(0); } else { warning("single user shell terminated, restarting"); ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: iicsmb
On Thu, Oct 6, 2016 at 5:39 AM, Mark Dixonwrote: > If I load the module on my laptop (Lenovo Thinkpad X1 Carbon), I get: > > iicsmb0: on iicbus0 > iicsmb1: on iicbus1 > iicsmb2: on iicbus2 > iicsmb3: on iicbus3 > iicsmb4: on iicbus4 > iicsmb5: on iicbus5 > iicsmb6: on iicbus6 > iicsmb7: on iicbus7 > iicsmb8: on iicbus8 > iicsmb9: on iicbus9 > iicsmb10: on iicbus10 > iicsmb11: on iicbus11 > smbus1: on iicsmb0 > smbus2: on iicsmb1 > smbus3: on iicsmb2 > smbus4: on iicsmb3 > smbus5: on iicsmb4 > smbus6: on iicsmb5 > smbus7: on iicsmb6 > smbus8: on iicsmb7 > smbus9: on iicsmb8 > smbus10: on iicsmb9 > smbus11: on iicsmb10 > smbus12: on iicsmb11 > > I have no idea what this means though. > > Regards, > > Mark Andriy Likewise I have devices that appear but not sure what they are. I am on smbios.planar.maker="BIOSTAR Group" smbios.planar.product="A68I-350 DELUXE" iicsmb0: on iicbus0 iicsmb1: on iicbus1 iicsmb2: on iicbus2 iicsmb3: on iicbus3 iicsmb4: on iicbus4 iicsmb5: on iicbus5 iicsmb6: on iicbus6 iicsmb7: on iicbus7 smbus0: on iicsmb0 smbus1: on iicsmb1 smbus2: on iicsmb2 smbus3: on iicsmb3 smbus4: on iicsmb4 smbus5: on iicsmb5 smbus6: on iicsmb6 smbus7: on iicsmb7 root@ostrich:~ # ls -l /dev/iic* crw--- 1 root wheel 0x6d Sep 26 16:21 /dev/iic0 crw--- 1 root wheel 0x6e Sep 26 16:21 /dev/iic1 crw--- 1 root wheel 0x6f Sep 26 16:21 /dev/iic2 crw--- 1 root wheel 0x70 Sep 26 16:21 /dev/iic3 crw--- 1 root wheel 0x71 Sep 26 16:21 /dev/iic4 crw--- 1 root wheel 0x72 Sep 26 16:21 /dev/iic5 crw--- 1 root wheel 0x73 Sep 26 16:21 /dev/iic6 crw--- 1 root wheel 0x77 Sep 26 16:21 /dev/iic7 probing them with smbmsg doesn't return any data. -- mark saad | nones...@longcount.org ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: 11.0 stuck on high network load
Hi, On 9/28/16 1:59 PM, Slawa Olhovchenkov wrote: > On Wed, Sep 28, 2016 at 12:06:47PM +0200, Julien Charbon wrote: >> >> I am still trying to reproduce your issue, without success so far. Thanks for Slawa effort and multiple debug report we start seeing the bottom of this issue and it seems to be a generic one. The most useful report being: panic: tcp_detach: INP_TIMEWAIT && INP_DROPPED && tp != NULL cpuid = 4 KDB: stack backtrace: db_trace_self_wrapper() at 0x8032467b = db_trace_self_wrapper+0x2b/frame 0xfe1f9e1f8730 vpanic() at 0x804b5672 = vpanic+0x182/frame 0xfe1f9e1f87b0 kassert_panic() at 0x804b54e6 = kassert_panic+0x126/frame 0xfe1f9e1f8820 tcp_usr_detach() at 0x806564dc = tcp_usr_detach+0x1bc/frame 0xfe1f9e1f8850 sofree() at 0x8053de66 = sofree+0x1a6/frame 0xfe1f9e1f8880 tcp_close() at 0x8064dd8e = tcp_close+0x11e/frame 0xfe1f9e1f88b0 tcp_timer_2msl() at 0x80653c28 = tcp_timer_2msl+0x278/frame 0xfe1f9e1f88e0 softclock_call_cc() at 0x804cbacc = softclock_call_cc+0x19c/frame 0xfe1f9e1f89c0 softclock() at 0x804cbec7 = softclock+0x47/frame 0xfe1f9e1f89e0 intr_event_execute_handlers() at 0x8047aa86 = intr_event_execute_handlers+0x96/frame 0xfe1f9e1f8a20 ithread_loop() at 0x8047b106 = ithread_loop+0xa6/frame 0xfe1f9e1f8a70 fork_exit() at 0x804781b4 = fork_exit+0x84/frame 0xfe1f9e1f8ab0 fork_trampoline() at 0x80713fce = fork_trampoline+0xe/frame 0xfe1f9e1f8ab0 The scenario: 1. thread1: tcp_timer_2msl() expires and tcp_close() is called to clean this TCP connection. 2. thread1: In tcp_close() the inp is marked with INP_DROPPED flag, the process continues and calls INP_WUNLOCK() here: https://github.com/freebsd/freebsd/blob/releng/11.0/sys/netinet/tcp_subr.c#L1568 3. thread2: Now because INP_WLOCK is released, the inp can transition to INP_TIMEWAIT state and nothing is preventing it. 4. thread2: During the INP_TIMEWAIT state transition, the inp is marked with INP_TIMEWAIT flag. 5. thread1: Back in business and tcp_close() call continues with sofree() -> tcp_usr_detach() -> tcp_detach(). Then as inp is marked with INP_DROPPED|INP_TIMEWAIT flags, in_pcbfree() is called. w/ INVARIANTS you have an assertion here, w/o INVARIANTS process continues. 6. Later: tcp_twclose() cleans up this INP_TIMEWAIT inp and calls in_pcbfree() again to achieve a fancy inp double-free. This issue is a tricky one and seems here since quite a while. It has been witness at least once in 10.1 and by two different people in 11.0. Astute questions: o Why INP_DROPPED flag is not tested in tcp_input() in the first place? When you are marked as INP_DROPPED, you are almost dead, you should not be allowed to transition to a different state! Good point, and tcp_input() relies on the fact that INP_DROPPED inps are no more in TCP hash table. But tcp_input() in some cases do relock INP (see relocked: label) and if it does check a lot of things after having relocked the inp it does not check for a recently added INP_DROPPED flag. o Why tcp_detach() does an unconditional in_pcbfree() for inps in TIMEWAIT state? This because inps in TIMEWAIT state have only one exit: Being freed. And it is the duty of tcp_detach() to free all inps with INP_DROPPED|INP_TIMEWAIT. o Why this issue is so rare? Good question, I can see how to have a specific TCP traffic to make it more frequent but no definitive answer yet. Fix proposal: This issue description is still a bit fresh but I would enforce that an inp with INP_DROPPED flag should not be allowed to change state. Thing learned: When re-locking an inp, it might have changed a lot, and you might not like what it became. Thanks again to Slawa, for his numerous debug reports and always questioning my explanations. His last question directly led to this finding. He is testing a quick workaround patch to check if there is more. I will create a review with a fix proposal, and don't hesitate if you have other comments on this issue. -- Julien signature.asc Description: OpenPGP digital signature
Re: 11.0 stuck on high network load
On Thu, Oct 06, 2016 at 09:28:06AM +0200, Julien Charbon wrote: > 2. thread1: In tcp_close() the inp is marked with INP_DROPPED flag, the > process continues and calls INP_WUNLOCK() here: > > https://github.com/freebsd/freebsd/blob/releng/11.0/sys/netinet/tcp_subr.c#L1568 Look also to sys/netinet/tcp_timewait.c:488 And check other locks from r160549 ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: iicsmb
If I load the module on my laptop (Lenovo Thinkpad X1 Carbon), I get: iicsmb0: on iicbus0 iicsmb1: on iicbus1 iicsmb2: on iicbus2 iicsmb3: on iicbus3 iicsmb4: on iicbus4 iicsmb5: on iicbus5 iicsmb6: on iicbus6 iicsmb7: on iicbus7 iicsmb8: on iicbus8 iicsmb9: on iicbus9 iicsmb10: on iicbus10 iicsmb11: on iicbus11 smbus1: on iicsmb0 smbus2: on iicsmb1 smbus3: on iicsmb2 smbus4: on iicsmb3 smbus5: on iicsmb4 smbus6: on iicsmb5 smbus7: on iicsmb6 smbus8: on iicsmb7 smbus9: on iicsmb8 smbus10: on iicsmb9 smbus11: on iicsmb10 smbus12: on iicsmb11 I have no idea what this means though. Regards, Mark signature.asc Description: This is a digitally signed message part
Re: 11.0 stuck on high network load
Hi Hiren, On 10/6/16 9:44 AM, hiren panchasara wrote: > On 10/06/16 at 09:28P, Julien Charbon wrote: >> On 9/28/16 1:59 PM, Slawa Olhovchenkov wrote: >>> On Wed, Sep 28, 2016 at 12:06:47PM +0200, Julien Charbon wrote: I am still trying to reproduce your issue, without success so far. >> >> Thanks for Slawa effort and multiple debug report we start seeing the >> bottom of this issue and it seems to be a generic one. The most useful >> report being: >> >> panic: tcp_detach: INP_TIMEWAIT && INP_DROPPED && tp != NULL > > I know there are multiple and probably related problems being > discussed here but what about the one mentioned in subject of this > thread? > Apologies if I've missed something conclusive in one of the replies of > this thread about that issue. This issue can lead the machine being stuck on high network load, by double freeing an inp, you can corrupt/leak an inp lock, and the network stack can wait definitely on this inp lock to be released. You get this assert only with INVARIANTS defined. Of usual, we can have more than one issue here, but this INP_TIMEWAI|INP_DROPPED issue need to be fixed anyway. -- Julien signature.asc Description: OpenPGP digital signature
Re: Reproducible panic - Going nowhere without my init!
On 2016-Oct-04 11:14:38 +1000, Andy Farkaswrote: >Is it just me or > >Step 1: boot >Step 2: login as root >Step 3: type "w" * >Step 4: type "shutdown now; logout" >Step 5: press at the 'Enter full pathname of shell or RETURN for >/bin/sh:' prompt >Step 6: type "reboot" >Step 7: get a Panic: "Going nowhere without my init!" > >* The panic will not happen if you skip step 3. > >The panic will not happen if you type "sync; sync; sync" after step 5. > >The panic will not happen if you wait (an unknown amount of) some time >after step 5. I can reproduce this on the console of my GCE instance but the timing seems important. It doesn't seem to fail if I ssh in or if I pause between any of the commands. ... gce1# w 7:47PM up 38 secs, 1 users, load averages: 0.69, 0.22, 0.08 USER TTY FROM LOGIN@ IDLE WHAT root u0 - 7:47PM - w gce1# shutdown now;logout Shutdown NOW! shutdown: [pid 1071] Stopping cron. Stopping sshd. Stopping ntpd. Stopping local_unbound. Stopping devd. Writing entropy file:. Writing early boot entropy file:. Terminated . Oct 6 19:47:09 pflog0: promiscuous mode disabled Enter full pathname of shell or RETURN for /bin/sh: gce1# reboot Oct 6 19:47:17 init: single user shell terminated. init died (signal 0, exit 0) panic: Going nowhere without my init! Uptime: 55s Changing serial settings was 0/0 now 3/0 Start bios (version 1.7.2-20150226_170051-google) gce1$ uname -a FreeBSD gce1.rulingia.com 11.0-PRERELEASE FreeBSD 11.0-PRERELEASE #83 r306704M: Thu Oct 6 13:22:27 AEDT 2016 r...@gce1.rulingia.com:/usr/obj/usr/src/sys/GCE amd64 I haven't investigated the cause yet. -- Peter Jeremy signature.asc Description: PGP signature
Re: Reproducible panic - Going nowhere without my init!
Reverted your patch then changed line 1011 of init.c to _exit(97): --- init.c-orig 2016-10-05 18:52:24.02291 +1000 +++ init.c 2016-10-06 17:02:33.714624000 +1000 @@ -1008,7 +1008,7 @@ */ warning("single user shell terminated."); sleep(STALL_TIMEOUT); - _exit(0); + _exit(97); } else { warning("single user shell terminated, restarting"); return (state_func_t) single_user; ...and got a panic that showed "exit 97": http://imgur.com/xonPwxR I think that kern_reboot() is not being called somehow. kern_reboot() is the only place rebooting = 1; is executed. "init died (signal 0, exit 97) panic: Going nowhere without my init!" can only happen if rebooting = 0 in kern_exit.c exit1(). Another tell that kern_reboot() has not been called is "cpuid = 3" because the first thing kern_reboot() does is bind to CPU 0. Why is kern_reboot() being skipped? I have no idea. Anything more I can do to help? Do you want a core dump? -andyf ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: 11.0 stuck on high network load
On 10/06/16 at 09:51P, Julien Charbon wrote: > > Hi Hiren, > > On 10/6/16 9:44 AM, hiren panchasara wrote: > > On 10/06/16 at 09:28P, Julien Charbon wrote: > >> On 9/28/16 1:59 PM, Slawa Olhovchenkov wrote: > >>> On Wed, Sep 28, 2016 at 12:06:47PM +0200, Julien Charbon wrote: > > I am still trying to reproduce your issue, without success so far. > >> > >> Thanks for Slawa effort and multiple debug report we start seeing the > >> bottom of this issue and it seems to be a generic one. The most useful > >> report being: > >> > >> panic: tcp_detach: INP_TIMEWAIT && INP_DROPPED && tp != NULL > > > > I know there are multiple and probably related problems being > > discussed here but what about the one mentioned in subject of this > > thread? > > Apologies if I've missed something conclusive in one of the replies of > > this thread about that issue. > > This issue can lead the machine being stuck on high network load, by > double freeing an inp, you can corrupt/leak an inp lock, and the network > stack can wait definitely on this inp lock to be released. You get this > assert only with INVARIANTS defined. > > Of usual, we can have more than one issue here, but this > INP_TIMEWAI|INP_DROPPED issue need to be fixed anyway. Thanks for the explanation, Julien. Cheers, Hiren pgpsLxxVbSK2k.pgp Description: PGP signature
Re: 11.0 stuck on high network load
On 10/06/16 at 09:28P, Julien Charbon wrote: > > Hi, > > On 9/28/16 1:59 PM, Slawa Olhovchenkov wrote: > > On Wed, Sep 28, 2016 at 12:06:47PM +0200, Julien Charbon wrote: > >> > >> I am still trying to reproduce your issue, without success so far. > > Thanks for Slawa effort and multiple debug report we start seeing the > bottom of this issue and it seems to be a generic one. The most useful > report being: > > panic: tcp_detach: INP_TIMEWAIT && INP_DROPPED && tp != NULL I know there are multiple and probably related problems being discussed here but what about the one mentioned in subject of this thread? Apologies if I've missed something conclusive in one of the replies of this thread about that issue. Cheers, Hiren pgpZxjPShG4YG.pgp Description: PGP signature