Re: ATF t_mlock() babylon5 kernel panics
> On Mar 12, 2019, at 9:09 PM, Robert Elz wrote: > > The first issue I noticed, is that t_mlock() apparently belives > the malloc(3) man page, which states: > > The malloc() function allocates size bytes of uninitialized memory. The > allocated space is suitably aligned (after possible pointer coercion) for > storage of any type of object. > > and in particular, those last few words. The "any type of object" that > t_mlock wants to store is a "page" - that is a hardware page. The test employs a bogus understanding of how malloc() is specified. On x86, malloc() should return memory that is 16-byte aligned because that is the maximum alignment requirement of the fundamental types used by the compiler. > It obtains > the size of that using: > > page = sysconf(_SC_PAGESIZE); > > and then does > >buf = malloc(page); > > and if buf is not NULL (which it does check) assumes that it now > has a correctly page aligned page sized block of memory, in which > it can run mlock() related tests. > > Something tells me that the "any type of object" does not include this > one, and that t_mlock should be using posix_memalign() instead to allocate > its page, so it can specify that it needs a page aligned page. Correct. Or mmap() (which always returns page-aligned pointers). > Again, I am not proposing fixing the test until the kernel issues > are corrected, but it would be good for someone who knows what alignment > malloc() really promises to return (across all NetBSD architectures) > to rewrite the man page to say something more specific than "any type of > object" ! I've also seen the term "fundamental object" used. One has to remember that malloc() is specified by the C standard, and C has no notion of "pages" or any other such silliness that we Unix people assume are fundamental :-) > NetBSD's mlock() rounds down, so regardless of the alignment of the > space allocated, the mlock() tests should be working (the locking might > not be exactly what the test is expecting, but all it is doing is keeping > pages locked in memory - which pages exactly this test does not really > care). POSIX specifically states that mlock() //may// require that the address is page-aligned ... Our implementation does not require this: /* * align the address to a page boundary and adjust the size accordingly */ pageoff = (addr & PAGE_MASK); addr -= pageoff; size += pageoff; size = (vsize_t)round_page(size); That is to say, the intent of our implementation is to wire the page where the range begins through the page where the range ends. Note that internally, UVM represents all ranges as [start, start+size) (assuming start and size are page aligned / rounded). > On my test setup, the kernel did not panic. It does however experience > some other weirdness, some of which is also apparent in the bablylon5 > tests, and others which might be. > > My test system is an amd64 XEN DomU - configired with no swap space, and > just 1GB RAM. It typically has a tmpfs mounted limitted to 1/2 GB > (actually slightly more than that - not sure where I got the number from, > there may have been a typo... the -s param is -s=537255936 in fstab. > That oddity should be irrelevant. > > The first thing I noticed was that when I run the t_mlock test in this > environment, it ends up failing when /tmp has run out of space. And I > mean really run out of space, in that it is gone forever, and nothing I > have thought of so far to try gets any of that space back again. And there are no files there? Even an open-unliked file should disappear when the offending process exits. > I assume that jemalloc() (aka malloc() in the test) is doing some kind > of mmap() that is backed by space on /tmp and continually grabbing more > until it eventually runs out, and that the kernel never releases that > space (even after the program that mapped it has exited). That seems > sub-optimal, and needs fixing in the kernel, anonymous mmap's (or whatever > kind jemalloc() is doing) need to be released when there are no more > processes that can possibly use them. Well, note that tmpfs also uses anonymous memory. Is it that "df" on the tmpfs is really showing a bunch of space allocated to the tmpfs? > I did not try umount -f (easier to just reboot...) but a regular umount > failed (EBUSY) even though there was nothing visibly using anything on > /tmp (and I killed every possible program, leaving only init - and yes, > that did include the console shell I use to test things). > > Umounting the tmpfs before running the t_mlock test worked fine (which also > illustrates that none of the very few daemon processes, nor the shell, etc, > from my login, are just happening to be using /tmp - and that it is the > results of the malloc() calls from t_mlock that must be the culprit. > (While ATF is running, it would be using /tmp as both its working > directory,
Re: ATF t_mlock() babylon5 kernel panics
Date:Wed, 13 Mar 2019 11:09:09 +0700 From:Robert Elz Message-ID: <27829.1552450...@jinx.noi.kre.to> A few corrections/additions to my message: | "page" is the page size.(4KB, 8KB or 16KB or ...) Looks to be 4K. Is that correct? | From the number of kernel messages, I'm assuming one from each of | those mlock/munlock calls Now I think more than one, with a 4K page size, that's just 4 iterations of the loop, and there are more than 8 printfs. So now I suspect 2 printf's per mlock() and munlock() (16 printfs) which looks about right - though there might be 19 printfs, which is a bit weird (but no-one ever claimed I can count). | (possibly excepting the first, but because | a similar buffer was malloc()/free() 'd in the previous sub-test, That's not possible, there was no previous sub-test. So: | it is conceivable that the page started out locked - that is, if malloc() | returned the same page as last time. isn't possible. | The buffer is free()'d after that loop ends (of course). In this test that always happens, as the only way the test fails is if the page size < 1024 (and not "fails" there but skips) in which case the test never bothers freeing the buffer (like I said before, this test has bugs ... it should do the page size test before the malloc()) or if the malloc() fails (when there is nothing to free). But in other tests, the buffer does not get free'd if the test fails (and I think all of the others fail right now). | While I will not fix the bugs in the test that might alter the way | it interacts with the kernel, I will update the test to get more | information from the tests that fail (and check results that are | currently not checked and make them available - but so as not to | alter what happens, merely by output to stderr). I have done that now, and will commit after a test build and run to make sure I made no stupid mistakes (a simple compile test and run on my development system, which is older, and where the test works, reveals no issues). The commit of this meaningless diagnostic only change is so we can see more of what is happening in the b5 test runs. This actually reduces the info in a sense, as some ATF macro calls which would have printed useful info about what failed now will just say "err == 0 failed" or something ... but the extra printf to stderr which will be immediately before that should provide the missing info, and more. kre
ATF t_mlock() babylon5 kernel panics
Apologies for the multi-list posting, but I think this needs a wide audience - please respect the Reply-To and send replies only to current-users@ I have been looking into this, a little. First, while the t_mlock() test is most likely broken, it should never cause a kernel panic (or even a kernel diagnostic printf, which it is also doing) - so I do not propose fixing it until the kernel gets fixed. I would ask that no-one else fix it either. It also seems to have changed behaviour since jemalloc() was updated - which is also revealing, but again, that's a good thing, so it should keep using the updated jemalloc(). The first issue I noticed, is that t_mlock() apparently belives the malloc(3) man page, which states: The malloc() function allocates size bytes of uninitialized memory. The allocated space is suitably aligned (after possible pointer coercion) for storage of any type of object. and in particular, those last few words. The "any type of object" that t_mlock wants to store is a "page" - that is a hardware page. It obtains the size of that using: page = sysconf(_SC_PAGESIZE); and then does buf = malloc(page); and if buf is not NULL (which it does check) assumes that it now has a correctly page aligned page sized block of memory, in which it can run mlock() related tests. Something tells me that the "any type of object" does not include this one, and that t_mlock should be using posix_memalign() instead to allocate its page, so it can specify that it needs a page aligned page. Again, I am not proposing fixing the test until the kernel issues are corrected, but it would be good for someone who knows what alignment malloc() really promises to return (across all NetBSD architectures) to rewrite the man page to say something more specific than "any type of object" ! NetBSD's mlock() rounds down, so regardless of the alignment of the space allocated, the mlock() tests should be working (the locking might not be exactly what the test is expecting, but all it is doing is keeping pages locked in memory - which pages exactly this test does not really care). On my test setup, the kernel did not panic. It does however experience some other weirdness, some of which is also apparent in the bablylon5 tests, and others which might be. My test system is an amd64 XEN DomU - configired with no swap space, and just 1GB RAM. It typically has a tmpfs mounted limitted to 1/2 GB (actually slightly more than that - not sure where I got the number from, there may have been a typo... the -s param is -s=537255936 in fstab. That oddity should be irrelevant. The first thing I noticed was that when I run the t_mlock test in this environment, it ends up failing when /tmp has run out of space. And I mean really run out of space, in that it is gone forever, and nothing I have thought of so far to try gets any of that space back again. I assume that jemalloc() (aka malloc() in the test) is doing some kind of mmap() that is backed by space on /tmp and continually grabbing more until it eventually runs out, and that the kernel never releases that space (even after the program that mapped it has exited). That seems sub-optimal, and needs fixing in the kernel, anonymous mmap's (or whatever kind jemalloc() is doing) need to be released when there are no more processes that can possibly use them. I did not try umount -f (easier to just reboot...) but a regular umount failed (EBUSY) even though there was nothing visibly using anything on /tmp (and I killed every possible program, leaving only init - and yes, that did include the console shell I use to test things). Umounting the tmpfs before running the t_mlock test worked fine (which also illustrates that none of the very few daemon processes, nor the shell, etc, from my login, are just happening to be using /tmp - and that it is the results of the malloc() calls from t_mlock that must be the culprit. (While ATF is running, it would be using /tmp as both its working directory, and for output file storage, but after the test fails, all of those processes are gone). With an unmounted /tmp that issue (overflowed /tmp) was gone - but that might just be because the (single) filesystem that everything lives on in this test environment is fairly large, and very little of it is used by the fairly bare-bones NetBSD that is installed in it (it is plenty big enough to store all of the temporaries and results of a release build when I use it to run that kind of test. That hasn't been needed (by me) for a while, and I clean it out and do whole new fresh builds from time to time, so right now it is only about 3% occupied) Next issue: the kernel prints lots of [ 41.2801543] pmap_unwire: wiring for pmap 0xc98002f6b5c0 va 0x7f7ff7ef1000did not change! (note I do not yet have gson's recent change to insert the missing space...) That happens during the mlock_clip subtest, which passes. After allocating a buffer (as above), this tes
daily CVS update output
Updating src tree: cvs update: `src/crypto/external/bsd/openssl/dist/.gitattributes' is no longer in the repository cvs update: `src/crypto/external/bsd/openssl/dist/.gitignore' is no longer in the repository cvs update: `src/crypto/external/bsd/openssl/dist/.gitmodules' is no longer in the repository cvs update: `src/crypto/external/bsd/openssl/dist/.travis-apt-pin.preferences' is no longer in the repository cvs update: `src/crypto/external/bsd/openssl/dist/.travis-create-release.sh' is no longer in the repository cvs update: `src/crypto/external/bsd/openssl/dist/.travis.yml' is no longer in the repository P src/crypto/external/bsd/openssl/dist/CHANGES P src/crypto/external/bsd/openssl/dist/CONTRIBUTING P src/crypto/external/bsd/openssl/dist/Configure P src/crypto/external/bsd/openssl/dist/INSTALL P src/crypto/external/bsd/openssl/dist/LICENSE P src/crypto/external/bsd/openssl/dist/NEWS P src/crypto/external/bsd/openssl/dist/NOTES.ANDROID P src/crypto/external/bsd/openssl/dist/NOTES.DJGPP P src/crypto/external/bsd/openssl/dist/NOTES.VMS P src/crypto/external/bsd/openssl/dist/README P src/crypto/external/bsd/openssl/dist/config P src/crypto/external/bsd/openssl/dist/e_os.h cvs update: `src/crypto/external/bsd/openssl/dist/.github/PULL_REQUEST_TEMPLATE.md' is no longer in the repository P src/crypto/external/bsd/openssl/dist/Configurations/00-base-templates.conf P src/crypto/external/bsd/openssl/dist/Configurations/10-main.conf P src/crypto/external/bsd/openssl/dist/Configurations/15-android.conf P src/crypto/external/bsd/openssl/dist/Configurations/50-win-onecore.conf P src/crypto/external/bsd/openssl/dist/Configurations/README P src/crypto/external/bsd/openssl/dist/Configurations/README.design P src/crypto/external/bsd/openssl/dist/Configurations/descrip.mms.tmpl cvs update: `src/crypto/external/bsd/openssl/dist/Configurations/dist.conf' is no longer in the repository P src/crypto/external/bsd/openssl/dist/Configurations/unix-Makefile.tmpl P src/crypto/external/bsd/openssl/dist/apps/apps.c P src/crypto/external/bsd/openssl/dist/apps/ct_log_list.cnf P src/crypto/external/bsd/openssl/dist/apps/dh1024.pem P src/crypto/external/bsd/openssl/dist/apps/dh2048.pem P src/crypto/external/bsd/openssl/dist/apps/dh4096.pem P src/crypto/external/bsd/openssl/dist/apps/ocsp.c P src/crypto/external/bsd/openssl/dist/apps/openssl-vms.cnf P src/crypto/external/bsd/openssl/dist/apps/openssl.cnf P src/crypto/external/bsd/openssl/dist/apps/pkcs12.c P src/crypto/external/bsd/openssl/dist/apps/rehash.c P src/crypto/external/bsd/openssl/dist/apps/s_cb.c P src/crypto/external/bsd/openssl/dist/apps/s_client.c P src/crypto/external/bsd/openssl/dist/apps/s_server.c P src/crypto/external/bsd/openssl/dist/apps/speed.c P src/crypto/external/bsd/openssl/dist/apps/verify.c P src/crypto/external/bsd/openssl/dist/apps/demoSRP/srp_verifier.txt P src/crypto/external/bsd/openssl/dist/crypto/armcap.c P src/crypto/external/bsd/openssl/dist/crypto/cryptlib.c P src/crypto/external/bsd/openssl/dist/crypto/init.c P src/crypto/external/bsd/openssl/dist/crypto/ppc_arch.h P src/crypto/external/bsd/openssl/dist/crypto/ppccap.c P src/crypto/external/bsd/openssl/dist/crypto/ppccpuid.pl P src/crypto/external/bsd/openssl/dist/crypto/uid.c P src/crypto/external/bsd/openssl/dist/crypto/aes/asm/aes-x86_64.pl P src/crypto/external/bsd/openssl/dist/crypto/aes/asm/aesni-x86_64.pl P src/crypto/external/bsd/openssl/dist/crypto/aes/asm/aesv8-armx.pl P src/crypto/external/bsd/openssl/dist/crypto/aes/asm/bsaes-x86_64.pl P src/crypto/external/bsd/openssl/dist/crypto/aes/asm/vpaes-armv8.pl P src/crypto/external/bsd/openssl/dist/crypto/aes/asm/vpaes-x86_64.pl P src/crypto/external/bsd/openssl/dist/crypto/asn1/a_digest.c P src/crypto/external/bsd/openssl/dist/crypto/asn1/a_sign.c P src/crypto/external/bsd/openssl/dist/crypto/asn1/a_verify.c P src/crypto/external/bsd/openssl/dist/crypto/asn1/ameth_lib.c P src/crypto/external/bsd/openssl/dist/crypto/asn1/charmap.h P src/crypto/external/bsd/openssl/dist/crypto/asn1/charmap.pl P src/crypto/external/bsd/openssl/dist/crypto/asn1/d2i_pu.c P src/crypto/external/bsd/openssl/dist/crypto/bio/b_addr.c P src/crypto/external/bsd/openssl/dist/crypto/bio/bss_file.c P src/crypto/external/bsd/openssl/dist/crypto/bio/bss_mem.c P src/crypto/external/bsd/openssl/dist/crypto/bn/bn_ctx.c P src/crypto/external/bsd/openssl/dist/crypto/bn/bn_depr.c P src/crypto/external/bsd/openssl/dist/crypto/bn/bn_div.c P src/crypto/external/bsd/openssl/dist/crypto/bn/bn_exp.c P src/crypto/external/bsd/openssl/dist/crypto/bn/bn_lib.c P src/crypto/external/bsd/openssl/dist/crypto/bn/bn_prime.h P src/crypto/external/bsd/openssl/dist/crypto/bn/bn_prime.pl P src/crypto/external/bsd/openssl/dist/crypto/bn/bn_shift.c P src/crypto/external/bsd/openssl/dist/crypto/bn/asm/armv8-mont.pl P src/crypto/external/bsd/openssl/dist/crypto/bn/asm/ia64.S P src/crypto/external/bsd/openssl/dist/crypto/bn/asm/mips.pl P src/crypto/external/bsd/openssl/dist/crypto/bn/asm/rsaz
Re: xdm receives no input
/etc/ttys ? On Tue, 12 Mar 2019 at 16:32, Patrick Welche wrote: > Had a go with the shiny new X (thanks!) on the sandy bridge laptop > which no longer likes SNA but works with UX, and xdm seems to sit > at the prompt waiting for something: > > #0 0x7f7ff344285a in poll () from /usr/lib/libc.so.12 > #1 0x7f7ff6031f3d in IoWait (wt=0x7f7fcec0, wf= pointer>, > wf=) > at /usr/xsrc/external/mit/libXt/dist/src/NextEvent.c:356 > #2 _XtWaitForSomething (app=app@entry=0x7f7ff7e6, > ignoreEvents=ignoreEvents@entry=0 '\000', > ignoreTimers=ignoreTimers@entry=0 '\000', > ignoreInputs=ignoreInputs@entry=0 '\000', > ignoreSignals=ignoreSignals@entry=0 '\000', block=block@entry=1 > '\001', > drop_lock=0 '\000', drop_lock@entry=1 '\001', howlong=howlong@entry > =0x0) > at /usr/xsrc/external/mit/libXt/dist/src/NextEvent.c:624 > > The cursor is visible in the Login: prompt, but apparently Something never > happens... (Switching to SNA, doesn't change anything.) > > Thoughts on how to debug? > > > Cheers, > > Patrick > --
Re: zsh crash in recent -current
On Tue, Mar 12, 2019 at 03:33:26PM +, Chavdar Ivanov wrote: > On amd64 -curent from yesterday (and a couple of days earlier) I > started to get zsh crashes when tab-completing (files, directories, > packages), similar to I see lots of crashes with zsh too. Some happen in completion, sometimes when I press enter on a command line, sometimes the history gets trashed (lots of weird characters turn up when I press 'up') and the shell dies soon after. I think there are at least two different bugs in zsh here. Thomas
re: xdm receives no input
Patrick Welche writes: > Had a go with the shiny new X (thanks!) on the sandy bridge laptop > which no longer likes SNA but works with UX, and xdm seems to sit > at the prompt waiting for something: > > #0 0x7f7ff344285a in poll () from /usr/lib/libc.so.12 > #1 0x7f7ff6031f3d in IoWait (wt=0x7f7fcec0, wf=, > wf=) > at /usr/xsrc/external/mit/libXt/dist/src/NextEvent.c:356 > #2 _XtWaitForSomething (app=app@entry=0x7f7ff7e6, > ignoreEvents=ignoreEvents@entry=0 '\000', > ignoreTimers=ignoreTimers@entry=0 '\000', > ignoreInputs=ignoreInputs@entry=0 '\000', > ignoreSignals=ignoreSignals@entry=0 '\000', block=block@entry=1 '\001', > drop_lock=0 '\000', drop_lock@entry=1 '\001', howlong=howlong@entry=0x0) > at /usr/xsrc/external/mit/libXt/dist/src/NextEvent.c:624 > > The cursor is visible in the Login: prompt, but apparently Something never > happens... (Switching to SNA, doesn't change anything.) > > Thoughts on how to debug? is it a black input bar that doesn't appear to do anything? (it probably does work -- did you try typing blind?) the fix is to update /usr/X11/xdm/Xresources file -- it has new required entries. i'm going to work on making this less awful when it isn't updated, but i've been busy working on the Mesa18 update :-) postinstall is not yet capable of properly updating it or really suggesting what you need to do, so that would be the first step, but perhaps also making /etc/rc.d/xdm fail to run if the new entries are missing would be good too. thanks. .mrg.
Automated report: NetBSD-current/i386 build success
The NetBSD-current/i386 build is working again. The following commits were made between the last failed build and the successful build: 2019.03.12.16.44.12 christos src/crypto/external/bsd/openssl/dist/test/bio_memleak_test.c,v 1.1 2019.03.12.16.44.16 christos src/crypto/external/bsd/openssl/dist/test/certs/root-cert-rsa2.pem,v 1.1 2019.03.12.16.44.16 christos src/crypto/external/bsd/openssl/dist/test/ec_internal_test.c,v 1.1 2019.03.12.16.44.16 christos src/crypto/external/bsd/openssl/dist/test/recipes/02-test_errstr.t,v 1.1 2019.03.12.16.44.16 christos src/crypto/external/bsd/openssl/dist/test/recipes/03-test_internal_ec.t,v 1.1 2019.03.12.16.44.17 christos src/crypto/external/bsd/openssl/dist/test/recipes/90-test_bio_memleak.t,v 1.1 2019.03.12.16.44.17 christos src/crypto/external/bsd/openssl/dist/test/recipes/90-test_includes_data/includes-eq-ws.cnf,v 1.1 2019.03.12.16.44.17 christos src/crypto/external/bsd/openssl/dist/test/recipes/90-test_includes_data/includes-eq.cnf,v 1.1 2019.03.12.16.44.18 christos src/crypto/external/bsd/openssl/dist/test/ssl-tests/29-dtls-sctp-label-bug.conf,v 1.1 2019.03.12.16.44.18 christos src/crypto/external/bsd/openssl/dist/test/ssl-tests/29-dtls-sctp-label-bug.conf.in,v 1.1 2019.03.12.16.50.35 christos src/distrib/sets/lists/base/shl.mi,v 1.860 2019.03.12.16.58.12 christos src/crypto/external/bsd/openssl/dist/.github/Attic/PULL_REQUEST_TEMPLATE.md,v 1.2 2019.03.12.16.58.12 christos src/crypto/external/bsd/openssl/dist/Attic/.gitattributes,v 1.2 2019.03.12.16.58.12 christos src/crypto/external/bsd/openssl/dist/Attic/.gitignore,v 1.2 2019.03.12.16.58.12 christos src/crypto/external/bsd/openssl/dist/Attic/.gitmodules,v 1.2 2019.03.12.16.58.12 christos src/crypto/external/bsd/openssl/dist/Attic/.travis-apt-pin.preferences,v 1.2 2019.03.12.16.58.12 christos src/crypto/external/bsd/openssl/dist/Attic/.travis-create-release.sh,v 1.2 2019.03.12.16.58.12 christos src/crypto/external/bsd/openssl/dist/Attic/.travis.yml,v 1.2 2019.03.12.16.58.12 christos src/crypto/external/bsd/openssl/dist/CHANGES,v 1.19 2019.03.12.16.58.12 christos src/crypto/external/bsd/openssl/dist/Configurations/Attic/dist.conf,v 1.2 2019.03.12.16.58.12 christos src/crypto/external/bsd/openssl/dist/Configure,v 1.23 2019.03.12.16.58.12 christos src/crypto/external/bsd/openssl/dist/NEWS,v 1.19 2019.03.12.16.58.12 christos src/crypto/external/bsd/openssl/dist/README,v 1.19 2019.03.12.16.58.12 christos src/crypto/external/bsd/openssl/dist/apps/ocsp.c,v 1.18 2019.03.12.16.58.12 christos src/crypto/external/bsd/openssl/dist/apps/openssl.cnf,v 1.8 2019.03.12.16.58.12 christos src/crypto/external/bsd/openssl/dist/apps/s_client.c,v 1.18 2019.03.12.16.58.12 christos src/crypto/external/bsd/openssl/dist/apps/s_server.c,v 1.19 2019.03.12.16.58.12 christos src/crypto/external/bsd/openssl/dist/apps/speed.c,v 1.17 2019.03.12.16.58.12 christos src/crypto/external/bsd/openssl/dist/crypto/aes/asm/aes-x86_64.pl,v 1.6 2019.03.12.16.58.12 christos src/crypto/external/bsd/openssl/dist/crypto/aes/asm/aesni-x86_64.pl,v 1.6 2019.03.12.16.58.12 christos src/crypto/external/bsd/openssl/dist/crypto/armcap.c,v 1.9 2019.03.12.16.58.12 christos src/crypto/external/bsd/openssl/dist/crypto/bio/bss_file.c,v 1.11 2019.03.12.16.58.12 christos src/crypto/external/bsd/openssl/dist/crypto/cryptlib.c,v 1.14 2019.03.12.16.58.12 christos src/crypto/external/bsd/openssl/dist/crypto/ppccap.c,v 1.10 2019.03.12.16.58.12 christos src/crypto/external/bsd/openssl/dist/crypto/ppccpuid.pl,v 1.8 2019.03.12.16.58.12 christos src/crypto/external/bsd/openssl/dist/crypto/uid.c,v 1.6 2019.03.12.16.58.12 christos src/crypto/external/bsd/openssl/dist/e_os.h,v 1.13 2019.03.12.16.58.13 christos src/crypto/external/bsd/openssl/dist/crypto/bn/asm/mips.pl,v 1.4 2019.03.12.16.58.13 christos src/crypto/external/bsd/openssl/dist/crypto/bn/bn_exp.c,v 1.19 2019.03.12.16.58.13 christos src/crypto/external/bsd/openssl/dist/crypto/bn/bn_lib.c,v 1.11 2019.03.12.16.58.13 christos src/crypto/external/bsd/openssl/dist/crypto/cms/cms_pwri.c,v 1.11 2019.03.12.16.58.13 christos src/crypto/external/bsd/openssl/dist/crypto/conf/conf_def.c,v 1.10 2019.03.12.16.58.13 christos src/crypto/external/bsd/openssl/dist/crypto/dso/dso_dlfcn.c,v 1.14 2019.03.12.16.58.13 christos src/crypto/external/bsd/openssl/dist/crypto/ec/ec2_smpl.c,v 1.8 2019.03.12.16.58.13 christos src/crypto/external/bsd/openssl/dist/crypto/ec/ec_ameth.c,v 1.9 2019.03.12.16.58.13 christos src/crypto/external/bsd/openssl/dist/crypto/ec/ec_lcl.h,v 1.6 2019.03.12.16.58.13 christos src/crypto/external/bsd/openssl/dist/crypto/ec/ecp_mont.c,v 1.6 2019.03.12.16.58.13 christos src/crypto/external/bsd/openssl/dist/crypto/ec/ecp_nist.c,v 1.6 2019.03.12.16.58.13 christos src/crypto/external/bs
Automated report: NetBSD-current/i386 build failure
This is an automatically generated notice of a NetBSD-current/i386 build failure. The failure occurred on babylon5.netbsd.org, a NetBSD/amd64 host, using sources from CVS date 2019.03.12.15.14.02. An extract from the build.sh output follows: obsolete_stand fix: postinstall fixes passed: obsolete_stand postinstall fixes failed: === checkflist ===> distrib/sets --- check_DESTDIR --- --- checkflist --- cd /tmp/bracket/build/2019.03.12.15.14.02-i386/src/distrib/sets && DESTDIR=/tmp/bracket/build/2019.03.12.15.14.02-i386/destdir MACHINE=i386 MACHINE_ARCH=i386 AWK=/tmp/bracket/build/2019.03.12.15.14.02-i386/tools/bin/nbawk CKSUM=/tmp/bracket/build/2019.03.12.15.14.02-i386/tools/bin/nbcksum DB=/tmp/bracket/build/2019.03.12.15.14.02-i386/tools/bin/nbdb EGREP=/tmp/bracket/build/2019.03.12.15.14.02-i386/tools/bin/nbgrep\ -E HOST_SH=/bin/sh MAKE=/tmp/bracket/build/2019.03.12.15.14.02-i386/tools/bin/nbmake MKTEMP=/tmp/bracket/build/2019.03.12.15.14.02-i386/tools/bin/nbmktemp MTREE=/tmp/bracket/build/2019.03.12.15.14.02-i386/tools/bin/nbmtree PAX=/tmp/bracket/build/2019.03.12.15.14.02-i386/tools/bin/nbpax COMPRESS_PROGRAM=gzip GZIP=-n XZ_OPT=-9 TAR_SUFF=tgz PKG_CREATE=/tmp/bracket/build/2019.03.12.15.14.02-i386/tools/bin/nbpkg_create SED=/tmp/bracket/build/2019.03.12.15.14.02-i386/tools/bin/nbsed TSORT=/tmp/bracket/build/2019.03.12.15.14.02-i386/tools/bin/nbtsort\ -q /bin/sh /tmp/bracket/build/2019.03.12.15.14.02-i386/src/distrib/sets/checkflist -L base -M /tmp/bracket/build/2019.03.12.15.14.02-i386/destdir/METALOG.sanitised === 1 extra files in DESTDIR = Files in DESTDIR but missing from flist. File is obsolete or flist is out of date ? -- ./usr/lib/libgnumalloc.so.1.0 = end of 1 extra files === == 1 missing files in DESTDIR Files in flist but missing from DESTDIR. File wasn't installed ? -- ./usr/lib/libjemalloc.so.1.0 end of 1 missing files == *** [checkflist] Error code 1 nbmake[2]: stopped in /tmp/bracket/build/2019.03.12.15.14.02-i386/src/distrib/sets 1 error The following commits were made between the last successful build and the failed build: 2019.03.12.15.10.43 christos src/distrib/sets/lists/base/shl.mi,v 1.859 2019.03.12.15.10.43 christos src/distrib/sets/lists/comp/mi,v 1.2262 2019.03.12.15.10.44 christos src/distrib/sets/lists/comp/shl.mi,v 1.325 2019.03.12.15.10.44 christos src/distrib/sets/lists/debug/mi,v 1.279 2019.03.12.15.10.44 christos src/distrib/sets/lists/debug/shl.mi,v 1.219 2019.03.12.15.11.13 christos src/include/malloc.h,v 1.8 2019.03.12.15.13.25 christos src/external/bsd/jemalloc/Makefile,v 1.1 2019.03.12.15.13.25 christos src/external/bsd/jemalloc/dist/src/jemalloc.c,v 1.6 2019.03.12.15.13.25 christos src/external/bsd/jemalloc/include/jemalloc/jemalloc.h,v 1.6 2019.03.12.15.13.25 christos src/external/bsd/jemalloc/lib/Makefile,v 1.3 2019.03.12.15.13.25 christos src/external/bsd/jemalloc/lib/Makefile.inc,v 1.6 2019.03.12.15.13.25 christos src/external/bsd/jemalloc/lib/jemalloc_stub.c,v 1.1 2019.03.12.15.14.02 christos src/lib/Makefile,v 1.269 Log files can be found at: http://releng.NetBSD.org/b5reports/i386/commits-2019.03.html#2019.03.12.15.14.02
xdm receives no input
Had a go with the shiny new X (thanks!) on the sandy bridge laptop which no longer likes SNA but works with UX, and xdm seems to sit at the prompt waiting for something: #0 0x7f7ff344285a in poll () from /usr/lib/libc.so.12 #1 0x7f7ff6031f3d in IoWait (wt=0x7f7fcec0, wf=, wf=) at /usr/xsrc/external/mit/libXt/dist/src/NextEvent.c:356 #2 _XtWaitForSomething (app=app@entry=0x7f7ff7e6, ignoreEvents=ignoreEvents@entry=0 '\000', ignoreTimers=ignoreTimers@entry=0 '\000', ignoreInputs=ignoreInputs@entry=0 '\000', ignoreSignals=ignoreSignals@entry=0 '\000', block=block@entry=1 '\001', drop_lock=0 '\000', drop_lock@entry=1 '\001', howlong=howlong@entry=0x0) at /usr/xsrc/external/mit/libXt/dist/src/NextEvent.c:624 The cursor is visible in the Login: prompt, but apparently Something never happens... (Switching to SNA, doesn't change anything.) Thoughts on how to debug? Cheers, Patrick
Re: zsh crash in recent -current
Pretty sure this is the case. This has happened so far perhaps four times, on the first occasion it definitely mentioned jemalloc. I'll try building zsh with debug info. On Tue, 12 Mar 2019 at 15:38, Christos Zoulas wrote: > > In article > , > Chavdar Ivanov wrote: > >Hi, > > > >On amd64 -curent from yesterday (and a couple of days earlier) I > >started to get zsh crashes when tab-completing (files, directories, > >packages), similar to > >. > >Core was generated by `zsh'. > >Program terminated with signal SIGSEGV, Segmentation fault. > >#0 0x7cf050211540 in permmatches () from > >/usr/pkg/lib/zsh/5.7/zsh/complete.so > >(gdb) bt > >#0 0x7cf050211540 in permmatches () from > >/usr/pkg/lib/zsh/5.7/zsh/complete.so > >#1 0x7cf050208b7c in ?? () from /usr/pkg/lib/zsh/5.7/zsh/complete.so > >#2 0x004550d5 in getstrvalue () > >#3 0x00456085 in ?? () > >#4 0x00456f67 in getindex () > >#5 0x00457557 in fetchvalue () > >#6 0x00471da6 in ?? () > >#7 0x00476ab7 in prefork () > >#8 0x00476f42 in singsub () > >#9 0x0042057c in evalcond () > >#10 0x00421f49 in ?? () > >#11 0x0042bbb8 in ?? () > >#12 0x00429a88 in execlist () > > (several repeats of the above snippet). > > > >Rebuilding zsh did not solve it. I have been usiing the latest version > >- 5.7 - since it became available in pkgsrc, under -current, and have > >never had problems before; the user zsh environment hasn't changed for > >a while. It is not impossiible that some of the files in this > >environment is damaged, so I will recreate it from scratch and retry, > >but wanted to mention it in case someone else has seen similar. > > You've been probably bitten by jemalloc, and I am guessing it is probably > some memory corruption... You could try linking with -lgnumalloc and see > if you can get it working again. > > christos > --
Re: zsh crash in recent -current
On Tue, Mar 12, 2019 at 03:33:26PM +, Chavdar Ivanov wrote: > Hi, > > On amd64 -curent from yesterday (and a couple of days earlier) I > started to get zsh crashes when tab-completing (files, directories, > packages), similar to > . > Core was generated by `zsh'. > Program terminated with signal SIGSEGV, Segmentation fault. > #0 0x7cf050211540 in permmatches () from > /usr/pkg/lib/zsh/5.7/zsh/complete.so > (gdb) bt > #0 0x7cf050211540 in permmatches () from > /usr/pkg/lib/zsh/5.7/zsh/complete.so > #1 0x7cf050208b7c in ?? () from /usr/pkg/lib/zsh/5.7/zsh/complete.so > #2 0x004550d5 in getstrvalue () > #3 0x00456085 in ?? () > #4 0x00456f67 in getindex () > #5 0x00457557 in fetchvalue () > #6 0x00471da6 in ?? () > #7 0x00476ab7 in prefork () > #8 0x00476f42 in singsub () > #9 0x0042057c in evalcond () > #10 0x00421f49 in ?? () > #11 0x0042bbb8 in ?? () > #12 0x00429a88 in execlist () > (several repeats of the above snippet). This is likely the more strict jemalloc debug code in libc finding a memory corruption issue. Try building zsh with debug info (and maybe also install debug sets to get symbol information for libc). Martin
Re: zsh crash in recent -current
In article , Chavdar Ivanov wrote: >Hi, > >On amd64 -curent from yesterday (and a couple of days earlier) I >started to get zsh crashes when tab-completing (files, directories, >packages), similar to >. >Core was generated by `zsh'. >Program terminated with signal SIGSEGV, Segmentation fault. >#0 0x7cf050211540 in permmatches () from >/usr/pkg/lib/zsh/5.7/zsh/complete.so >(gdb) bt >#0 0x7cf050211540 in permmatches () from >/usr/pkg/lib/zsh/5.7/zsh/complete.so >#1 0x7cf050208b7c in ?? () from /usr/pkg/lib/zsh/5.7/zsh/complete.so >#2 0x004550d5 in getstrvalue () >#3 0x00456085 in ?? () >#4 0x00456f67 in getindex () >#5 0x00457557 in fetchvalue () >#6 0x00471da6 in ?? () >#7 0x00476ab7 in prefork () >#8 0x00476f42 in singsub () >#9 0x0042057c in evalcond () >#10 0x00421f49 in ?? () >#11 0x0042bbb8 in ?? () >#12 0x00429a88 in execlist () > (several repeats of the above snippet). > >Rebuilding zsh did not solve it. I have been usiing the latest version >- 5.7 - since it became available in pkgsrc, under -current, and have >never had problems before; the user zsh environment hasn't changed for >a while. It is not impossiible that some of the files in this >environment is damaged, so I will recreate it from scratch and retry, >but wanted to mention it in case someone else has seen similar. You've been probably bitten by jemalloc, and I am guessing it is probably some memory corruption... You could try linking with -lgnumalloc and see if you can get it working again. christos
zsh crash in recent -current
Hi, On amd64 -curent from yesterday (and a couple of days earlier) I started to get zsh crashes when tab-completing (files, directories, packages), similar to . Core was generated by `zsh'. Program terminated with signal SIGSEGV, Segmentation fault. #0 0x7cf050211540 in permmatches () from /usr/pkg/lib/zsh/5.7/zsh/complete.so (gdb) bt #0 0x7cf050211540 in permmatches () from /usr/pkg/lib/zsh/5.7/zsh/complete.so #1 0x7cf050208b7c in ?? () from /usr/pkg/lib/zsh/5.7/zsh/complete.so #2 0x004550d5 in getstrvalue () #3 0x00456085 in ?? () #4 0x00456f67 in getindex () #5 0x00457557 in fetchvalue () #6 0x00471da6 in ?? () #7 0x00476ab7 in prefork () #8 0x00476f42 in singsub () #9 0x0042057c in evalcond () #10 0x00421f49 in ?? () #11 0x0042bbb8 in ?? () #12 0x00429a88 in execlist () (several repeats of the above snippet). Rebuilding zsh did not solve it. I have been usiing the latest version - 5.7 - since it became available in pkgsrc, under -current, and have never had problems before; the user zsh environment hasn't changed for a while. It is not impossiible that some of the files in this environment is damaged, so I will recreate it from scratch and retry, but wanted to mention it in case someone else has seen similar. --