Andriy, Can you give me some details about how you're able to reproduce this panic. I would like to help debug this. I'm also looking into the range_tree() panic, so any details you can provide would be very helpful.
If you can publish the crash dumps, I can also download them and take a look. Thanks, George On Wed, Jun 22, 2016 at 4:53 PM, Andriy Gapon <a...@freebsd.org> wrote: > > Igor, > > your suggestion was certainly a good one, however I took a path of a > lesser effort and tested my workload on the latest illumos kernel: > > panic[cpu3]/thread=ffffff000bc56c40: assertion failed: > ba.ba_phys->bt_bytes == 0 (0x400 == 0x0), file: > ../../common/fs/zfs/bptree.c, line: 293 > > ffffff000bc56890 genunix:process_type+164b75 () > ffffff000bc56a20 zfs:bptree_iterate+4bf () > ffffff000bc56a90 zfs:dsl_scan_sync+17c () > ffffff000bc56b50 zfs:spa_sync+2bb () > ffffff000bc56c20 zfs:txg_sync_thread+260 () > ffffff000bc56c30 unix:thread_start+8 () > > syncing file systems... done > dumping to /dev/zvol/dsk/rpool/dump, offset 65536, content: kernel > dumping: 0:34 100% done > 100% done: 339495 pages dumped, dump succeeded > rebooting... > > So, if anyone is interested I can provide any requested information from > the crash dump or try your debugging suggestions. > > On 22/06/2016 17:45, Igor Kozhukhov wrote: > > based on your changeset number - it is old update: > > > https://github.com/illumos/illumos-gate/commit/26455f9efcf9b1e44937d4d86d1ce37b006f25a9 > > 6052 decouple lzc_create() from the implementation details > > > > we have a lot of others changes in illumos tree and i can say - i have > > no panic on my system with gcc48 build - i have tested by zfs tests. > > > > Maybe, as solution, you can try to merge to latest changes and try to > > check it again? > > i had panic with gcc48 build, but Matt pointed to some delphix update > > and we have upstreamed it and i have no panics any more with full list > > of zfs tests, what availabe on illumos tree. > > > > best regards, > > -Igor > > > > > >> On Jun 22, 2016, at 5:17 PM, Andriy Gapon <a...@freebsd.org > >> <mailto:a...@freebsd.org>> wrote: > >> > >> > >> I am not yet convinced that the problem has anything to do with > >> miscompiled code. I am using exactly the same optimizations and exactly > >> the same compiler as the official FreeBSD builds. > >> > >> On 22/06/2016 17:03, Igor Kozhukhov wrote: > >>> Hi Andri, > >>> > >>> i have DilOS with gcc-4.8,5 (+ special patches) for illumos builds. > >>> i had some problems with zdb - found it by zfs tests. > >>> > >>> problem has been fixed by disable of optimization : > >>> -fno-aggressive-loop-optimizations > >>> > >>> also, i have added: > >>> -fno-ipa-sra > >>> > >>> but i no remember a story why i have added it ;) > >>> probabbly it was added with another illumos component and new gcc-4.8 > >>> > >>> As you know, illumos still is using gcc-4.4.4 and some newer compilers > >>> can produce new issues with older code :) > >>> > >>> I think, you can try to play with your clang optimization flags too. > >>> i have no experience with clang. > >>> > >>> best regards, > >>> -Igor > >>> > >>> > >>>> On Jun 22, 2016, at 4:21 PM, Andriy Gapon <a...@freebsd.org > >>>> <mailto:a...@freebsd.org> > >>>> <mailto:a...@freebsd.org>> wrote: > >>>> > >>>> > >>>> I am getting the following panic using the latest FreeBSD head that is > >>>> synchronized with OpenZFS code as of > >>>> illumos/illumos-gate@26455f9efcf9b1e44937d4d86d1ce37b006f25a9. > >>>> > >>>> panic: solaris assert: ba.ba_phys->bt_bytes == 0 (0x400 == 0x0), file: > >>>> /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/bptree.c, > >>>> line: 292 > >>>> cpuid = 1 > >>>> KDB: stack backtrace: > >>>> db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame > >>>> 0xfffffe004db9d310 > >>>> vpanic() at vpanic+0x182/frame 0xfffffe004db9d390 > >>>> panic() at panic+0x43/frame 0xfffffe004db9d3f0 > >>>> assfail3() at assfail3+0x2c/frame 0xfffffe004db9d410 > >>>> bptree_iterate() at bptree_iterate+0x35e/frame 0xfffffe004db9d540 > >>>> dsl_scan_sync() at dsl_scan_sync+0x24f/frame 0xfffffe004db9d890 > >>>> spa_sync() at spa_sync+0x897/frame 0xfffffe004db9dad0 > >>>> txg_sync_thread() at txg_sync_thread+0x383/frame 0xfffffe004db9dbb0 > >>>> fork_exit() at fork_exit+0x84/frame 0xfffffe004db9dbf0 > >>>> fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe004db9dbf0 > >>>> --- trap 0, rip = 0, rsp = 0, rbp = 0 --- > >>>> > >>>> I have a crash dump, but unfortunately it's hard to work with it, > >>>> because a lot of useful information got "optimized out" by clang. > >>>> > >>>> I can reproduce the panic using a synthetic workload, but I do not > have > >>>> a concise reproduction scenario. Every time the panic happens > bt_bytes > >>>> is 0x400, I haven't seen any other number there. > >>>> > >>>> Does anyone have an idea what could be causing this? > >>>> I can try any diagnostic code that might shed more light. > >>>> Thank you! > >>>> > >>>> -- > >>>> Andriy Gapon > >>>> > >>>> > >>>> http://www.listbox.com <http://www.listbox.com/> > >>> > >>> *openzfs-developer* | Archives > >>> <https://www.listbox.com/member/archive/274414/=now> > >>> <https://www.listbox.com/member/archive/rss/274414/28133750-22ed9730> > | > >>> Modify > >>> <https://www.listbox.com/member/?&> > >>> Your Subscription[Powered by Listbox] <http://www.listbox.com > >>> <http://www.listbox.com/>> > >>> > >> > >> > >> -- > >> Andriy Gapon > >> > > > > *openzfs-developer* | Archives > > <https://www.listbox.com/member/archive/274414/=now> > > <https://www.listbox.com/member/archive/rss/274414/28133750-22ed9730> | > > Modify > > <https://www.listbox.com/member/?&> > > Your Subscription [Powered by Listbox] <http://www.listbox.com> > > > > > -- > Andriy Gapon > ------------------------------------------- openzfs-developer Archives: https://www.listbox.com/member/archive/274414/=now RSS Feed: https://www.listbox.com/member/archive/rss/274414/28015062-cce53afa Modify Your Subscription: https://www.listbox.com/member/?member_id=28015062&id_secret=28015062-f966d51c Powered by Listbox: http://www.listbox.com