Re: CHANGE_PV_LIST_LOCK_TO_PHYS is not correct when !NUMA ?
On Sun, May 10, 2020 at 12:02:19AM +0300, Andriy Gapon wrote: > On 09/05/2020 23:47, Konstantin Belousov wrote: > > Might be not, might be it would help due to pmap_delayed_invl_genp(). > > But I would more worry about this 'already started' issue, because > > this must not happen. Can you remove the assert from the macro and > > provide backtrace of 'DI already started' panic ? > > Oh, now that you asked for it, I see that it was a secondary panic (through > vt, > fb, drm code path). > The first panic was still the same "address %lx beyond the last segment". > I'll test your suggestion tomorrow. Yes, the backtrace is reasonable in the sense that VM was recursed due to panic while already in DI section. So pmap_remove() from inside panic handler indeed triggered the right assert. > > > #10 0x8080340e in vpanic (fmt=, ap=) at > /usr/devel/git/motil/sys/kern/kern_shutdown.c:902 > #11 0x808031a3 in panic (fmt=0x8119a998 > "\265\001ʀ\377\377\377\377") at > /usr/devel/git/motil/sys/kern/kern_shutdown.c:839 > #12 0x80bb4c05 in pmap_delayed_invl_start_u () at > /usr/devel/git/motil/sys/amd64/amd64/pmap.c:783 > #13 0x80bb8ede in pmap_remove (pmap=0x812ee930 > , sva=18446741877558251520, eva=) at > /usr/devel/git/motil/sys/amd64/amd64/pmap.c:5418 > #14 0x80b2b6ad in _kmem_unback (object=, > addr=18446741877558251520, size=102400) at > /usr/devel/git/motil/sys/vm/vm_kern.c:574 > #15 0x80b2b7dd in kmem_free (addr=18446741877558251520, size=102400) > at > /usr/devel/git/motil/sys/vm/vm_kern.c:614 > #16 0x807db77b in free_large (addr=0xfe00ab2e9000, size=102400) at > /usr/devel/git/motil/sys/kern/kern_malloc.c:599 > #17 free (addr=0xfe00ab2e9000, mtp=0x825f90c0 ) at > /usr/devel/git/motil/sys/kern/kern_malloc.c:818 > #18 0x82444922 in dc_gamma_release (gamma=) at > /usr/home/avg/devel/kms-drm/drivers/gpu/drm/amd/display/dc/core/dc_surface.c:162 > #19 destruct (plane_state=0xf800080ef800) at > /usr/home/avg/devel/kms-drm/drivers/gpu/drm/amd/display/dc/core/dc_surface.c:53 > #20 dc_plane_state_free (kref=) at > /usr/home/avg/devel/kms-drm/drivers/gpu/drm/amd/display/dc/core/dc_surface.c:140 > #21 kref_put (kref=, rel=) at > /usr/devel/git/motil/sys/compat/linuxkpi/common/include/linux/kref.h:74 > #22 dc_plane_state_release (plane_state=) at > /usr/home/avg/devel/kms-drm/drivers/gpu/drm/amd/display/dc/core/dc_surface.c:146 > #23 0x82442de9 in dc_resource_state_destruct > (context=0xfe00a2af) at > /usr/home/avg/devel/kms-drm/drivers/gpu/drm/amd/display/dc/core/dc_resource.c:2295 > #24 0x824355d2 in dc_state_free (kref=) at > /usr/home/avg/devel/kms-drm/drivers/gpu/drm/amd/display/dc/core/dc.c:1152 > #25 kref_put (kref=, rel=) at > /usr/devel/git/motil/sys/compat/linuxkpi/common/include/linux/kref.h:74 > #26 dc_release_state (context=) at > /usr/home/avg/devel/kms-drm/drivers/gpu/drm/amd/display/dc/core/dc.c:1158 > #27 0x8241f6cc in dm_atomic_destroy_state (obj=, > state=0xf80020465550) at > /usr/home/avg/devel/kms-drm/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c:1667 > #28 0x82569734 in drm_atomic_state_default_clear > (state=0xf80008274a00) at > /usr/home/avg/devel/kms-drm/drivers/gpu/drm/drm_atomic.c:202 > #29 0x82569827 in drm_atomic_state_clear (state=) at > /usr/home/avg/devel/kms-drm/drivers/gpu/drm/drm_atomic.c:240 > #30 __drm_atomic_state_free (ref=0xf80008274a00) at > /usr/home/avg/devel/kms-drm/drivers/gpu/drm/drm_atomic.c:256 > #31 0x825998d8 in kref_put (kref=0xf80008274a00, rel= out>) at > /usr/devel/git/motil/sys/compat/linuxkpi/common/include/linux/kref.h:74 > #32 drm_atomic_state_put (state=0xf80008274a00) at > /usr/home/avg/devel/kms-drm/include/drm/drm_atomic.h:385 > #33 restore_fbdev_mode_atomic (fb_helper=, active=true) at > /usr/home/avg/devel/kms-drm/drivers/gpu/drm/drm_fb_helper.c:461 > #34 0x8259567a in drm_fb_helper_restore_fbdev_mode_unlocked > (fb_helper=0xf8002096d800) at > /usr/home/avg/devel/kms-drm/drivers/gpu/drm/drm_fb_helper.c:549 > #35 0x825bcc8a in vt_kms_postswitch (arg=0xf800027c52c0) at > /usr/home/avg/devel/kms-drm/drivers/gpu/drm/linux_fb.c:97 > #36 0x806b04b2 in vt_window_switch (vw=0x80e999a8 > ) at /usr/devel/git/motil/sys/dev/vt/vt_core.c:603 > #37 0x806ada0f in vtterm_cngrab (tm=) at > /usr/devel/git/motil/sys/dev/vt/vt_core.c:1612 > #38 0x8079f776 in cngrab () at > /usr/devel/git/motil/sys/kern/kern_cons.c:397 > #39 0x8080335c in vpanic (fmt=0x80cc257f "address %lx beyond > the > last segment", ap=0xfe009e18c890) at > /usr/devel/git/motil/sys/kern/kern_shutdown.c:887 > #40 0x808031a3 in panic (fmt=0x8119a998 > "\265\001ʀ\377\377\377\377") at > /usr/devel/git/motil/sys/kern/kern_shutdown.c:839 > #41 0x80bc2ac3 in pmap_remove_pte (pmap=0xfe00a4cdbb08, > ptq=0xf800cd2b4000, va=345
Re: CHANGE_PV_LIST_LOCK_TO_PHYS is not correct when !NUMA ?
On 09/05/2020 23:47, Konstantin Belousov wrote: > Might be not, might be it would help due to pmap_delayed_invl_genp(). > But I would more worry about this 'already started' issue, because > this must not happen. Can you remove the assert from the macro and > provide backtrace of 'DI already started' panic ? Oh, now that you asked for it, I see that it was a secondary panic (through vt, fb, drm code path). The first panic was still the same "address %lx beyond the last segment". I'll test your suggestion tomorrow. #10 0x8080340e in vpanic (fmt=, ap=) at /usr/devel/git/motil/sys/kern/kern_shutdown.c:902 #11 0x808031a3 in panic (fmt=0x8119a998 "\265\001ʀ\377\377\377\377") at /usr/devel/git/motil/sys/kern/kern_shutdown.c:839 #12 0x80bb4c05 in pmap_delayed_invl_start_u () at /usr/devel/git/motil/sys/amd64/amd64/pmap.c:783 #13 0x80bb8ede in pmap_remove (pmap=0x812ee930 , sva=18446741877558251520, eva=) at /usr/devel/git/motil/sys/amd64/amd64/pmap.c:5418 #14 0x80b2b6ad in _kmem_unback (object=, addr=18446741877558251520, size=102400) at /usr/devel/git/motil/sys/vm/vm_kern.c:574 #15 0x80b2b7dd in kmem_free (addr=18446741877558251520, size=102400) at /usr/devel/git/motil/sys/vm/vm_kern.c:614 #16 0x807db77b in free_large (addr=0xfe00ab2e9000, size=102400) at /usr/devel/git/motil/sys/kern/kern_malloc.c:599 #17 free (addr=0xfe00ab2e9000, mtp=0x825f90c0 ) at /usr/devel/git/motil/sys/kern/kern_malloc.c:818 #18 0x82444922 in dc_gamma_release (gamma=) at /usr/home/avg/devel/kms-drm/drivers/gpu/drm/amd/display/dc/core/dc_surface.c:162 #19 destruct (plane_state=0xf800080ef800) at /usr/home/avg/devel/kms-drm/drivers/gpu/drm/amd/display/dc/core/dc_surface.c:53 #20 dc_plane_state_free (kref=) at /usr/home/avg/devel/kms-drm/drivers/gpu/drm/amd/display/dc/core/dc_surface.c:140 #21 kref_put (kref=, rel=) at /usr/devel/git/motil/sys/compat/linuxkpi/common/include/linux/kref.h:74 #22 dc_plane_state_release (plane_state=) at /usr/home/avg/devel/kms-drm/drivers/gpu/drm/amd/display/dc/core/dc_surface.c:146 #23 0x82442de9 in dc_resource_state_destruct (context=0xfe00a2af) at /usr/home/avg/devel/kms-drm/drivers/gpu/drm/amd/display/dc/core/dc_resource.c:2295 #24 0x824355d2 in dc_state_free (kref=) at /usr/home/avg/devel/kms-drm/drivers/gpu/drm/amd/display/dc/core/dc.c:1152 #25 kref_put (kref=, rel=) at /usr/devel/git/motil/sys/compat/linuxkpi/common/include/linux/kref.h:74 #26 dc_release_state (context=) at /usr/home/avg/devel/kms-drm/drivers/gpu/drm/amd/display/dc/core/dc.c:1158 #27 0x8241f6cc in dm_atomic_destroy_state (obj=, state=0xf80020465550) at /usr/home/avg/devel/kms-drm/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c:1667 #28 0x82569734 in drm_atomic_state_default_clear (state=0xf80008274a00) at /usr/home/avg/devel/kms-drm/drivers/gpu/drm/drm_atomic.c:202 #29 0x82569827 in drm_atomic_state_clear (state=) at /usr/home/avg/devel/kms-drm/drivers/gpu/drm/drm_atomic.c:240 #30 __drm_atomic_state_free (ref=0xf80008274a00) at /usr/home/avg/devel/kms-drm/drivers/gpu/drm/drm_atomic.c:256 #31 0x825998d8 in kref_put (kref=0xf80008274a00, rel=) at /usr/devel/git/motil/sys/compat/linuxkpi/common/include/linux/kref.h:74 #32 drm_atomic_state_put (state=0xf80008274a00) at /usr/home/avg/devel/kms-drm/include/drm/drm_atomic.h:385 #33 restore_fbdev_mode_atomic (fb_helper=, active=true) at /usr/home/avg/devel/kms-drm/drivers/gpu/drm/drm_fb_helper.c:461 #34 0x8259567a in drm_fb_helper_restore_fbdev_mode_unlocked (fb_helper=0xf8002096d800) at /usr/home/avg/devel/kms-drm/drivers/gpu/drm/drm_fb_helper.c:549 #35 0x825bcc8a in vt_kms_postswitch (arg=0xf800027c52c0) at /usr/home/avg/devel/kms-drm/drivers/gpu/drm/linux_fb.c:97 #36 0x806b04b2 in vt_window_switch (vw=0x80e999a8 ) at /usr/devel/git/motil/sys/dev/vt/vt_core.c:603 #37 0x806ada0f in vtterm_cngrab (tm=) at /usr/devel/git/motil/sys/dev/vt/vt_core.c:1612 #38 0x8079f776 in cngrab () at /usr/devel/git/motil/sys/kern/kern_cons.c:397 #39 0x8080335c in vpanic (fmt=0x80cc257f "address %lx beyond the last segment", ap=0xfe009e18c890) at /usr/devel/git/motil/sys/kern/kern_shutdown.c:887 #40 0x808031a3 in panic (fmt=0x8119a998 "\265\001ʀ\377\377\377\377") at /usr/devel/git/motil/sys/kern/kern_shutdown.c:839 #41 0x80bc2ac3 in pmap_remove_pte (pmap=0xfe00a4cdbb08, ptq=0xf800cd2b4000, va=34523316224, ptepde=3442163815, free=0xfe009e18c9a0, lockp=0xfe009e18c9b8) at /usr/devel/git/motil/sys/amd64/amd64/pmap.c:3599 #42 0x80bba98c in pmap_remove_ptes (pmap=0xfe00a4cdbb08, sva=34523316224, eva=34525413376, pde=0xf800b2515270, free=0xfe009e18c9a0, lockp=0xfe009e18c9b8) at /usr/devel/git/motil/sys/amd64/amd64/pmap.c:5378 #43 0x80bb921c in pmap_remove (pmap=, sva=34523316224, eva=) at /u
Re: CHANGE_PV_LIST_LOCK_TO_PHYS is not correct when !NUMA ?
On Sat, May 09, 2020 at 11:33:40PM +0300, Andriy Gapon wrote: > On 09/05/2020 19:50, Konstantin Belousov wrote: > > On Sat, May 09, 2020 at 07:16:27PM +0300, Andriy Gapon wrote: > >> On 09/05/2020 19:13, Konstantin Belousov wrote: > >>> On Sat, May 09, 2020 at 06:52:24PM +0300, Andriy Gapon wrote: > On 08/05/2020 19:15, Konstantin Belousov wrote: > > On Fri, May 08, 2020 at 06:53:24PM +0300, Andriy Gapon wrote: > >> > >> I have a reproducible panic with a custom kernel without option NUMA > >> while using > >> amdgpu driver from linuxkpi-based drm: > >> > >> panic: address 41ec0 beyond the last segment > >> > >> I did some quick debugging and the panic happens when Xorg server > >> tries to > >> access a frame buffer (or something like that). There is a page fault > >> that gets > >> satisfied by ttm with a fictitious page. > >> > >> The stack trace is: > >> #11 0x808031a3 in panic (fmt=0x8119a998 > >> "5\003ʀ\377\377\377\377") at > >> /usr/devel/git/motil/sys/kern/kern_shutdown.c:839 > >> #12 0x80bbc552 in pmap_enter (pmap=, > >> va=34504441856, > >> m=, prot=, flags=, > >> psind= >> out>) at /usr/devel/git/motil/sys/amd64/amd64/pmap.c:6035 > >> #13 0x80b288be in vm_fault_populate (fs=) at > >> /usr/devel/git/motil/sys/vm/vm_fault.c:519 > >> #14 vm_fault_allocate (fs=) at > >> /usr/devel/git/motil/sys/vm/vm_fault.c:1032 > >> #15 vm_fault (map=, vaddr=, > >> fault_type= >> out>, fault_flags=, m_hold=) at > >> /usr/devel/git/motil/sys/vm/vm_fault.c:1342 > >> #16 0x80b26e7e in vm_fault_trap (map=0xfe0017cd39e8, > >> vaddr=, fault_type=, fault_flags=0, > >> signo=0xfe00a810dbc4, ucode=0xfe00a810dbc0) at > >> /usr/devel/git/motil/sys/vm/vm_fault.c:589 > >> #17 0x80bcf89c in trap_pfault (frame=0xfe00a810dc00, > >> usermode=, signo=, > >> ucode=0x80853250 > >> ) at /usr/devel/git/motil/sys/amd64/amd64/trap.c:821 > >> #18 0x80bceeec in trap (frame=0xfe00a810dc00) at > >> /usr/devel/git/motil/sys/amd64/amd64/trap.c:34 > >> > >> > >> The line number in pmap_enter() is incorrect, I guess because of > >> optimizations. > >> The assert seems to be reached via pmap_enter -> > >> CHANGE_PV_LIST_LOCK_TO_PHYS -> > >> PHYS_TO_PV_LIST_LOCK -> pa_index(). > >> > >> The panic in correct in that the page is fictitious and its physical > >> address is > >> beyond the end of real physical memory. > >> It seems that NUMA PHYS_TO_PV_LIST_LOCK() is aware of such pages, but > >> !NUMA one > >> is not. > > > > I think you can remove this assert. pa_index() is always taken by > > % NVP_LIST_LOCKS, because fictitious mappings are not promoted. > > > > Try that and commit if it works for you. > > I tried this change: > diff --git a/sys/amd64/amd64/pmap.c b/sys/amd64/amd64/pmap.c > index 4deed86a76d1a..b834b7f0388b7 100644 > --- a/sys/amd64/amd64/pmap.c > +++ b/sys/amd64/amd64/pmap.c > @@ -345,7 +345,7 @@ pmap_pku_mask_bit(pmap_t pmap) > #define NPV_LIST_LOCKS MAXCPU > > #define PHYS_TO_PV_LIST_LOCK(pa)\ > -(&pv_list_locks[pa_index(pa) % NPV_LIST_LOCKS]) > +(&pv_list_locks[((pa) >> PDRSHIFT) % > NPV_LIST_LOCKS]) > #endif > > #define CHANGE_PV_LIST_LOCK_TO_PHYS(lockp, pa) do {\ > > It fixed the original problem, but I got a new panic. > "DI already started" in pmap_remove() -> pmap_delayed_invl_start_u(). > I guess that !NUMA variant does not get much testing, so I'll probably > just > stick with the default. > >>> Why didn't you just removed the KASSERT from pa_index ? > >> > >> Well, I thought it might be useful in the NUMA case. > >> pa_index() definition is shared between both cases. > > Might be define the macro two times, for NUMA/non-NUMA. non-NUMA case > > does not need the assert, because users take it mod NPV_LIST_LOCKS. > > > > I still don't see how that could help with "DI already started" panic. Might be not, might be it would help due to pmap_delayed_invl_genp(). But I would more worry about this 'already started' issue, because this must not happen. Can you remove the assert from the macro and provide backtrace of 'DI already started' panic ? ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: CHANGE_PV_LIST_LOCK_TO_PHYS is not correct when !NUMA ?
On 09/05/2020 19:50, Konstantin Belousov wrote: > On Sat, May 09, 2020 at 07:16:27PM +0300, Andriy Gapon wrote: >> On 09/05/2020 19:13, Konstantin Belousov wrote: >>> On Sat, May 09, 2020 at 06:52:24PM +0300, Andriy Gapon wrote: On 08/05/2020 19:15, Konstantin Belousov wrote: > On Fri, May 08, 2020 at 06:53:24PM +0300, Andriy Gapon wrote: >> >> I have a reproducible panic with a custom kernel without option NUMA >> while using >> amdgpu driver from linuxkpi-based drm: >> >> panic: address 41ec0 beyond the last segment >> >> I did some quick debugging and the panic happens when Xorg server tries >> to >> access a frame buffer (or something like that). There is a page fault >> that gets >> satisfied by ttm with a fictitious page. >> >> The stack trace is: >> #11 0x808031a3 in panic (fmt=0x8119a998 >> "5\003ʀ\377\377\377\377") at >> /usr/devel/git/motil/sys/kern/kern_shutdown.c:839 >> #12 0x80bbc552 in pmap_enter (pmap=, >> va=34504441856, >> m=, prot=, flags=, >> psind=> out>) at /usr/devel/git/motil/sys/amd64/amd64/pmap.c:6035 >> #13 0x80b288be in vm_fault_populate (fs=) at >> /usr/devel/git/motil/sys/vm/vm_fault.c:519 >> #14 vm_fault_allocate (fs=) at >> /usr/devel/git/motil/sys/vm/vm_fault.c:1032 >> #15 vm_fault (map=, vaddr=, >> fault_type=> out>, fault_flags=, m_hold=) at >> /usr/devel/git/motil/sys/vm/vm_fault.c:1342 >> #16 0x80b26e7e in vm_fault_trap (map=0xfe0017cd39e8, >> vaddr=, fault_type=, fault_flags=0, >> signo=0xfe00a810dbc4, ucode=0xfe00a810dbc0) at >> /usr/devel/git/motil/sys/vm/vm_fault.c:589 >> #17 0x80bcf89c in trap_pfault (frame=0xfe00a810dc00, >> usermode=, signo=, ucode=0x80853250 >> ) at /usr/devel/git/motil/sys/amd64/amd64/trap.c:821 >> #18 0x80bceeec in trap (frame=0xfe00a810dc00) at >> /usr/devel/git/motil/sys/amd64/amd64/trap.c:34 >> >> >> The line number in pmap_enter() is incorrect, I guess because of >> optimizations. >> The assert seems to be reached via pmap_enter -> >> CHANGE_PV_LIST_LOCK_TO_PHYS -> >> PHYS_TO_PV_LIST_LOCK -> pa_index(). >> >> The panic in correct in that the page is fictitious and its physical >> address is >> beyond the end of real physical memory. >> It seems that NUMA PHYS_TO_PV_LIST_LOCK() is aware of such pages, but >> !NUMA one >> is not. > > I think you can remove this assert. pa_index() is always taken by > % NVP_LIST_LOCKS, because fictitious mappings are not promoted. > > Try that and commit if it works for you. I tried this change: diff --git a/sys/amd64/amd64/pmap.c b/sys/amd64/amd64/pmap.c index 4deed86a76d1a..b834b7f0388b7 100644 --- a/sys/amd64/amd64/pmap.c +++ b/sys/amd64/amd64/pmap.c @@ -345,7 +345,7 @@ pmap_pku_mask_bit(pmap_t pmap) #define NPV_LIST_LOCKS MAXCPU #define PHYS_TO_PV_LIST_LOCK(pa)\ - (&pv_list_locks[pa_index(pa) % NPV_LIST_LOCKS]) + (&pv_list_locks[((pa) >> PDRSHIFT) % NPV_LIST_LOCKS]) #endif #define CHANGE_PV_LIST_LOCK_TO_PHYS(lockp, pa) do {\ It fixed the original problem, but I got a new panic. "DI already started" in pmap_remove() -> pmap_delayed_invl_start_u(). I guess that !NUMA variant does not get much testing, so I'll probably just stick with the default. >>> Why didn't you just removed the KASSERT from pa_index ? >> >> Well, I thought it might be useful in the NUMA case. >> pa_index() definition is shared between both cases. > Might be define the macro two times, for NUMA/non-NUMA. non-NUMA case > does not need the assert, because users take it mod NPV_LIST_LOCKS. > I still don't see how that could help with "DI already started" panic. -- Andriy Gapon ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: Error loading tcp_bbr kernel module
Hi Michael, On Sat, May 09, 2020 at 05:42:55PM +0200, Michael Tuexen wrote: > > On 9. May 2020, at 16:25, Gordon Bergling wrote: > > I tried tcp_rack and tcp_bbr, since both are separate TCP stacks. I just > > posted the wrong error message. Both TCP stacks weren’t loadable as a > > kernel module with just the former mentioned build option. > > > > I currently have build running with both kernel options you mentioned. > > > > If the build is successful and I can change the default TCP stack to RACK > > and BBR I let you know. > That would be great. I have them running on my machines, but I might have > missed something. > > > > Further I didn’t find any documentation within tcp(4) regarding RACK and > > BBR. Since I am about to enhance the manpages, I’ll extent tcp(4) about > > information about RACK and BBR, but this is a different topic. > > > Yes it is. And I would suggest to use separate man pages, a single one for > each stack. > The the generic man page might refer to them... My first thoughts on this topic were about to extent tcp(4) and create links to tcp_rack(4) and tcp_bbr(4), but separate manpages maybe the way to go. I just have to investigate the respective details. I was once very deep into TCP/IP, while building perimeter firewalls with FreeBSD, but this was 20 years ago. I add you as a reviever for the differential once I have a rough cut for the manpages ready. Best regards, Gordon > >> Am 09.05.2020 um 14:37 schrieb Michael Tuexen : > >>> On 9. May 2020, at 14:18, Gordon Bergling > >>> wrote: > >>> > >>> Greetings, > >>> > >>> I build -CURRENT with WITH_EXTRA_TCP_STACKS=1, but I got the following > >>> error > >>> when I try to load for example tcp_bbr.ko. > >>> z > >>> kldload: an error occurred while loading module tcp_rack.ko. Please check > >>> dmesg(8) for more details. > >> This indicates that you want to load the RACK stack. > >> > >> Please note that you need for BBR and RACK: > >> optionsTCPHPTS > >> in the kernel config and in addition to that for RACK > >> optionsRATELIMIT > >> > >>> dmesg shows: > >>> > >>> KLD tcp_bbr.ko: depends on tcphpts - not available or version mismatch > >>> linker_load_file: /boot/kernel/tcp_bbr.ko - unsupported file type > >>> > >>> Any hints on solving the problem? > >>> > >>> The kernel config is GENERIC. > >>> > >>> Best regards, > >>> > >>> Gordon ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: Error loading tcp_bbr kernel module
Hi Michael, with a kernel config which includes include GENERIC options RATELIMIT options TCPHPTS applied, I could successfully use net.inet.tcp.functions_default=bbr to switch the TCP stack. Thanks for the fast help, Gordon > Am 09.05.2020 um 16:25 schrieb Gordon Bergling : > > Hi Michael, > > thanks for your reply. > > I tried tcp_rack and tcp_bbr, since both are separate TCP stacks. I just > posted the wrong error message. Both TCP stacks weren’t loadable as a kernel > module with just the former mentioned build option. > > I currently have build running with both kernel options you mentioned. > > If the build is successful and I can change the default TCP stack to RACK and > BBR I let you know. > > Further I didn’t find any documentation within tcp(4) regarding RACK and BBR. > Since I am about to enhance the manpages, I’ll extent tcp(4) about > information about RACK and BBR, but this is a different topic. > > Best regards, > > Gordon > >> Am 09.05.2020 um 14:37 schrieb Michael Tuexen : >> >>> On 9. May 2020, at 14:18, Gordon Bergling wrote: >>> >>> Greetings, >>> >>> I build -CURRENT with WITH_EXTRA_TCP_STACKS=1, but I got the following error >>> when I try to load for example tcp_bbr.ko. >>> z >>> kldload: an error occurred while loading module tcp_rack.ko. Please check >>> dmesg(8) for more details. >> This indicates that you want to load the RACK stack. >> >> Please note that you need for BBR and RACK: >> options TCPHPTS >> in the kernel config and in addition to that for RACK >> options RATELIMIT >> >> Best regards >> Michael >>> >>> dmesg shows: >>> >>> KLD tcp_bbr.ko: depends on tcphpts - not available or version mismatch >>> linker_load_file: /boot/kernel/tcp_bbr.ko - unsupported file type >>> >>> Any hints on solving the problem? >>> >>> The kernel config is GENERIC. >>> >>> Best regards, >>> >>> Gordon >>> ___ >>> freebsd-current@freebsd.org mailing list >>> https://lists.freebsd.org/mailman/listinfo/freebsd-current >>> To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org" >> >> ___ >> freebsd-current@freebsd.org mailing list >> https://lists.freebsd.org/mailman/listinfo/freebsd-current >> To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org" > ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: Error loading tcp_bbr kernel module
> On 9. May 2020, at 18:07, Gordon Bergling wrote: > > Hi Michael, > > On Sat, May 09, 2020 at 05:42:55PM +0200, Michael Tuexen wrote: >>> On 9. May 2020, at 16:25, Gordon Bergling wrote: >>> I tried tcp_rack and tcp_bbr, since both are separate TCP stacks. I just >>> posted the wrong error message. Both TCP stacks weren’t loadable as a >>> kernel module with just the former mentioned build option. >>> >>> I currently have build running with both kernel options you mentioned. >>> >>> If the build is successful and I can change the default TCP stack to RACK >>> and BBR I let you know. >> That would be great. I have them running on my machines, but I might have >> missed something. >>> >>> Further I didn’t find any documentation within tcp(4) regarding RACK and >>> BBR. Since I am about to enhance the manpages, I’ll extent tcp(4) about >>> information about RACK and BBR, but this is a different topic. >>> >> Yes it is. And I would suggest to use separate man pages, a single one for >> each stack. >> The the generic man page might refer to them... > > My first thoughts on this topic were about to extent tcp(4) and create links > to > tcp_rack(4) and tcp_bbr(4), but separate manpages maybe the way to go. I just > have to investigate the respective details. I was once very deep into TCP/IP, > while building perimeter firewalls with FreeBSD, but this was 20 years ago. > > I add you as a reviever for the differential once I have a rough cut > for the manpages ready. Hi Gordon, please do so. Don't forget to add rrs@, since he wrote both stacks. Best regards Michael > > Best regards, > > Gordon > Am 09.05.2020 um 14:37 schrieb Michael Tuexen : > On 9. May 2020, at 14:18, Gordon Bergling > wrote: > > Greetings, > > I build -CURRENT with WITH_EXTRA_TCP_STACKS=1, but I got the following > error > when I try to load for example tcp_bbr.ko. > z > kldload: an error occurred while loading module tcp_rack.ko. Please check > dmesg(8) for more details. This indicates that you want to load the RACK stack. Please note that you need for BBR and RACK: optionsTCPHPTS in the kernel config and in addition to that for RACK optionsRATELIMIT > dmesg shows: > > KLD tcp_bbr.ko: depends on tcphpts - not available or version mismatch > linker_load_file: /boot/kernel/tcp_bbr.ko - unsupported file type > > Any hints on solving the problem? > > The kernel config is GENERIC. > > Best regards, > > Gordon ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: Error loading tcp_bbr kernel module
Hi Michael, thanks for your reply. I tried tcp_rack and tcp_bbr, since both are separate TCP stacks. I just posted the wrong error message. Both TCP stacks weren’t loadable as a kernel module with just the former mentioned build option. I currently have build running with both kernel options you mentioned. If the build is successful and I can change the default TCP stack to RACK and BBR I let you know. Further I didn’t find any documentation within tcp(4) regarding RACK and BBR. Since I am about to enhance the manpages, I’ll extent tcp(4) about information about RACK and BBR, but this is a different topic. Best regards, Gordon > Am 09.05.2020 um 14:37 schrieb Michael Tuexen : > >> On 9. May 2020, at 14:18, Gordon Bergling wrote: >> >> Greetings, >> >> I build -CURRENT with WITH_EXTRA_TCP_STACKS=1, but I got the following error >> when I try to load for example tcp_bbr.ko. >> z >> kldload: an error occurred while loading module tcp_rack.ko. Please check >> dmesg(8) for more details. > This indicates that you want to load the RACK stack. > > Please note that you need for BBR and RACK: > options TCPHPTS > in the kernel config and in addition to that for RACK > options RATELIMIT > > Best regards > Michael >> >> dmesg shows: >> >> KLD tcp_bbr.ko: depends on tcphpts - not available or version mismatch >> linker_load_file: /boot/kernel/tcp_bbr.ko - unsupported file type >> >> Any hints on solving the problem? >> >> The kernel config is GENERIC. >> >> Best regards, >> >> Gordon >> ___ >> freebsd-current@freebsd.org mailing list >> https://lists.freebsd.org/mailman/listinfo/freebsd-current >> To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org" > > ___ > freebsd-current@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-current > To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org" ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: CHANGE_PV_LIST_LOCK_TO_PHYS is not correct when !NUMA ?
On Sat, May 09, 2020 at 07:16:27PM +0300, Andriy Gapon wrote: > On 09/05/2020 19:13, Konstantin Belousov wrote: > > On Sat, May 09, 2020 at 06:52:24PM +0300, Andriy Gapon wrote: > >> On 08/05/2020 19:15, Konstantin Belousov wrote: > >>> On Fri, May 08, 2020 at 06:53:24PM +0300, Andriy Gapon wrote: > > I have a reproducible panic with a custom kernel without option NUMA > while using > amdgpu driver from linuxkpi-based drm: > > panic: address 41ec0 beyond the last segment > > I did some quick debugging and the panic happens when Xorg server tries > to > access a frame buffer (or something like that). There is a page fault > that gets > satisfied by ttm with a fictitious page. > > The stack trace is: > #11 0x808031a3 in panic (fmt=0x8119a998 > "5\003ʀ\377\377\377\377") at > /usr/devel/git/motil/sys/kern/kern_shutdown.c:839 > #12 0x80bbc552 in pmap_enter (pmap=, > va=34504441856, > m=, prot=, flags=, > psind= out>) at /usr/devel/git/motil/sys/amd64/amd64/pmap.c:6035 > #13 0x80b288be in vm_fault_populate (fs=) at > /usr/devel/git/motil/sys/vm/vm_fault.c:519 > #14 vm_fault_allocate (fs=) at > /usr/devel/git/motil/sys/vm/vm_fault.c:1032 > #15 vm_fault (map=, vaddr=, > fault_type= out>, fault_flags=, m_hold=) at > /usr/devel/git/motil/sys/vm/vm_fault.c:1342 > #16 0x80b26e7e in vm_fault_trap (map=0xfe0017cd39e8, > vaddr=, fault_type=, fault_flags=0, > signo=0xfe00a810dbc4, ucode=0xfe00a810dbc0) at > /usr/devel/git/motil/sys/vm/vm_fault.c:589 > #17 0x80bcf89c in trap_pfault (frame=0xfe00a810dc00, > usermode=, signo=, ucode=0x80853250 > ) at /usr/devel/git/motil/sys/amd64/amd64/trap.c:821 > #18 0x80bceeec in trap (frame=0xfe00a810dc00) at > /usr/devel/git/motil/sys/amd64/amd64/trap.c:34 > > > The line number in pmap_enter() is incorrect, I guess because of > optimizations. > The assert seems to be reached via pmap_enter -> > CHANGE_PV_LIST_LOCK_TO_PHYS -> > PHYS_TO_PV_LIST_LOCK -> pa_index(). > > The panic in correct in that the page is fictitious and its physical > address is > beyond the end of real physical memory. > It seems that NUMA PHYS_TO_PV_LIST_LOCK() is aware of such pages, but > !NUMA one > is not. > >>> > >>> I think you can remove this assert. pa_index() is always taken by > >>> % NVP_LIST_LOCKS, because fictitious mappings are not promoted. > >>> > >>> Try that and commit if it works for you. > >> > >> I tried this change: > >> diff --git a/sys/amd64/amd64/pmap.c b/sys/amd64/amd64/pmap.c > >> index 4deed86a76d1a..b834b7f0388b7 100644 > >> --- a/sys/amd64/amd64/pmap.c > >> +++ b/sys/amd64/amd64/pmap.c > >> @@ -345,7 +345,7 @@ pmap_pku_mask_bit(pmap_t pmap) > >> #define NPV_LIST_LOCKS MAXCPU > >> > >> #define PHYS_TO_PV_LIST_LOCK(pa)\ > >> - (&pv_list_locks[pa_index(pa) % NPV_LIST_LOCKS]) > >> + (&pv_list_locks[((pa) >> PDRSHIFT) % NPV_LIST_LOCKS]) > >> #endif > >> > >> #define CHANGE_PV_LIST_LOCK_TO_PHYS(lockp, pa) do {\ > >> > >> It fixed the original problem, but I got a new panic. > >> "DI already started" in pmap_remove() -> pmap_delayed_invl_start_u(). > >> I guess that !NUMA variant does not get much testing, so I'll probably just > >> stick with the default. > > Why didn't you just removed the KASSERT from pa_index ? > > Well, I thought it might be useful in the NUMA case. > pa_index() definition is shared between both cases. Might be define the macro two times, for NUMA/non-NUMA. non-NUMA case does not need the assert, because users take it mod NPV_LIST_LOCKS. ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Error loading tcp_bbr kernel module
Greetings, I build -CURRENT with WITH_EXTRA_TCP_STACKS=1, but I got the following error when I try to load for example tcp_bbr.ko. kldload: an error occurred while loading module tcp_rack.ko. Please check dmesg(8) for more details. dmesg shows: KLD tcp_bbr.ko: depends on tcphpts - not available or version mismatch linker_load_file: /boot/kernel/tcp_bbr.ko - unsupported file type Any hints on solving the problem? The kernel config is GENERIC. Best regards, Gordon ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: CHANGE_PV_LIST_LOCK_TO_PHYS is not correct when !NUMA ?
On 09/05/2020 19:13, Konstantin Belousov wrote: > On Sat, May 09, 2020 at 06:52:24PM +0300, Andriy Gapon wrote: >> On 08/05/2020 19:15, Konstantin Belousov wrote: >>> On Fri, May 08, 2020 at 06:53:24PM +0300, Andriy Gapon wrote: I have a reproducible panic with a custom kernel without option NUMA while using amdgpu driver from linuxkpi-based drm: panic: address 41ec0 beyond the last segment I did some quick debugging and the panic happens when Xorg server tries to access a frame buffer (or something like that). There is a page fault that gets satisfied by ttm with a fictitious page. The stack trace is: #11 0x808031a3 in panic (fmt=0x8119a998 "5\003ʀ\377\377\377\377") at /usr/devel/git/motil/sys/kern/kern_shutdown.c:839 #12 0x80bbc552 in pmap_enter (pmap=, va=34504441856, m=, prot=, flags=, psind=>>> out>) at /usr/devel/git/motil/sys/amd64/amd64/pmap.c:6035 #13 0x80b288be in vm_fault_populate (fs=) at /usr/devel/git/motil/sys/vm/vm_fault.c:519 #14 vm_fault_allocate (fs=) at /usr/devel/git/motil/sys/vm/vm_fault.c:1032 #15 vm_fault (map=, vaddr=, fault_type=>>> out>, fault_flags=, m_hold=) at /usr/devel/git/motil/sys/vm/vm_fault.c:1342 #16 0x80b26e7e in vm_fault_trap (map=0xfe0017cd39e8, vaddr=, fault_type=, fault_flags=0, signo=0xfe00a810dbc4, ucode=0xfe00a810dbc0) at /usr/devel/git/motil/sys/vm/vm_fault.c:589 #17 0x80bcf89c in trap_pfault (frame=0xfe00a810dc00, usermode=, signo=, ucode=0x80853250 ) at /usr/devel/git/motil/sys/amd64/amd64/trap.c:821 #18 0x80bceeec in trap (frame=0xfe00a810dc00) at /usr/devel/git/motil/sys/amd64/amd64/trap.c:34 The line number in pmap_enter() is incorrect, I guess because of optimizations. The assert seems to be reached via pmap_enter -> CHANGE_PV_LIST_LOCK_TO_PHYS -> PHYS_TO_PV_LIST_LOCK -> pa_index(). The panic in correct in that the page is fictitious and its physical address is beyond the end of real physical memory. It seems that NUMA PHYS_TO_PV_LIST_LOCK() is aware of such pages, but !NUMA one is not. >>> >>> I think you can remove this assert. pa_index() is always taken by >>> % NVP_LIST_LOCKS, because fictitious mappings are not promoted. >>> >>> Try that and commit if it works for you. >> >> I tried this change: >> diff --git a/sys/amd64/amd64/pmap.c b/sys/amd64/amd64/pmap.c >> index 4deed86a76d1a..b834b7f0388b7 100644 >> --- a/sys/amd64/amd64/pmap.c >> +++ b/sys/amd64/amd64/pmap.c >> @@ -345,7 +345,7 @@ pmap_pku_mask_bit(pmap_t pmap) >> #define NPV_LIST_LOCKS MAXCPU >> >> #define PHYS_TO_PV_LIST_LOCK(pa)\ >> -(&pv_list_locks[pa_index(pa) % NPV_LIST_LOCKS]) >> +(&pv_list_locks[((pa) >> PDRSHIFT) % NPV_LIST_LOCKS]) >> #endif >> >> #define CHANGE_PV_LIST_LOCK_TO_PHYS(lockp, pa) do {\ >> >> It fixed the original problem, but I got a new panic. >> "DI already started" in pmap_remove() -> pmap_delayed_invl_start_u(). >> I guess that !NUMA variant does not get much testing, so I'll probably just >> stick with the default. > Why didn't you just removed the KASSERT from pa_index ? Well, I thought it might be useful in the NUMA case. pa_index() definition is shared between both cases. -- Andriy Gapon ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: CHANGE_PV_LIST_LOCK_TO_PHYS is not correct when !NUMA ?
On Sat, May 09, 2020 at 06:52:24PM +0300, Andriy Gapon wrote: > On 08/05/2020 19:15, Konstantin Belousov wrote: > > On Fri, May 08, 2020 at 06:53:24PM +0300, Andriy Gapon wrote: > >> > >> I have a reproducible panic with a custom kernel without option NUMA while > >> using > >> amdgpu driver from linuxkpi-based drm: > >> > >> panic: address 41ec0 beyond the last segment > >> > >> I did some quick debugging and the panic happens when Xorg server tries to > >> access a frame buffer (or something like that). There is a page fault > >> that gets > >> satisfied by ttm with a fictitious page. > >> > >> The stack trace is: > >> #11 0x808031a3 in panic (fmt=0x8119a998 > >> "5\003ʀ\377\377\377\377") at > >> /usr/devel/git/motil/sys/kern/kern_shutdown.c:839 > >> #12 0x80bbc552 in pmap_enter (pmap=, va=34504441856, > >> m=, prot=, flags=, > >> psind= >> out>) at /usr/devel/git/motil/sys/amd64/amd64/pmap.c:6035 > >> #13 0x80b288be in vm_fault_populate (fs=) at > >> /usr/devel/git/motil/sys/vm/vm_fault.c:519 > >> #14 vm_fault_allocate (fs=) at > >> /usr/devel/git/motil/sys/vm/vm_fault.c:1032 > >> #15 vm_fault (map=, vaddr=, > >> fault_type= >> out>, fault_flags=, m_hold=) at > >> /usr/devel/git/motil/sys/vm/vm_fault.c:1342 > >> #16 0x80b26e7e in vm_fault_trap (map=0xfe0017cd39e8, > >> vaddr=, fault_type=, fault_flags=0, > >> signo=0xfe00a810dbc4, ucode=0xfe00a810dbc0) at > >> /usr/devel/git/motil/sys/vm/vm_fault.c:589 > >> #17 0x80bcf89c in trap_pfault (frame=0xfe00a810dc00, > >> usermode=, signo=, ucode=0x80853250 > >> ) at /usr/devel/git/motil/sys/amd64/amd64/trap.c:821 > >> #18 0x80bceeec in trap (frame=0xfe00a810dc00) at > >> /usr/devel/git/motil/sys/amd64/amd64/trap.c:34 > >> > >> > >> The line number in pmap_enter() is incorrect, I guess because of > >> optimizations. > >> The assert seems to be reached via pmap_enter -> > >> CHANGE_PV_LIST_LOCK_TO_PHYS -> > >> PHYS_TO_PV_LIST_LOCK -> pa_index(). > >> > >> The panic in correct in that the page is fictitious and its physical > >> address is > >> beyond the end of real physical memory. > >> It seems that NUMA PHYS_TO_PV_LIST_LOCK() is aware of such pages, but > >> !NUMA one > >> is not. > > > > I think you can remove this assert. pa_index() is always taken by > > % NVP_LIST_LOCKS, because fictitious mappings are not promoted. > > > > Try that and commit if it works for you. > > I tried this change: > diff --git a/sys/amd64/amd64/pmap.c b/sys/amd64/amd64/pmap.c > index 4deed86a76d1a..b834b7f0388b7 100644 > --- a/sys/amd64/amd64/pmap.c > +++ b/sys/amd64/amd64/pmap.c > @@ -345,7 +345,7 @@ pmap_pku_mask_bit(pmap_t pmap) > #define NPV_LIST_LOCKS MAXCPU > > #define PHYS_TO_PV_LIST_LOCK(pa)\ > - (&pv_list_locks[pa_index(pa) % NPV_LIST_LOCKS]) > + (&pv_list_locks[((pa) >> PDRSHIFT) % NPV_LIST_LOCKS]) > #endif > > #define CHANGE_PV_LIST_LOCK_TO_PHYS(lockp, pa) do {\ > > It fixed the original problem, but I got a new panic. > "DI already started" in pmap_remove() -> pmap_delayed_invl_start_u(). > I guess that !NUMA variant does not get much testing, so I'll probably just > stick with the default. Why didn't you just removed the KASSERT from pa_index ? ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: CHANGE_PV_LIST_LOCK_TO_PHYS is not correct when !NUMA ?
On 08/05/2020 19:15, Konstantin Belousov wrote: > On Fri, May 08, 2020 at 06:53:24PM +0300, Andriy Gapon wrote: >> >> I have a reproducible panic with a custom kernel without option NUMA while >> using >> amdgpu driver from linuxkpi-based drm: >> >> panic: address 41ec0 beyond the last segment >> >> I did some quick debugging and the panic happens when Xorg server tries to >> access a frame buffer (or something like that). There is a page fault that >> gets >> satisfied by ttm with a fictitious page. >> >> The stack trace is: >> #11 0x808031a3 in panic (fmt=0x8119a998 >> "5\003ʀ\377\377\377\377") at >> /usr/devel/git/motil/sys/kern/kern_shutdown.c:839 >> #12 0x80bbc552 in pmap_enter (pmap=, va=34504441856, >> m=, prot=, flags=, >> psind=> out>) at /usr/devel/git/motil/sys/amd64/amd64/pmap.c:6035 >> #13 0x80b288be in vm_fault_populate (fs=) at >> /usr/devel/git/motil/sys/vm/vm_fault.c:519 >> #14 vm_fault_allocate (fs=) at >> /usr/devel/git/motil/sys/vm/vm_fault.c:1032 >> #15 vm_fault (map=, vaddr=, >> fault_type=> out>, fault_flags=, m_hold=) at >> /usr/devel/git/motil/sys/vm/vm_fault.c:1342 >> #16 0x80b26e7e in vm_fault_trap (map=0xfe0017cd39e8, >> vaddr=, fault_type=, fault_flags=0, >> signo=0xfe00a810dbc4, ucode=0xfe00a810dbc0) at >> /usr/devel/git/motil/sys/vm/vm_fault.c:589 >> #17 0x80bcf89c in trap_pfault (frame=0xfe00a810dc00, >> usermode=, signo=, ucode=0x80853250 >> ) at /usr/devel/git/motil/sys/amd64/amd64/trap.c:821 >> #18 0x80bceeec in trap (frame=0xfe00a810dc00) at >> /usr/devel/git/motil/sys/amd64/amd64/trap.c:34 >> >> >> The line number in pmap_enter() is incorrect, I guess because of >> optimizations. >> The assert seems to be reached via pmap_enter -> CHANGE_PV_LIST_LOCK_TO_PHYS >> -> >> PHYS_TO_PV_LIST_LOCK -> pa_index(). >> >> The panic in correct in that the page is fictitious and its physical address >> is >> beyond the end of real physical memory. >> It seems that NUMA PHYS_TO_PV_LIST_LOCK() is aware of such pages, but !NUMA >> one >> is not. > > I think you can remove this assert. pa_index() is always taken by > % NVP_LIST_LOCKS, because fictitious mappings are not promoted. > > Try that and commit if it works for you. I tried this change: diff --git a/sys/amd64/amd64/pmap.c b/sys/amd64/amd64/pmap.c index 4deed86a76d1a..b834b7f0388b7 100644 --- a/sys/amd64/amd64/pmap.c +++ b/sys/amd64/amd64/pmap.c @@ -345,7 +345,7 @@ pmap_pku_mask_bit(pmap_t pmap) #defineNPV_LIST_LOCKS MAXCPU #definePHYS_TO_PV_LIST_LOCK(pa)\ - (&pv_list_locks[pa_index(pa) % NPV_LIST_LOCKS]) + (&pv_list_locks[((pa) >> PDRSHIFT) % NPV_LIST_LOCKS]) #endif #defineCHANGE_PV_LIST_LOCK_TO_PHYS(lockp, pa) do {\ It fixed the original problem, but I got a new panic. "DI already started" in pmap_remove() -> pmap_delayed_invl_start_u(). I guess that !NUMA variant does not get much testing, so I'll probably just stick with the default. -- Andriy Gapon ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: Error loading tcp_bbr kernel module
> On 9. May 2020, at 16:25, Gordon Bergling wrote: > > Hi Michael, > > thanks for your reply. > > I tried tcp_rack and tcp_bbr, since both are separate TCP stacks. I just > posted the wrong error message. Both TCP stacks weren’t loadable as a kernel > module with just the former mentioned build option. > > I currently have build running with both kernel options you mentioned. > > If the build is successful and I can change the default TCP stack to RACK and > BBR I let you know. That would be great. I have them running on my machines, but I might have missed something. > > Further I didn’t find any documentation within tcp(4) regarding RACK and BBR. > Since I am about to enhance the manpages, I’ll extent tcp(4) about > information about RACK and BBR, but this is a different topic. > Yes it is. And I would suggest to use separate man pages, a single one for each stack. The the generic man page might refer to them... Best regards Michael > Best regards, > > Gordon > >> Am 09.05.2020 um 14:37 schrieb Michael Tuexen : >> >>> On 9. May 2020, at 14:18, Gordon Bergling wrote: >>> >>> Greetings, >>> >>> I build -CURRENT with WITH_EXTRA_TCP_STACKS=1, but I got the following error >>> when I try to load for example tcp_bbr.ko. >>> z >>> kldload: an error occurred while loading module tcp_rack.ko. Please check >>> dmesg(8) for more details. >> This indicates that you want to load the RACK stack. >> >> Please note that you need for BBR and RACK: >> options TCPHPTS >> in the kernel config and in addition to that for RACK >> options RATELIMIT >> >> Best regards >> Michael >>> >>> dmesg shows: >>> >>> KLD tcp_bbr.ko: depends on tcphpts - not available or version mismatch >>> linker_load_file: /boot/kernel/tcp_bbr.ko - unsupported file type >>> >>> Any hints on solving the problem? >>> >>> The kernel config is GENERIC. >>> >>> Best regards, >>> >>> Gordon >>> ___ >>> freebsd-current@freebsd.org mailing list >>> https://lists.freebsd.org/mailman/listinfo/freebsd-current >>> To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org" >> >> ___ >> freebsd-current@freebsd.org mailing list >> https://lists.freebsd.org/mailman/listinfo/freebsd-current >> To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org" > ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Xorg question
I run the latest current and I have the following packages installed>xf86-input-keyboard-1.9.0_4 xf86-input-libinput-0.28.2_1 xf86-input-mouse-1.9.3_3 Should I keep all of them or may I keep xf86-input-libinputThank youFilippo ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: Error loading tcp_bbr kernel module
> On 9. May 2020, at 14:18, Gordon Bergling wrote: > > Greetings, > > I build -CURRENT with WITH_EXTRA_TCP_STACKS=1, but I got the following error > when I try to load for example tcp_bbr.ko. > z > kldload: an error occurred while loading module tcp_rack.ko. Please check > dmesg(8) for more details. This indicates that you want to load the RACK stack. Please note that you need for BBR and RACK: options TCPHPTS in the kernel config and in addition to that for RACK options RATELIMIT Best regards Michael > > dmesg shows: > > KLD tcp_bbr.ko: depends on tcphpts - not available or version mismatch > linker_load_file: /boot/kernel/tcp_bbr.ko - unsupported file type > > Any hints on solving the problem? > > The kernel config is GENERIC. > > Best regards, > > Gordon > ___ > freebsd-current@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-current > To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org" ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: Error loading tcp_bbr kernel module
On Sat, May 09, 2020 at 02:18:51PM +0200, Gordon Bergling wrote: > Greetings, > > I build -CURRENT with WITH_EXTRA_TCP_STACKS=1, but I got the following error > when I try to load for example tcp_bbr.ko. > > kldload: an error occurred while loading module tcp_rack.ko. Please check > dmesg(8) for more details. > > dmesg shows: > > KLD tcp_bbr.ko: depends on tcphpts - not available or version mismatch > linker_load_file: /boot/kernel/tcp_bbr.ko - unsupported file type > > Any hints on solving the problem? > > The kernel config is GENERIC. > > Best regards, > > Gordon > Looks as if option TCPHPTS isn't in GENERIC, and it's a requisite for BBR. I'd probably create a custom kernel config that amounted to: include GENERIC options TCPHPTS Peace, david -- David H. Wolfskill da...@catwhisker.org Donald Trump had 3 years to replenish the US stockpile of PPE -- and failed. See http://www.catwhisker.org/~david/publickey.gpg for my public key. signature.asc Description: PGP signature
Re: "make buildworld" fails for r360785?
Dimitry Andric writes: > /usr/src/contrib/llvm-project/clang/lib/Basic/SourceManager.cpp:1228:10: > fatal error: 'emmintrin.h' file not found > #include > ^ > ... > > In file included from > /usr/src/contrib/llvm-project/clang/lib/Basic/SourceManager.cpp:1228: > > In file included from /usr/include/emmintrin.h:13: > > /usr/include/xmmintrin.h:27:10: fatal error: 'mm_malloc.h' file not found > > #include > > ^ > > 1 error generated. > > *** Error code 1 > > During which stage of buildworld is this? If it is during the > cross-tools stage, your host environment is busted somehow. That is the latest stage listed by the complete build log. (Which I can make available if it's useful.) The only complaint I see is two lines at the top of the log: make[1]: "/usr/src/Makefile.inc1" line 325: SYSTEM_COMPILER: libclang will be built for bootstrapping a cross-compiler. make[1]: "/usr/src/Makefile.inc1" line 330: SYSTEM_LINKER: libclang will be built for bootstrapping a cross-linker. Makefile.inc1 was downloaded with the fresh source tree, and I have already posted make.conf and src.conf. I am not sure what else could be broken, nor how to diagnose it. > This appears to happen to some people on this list, for unknown reasons. > My guess is they either run "make delete-old" before running buildworld > (which is the wrong order!), I did recently run "delete-old" ... but only as directed by the official documentation. But ... let's say that somehow happened. How do I recover? Is there a bootstrap process? > or have done an earlier > buildworld where > they explicitly disabled clang, so the intrinsics headers never get > installed. Other than a custom kernel config file, I don't touch the system build process. Respectfully, Robert Huff ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: "make buildworld" fails for r360785?
On 9 May 2020, at 05:10, Robert Huff wrote: > > Chris writes: >>> "make buildowrld" fails with: ... /usr/src/contrib/llvm-project/clang/lib/Basic/SourceManager.cpp:1228:10: fatal error: 'emmintrin.h' file not found #include ^ ... > In file included from > /usr/src/contrib/llvm-project/clang/lib/Basic/SourceManager.cpp:1228: > In file included from /usr/include/emmintrin.h:13: > /usr/include/xmmintrin.h:27:10: fatal error: 'mm_malloc.h' file not found > #include > ^ > 1 error generated. > *** Error code 1 During which stage of buildworld is this? If it is during the cross-tools stage, your host environment is busted somehow. This appears to happen to some people on this list, for unknown reasons. My guess is they either run "make delete-old" before running buildworld (which is the wrong order!), or have done an earlier buildworld where they explicitly disabled clang, so the intrinsics headers never get installed. -Dimitry signature.asc Description: Message signed with OpenPGP
Re: svn commit: r360233 - in head: contrib/jemalloc . . . : This partially breaks a 2-socket 32-bit powerpc (old PowerMac G4) based on head -r360311
[I caused nfsd to having things shifted in mmeory some to see it it tracked content vs. page boundary for where the zeros stop. Non-nfsd examples omitted.] > . . . >> nfsd hit an assert, failing ret == sz_size2index_compute(size) > > [Correction: That should have referenced sz_index2size_lookup(index).] > >> (also, but a different caller of sz_size2index): > > [Correction: The "also" comment should be ignored: > sz_index2size_lookup(index) is referenced below.] > >> >> (gdb) bt >> #0 thr_kill () at thr_kill.S:4 >> #1 0x502b2170 in __raise (s=6) at /usr/src/lib/libc/gen/raise.c:52 >> #2 0x50211cc0 in abort () at /usr/src/lib/libc/stdlib/abort.c:67 >> #3 0x50206104 in sz_index2size_lookup (index=) at >> /usr/src/contrib/jemalloc/include/jemalloc/internal/sz.h:200 >> #4 sz_index2size (index=) at >> /usr/src/contrib/jemalloc/include/jemalloc/internal/sz.h:207 >> #5 ifree (tsd=0x50094018, ptr=0x50041028, tcache=0x50094138, >> slow_path=) at jemalloc_jemalloc.c:2583 >> #6 0x50205cac in __je_free_default (ptr=0x50041028) at >> jemalloc_jemalloc.c:2784 >> #7 0x50206294 in __free (ptr=0x50041028) at jemalloc_jemalloc.c:2852 >> #8 0x50287ec8 in ns_src_free (src=0x50329004, srclistsize=) >> at /usr/src/lib/libc/net/nsdispatch.c:452 >> #9 ns_dbt_free (dbt=0x50329000) at /usr/src/lib/libc/net/nsdispatch.c:436 >> #10 vector_free (vec=0x50329000, count=, esize=12, >> free_elem=) at /usr/src/lib/libc/net/nsdispatch.c:253 >> #11 nss_atexit () at /usr/src/lib/libc/net/nsdispatch.c:578 >> #12 0x5028d958 in __cxa_finalize (dso=0x0) at >> /usr/src/lib/libc/stdlib/atexit.c:240 >> #13 0x502117f8 in exit (status=0) at /usr/src/lib/libc/stdlib/exit.c:74 >> #14 0x10013f9c in child_cleanup (signo=) at >> /usr/src/usr.sbin/nfsd/nfsd.c:969 >> #15 >> #16 0x in ?? () >> >> (gdb) up 3 >> #3 0x50206104 in sz_index2size_lookup (index=) at >> /usr/src/contrib/jemalloc/include/jemalloc/internal/sz.h:200 >> 200 assert(ret == sz_index2size_compute(index)); >> >> (ret is optimized out.) >> >> 197 JEMALLOC_ALWAYS_INLINE size_t >> 198 sz_index2size_lookup(szind_t index) { >> 199 size_t ret = (size_t)sz_index2size_tab[index]; >> 200 assert(ret == sz_index2size_compute(index)); >> 201 return ret; >> 202 } > > (gdb) print/x __je_sz_index2size_tab > $3 = {0x0 } > > Also: > > (gdb) x/4x __je_arenas+16368/4 > 0x5030cab0 <__je_arenas+16368>: 0x 0x > 0x 0x > (gdb) print/x __je_arenas_lock > > $8 = {{{prof_data = {tot_wait_time = {ns = 0x0}, max_wait_time = {ns = 0x0}, > n_wait_times = 0x0, n_spin_acquired = 0x0, max_n_thds = 0x0, n_waiting_thds = > {repr = 0x0}, n_owner_switches = 0x0, > prev_owner = 0x0, n_lock_ops = 0x0}, lock = 0x0, postponed_next = 0x0, > locked = {repr = 0x0}}}, witness = {name = 0x0, rank = 0x0, comp = 0x0, > opaque = 0x0, link = {qre_next = 0x0, > qre_prev = 0x0}}, lock_order = 0x0} > (gdb) print/x __je_narenas_auto > $9 = 0x0 > (gdb) print/x malloc_conf > $10 = 0x0 > (gdb) print/x __je_ncpus > $11 = 0x0 > (gdb) print/x __je_manual_arena_base > $12 = 0x0 > (gdb) print/x __je_sz_pind2sz_tab > $13 = {0x0 } > (gdb) print/x __je_sz_size2index_tab > $1 = {0x0 , 0x1a, 0x1b , 0x1c 64 times>} > >> Booting and immediately trying something like: >> >> service nfsd stop >> >> did not lead to a failure. But may be after >> a while it would and be less drastic than a >> reboot or power down. > > More detail: > > So, for rpcbind and nfds at some point a large part of > __je_sz_size2index_tab is being stomped on, as is all of > __je_sz_index2size_tab and more. > > . . . > > For nfsd, it is similar (again showing the partially > non-zero live process context instead of the all-zeros > from the .core file): > > 0x5030cab0 <__je_arenas+16368>: 0x 0x > 0x 0x0009 > 0x5030cac0 <__je_arenas_lock>:0x 0x > 0x 0x > 0x5030cad0 <__je_arenas_lock+16>: 0x 0x > 0x 0x > 0x5030cae0 <__je_arenas_lock+32>: 0x 0x > 0x 0x > 0x5030caf0 <__je_arenas_lock+48>: 0x 0x > 0x 0x > 0x5030cb00 <__je_arenas_lock+64>: 0x 0x502ff070 > 0x 0x > 0x5030cb10 <__je_arenas_lock+80>: 0x500ebb04 0x0003 > 0x 0x > 0x5030cb20 <__je_arenas_lock+96>: 0x5030cb10 0x5030cb10 > 0x 0x > > Then the memory in the crash continues to be zero until: > > 0x5030d000 <__je_sz_size2index_tab+384>: 0x1a1b1b1b 0x1b1b1b1b > 0x1b1b1b1b 0x1b1b1b1b > > Notice the interesting page boundary for where non-zero > is first available again! >