Re: systematic crash in amdgpu init since 5.10.20. Not fixed with 5.10.21
On 09/03/2021 18:27, Holger Hoffstätte wrote: On 2021-03-07 17:18, Eric Valette wrote: I have the following systematic crash at boot since 5.10.20 (.19 was ok) This laptop has two graphic cards: 03:00.0 Display controller: Advanced Micro Devices, Inc. [AMD/ATI] Navi 14 [Radeon RX 5500/5500M / Pro 5500M] (rev c1) 07:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Renoir (rev c6) NB: cc me I'm not subscribed CPU: 13 PID: 721 Comm: systemd-udevd Not tainted 5.10.21 #2 [ 4.446170] Hardware name: Micro-Star International Co., Ltd. Bravo 17 A4DDR/MS-17FK, BIOS E17FKAMS.117 10/29/2020 [ 4.446175] RIP: 0010:kernel_fpu_begin_mask+0xc5/0xe0 [ 4.446179] Code: 65 8a 05 86 32 9f 52 84 c0 74 9a 0f 0b eb 96 48 8b 07 f6 c4 40 75 b0 f0 80 4f 01 40 48 81 c7 00 0c 00 00 e8 cd fb ff ff eb 9d <0f> 0b eb 82 db e3 eb b8 e8 3e 63 e0 00 66 66 2e 0f 1f 84 00 00 00 [ 4.446182] RSP: 0018:bc70012ef5e8 EFLAGS: 00010202 [ 4.446185] RAX: 8001 RBX: 0003 RCX: bc70012ef65c [ 4.446186] RDX: 9bd4415b4000 RSI: 9bd4525c RDI: 0003 [ 4.446188] RBP: 9bd433e2 R08: bc70012ef660 R09: [ 4.446190] R10: 9bd415ba4000 R11: 9bd4525c10f0 R12: c0c46560 [ 4.446191] R13: R14: 9bd4415b4000 R15: 0001 [ 4.446194] FS: 7f00024218c0() GS:9bd71f94() knlGS: [ 4.446196] CS: 0010 DS: ES: CR0: 80050033 [ 4.446199] CR2: 5575ca8bb8e8 CR3: 0001117e8000 CR4: 00350ee0 [ 4.446200] Call Trace: [ 4.446532] dcn21_calculate_wm+0x49/0x410 [amdgpu] [ 4.446848] dcn21_validate_bandwidth_fp+0x174/0x280 [amdgpu] [ 4.447162] dcn21_validate_bandwidth+0x29/0x40 [amdgpu] [ 4.447415] dc_validate_global_state+0x2f2/0x390 [amdgpu] [ 4.447667] amdgpu_dm_atomic_check+0xb0d/0xc00 [amdgpu] [ 4.447704] drm_atomic_check_only+0x55a/0x7d0 [drm] [ 4.447735] drm_atomic_commit+0x13/0x50 [drm] [ 4.447765] drm_client_modeset_commit_atomic+0x1e4/0x220 [drm] [ 4.447795] drm_client_modeset_commit_locked+0x56/0x150 [drm] [ 4.447822] drm_client_modeset_commit+0x24/0x40 [drm] [ 4.447840] drm_fb_helper_set_par+0xa5/0xd0 [drm_kms_helper] [ 4.447846] fbcon_init+0x2b3/0x570 [ 4.447850] visual_init+0xce/0x130 [ 4.447853] do_bind_con_driver.isra.0+0x1db/0x2e0 [ 4.447857] do_take_over_console+0x116/0x180 [ 4.447861] do_fbcon_takeover+0x5c/0xc0 [ 4.447864] register_framebuffer+0x1e4/0x300 [ 4.447881] __drm_fb_helper_initial_config_and_unlock+0x321/0x4a0 [drm_kms_helper] [ 4.448081] amdgpu_fbdev_init+0xb9/0xf0 [amdgpu] [ 4.448326] amdgpu_device_init.cold+0x166b/0x1a4d [amdgpu] [ 4.448334] ? pci_bus_read_config_word+0x49/0x70 [ 4.448527] amdgpu_driver_load_kms+0x2b/0x1f0 [amdgpu] [ 4.448718] amdgpu_pci_probe+0x114/0x1a0 [amdgpu] [ 4.448761] local_pci_probe+0x42/0x80 [ 4.448770] ? _cond_resched+0x16/0x40 [ 4.448774] pci_device_probe+0xfa/0x1b0 [ 4.448781] really_probe+0xf2/0x440 [ 4.448786] driver_probe_device+0xe1/0x150 [ 4.448789] device_driver_attach+0xa1/0xb0 [ 4.448792] __driver_attach+0x8a/0x150 [ 4.448794] ? device_driver_attach+0xb0/0xb0 [ 4.448797] ? device_driver_attach+0xb0/0xb0 [ 4.448800] bus_for_each_dev+0x78/0xc0 [ 4.448805] bus_add_driver+0x12b/0x1e0 [ 4.448808] driver_register+0x8b/0xe0 [ 4.448812] ? 0xc134a000 [ 4.448817] do_one_initcall+0x44/0x1d0 [ 4.448822] ? do_init_module+0x23/0x260 [ 4.448828] ? kmem_cache_alloc_trace+0xf5/0x200 [ 4.448831] do_init_module+0x5c/0x260 [ 4.448834] __do_sys_finit_module+0xb1/0x110 [ 4.448840] do_syscall_64+0x33/0x80 [ 4.448844] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 4.448848] RIP: 0033:0x7f00028da9b9 [ 4.448853] Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d a7 54 0c 00 f7 d8 64 89 01 48 [ 4.448855] RSP: 002b:7ffcab625508 EFLAGS: 0246 ORIG_RAX: 0139 [ 4.448860] RAX: ffda RBX: 56314a23dff0 RCX: 7f00028da9b9 [ 4.448862] RDX: RSI: 7f0002a65e2d RDI: 0017 [ 4.448864] RBP: 0002 R08: R09: 56314a243020 [ 4.448866] R10: 0017 R11: 0246 R12: 7f0002a65e2d [ 4.448868] R13: R14: 56314a24a2d0 R15: 56314a23dff0 [ 4.448873] ---[ end trace 72b8a47f60a3c4b2 ]--- [ 4.449556] [ cut here ] [ 4.449562] WARNING: CPU: 13 PID: 721 at arch/x86/kernel/fpu/core.c:155 kernel_fpu_end+0x19/0x20 [ 4.449563] Modules linked in: uinput binfmt_misc amdgpu(+) uvcvideo videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 videodev iwlmvm videobuf2_common gpu_sched ttm msi_wmi dr
systematic crash in amdgpu init since 5.10.20. Not fixed with 5.10.21
I have the following systematic crash at boot since 5.10.20 (.19 was ok) This laptop has two graphic cards: 03:00.0 Display controller: Advanced Micro Devices, Inc. [AMD/ATI] Navi 14 [Radeon RX 5500/5500M / Pro 5500M] (rev c1) 07:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Renoir (rev c6) NB: cc me I'm not subscribed CPU: 13 PID: 721 Comm: systemd-udevd Not tainted 5.10.21 #2 [4.446170] Hardware name: Micro-Star International Co., Ltd. Bravo 17 A4DDR/MS-17FK, BIOS E17FKAMS.117 10/29/2020 [4.446175] RIP: 0010:kernel_fpu_begin_mask+0xc5/0xe0 [4.446179] Code: 65 8a 05 86 32 9f 52 84 c0 74 9a 0f 0b eb 96 48 8b 07 f6 c4 40 75 b0 f0 80 4f 01 40 48 81 c7 00 0c 00 00 e8 cd fb ff ff eb 9d <0f> 0b eb 82 db e3 eb b8 e8 3e 63 e0 00 66 66 2e 0f 1f 84 00 00 00 [4.446182] RSP: 0018:bc70012ef5e8 EFLAGS: 00010202 [4.446185] RAX: 8001 RBX: 0003 RCX: bc70012ef65c [4.446186] RDX: 9bd4415b4000 RSI: 9bd4525c RDI: 0003 [4.446188] RBP: 9bd433e2 R08: bc70012ef660 R09: [4.446190] R10: 9bd415ba4000 R11: 9bd4525c10f0 R12: c0c46560 [4.446191] R13: R14: 9bd4415b4000 R15: 0001 [4.446194] FS: 7f00024218c0() GS:9bd71f94() knlGS: [4.446196] CS: 0010 DS: ES: CR0: 80050033 [4.446199] CR2: 5575ca8bb8e8 CR3: 0001117e8000 CR4: 00350ee0 [4.446200] Call Trace: [4.446532] dcn21_calculate_wm+0x49/0x410 [amdgpu] [4.446848] dcn21_validate_bandwidth_fp+0x174/0x280 [amdgpu] [4.447162] dcn21_validate_bandwidth+0x29/0x40 [amdgpu] [4.447415] dc_validate_global_state+0x2f2/0x390 [amdgpu] [4.447667] amdgpu_dm_atomic_check+0xb0d/0xc00 [amdgpu] [4.447704] drm_atomic_check_only+0x55a/0x7d0 [drm] [4.447735] drm_atomic_commit+0x13/0x50 [drm] [4.447765] drm_client_modeset_commit_atomic+0x1e4/0x220 [drm] [4.447795] drm_client_modeset_commit_locked+0x56/0x150 [drm] [4.447822] drm_client_modeset_commit+0x24/0x40 [drm] [4.447840] drm_fb_helper_set_par+0xa5/0xd0 [drm_kms_helper] [4.447846] fbcon_init+0x2b3/0x570 [4.447850] visual_init+0xce/0x130 [4.447853] do_bind_con_driver.isra.0+0x1db/0x2e0 [4.447857] do_take_over_console+0x116/0x180 [4.447861] do_fbcon_takeover+0x5c/0xc0 [4.447864] register_framebuffer+0x1e4/0x300 [4.447881] __drm_fb_helper_initial_config_and_unlock+0x321/0x4a0 [drm_kms_helper] [4.448081] amdgpu_fbdev_init+0xb9/0xf0 [amdgpu] [4.448326] amdgpu_device_init.cold+0x166b/0x1a4d [amdgpu] [4.448334] ? pci_bus_read_config_word+0x49/0x70 [4.448527] amdgpu_driver_load_kms+0x2b/0x1f0 [amdgpu] [4.448718] amdgpu_pci_probe+0x114/0x1a0 [amdgpu] [4.448761] local_pci_probe+0x42/0x80 [4.448770] ? _cond_resched+0x16/0x40 [4.448774] pci_device_probe+0xfa/0x1b0 [4.448781] really_probe+0xf2/0x440 [4.448786] driver_probe_device+0xe1/0x150 [4.448789] device_driver_attach+0xa1/0xb0 [4.448792] __driver_attach+0x8a/0x150 [4.448794] ? device_driver_attach+0xb0/0xb0 [4.448797] ? device_driver_attach+0xb0/0xb0 [4.448800] bus_for_each_dev+0x78/0xc0 [4.448805] bus_add_driver+0x12b/0x1e0 [4.448808] driver_register+0x8b/0xe0 [4.448812] ? 0xc134a000 [4.448817] do_one_initcall+0x44/0x1d0 [4.448822] ? do_init_module+0x23/0x260 [4.448828] ? kmem_cache_alloc_trace+0xf5/0x200 [4.448831] do_init_module+0x5c/0x260 [4.448834] __do_sys_finit_module+0xb1/0x110 [4.448840] do_syscall_64+0x33/0x80 [4.448844] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [4.448848] RIP: 0033:0x7f00028da9b9 [4.448853] Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d a7 54 0c 00 f7 d8 64 89 01 48 [4.448855] RSP: 002b:7ffcab625508 EFLAGS: 0246 ORIG_RAX: 0139 [4.448860] RAX: ffda RBX: 56314a23dff0 RCX: 7f00028da9b9 [4.448862] RDX: RSI: 7f0002a65e2d RDI: 0017 [4.448864] RBP: 0002 R08: R09: 56314a243020 [4.448866] R10: 0017 R11: 0246 R12: 7f0002a65e2d [4.448868] R13: R14: 56314a24a2d0 R15: 56314a23dff0 [4.448873] ---[ end trace 72b8a47f60a3c4b2 ]--- [4.449556] [ cut here ] [4.449562] WARNING: CPU: 13 PID: 721 at arch/x86/kernel/fpu/core.c:155 kernel_fpu_end+0x19/0x20 [4.449563] Modules linked in: uinput binfmt_misc amdgpu(+) uvcvideo videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 videodev iwlmvm videobuf2_common gpu_sched ttm msi_wmi drm_kms_helper pcspkr serio_raw sparse_keymap sp5100_tco cec watchdog btusb i2c_algo_bit fb_sys_fops
Re: Periodic Null pointer access with 4.14.29-30 (while using minidlna) in sendfile system call
On 3/25/18 11:39 PM, Eric Valette wrote: On 3/25/18 11:07 PM, Eric Valette wrote: Since a few kernels I get this error when streaming video to TV via minidlna: Checked it started with at least 4.14.25. My older kern.log.2.gz file says before first march I did not get this error. But I removed older kernels from my /boot directory. I usually update kernel regularly as they appear but sometime I have a week or two delay. If this helps bisecting. Opened bug 199211 in bugzilla. Other user have confirmed it and even proposed a possible explanation. -- eric
Re: Periodic Null pointer access with 4.14.29-30 (while using minidlna) in sendfile system call
On 3/25/18 11:39 PM, Eric Valette wrote: On 3/25/18 11:07 PM, Eric Valette wrote: Since a few kernels I get this error when streaming video to TV via minidlna: Checked it started with at least 4.14.25. My older kern.log.2.gz file says before first march I did not get this error. But I removed older kernels from my /boot directory. I usually update kernel regularly as they appear but sometime I have a week or two delay. If this helps bisecting. Opened bug 199211 in bugzilla. Other user have confirmed it and even proposed a possible explanation. -- eric
Re: Periodic Null pointer access with 4.14.29-30 (while using minidlna) in sendfile system call
On 3/25/18 11:07 PM, Eric Valette wrote: Since a few kernels I get this error when streaming video to TV via minidlna: Checked it started with at least 4.14.25. My older kern.log.2.gz file says before first march I did not get this error. But I removed older kernels from my /boot directory. I usually update kernel regularly as they appear but sometime I have a week or two delay. If this helps bisecting. -- eric
Re: Periodic Null pointer access with 4.14.29-30 (while using minidlna) in sendfile system call
On 3/25/18 11:07 PM, Eric Valette wrote: Since a few kernels I get this error when streaming video to TV via minidlna: Checked it started with at least 4.14.25. My older kern.log.2.gz file says before first march I did not get this error. But I removed older kernels from my /boot directory. I usually update kernel regularly as they appear but sometime I have a week or two delay. If this helps bisecting. -- eric
Periodic Null pointer access with 4.14.29-30 (while using minidlna) in sendfile system call
pe_to_sendpage+0x57/0x59 [ 3467.935762] __splice_from_pipe+0xa4/0x150 [ 3467.935765] splice_from_pipe+0x50/0x66 [ 3467.935770] ? generic_pipe_buf_nosteal+0x6/0x6 [ 3467.935774] direct_splice_actor+0x2e/0x2f [ 3467.935779] splice_direct_to_actor+0xf2/0x18b [ 3467.935782] ? vmsplice_to_user+0x10f/0x10f [ 3467.935787] do_splice_direct+0x87/0x9b [ 3467.935793] do_sendfile+0x186/0x2d6 [ 3467.935797] SyS_sendfile64+0x50/0x7f [ 3467.935802] do_syscall_64+0x5c/0xe7 [ 3467.935808] entry_SYSCALL_64_after_hwframe+0x3d/0xa2 [ 3467.935812] RIP: 0033:0x7f96f3c30bea [ 3467.935815] RSP: 002b:7fff5ca2d008 EFLAGS: 0206 ORIG_RAX: 0028 [ 3467.935819] RAX: ffda RBX: RCX: 7f96f3c30bea [ 3467.935823] RDX: 7fff5ca2d018 RSI: 000b RDI: 000a [ 3467.935826] RBP: af0c410c R08: R09: 5633e3b45fe7 [ 3467.935830] R10: 7fff R11: 0206 R12: 000b [ 3467.935833] R13: 0001 R14: 5633e3ddefa0 R15: 1b31 [ 3467.935837] Code: 44 03 8f 44 06 00 00 79 10 80 48 38 08 8b 8f 3c 06 00 00 89 8f 44 06 00 00 40 80 e6 01 74 0c 8b 8f 3c 06 00 00 89 8f cc 05 00 00 <44> 39 40 78 72 13 89 de 45 85 db b8 02 00 00 00 5b 0f 45 d0 e9 [ 3467.935882] RIP: tcp_push+0x70/0xeb RSP: c900015b7bc0 [ 3467.935885] CR2: 0078 [ 3467.935909] ---[ end trace 4fc2992bf2697a63 ]--- [ 9308.859580] BUG: unable to handle kernel NULL pointer dereference at 0078 [ 9308.859596] IP: tcp_push+0x70/0xeb [ 9308.859598] PGD 0 P4D 0 [ 9308.859603] Oops: [#6] SMP PTI [ 9308.859606] Modules linked in: uas usb_storage ehci_pci ehci_hcd xhci_pci xhci_hcd usbcore usb_common [last unloaded: usb_storage] [ 9308.859621] CPU: 0 PID: 4785 Comm: minidlnad Tainted: G D 4.14.30 #1 [ 9308.859624] Hardware name: MotherBoard By ZOTAC MotherBoard H67ITX-C-E/H67ITX-C-E, BIOS 4.6.4.B 08/25/2011 [ 9308.859629] task: 88022f1d6280 task.stack: c900016b [ 9308.859634] RIP: 0010:tcp_push+0x70/0xeb [ 9308.859637] RSP: 0018:c900016b3bc0 EFLAGS: 00010246 [ 9308.859640] RAX: RBX: 05a8 RCX: [ 9308.859644] RDX: RSI: 00028000 RDI: 880236f0e000 [ 9308.859647] RBP: 880236f0e138 R08: 05a8 R09: 0002 [ 9308.859651] R10: 880236f0e138 R11: 8000 R12: 88023662ba80 [ 9308.859654] R13: 880236f0e000 R14: 05a8 R15: ffe0 [ 9308.859659] FS: 7f96f5e7ce80() GS:88023fa0() knlGS: [ 9308.859662] CS: 0010 DS: ES: CR0: 80050033 [ 9308.859666] CR2: 0078 CR3: 00022f308005 CR4: 000606f0 [ 9308.859669] Call Trace: [ 9308.859675] tcp_sendmsg_locked+0xac3/0xc1b [ 9308.859682] ? generic_pipe_buf_nosteal+0x6/0x6 [ 9308.859687] sock_no_sendpage_locked+0x74/0x7b [ 9308.859692] tcp_sendpage_locked+0x36/0x72 [ 9308.859697] ? release_sock+0x3b/0x74 [ 9308.859701] ? lock_sock_nested+0xd/0x38 [ 9308.859705] tcp_sendpage+0x38/0x4f [ 9308.859711] inet_sendpage+0x74/0xd3 [ 9308.859715] kernel_sendpage+0x15/0x1c [ 9308.859719] sock_sendpage+0x1b/0x1e [ 9308.859723] pipe_to_sendpage+0x57/0x59 [ 9308.859727] __splice_from_pipe+0xa4/0x150 [ 9308.859730] splice_from_pipe+0x50/0x66 [ 9308.859735] ? generic_pipe_buf_nosteal+0x6/0x6 [ 9308.859739] direct_splice_actor+0x2e/0x2f [ 9308.859743] splice_direct_to_actor+0xf2/0x18b [ 9308.859747] ? vmsplice_to_user+0x10f/0x10f [ 9308.859752] do_splice_direct+0x87/0x9b [ 9308.859757] do_sendfile+0x186/0x2d6 [ 9308.859761] SyS_sendfile64+0x50/0x7f [ 9308.859766] do_syscall_64+0x5c/0xe7 [ 9308.859771] entry_SYSCALL_64_after_hwframe+0x3d/0xa2 [ 9308.859776] RIP: 0033:0x7f96f3c30bea [ 9308.859778] RSP: 002b:7fff5ca2d008 EFLAGS: 0212 ORIG_RAX: 0028 [ 9308.859783] RAX: ffda RBX: RCX: 7f96f3c30bea [ 9308.859787] RDX: 7fff5ca2d018 RSI: 000b RDI: 000a [ 9308.859790] RBP: af0c410c R08: R09: 5633e3b45fe7 [ 9308.859793] R10: 2f0c35e0 R11: 0212 R12: 000b [ 9308.859797] R13: 0001 R14: 5633e3ddefa0 R15: 8b2d [ 9308.859800] Code: 44 03 8f 44 06 00 00 79 10 80 48 38 08 8b 8f 3c 06 00 00 89 8f 44 06 00 00 40 80 e6 01 74 0c 8b 8f 3c 06 00 00 89 8f cc 05 00 00 <44> 39 40 78 72 13 89 de 45 85 db b8 02 00 00 00 5b 0f 45 d0 e9 [ 9308.859846] RIP: tcp_push+0x70/0xeb RSP: c900016b3bc0 [ 9308.859848] CR2: 0078 [ 9308.859851] ---[ end trace 4fc2992bf2697a64 ]--- root@nas2:~# -- __ / ` Eric Valette /-- __ o _. 6 rue Paul Le Flem (___, / (_(_(__ 35740 Pace Tel: +33 (0)2 99 85 26 76 Fax: +33 (0)2 99 85 26 76 E-mail: eric.vale...@free.fr
Periodic Null pointer access with 4.14.29-30 (while using minidlna) in sendfile system call
pe_to_sendpage+0x57/0x59 [ 3467.935762] __splice_from_pipe+0xa4/0x150 [ 3467.935765] splice_from_pipe+0x50/0x66 [ 3467.935770] ? generic_pipe_buf_nosteal+0x6/0x6 [ 3467.935774] direct_splice_actor+0x2e/0x2f [ 3467.935779] splice_direct_to_actor+0xf2/0x18b [ 3467.935782] ? vmsplice_to_user+0x10f/0x10f [ 3467.935787] do_splice_direct+0x87/0x9b [ 3467.935793] do_sendfile+0x186/0x2d6 [ 3467.935797] SyS_sendfile64+0x50/0x7f [ 3467.935802] do_syscall_64+0x5c/0xe7 [ 3467.935808] entry_SYSCALL_64_after_hwframe+0x3d/0xa2 [ 3467.935812] RIP: 0033:0x7f96f3c30bea [ 3467.935815] RSP: 002b:7fff5ca2d008 EFLAGS: 0206 ORIG_RAX: 0028 [ 3467.935819] RAX: ffda RBX: RCX: 7f96f3c30bea [ 3467.935823] RDX: 7fff5ca2d018 RSI: 000b RDI: 000a [ 3467.935826] RBP: af0c410c R08: R09: 5633e3b45fe7 [ 3467.935830] R10: 7fff R11: 0206 R12: 000b [ 3467.935833] R13: 0001 R14: 5633e3ddefa0 R15: 1b31 [ 3467.935837] Code: 44 03 8f 44 06 00 00 79 10 80 48 38 08 8b 8f 3c 06 00 00 89 8f 44 06 00 00 40 80 e6 01 74 0c 8b 8f 3c 06 00 00 89 8f cc 05 00 00 <44> 39 40 78 72 13 89 de 45 85 db b8 02 00 00 00 5b 0f 45 d0 e9 [ 3467.935882] RIP: tcp_push+0x70/0xeb RSP: c900015b7bc0 [ 3467.935885] CR2: 0078 [ 3467.935909] ---[ end trace 4fc2992bf2697a63 ]--- [ 9308.859580] BUG: unable to handle kernel NULL pointer dereference at 0078 [ 9308.859596] IP: tcp_push+0x70/0xeb [ 9308.859598] PGD 0 P4D 0 [ 9308.859603] Oops: [#6] SMP PTI [ 9308.859606] Modules linked in: uas usb_storage ehci_pci ehci_hcd xhci_pci xhci_hcd usbcore usb_common [last unloaded: usb_storage] [ 9308.859621] CPU: 0 PID: 4785 Comm: minidlnad Tainted: G D 4.14.30 #1 [ 9308.859624] Hardware name: MotherBoard By ZOTAC MotherBoard H67ITX-C-E/H67ITX-C-E, BIOS 4.6.4.B 08/25/2011 [ 9308.859629] task: 88022f1d6280 task.stack: c900016b [ 9308.859634] RIP: 0010:tcp_push+0x70/0xeb [ 9308.859637] RSP: 0018:c900016b3bc0 EFLAGS: 00010246 [ 9308.859640] RAX: RBX: 05a8 RCX: [ 9308.859644] RDX: RSI: 00028000 RDI: 880236f0e000 [ 9308.859647] RBP: 880236f0e138 R08: 05a8 R09: 0002 [ 9308.859651] R10: 880236f0e138 R11: 8000 R12: 88023662ba80 [ 9308.859654] R13: 880236f0e000 R14: 05a8 R15: ffe0 [ 9308.859659] FS: 7f96f5e7ce80() GS:88023fa0() knlGS: [ 9308.859662] CS: 0010 DS: ES: CR0: 80050033 [ 9308.859666] CR2: 0078 CR3: 00022f308005 CR4: 000606f0 [ 9308.859669] Call Trace: [ 9308.859675] tcp_sendmsg_locked+0xac3/0xc1b [ 9308.859682] ? generic_pipe_buf_nosteal+0x6/0x6 [ 9308.859687] sock_no_sendpage_locked+0x74/0x7b [ 9308.859692] tcp_sendpage_locked+0x36/0x72 [ 9308.859697] ? release_sock+0x3b/0x74 [ 9308.859701] ? lock_sock_nested+0xd/0x38 [ 9308.859705] tcp_sendpage+0x38/0x4f [ 9308.859711] inet_sendpage+0x74/0xd3 [ 9308.859715] kernel_sendpage+0x15/0x1c [ 9308.859719] sock_sendpage+0x1b/0x1e [ 9308.859723] pipe_to_sendpage+0x57/0x59 [ 9308.859727] __splice_from_pipe+0xa4/0x150 [ 9308.859730] splice_from_pipe+0x50/0x66 [ 9308.859735] ? generic_pipe_buf_nosteal+0x6/0x6 [ 9308.859739] direct_splice_actor+0x2e/0x2f [ 9308.859743] splice_direct_to_actor+0xf2/0x18b [ 9308.859747] ? vmsplice_to_user+0x10f/0x10f [ 9308.859752] do_splice_direct+0x87/0x9b [ 9308.859757] do_sendfile+0x186/0x2d6 [ 9308.859761] SyS_sendfile64+0x50/0x7f [ 9308.859766] do_syscall_64+0x5c/0xe7 [ 9308.859771] entry_SYSCALL_64_after_hwframe+0x3d/0xa2 [ 9308.859776] RIP: 0033:0x7f96f3c30bea [ 9308.859778] RSP: 002b:7fff5ca2d008 EFLAGS: 0212 ORIG_RAX: 0028 [ 9308.859783] RAX: ffda RBX: RCX: 7f96f3c30bea [ 9308.859787] RDX: 7fff5ca2d018 RSI: 000b RDI: 000a [ 9308.859790] RBP: af0c410c R08: R09: 5633e3b45fe7 [ 9308.859793] R10: 2f0c35e0 R11: 0212 R12: 000b [ 9308.859797] R13: 0001 R14: 5633e3ddefa0 R15: 8b2d [ 9308.859800] Code: 44 03 8f 44 06 00 00 79 10 80 48 38 08 8b 8f 3c 06 00 00 89 8f 44 06 00 00 40 80 e6 01 74 0c 8b 8f 3c 06 00 00 89 8f cc 05 00 00 <44> 39 40 78 72 13 89 de 45 85 db b8 02 00 00 00 5b 0f 45 d0 e9 [ 9308.859846] RIP: tcp_push+0x70/0xeb RSP: c900016b3bc0 [ 9308.859848] CR2: 0078 [ 9308.859851] ---[ end trace 4fc2992bf2697a64 ]--- root@nas2:~# -- __ / ` Eric Valette /-- __ o _. 6 rue Paul Le Flem (___, / (_(_(__ 35740 Pace Tel: +33 (0)2 99 85 26 76 Fax: +33 (0)2 99 85 26 76 E-mail: eric.vale...@free.fr
Re: Linux 4.9.55 break network setup because dhcp client gets an error
On 12/10/2017 19:57, Shuah Khan wrote: On Thu, Oct 12, 2017 at 11:52 AM, Eric Valette <eric.vale...@free.fr> wrote: Hi, Just compiled a fresh 4.9.55, with same .config, same user space than 4.9.54 and discovered I had no network because ifup fails because dhcp cleint fails. As everything is identical, 4.9.54 still works, I guess one patch in net/core/... did break something. Please revert commit 345c66695569db83eed100723e4df72cb54df7de Author: Eric Dumazet <eduma...@google.com> Date: Mon Oct 2 12:20:51 2017 -0700 socket, bpf: fix possible use after free This will fix the problem. Thanks for the quick answer. Glad to see it is already known and almost fixed. I'll rather wait for 56 (to get the correct patch dependency merged) and remove 55 kernel image from /boot in the meantime. Glas my headless NAS has a fixed address because it is more tricky to get back in order and I switched it to 55 already. Thanks for the support, -- eric
Re: Linux 4.9.55 break network setup because dhcp client gets an error
On 12/10/2017 19:57, Shuah Khan wrote: On Thu, Oct 12, 2017 at 11:52 AM, Eric Valette wrote: Hi, Just compiled a fresh 4.9.55, with same .config, same user space than 4.9.54 and discovered I had no network because ifup fails because dhcp cleint fails. As everything is identical, 4.9.54 still works, I guess one patch in net/core/... did break something. Please revert commit 345c66695569db83eed100723e4df72cb54df7de Author: Eric Dumazet Date: Mon Oct 2 12:20:51 2017 -0700 socket, bpf: fix possible use after free This will fix the problem. Thanks for the quick answer. Glad to see it is already known and almost fixed. I'll rather wait for 56 (to get the correct patch dependency merged) and remove 55 kernel image from /boot in the meantime. Glas my headless NAS has a fixed address because it is more tricky to get back in order and I switched it to 55 already. Thanks for the support, -- eric
Linux 4.9.55 break network setup because dhcp client gets an error
Hi, Just compiled a fresh 4.9.55, with same .config, same user space than 4.9.54 and discovered I had no network because ifup fails because dhcp cleint fails. As everything is identical, 4.9.54 still works, I guess one patch in net/core/... did break something. ifup eth0 Internet Systems Consortium DHCP Client 4.3.5 Copyright 2004-2016 Internet Systems Consortium. All rights reserved. For info, please visit https://www.isc.org/software/dhcp/ Can't install packet filter program: Cannot allocate memory <= If you think you have received this message due to a bug rather than a configuration issue please read the section on submitting bugs on either our web page at www.isc.org or in the README file before submitting a bug. These pages explain the proper process and the information we find helpful for debugging.. exiting. ifup: failed to bring up eth0 Ps: CC me I'm not subscribed --eric
Linux 4.9.55 break network setup because dhcp client gets an error
Hi, Just compiled a fresh 4.9.55, with same .config, same user space than 4.9.54 and discovered I had no network because ifup fails because dhcp cleint fails. As everything is identical, 4.9.54 still works, I guess one patch in net/core/... did break something. ifup eth0 Internet Systems Consortium DHCP Client 4.3.5 Copyright 2004-2016 Internet Systems Consortium. All rights reserved. For info, please visit https://www.isc.org/software/dhcp/ Can't install packet filter program: Cannot allocate memory <= If you think you have received this message due to a bug rather than a configuration issue please read the section on submitting bugs on either our web page at www.isc.org or in the README file before submitting a bug. These pages explain the proper process and the information we find helpful for debugging.. exiting. ifup: failed to bring up eth0 Ps: CC me I'm not subscribed --eric
Freezing of tasks failed ... *0* tasks refusing to freeze, wq_busy=1)
I have a NAS system that does supend itslef when rx/tx counts drops below a level meaning no activity... The last working kernel was 3.14.58, any thing longterm after that refuse to freeze after a while with: [ 9692.905439] ehci-pci :00:1d.0: remove, state 1 [ 9692.905448] usb usb4: USB disconnect, device number 1 [ 9692.905451] usb 4-1: USB disconnect, device number 2 [ 9692.905453] usb 4-1.2: USB disconnect, device number 3 [ 9692.913779] ehci-pci :00:1d.0: USB bus 4 deregistered [ 9692.913950] ehci-pci :00:1a.0: remove, state 4 [ 9692.913957] usb usb1: USB disconnect, device number 1 [ 9692.913959] usb 1-1: USB disconnect, device number 2 [ 9692.918351] ehci-pci :00:1a.0: USB bus 1 deregistered [ 9693.227548] PM: Syncing filesystems ... done. [ 9693.357287] Freezing user space processes ... (elapsed 0.001 seconds) done. [ 9693.358563] Freezing remaining freezable tasks ... [ 9713.363844] Freezing of tasks failed after 20.015 seconds (0 tasks refusing to freeze, wq_busy=1): << 0)??? [ 9713.363921] Restarting kernel threads ... done. [ 9713.363956] Restarting tasks ... done. [ 9713.496684] ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) Driver [ 9713.496787] ehci-pci: EHCI PCI platform driver [ 9713.496902] ehci-pci :00:1a.0: EHCI Host Controller [ 9713.496908] ehci-pci :00:1a.0: new USB bus registered, assigned bus number 1 [ 9713.496921] ehci-pci :00:1a.0: debug port 2 [ 9713.500878] ehci-pci :00:1a.0: cache line size of 64 is not supported [ 9713.500890] ehci-pci :00:1a.0: irq 16, io mem 0xfe707000 [ 9713.517601] ehci-pci :00:1a.0: USB 2.0 started, EHCI 1.00 [ 9713.518144] hub 1-0:1.0: USB hub found [ 9713.518191] hub 1-0:1.0: 2 ports detected [ 9713.567741] ehci-pci :00:1d.0: EHCI Host Controller [ 9713.567750] ehci-pci :00:1d.0: new USB bus registered, assigned bus number 4 [ 9713.567765] ehci-pci :00:1d.0: debug port 2 [ 9713.571674] ehci-pci :00:1d.0: cache line size of 64 is not supported [ 9713.571692] ehci-pci :00:1d.0: irq 23, io mem 0xfe706000 [ 9713.587580] ehci-pci :00:1d.0: USB 2.0 started, EHCI 1.00 [ 9713.588260] hub 4-0:1.0: USB hub found [ 9713.588307] hub 4-0:1.0: 2 ports detected [ 9713.837399] usb 1-1: new high-speed USB device number 2 using ehci-pci [ 9713.907412] usb 4-1: new high-speed USB device number 2 using ehci-pci [ 9713.988167] hub 1-1:1.0: USB hub found [ 9713.988384] hub 1-1:1.0: 6 ports detected [ 9714.058113] hub 4-1:1.0: USB hub found [ 9714.058307] hub 4-1:1.0: 8 ports detected [ 9714.267224] usb 1-1.1: new full-speed USB device number 3 using ehci-pci [ 9714.337189] usb 4-1.2: new high-speed USB device number 3 using ehci-pci Freezing user space processes ... (elapsed 0.001 seconds) done. [ 9693.358563] Freezing remaining freezable tasks ... [ 9713.363844] Freezing of tasks failed after 20.015 seconds (0 tasks refusing to freeze, wq_busy=1): -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Freezing of tasks failed ... *0* tasks refusing to freeze, wq_busy=1)
I have a NAS system that does supend itslef when rx/tx counts drops below a level meaning no activity... The last working kernel was 3.14.58, any thing longterm after that refuse to freeze after a while with: [ 9692.905439] ehci-pci :00:1d.0: remove, state 1 [ 9692.905448] usb usb4: USB disconnect, device number 1 [ 9692.905451] usb 4-1: USB disconnect, device number 2 [ 9692.905453] usb 4-1.2: USB disconnect, device number 3 [ 9692.913779] ehci-pci :00:1d.0: USB bus 4 deregistered [ 9692.913950] ehci-pci :00:1a.0: remove, state 4 [ 9692.913957] usb usb1: USB disconnect, device number 1 [ 9692.913959] usb 1-1: USB disconnect, device number 2 [ 9692.918351] ehci-pci :00:1a.0: USB bus 1 deregistered [ 9693.227548] PM: Syncing filesystems ... done. [ 9693.357287] Freezing user space processes ... (elapsed 0.001 seconds) done. [ 9693.358563] Freezing remaining freezable tasks ... [ 9713.363844] Freezing of tasks failed after 20.015 seconds (0 tasks refusing to freeze, wq_busy=1): << 0)??? [ 9713.363921] Restarting kernel threads ... done. [ 9713.363956] Restarting tasks ... done. [ 9713.496684] ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) Driver [ 9713.496787] ehci-pci: EHCI PCI platform driver [ 9713.496902] ehci-pci :00:1a.0: EHCI Host Controller [ 9713.496908] ehci-pci :00:1a.0: new USB bus registered, assigned bus number 1 [ 9713.496921] ehci-pci :00:1a.0: debug port 2 [ 9713.500878] ehci-pci :00:1a.0: cache line size of 64 is not supported [ 9713.500890] ehci-pci :00:1a.0: irq 16, io mem 0xfe707000 [ 9713.517601] ehci-pci :00:1a.0: USB 2.0 started, EHCI 1.00 [ 9713.518144] hub 1-0:1.0: USB hub found [ 9713.518191] hub 1-0:1.0: 2 ports detected [ 9713.567741] ehci-pci :00:1d.0: EHCI Host Controller [ 9713.567750] ehci-pci :00:1d.0: new USB bus registered, assigned bus number 4 [ 9713.567765] ehci-pci :00:1d.0: debug port 2 [ 9713.571674] ehci-pci :00:1d.0: cache line size of 64 is not supported [ 9713.571692] ehci-pci :00:1d.0: irq 23, io mem 0xfe706000 [ 9713.587580] ehci-pci :00:1d.0: USB 2.0 started, EHCI 1.00 [ 9713.588260] hub 4-0:1.0: USB hub found [ 9713.588307] hub 4-0:1.0: 2 ports detected [ 9713.837399] usb 1-1: new high-speed USB device number 2 using ehci-pci [ 9713.907412] usb 4-1: new high-speed USB device number 2 using ehci-pci [ 9713.988167] hub 1-1:1.0: USB hub found [ 9713.988384] hub 1-1:1.0: 6 ports detected [ 9714.058113] hub 4-1:1.0: USB hub found [ 9714.058307] hub 4-1:1.0: 8 ports detected [ 9714.267224] usb 1-1.1: new full-speed USB device number 3 using ehci-pci [ 9714.337189] usb 4-1.2: new high-speed USB device number 3 using ehci-pci Freezing user space processes ... (elapsed 0.001 seconds) done. [ 9693.358563] Freezing remaining freezable tasks ... [ 9713.363844] Freezing of tasks failed after 20.015 seconds (0 tasks refusing to freeze, wq_busy=1): -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
3.18.11 longterm refuse to suspend to ram/shutdown
Hi, I have a nas build around a dedicated NAS enclosure and a classical mini-atx zotac Intel board. Was using happily 3.14.x (x=36) and sleepd to automatically suspend to ram when Eth traffic was down for 15 mins and wakeonLan mechanism for resume. As 3.18 has been declared longterm, I switched to 3.18 branch and I can no more suspend. My dmesg is full of message saying it can't suspend. And when this happens I cannot shutdown/reboot anymore. seems like https://bugzilla.kernel.org/show_bug.cgi?id=86241 But does not seem to be fixed yet Any hint? PS: CC me I'm not suscribed. -- eric -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
3.18.11 longterm refuse to suspend to ram/shutdown
Hi, I have a nas build around a dedicated NAS enclosure and a classical mini-atx zotac Intel board. Was using happily 3.14.x (x=36) and sleepd to automatically suspend to ram when Eth traffic was down for 15 mins and wakeonLan mechanism for resume. As 3.18 has been declared longterm, I switched to 3.18 branch and I can no more suspend. My dmesg is full of message saying it can't suspend. And when this happens I cannot shutdown/reboot anymore. seems like https://bugzilla.kernel.org/show_bug.cgi?id=86241 But does not seem to be fixed yet Any hint? PS: CC me I'm not suscribed. -- eric -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: memmap exclude boot command: how to check if it was indeed applied
On 01/07/2013 20:40, Eric Valette wrote: Hi, After hunting an unreproducible bug, I decided to run memtest86+ and found that only 8 byte of memory refuse to write the last two digit on the last 4GB memory stick. Memory is at 0x31db357558 So I decided to add a memmap=4K@0x00031db35000 boot options in GRUB_CMDLINE_LINUX. But checking dmesg, I see no mention of this page being reserved. PS: CC me I'm not subscribed. Analysing /sys/firmware/memmap files, I see no trace of buggy address exclusion. Any help? --eric -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: memmap exclude boot command: how to check if it was indeed applied
On 01/07/2013 20:40, Eric Valette wrote: Hi, After hunting an unreproducible bug, I decided to run memtest86+ and found that only 8 byte of memory refuse to write the last two digit on the last 4GB memory stick. Memory is at 0x31db357558 So I decided to add a memmap=4K@0x00031db35000 boot options in GRUB_CMDLINE_LINUX. But checking dmesg, I see no mention of this page being reserved. PS: CC me I'm not subscribed. Analysing /sys/firmware/memmap files, I see no trace of buggy address exclusion. Any help? --eric -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
memmap exclude boot command: how to check if it was indeed applied
Hi, After hunting an unreproducible bug, I decided to run memtest86+ and found that only 8 byte of memory refuse to write the last two digit on the last 4GB memory stick. Memory is at 0x31db357558 So I decided to add a memmap=4K@0x00031db35000 boot options in GRUB_CMDLINE_LINUX. But checking dmesg, I see no mention of this page being reserved. PS: CC me I'm not subscribed. -- eric -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
memmap exclude boot command: how to check if it was indeed applied
Hi, After hunting an unreproducible bug, I decided to run memtest86+ and found that only 8 byte of memory refuse to write the last two digit on the last 4GB memory stick. Memory is at 0x31db357558 So I decided to add a memmap=4K@0x00031db35000 boot options in GRUB_CMDLINE_LINUX. But checking dmesg, I see no mention of this page being reserved. PS: CC me I'm not subscribed. -- eric -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Strange trace in dmesg on kernel 3.8.5
On 04/04/2013 09:03 PM, Eric Valette wrote: I fixed this already. https://git.kernel.org/cgit/linux/kernel/git/jgarzik/libata-dev.git/commit/?h=upstream-fixes=8e725c7f8a60feaa88edacd4dee2c754d5ae7706 Thanks for the support. I will apply the patch manually and wait for 3.8.6 Just to confirrm : fixed on my side with the patch. Thanks again. -- eric -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Strange trace in dmesg on kernel 3.8.5
On 04/04/2013 09:03 PM, Eric Valette wrote: I fixed this already. https://git.kernel.org/cgit/linux/kernel/git/jgarzik/libata-dev.git/commit/?h=upstream-fixesid=8e725c7f8a60feaa88edacd4dee2c754d5ae7706 Thanks for the support. I will apply the patch manually and wait for 3.8.6 Just to confirrm : fixed on my side with the patch. Thanks again. -- eric -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Strange trace in dmesg on kernel 3.8.5
On 04/04/2013 22:45, David Woodhouse wrote: > On Thu, 2013-04-04 at 21:44 +0200, Borislav Petkov wrote: >>> ahci :00:1f.2: DMA-API: device driver maps memory fromstack >>> [addr=88040df2da50] > I fixed this already. > > https://git.kernel.org/cgit/linux/kernel/git/jgarzik/libata-dev.git/commit/?h=upstream-fixes=8e725c7f8a60feaa88edacd4dee2c754d5ae7706 > Thanks for the support. I will apply the patch manually and wait for 3.8.6 -- eric -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Strange trace in dmesg on kernel 3.8.5
Hi, When booting a brand new machine freshly installed I notice in the dmesg output always the same worrying trace: PS copy me I'm not subscribed. ata2.00: ATA-9: Samsung SSD 840 Series, DXT07B0Q, max UDMA/133 ata2.00: 234441648 sectors, multi 16: LBA48 NCQ (depth 31/32), AA [ cut here ] WARNING: at /usr/src/linux-3.8.5/lib/dma-debug.c:947 check_for_stack+0x92/0xc7() Hardware name: System Product Name ahci :00:1f.2: DMA-API: device driver maps memory fromstack [addr=88040df2da50] Modules linked in: Pid: 1419, comm: scsi_eh_1 Not tainted 3.8.5 #3 Call Trace: [] ? check_for_stack+0x92/0xc7 [] warn_slowpath_common+0x7c/0x94 [] warn_slowpath_fmt+0x47/0x49 [] check_for_stack+0x92/0xc7 [] debug_dma_map_sg+0xed/0x150 [] ata_qc_issue+0x22c/0x2e1 [] ata_exec_internal_sg+0x23a/0x42c [] ata_exec_internal+0x7e/0x8b [] ata_read_log_page+0x78/0x7f [] ata_dev_configure+0xcc1/0x112b [] ? del_timer_sync+0x2b/0x49 [] ? schedule_timeout+0x145/0x165 [] ? wake_up_process+0x32/0x32 [] ? _raw_spin_lock_irqsave+0x18/0x39 [] ? _raw_spin_unlock_irqrestore+0x13/0x2e [] ? ata_exec_internal_sg+0x401/0x42c [] ? ata_exec_internal+0x7e/0x8b [] ? ata_std_postreset+0x9e/0xc4 [] ? ata_do_dev_read_id+0x22/0x24 [] ? ata_dev_read_id+0xd4/0x3f3 [] ? ahci_error_handler+0x3b/0x3b [] ? ata_ering_map+0x3a/0x5a [] ata_eh_recover+0x6b0/0xee2 [] ? ahci_error_handler+0x3b/0x3b [] ? ahci_dev_classify+0x4d/0x4d [] ? ahci_do_softreset+0x177/0x177 [] ? ata_phys_link_offline+0x27/0x27 [] ata_do_eh+0x46/0x93 [] ? ata_phys_link_offline+0x27/0x27 [] ? ahci_do_softreset+0x177/0x177 [] ? ahci_dev_classify+0x4d/0x4d [] ? ahci_error_handler+0x3b/0x3b [] ? ahci_dev_classify+0x4d/0x4d [] ata_std_error_handler+0x55/0x5c [] ahci_error_handler+0x23/0x3b [] ata_scsi_port_error_handler+0x236/0x554 [] ata_scsi_error+0x8c/0xb5 [] scsi_error_handler+0xa3/0x40a [] ? default_wake_function+0xd/0xf [] ? scsi_eh_get_sense+0xa2/0xa2 [] kthread+0xb5/0xbd [] ? kthread_stop+0x53/0x53 [] ret_from_fork+0x7c/0xb0 [] ? kthread_stop+0x53/0x53 ---[ end trace b53ccdbc387522c5 ]--- --eric -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Strange trace in dmesg on kernel 3.8.5
Hi, When booting a brand new machine freshly installed I notice in the dmesg output always the same worrying trace: PS copy me I'm not subscribed. ata2.00: ATA-9: Samsung SSD 840 Series, DXT07B0Q, max UDMA/133 ata2.00: 234441648 sectors, multi 16: LBA48 NCQ (depth 31/32), AA [ cut here ] WARNING: at /usr/src/linux-3.8.5/lib/dma-debug.c:947 check_for_stack+0x92/0xc7() Hardware name: System Product Name ahci :00:1f.2: DMA-API: device driver maps memory fromstack [addr=88040df2da50] Modules linked in: Pid: 1419, comm: scsi_eh_1 Not tainted 3.8.5 #3 Call Trace: [8121915c] ? check_for_stack+0x92/0xc7 [8102bd25] warn_slowpath_common+0x7c/0x94 [8102bd84] warn_slowpath_fmt+0x47/0x49 [8121915c] check_for_stack+0x92/0xc7 [8121927e] debug_dma_map_sg+0xed/0x150 [8133fd72] ata_qc_issue+0x22c/0x2e1 [81340061] ata_exec_internal_sg+0x23a/0x42c [813402d1] ata_exec_internal+0x7e/0x8b [813482cb] ata_read_log_page+0x78/0x7f [813414ea] ata_dev_configure+0xcc1/0x112b [81035b0d] ? del_timer_sync+0x2b/0x49 [815341c1] ? schedule_timeout+0x145/0x165 [8104c913] ? wake_up_process+0x32/0x32 [815362ae] ? _raw_spin_lock_irqsave+0x18/0x39 [8153642d] ? _raw_spin_unlock_irqrestore+0x13/0x2e [81340228] ? ata_exec_internal_sg+0x401/0x42c [813402d1] ? ata_exec_internal+0x7e/0x8b [8134260a] ? ata_std_postreset+0x9e/0xc4 [81340300] ? ata_do_dev_read_id+0x22/0x24 [81340499] ? ata_dev_read_id+0xd4/0x3f3 [8134ea61] ? ahci_error_handler+0x3b/0x3b [81347a63] ? ata_ering_map+0x3a/0x5a [8134a4f3] ata_eh_recover+0x6b0/0xee2 [8134ea61] ? ahci_error_handler+0x3b/0x3b [8134ee52] ? ahci_dev_classify+0x4d/0x4d [8134f9e1] ? ahci_do_softreset+0x177/0x177 [813426b0] ? ata_phys_link_offline+0x27/0x27 [8134b417] ata_do_eh+0x46/0x93 [813426b0] ? ata_phys_link_offline+0x27/0x27 [8134f9e1] ? ahci_do_softreset+0x177/0x177 [8134ee52] ? ahci_dev_classify+0x4d/0x4d [8134ea61] ? ahci_error_handler+0x3b/0x3b [8134ee52] ? ahci_dev_classify+0x4d/0x4d [8134b4b9] ata_std_error_handler+0x55/0x5c [8134ea49] ahci_error_handler+0x23/0x3b [8134affe] ata_scsi_port_error_handler+0x236/0x554 [8134b3a8] ata_scsi_error+0x8c/0xb5 [81329f8d] scsi_error_handler+0xa3/0x40a [8104c920] ? default_wake_function+0xd/0xf [81329eea] ? scsi_eh_get_sense+0xa2/0xa2 [81042662] kthread+0xb5/0xbd [810425ad] ? kthread_stop+0x53/0x53 [815370ac] ret_from_fork+0x7c/0xb0 [810425ad] ? kthread_stop+0x53/0x53 ---[ end trace b53ccdbc387522c5 ]--- --eric -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Strange trace in dmesg on kernel 3.8.5
On 04/04/2013 22:45, David Woodhouse wrote: On Thu, 2013-04-04 at 21:44 +0200, Borislav Petkov wrote: ahci :00:1f.2: DMA-API: device driver maps memory fromstack [addr=88040df2da50] I fixed this already. https://git.kernel.org/cgit/linux/kernel/git/jgarzik/libata-dev.git/commit/?h=upstream-fixesid=8e725c7f8a60feaa88edacd4dee2c754d5ae7706 Thanks for the support. I will apply the patch manually and wait for 3.8.6 -- eric -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: rtl8187 driver in 2.6.23-rc6-git5: kernel panic if not used as a module. Works as a module.
Rob Hussey wrote: > On 9/15/07, ポール・ロラン Paul Rolland <[EMAIL PROTECTED]> wrote: > >> Hi Eric, >> >> On Sat, 15 Sep 2007 20:30:14 +0200 >> Eric Valette <[EMAIL PROTECTED]> wrote: >> >> >>> Rob Hussey wrote: >>> >>>> On 9/15/07, Eric Valette <[EMAIL PROTECTED]> wrote: >>>> >>>>> Eric Valette wrote: >>>>> >>>>> >>> Thanks for your help: it does indeed fix the problem. >>> >> Nice it works for you too ! >> >> >>> Now I have two side questions: >>> - the code is no more symetric "subsys_initcall" -> "module_exit". >>> Do not know if it is "normal" but I love symmetry in code :-). Did not test >>> it still works as a module... >>> >> Symmetry is not broken, as we have : >> #define subsys_initcall(fn) module_init(fn) >> in include/linux/init.h where compiling as a module, and when not compiling >> as a module, I doubt the exit function is called unless you are shuting >> down your machine... >> >> >>> - Who takes the responsability to push a patch to Linus? I guess it >>> is urgent unless he plans a rc7 >>> >> Good point ! I expect the patches to be already in some queue waiting to be >> pulled ! >> > > The patches are on their way to making it into 2.6.23: > http://marc.info/?l=linux-netdev=118986368303529=2 > > Regards, > Rob > I'm a little bit worried as rc8 is out and ots still not in git. -- eric - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: rtl8187 driver in 2.6.23-rc6-git5: kernel panic if not used as a module. Works as a module.
Rob Hussey wrote: On 9/15/07, ポール・ロラン Paul Rolland [EMAIL PROTECTED] wrote: Hi Eric, On Sat, 15 Sep 2007 20:30:14 +0200 Eric Valette [EMAIL PROTECTED] wrote: Rob Hussey wrote: On 9/15/07, Eric Valette [EMAIL PROTECTED] wrote: Eric Valette wrote: Thanks for your help: it does indeed fix the problem. Nice it works for you too ! Now I have two side questions: - the code is no more symetric subsys_initcall - module_exit. Do not know if it is normal but I love symmetry in code :-). Did not test it still works as a module... Symmetry is not broken, as we have : #define subsys_initcall(fn) module_init(fn) in include/linux/init.h where compiling as a module, and when not compiling as a module, I doubt the exit function is called unless you are shuting down your machine... - Who takes the responsability to push a patch to Linus? I guess it is urgent unless he plans a rc7 Good point ! I expect the patches to be already in some queue waiting to be pulled ! The patches are on their way to making it into 2.6.23: http://marc.info/?l=linux-netdevm=118986368303529w=2 Regards, Rob I'm a little bit worried as rc8 is out and ots still not in git. -- eric - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: rtl8187 driver in 2.6.23-rc6-git5: kernel panic if not used as a module. Works as a module.
Johannes Berg wrote: > On Sa, 2007-09-15 at 21:00 +0200, Eric Valette wrote: > > >> I came to this conclusion too. But I would have preferred to have >> #define subsys_exit(fn) modules_exit(fn) >> >> in the case of a module and nop in the non module case... >> > > module_exit is a no-op anyway in the non-modular case, it's never > called, so what's the point? > That I would have prefered to see subsys_exit in front of subsys_initcall instead of module_exit because 1) it made me wonder if it still works in the module case 2) If you see the comment in init.h (/* Don't use these in modules, but some people do... */), you should not use it in module. The comment is, at least misleading, because for code that can be used as module or directly embedded in some cases you are indeed forced to use it (did a grep and found a lot) 3) Non symetrical code frequently points to errors or bad design. YMMV :-) 4) If someone someday find something to do when shutting down (hotrestart, fault tolerance, or something equivalent), we would have a place to hook. C++ has destructors called after main and before __exit... NB : This has nothing to do with the proposed patch that is definitively correct given current init.h. Personally, I would certainly add: #define subsys_exit(fn) in init.h But would rate it myself as a cosmetic change as it *only* makes the code more obvious to read :-) -- eric - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: rtl8187 driver in 2.6.23-rc6-git5: kernel panic if not used as a module. Works as a module.
Johannes Berg wrote: On Sa, 2007-09-15 at 21:00 +0200, Eric Valette wrote: I came to this conclusion too. But I would have preferred to have #define subsys_exit(fn) modules_exit(fn) in the case of a module and nop in the non module case... module_exit is a no-op anyway in the non-modular case, it's never called, so what's the point? That I would have prefered to see subsys_exit in front of subsys_initcall instead of module_exit because 1) it made me wonder if it still works in the module case 2) If you see the comment in init.h (/* Don't use these in modules, but some people do... */), you should not use it in module. The comment is, at least misleading, because for code that can be used as module or directly embedded in some cases you are indeed forced to use it (did a grep and found a lot) 3) Non symetrical code frequently points to errors or bad design. YMMV :-) 4) If someone someday find something to do when shutting down (hotrestart, fault tolerance, or something equivalent), we would have a place to hook. C++ has destructors called after main and before __exit... NB : This has nothing to do with the proposed patch that is definitively correct given current init.h. Personally, I would certainly add: #define subsys_exit(fn) in init.h But would rate it myself as a cosmetic change as it *only* makes the code more obvious to read :-) -- eric - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: rtl8187 driver in 2.6.23-rc6-git5: kernel panic if not used as a module. Works as a module.
Paul Rolland (ポール・ロラン) wrote: > Hi Eric, >> Now I have two side questions: >> - the code is no more symetric "subsys_initcall" -> "module_exit". >> Do not know if it is "normal" but I love symmetry in code :-). Did not test >> it still works as a module... > Symmetry is not broken, as we have : > #define subsys_initcall(fn) module_init(fn) > in include/linux/init.h where compiling as a module, and when not compiling > as a module, I doubt the exit function is called unless you are shuting > down your machine... I came to this conclusion too. But I would have preferred to have #define subsys_exit(fn) modules_exit(fn) in the case of a module and nop in the non module case... -- eric - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: rtl8187 driver in 2.6.23-rc6-git5: kernel panic if not used as a module. Works as a module.
Rob Hussey wrote: > On 9/15/07, Eric Valette <[EMAIL PROTECTED]> wrote: >> Eric Valette wrote: >> >>> I can probably take a picture of the backtrace if you want. >> Just saw that just above my message in the LKML web interface, someone >> posted a backtrace. Mine is different but at least, we are at least two >> to have the crash. > > This is the same thing I said to Paul Rolland, since I think your > problems are the same: > I had this problem as well. It has to do with mac80211, cfg80211 and > the rate control algorithm not initializing early enough in the boot > process. These two patches should fix it: > > [PATCH] mac80211: fix initialisation when built-in > http://article.gmane.org/gmane.linux.kernel.wireless.general/5710/match=patch+mac80211+initialisation > > [PATCH] cfg80211: fix initialisation if built-in > http://article.gmane.org/gmane.linux.network/71326/match=patch+cfg80211+initialisation > > Regards, > Rob Thanks for your help: it does indeed fix the problem. Now I have two side questions: - the code is no more symetric "subsys_initcall" -> "module_exit". Do not know if it is "normal" but I love symmetry in code :-). Did not test it still works as a module... - Who takes the responsability to push a patch to Linus? I guess it is urgent unless he plans a rc7 -- eric - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: rtl8187 driver in 2.6.23-rc6-git5: kernel panic if not used as a module. Works as a module.
Eric Valette wrote: > I can probably take a picture of the backtrace if you want. Just saw that just above my message in the LKML web interface, someone posted a backtrace. Mine is different but at least, we are at least two to have the crash. -- eric - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
rtl8187 driver in 2.6.23-rc6-git5: kernel panic if not used as a module. Works as a module.
First let me start by a thanks: it was the last piece of my P5W de luxe machine based that has not its driver from stock kernel. It works like a charm when used as a module: lsusb Bus 005 Device 001: ID : Bus 003 Device 001: ID : Bus 004 Device 001: ID : Bus 002 Device 001: ID : Bus 001 Device 007: ID 0bda:8187 Realtek Semiconductor Corp. <= Bus 001 Device 005: ID 05e3:0606 Genesys Logic, Inc. D-Link DUB-H4 USB 2.0 Hub Bus 001 Device 004: ID 03f0:c202 Hewlett-Packard Bus 001 Device 006: ID 058f:6362 Alcor Micro Corp. Hi-Speed Internal Multi-Card Reader/Writer Bus 001 Device 001: ID : Bus 001 Device 003: ID 05e3:0606 Genesys Logic, Inc. D-Link DUB-H4 USB 2.0 Hub Bus 001 Device 002: ID 05e3:0606 Genesys Logic, Inc. D-Link DUB-H4 USB 2.0 Hub wlan0 IEEE 802.11g ESSID:"" Mode:Managed Frequency:2.437 GHz Access Point: 00:18:E7:15:2B:2A Bit Rate=11 Mb/s Retry min limit:7 RTS thr:off Fragment thr=2346 B Encryption key:x Link Quality=43/64 Signal level=39/65 Rx invalid nwid:0 Rx invalid crypt:0 Rx invalid frag:0 Tx excessive retries:0 Invalid misc:0 Missed beacon:0 wlan0 Lien encap:Ethernet HWaddr 00:15:AF:0A:BE:A9 inet adr:192.168.1.11 Bcast:192.168.1.255 Masque:255.255.255.0 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:637 errors:0 dropped:0 overruns:0 frame:0 TX packets:562 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 lg file transmission:1000 RX bytes:373558 (364.8 KiB) TX bytes:88216 (86.1 KiB) However, when I compile it directly inside the kernel, I get a panic with a backtrace at usb initilialisation. The backtrace points to registering things in /proc (make-directory or something)! I can probably take a picture of the backtrace if you want. Config: Asus P5W de luxe, Intel(R) Core(TM)2 CPU 6600 @ 2.40GHz, pure 64 bits install. -- eric - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
rtl8187 driver in 2.6.23-rc6-git5: kernel panic if not used as a module. Works as a module.
First let me start by a thanks: it was the last piece of my P5W de luxe machine based that has not its driver from stock kernel. It works like a charm when used as a module: lsusb Bus 005 Device 001: ID : Bus 003 Device 001: ID : Bus 004 Device 001: ID : Bus 002 Device 001: ID : Bus 001 Device 007: ID 0bda:8187 Realtek Semiconductor Corp. = Bus 001 Device 005: ID 05e3:0606 Genesys Logic, Inc. D-Link DUB-H4 USB 2.0 Hub Bus 001 Device 004: ID 03f0:c202 Hewlett-Packard Bus 001 Device 006: ID 058f:6362 Alcor Micro Corp. Hi-Speed Internal Multi-Card Reader/Writer Bus 001 Device 001: ID : Bus 001 Device 003: ID 05e3:0606 Genesys Logic, Inc. D-Link DUB-H4 USB 2.0 Hub Bus 001 Device 002: ID 05e3:0606 Genesys Logic, Inc. D-Link DUB-H4 USB 2.0 Hub wlan0 IEEE 802.11g ESSID: Mode:Managed Frequency:2.437 GHz Access Point: 00:18:E7:15:2B:2A Bit Rate=11 Mb/s Retry min limit:7 RTS thr:off Fragment thr=2346 B Encryption key:x Link Quality=43/64 Signal level=39/65 Rx invalid nwid:0 Rx invalid crypt:0 Rx invalid frag:0 Tx excessive retries:0 Invalid misc:0 Missed beacon:0 wlan0 Lien encap:Ethernet HWaddr 00:15:AF:0A:BE:A9 inet adr:192.168.1.11 Bcast:192.168.1.255 Masque:255.255.255.0 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:637 errors:0 dropped:0 overruns:0 frame:0 TX packets:562 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 lg file transmission:1000 RX bytes:373558 (364.8 KiB) TX bytes:88216 (86.1 KiB) However, when I compile it directly inside the kernel, I get a panic with a backtrace at usb initilialisation. The backtrace points to registering things in /proc (make-directory or something)! I can probably take a picture of the backtrace if you want. Config: Asus P5W de luxe, Intel(R) Core(TM)2 CPU 6600 @ 2.40GHz, pure 64 bits install. -- eric - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: rtl8187 driver in 2.6.23-rc6-git5: kernel panic if not used as a module. Works as a module.
Eric Valette wrote: I can probably take a picture of the backtrace if you want. Just saw that just above my message in the LKML web interface, someone posted a backtrace. Mine is different but at least, we are at least two to have the crash. -- eric - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: rtl8187 driver in 2.6.23-rc6-git5: kernel panic if not used as a module. Works as a module.
Rob Hussey wrote: On 9/15/07, Eric Valette [EMAIL PROTECTED] wrote: Eric Valette wrote: I can probably take a picture of the backtrace if you want. Just saw that just above my message in the LKML web interface, someone posted a backtrace. Mine is different but at least, we are at least two to have the crash. This is the same thing I said to Paul Rolland, since I think your problems are the same: I had this problem as well. It has to do with mac80211, cfg80211 and the rate control algorithm not initializing early enough in the boot process. These two patches should fix it: [PATCH] mac80211: fix initialisation when built-in http://article.gmane.org/gmane.linux.kernel.wireless.general/5710/match=patch+mac80211+initialisation [PATCH] cfg80211: fix initialisation if built-in http://article.gmane.org/gmane.linux.network/71326/match=patch+cfg80211+initialisation Regards, Rob Thanks for your help: it does indeed fix the problem. Now I have two side questions: - the code is no more symetric subsys_initcall - module_exit. Do not know if it is normal but I love symmetry in code :-). Did not test it still works as a module... - Who takes the responsability to push a patch to Linus? I guess it is urgent unless he plans a rc7 -- eric - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: rtl8187 driver in 2.6.23-rc6-git5: kernel panic if not used as a module. Works as a module.
Paul Rolland (ポール・ロラン) wrote: Hi Eric, Now I have two side questions: - the code is no more symetric subsys_initcall - module_exit. Do not know if it is normal but I love symmetry in code :-). Did not test it still works as a module... Symmetry is not broken, as we have : #define subsys_initcall(fn) module_init(fn) in include/linux/init.h where compiling as a module, and when not compiling as a module, I doubt the exit function is called unless you are shuting down your machine... I came to this conclusion too. But I would have preferred to have #define subsys_exit(fn) modules_exit(fn) in the case of a module and nop in the non module case... -- eric - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/