Hi Zakhar,

Thanks for the quick response.  I was coming across some of those Proxmox
forum posts as well.  I'm not sure if going to the 5.4 kernel will create
any other challenges for us, as we're using dual port mellanox connectx-6
200G nics in the hosts, but it is definitely something we can try.

Marco

On Tue, Oct 12, 2021 at 1:53 PM Zakhar Kirpichenko <zak...@gmail.com> wrote:

> Hi,
>
> This could be kernel-related, as I've seen similar reports in Proxmox
> forum. Specifically, 5.11.x with Ceph seems to be hitting kernel NULL
> pointer dereference. Perhaps a newer kernel would help. If not, I'm running
> 16.2.6 with kernel 5.4.x without any issues.
>
> Best regards,
> Z
>
> On Tue, Oct 12, 2021 at 8:31 PM Marco Pizzolo <marcopizz...@gmail.com>
> wrote:
>
>> Hello everyone,
>>
>> We are seeing instability in 20.04.3 using HWE kernel and Ceph 16.2.6
>> w/Podman.
>>
>> We have OSDs that fail after <24 hours and I'm not sure why.
>>
>> Seeing this:
>>
>> ceph crash info
>> 2021-10-12T14:32:49.169552Z_d1ee94f7-1aaa-4221-abeb-68bd56d3c763
>> {
>>     "backtrace": [
>>         "/lib64/libpthread.so.0(+0x12b20) [0x7f4d31099b20]",
>>         "pthread_cond_wait()",
>>
>> "(std::condition_variable::wait(std::unique_lock<std::mutex>&)+0x10)
>> [0x7f4d306de8f0]",
>>         "(Throttle::_wait(long, std::unique_lock<std::mutex>&)+0x10d)
>> [0x55c52a0f077d]",
>>         "(Throttle::get(long, long)+0xb9) [0x55c52a0f1199]",
>>         "(BlueStore::BlueStoreThrottle::try_start_transaction(KeyValueDB&,
>> BlueStore::TransContext&, std::chrono::time_point<ceph::mono_clock,
>> std::chrono::duration<unsigned long, std::ratio<1l, 1000000000l> >
>> >)+0x29)
>> [0x55c529f362c9]",
>>
>>
>> "(BlueStore::queue_transactions(boost::intrusive_ptr<ObjectStore::CollectionImpl>&,
>> std::vector<ceph::os::Transaction, std::allocator<ceph::os::Transaction>
>> >&, boost::intrusive_ptr<TrackedOp>, ThreadPool::TPHandle*)+0x854)
>> [0x55c529fb7664]",
>>         "(non-virtual thunk to
>> PrimaryLogPG::queue_transactions(std::vector<ceph::os::Transaction,
>> std::allocator<ceph::os::Transaction> >&,
>> boost::intrusive_ptr<OpRequest>)+0x58) [0x55c529c0ee98]",
>>         "(ReplicatedBackend::submit_transaction(hobject_t const&,
>> object_stat_sum_t const&, eversion_t const&,
>> std::unique_ptr<PGTransaction,
>> std::default_delete<PGTransaction> >&&, eversion_t const&, eversion_t
>> const&, std::vector<pg_log_entry_t, std::allocator<pg_log_entry_t> >&&,
>> std::optional<pg_hit_set_history_t>&, Context*, unsigned long,
>> osd_reqid_t,
>> boost::intrusive_ptr<OpRequest>)+0xcad) [0x55c529dfbedd]",
>>         "(PrimaryLogPG::issue_repop(PrimaryLogPG::RepGather*,
>> PrimaryLogPG::OpContext*)+0xcf0) [0x55c529b7a630]",
>>         "(PrimaryLogPG::execute_ctx(PrimaryLogPG::OpContext*)+0x115d)
>> [0x55c529bd65ed]",
>>         "(PrimaryLogPG::do_op(boost::intrusive_ptr<OpRequest>&)+0x2de2)
>> [0x55c529bdf162]",
>>         "(PrimaryLogPG::do_request(boost::intrusive_ptr<OpRequest>&,
>> ThreadPool::TPHandle&)+0xd1c) [0x55c529be64ac]",
>>         "(OSD::dequeue_op(boost::intrusive_ptr<PG>,
>> boost::intrusive_ptr<OpRequest>, ThreadPool::TPHandle&)+0x309)
>> [0x55c529a6f1b9]",
>>         "(ceph::osd::scheduler::PGOpItem::run(OSD*, OSDShard*,
>> boost::intrusive_ptr<PG>&, ThreadPool::TPHandle&)+0x68) [0x55c529ccc868]",
>>         "(OSD::ShardedOpWQ::_process(unsigned int,
>> ceph::heartbeat_handle_d*)+0xa58) [0x55c529a8f1e8]",
>>         "(ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x5c4)
>> [0x55c52a0fa6c4]",
>>         "(ShardedThreadPool::WorkThreadSharded::entry()+0x14)
>> [0x55c52a0fd364]",
>>         "/lib64/libpthread.so.0(+0x814a) [0x7f4d3108f14a]",
>>         "clone()"
>>     ],
>>     "ceph_version": "16.2.6",
>>     "crash_id":
>> "2021-10-12T14:32:49.169552Z_d1ee94f7-1aaa-4221-abeb-68bd56d3c763",
>>     "entity_name": "osd.14",
>>     "os_id": "centos",
>>     "os_name": "CentOS Linux",
>>     "os_version": "8",
>>     "os_version_id": "8",
>>     "process_name": "ceph-osd",
>>     "stack_sig":
>> "46b81ca079908da081327cbc114a9c1801dfdbb81303b85fff0d4107a1aeeabe",
>>     "timestamp": "2021-10-12T14:32:49.169552Z",
>>     "utsname_hostname": "<HOSTNAME_REMOVED>",
>>     "utsname_machine": "x86_64",
>>     "utsname_release": "5.11.0-37-generic",
>>     "utsname_sysname": "Linux",
>>     "utsname_version": "#41~20.04.2-Ubuntu SMP Fri Sep 24 09:06:38 UTC
>> 2021"
>>
>> dmesg on host shows:
>>
>> [66258.080040] BUG: kernel NULL pointer dereference, address:
>> 00000000000000c0
>> [66258.080067] #PF: supervisor read access in kernel mode
>> [66258.080081] #PF: error_code(0x0000) - not-present page
>> [66258.080093] PGD 0 P4D 0
>> [66258.080105] Oops: 0000 [#1] SMP NOPTI
>> [66258.080115] CPU: 35 PID: 4955 Comm: zabbix_agentd Not tainted
>> 5.11.0-37-generic #41~20.04.2-Ubuntu
>> [66258.080137] Hardware name: Supermicro SSG-6049P-E1CR60L+/X11DSC+, BIOS
>> 3.3 02/21/2020
>> [66258.080154] RIP: 0010:blk_mq_put_rq_ref+0xa/0x60
>> [66258.080171] Code: 15 0f b6 d3 4c 89 e7 be 01 00 00 00 e8 cf fe ff ff 5b
>> 41 5c 5d c3 0f 0b 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 8b 47 10
>> <48> 8b 80 c0 00 00 00 48 89 e5 48 3b 78 40 74 1f 4c 8d 87 e8 00 00
>> [66258.080210] RSP: 0018:ffffa251249fbbc0 EFLAGS: 00010283
>> [66258.080224] RAX: 0000000000000000 RBX: ffffa251249fbc48 RCX:
>> 0000000000000002
>> [66258.080240] RDX: 0000000000000001 RSI: 0000000000000202 RDI:
>> ffff9382f4d8e000
>> [66258.080256] RBP: ffffa251249fbbf8 R08: 0000000000000000 R09:
>> 0000000000000035
>> [66258.080272] R10: abcc77118461cefd R11: ffff93b382257076 R12:
>> ffff9382f4d8e000
>> [66258.080288] R13: ffff9382f5670c00 R14: 0000000000000000 R15:
>> 0000000000000001
>> [66258.080304] FS:  00007fedc0ef46c0(0000) GS:ffff93e13f4c0000(0000)
>> knlGS:0000000000000000
>> [66258.080322] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> [66258.080335] CR2: 00000000000000c0 CR3: 000000011cc44005 CR4:
>> 00000000007706e0
>> [66258.080351] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
>> 0000000000000000
>> [66258.080367] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:
>> 0000000000000400
>> [66258.080383] PKRU: 55555554
>> [66258.080391] Call Trace:
>> [66258.080401]  ? bt_iter+0x54/0x90
>> [66258.080413]  blk_mq_queue_tag_busy_iter+0x18b/0x2d0
>> [66258.080427]  ? blk_mq_hctx_mark_pending+0x70/0x70
>> [66258.080440]  ? blk_mq_hctx_mark_pending+0x70/0x70
>> [66258.080452]  blk_mq_in_flight+0x38/0x60
>> [66258.080463]  diskstats_show+0x75/0x2b0
>> [66258.080475]  traverse+0x78/0x200
>> [66258.080485]  seq_lseek+0x61/0xd0
>> [66258.080495]  proc_reg_llseek+0x77/0xa0
>> [66258.080507]  ksys_lseek+0x68/0xb0
>> [66258.080519]  __x64_sys_lseek+0x1a/0x20
>> [66258.080530]  do_syscall_64+0x38/0x90
>> [66258.080542]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
>> [66258.080557] RIP: 0033:0x7fedc237293b
>> [66258.081098] Code: c3 48 8b 15 8f 96 00 00 f7 d8 64 89 02 b8 ff ff ff ff
>> eb be 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa b8 08 00 00 00 0f 05
>> <48> 3d 00 f0 ff ff 77 05 c3 0f 1f 40 00 48 8b 15 59 96 00 00 f7 d8
>> [66258.082122] RSP: 002b:00007fffb1cfd8e8 EFLAGS: 00000206 ORIG_RAX:
>> 0000000000000008
>> [66258.082633] RAX: ffffffffffffffda RBX: 0000000000000004 RCX:
>> 00007fedc237293b
>> [66258.083129] RDX: 0000000000000000 RSI: 00000000000022dc RDI:
>> 0000000000000007
>> [66258.083612] RBP: 00007fffb1cfd950 R08: 0000000000000002 R09:
>> 0000000061659bf0
>> [66258.084080] R10: 0000000000000000 R11: 0000000000000206 R12:
>> 000055a02d1acd4d
>> [66258.084536] R13: 0000000000000003 R14: 0000000000000000 R15:
>> 0000000000000000
>> [66258.084979] Modules linked in: binfmt_misc overlay bonding
>> nls_iso8859_1
>> dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua intel_rapl_msr
>> intel_rapl_common isst_if_common skx_edac nfit x86_pkg_temp_thermal
>> coretemp kvm_intel ipmi_ssif kvm rapl intel_cstate efi_pstore joydev
>> input_leds intel_pch_thermal mei_me mei ioatdma acpi_ipmi ipmi_si
>> ipmi_devintf ipmi_msghandler acpi_power_meter mac_hid sch_fq_codel msr
>> ip_tables x_tables autofs4 btrfs blake2b_generic raid10 raid456
>> async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq
>> libcrc32c raid0 multipath linear mlx5_ib ib_uverbs ib_core hid_generic
>> usbhid hid ses enclosure scsi_transport_sas raid1 crct10dif_pclmul
>> crc32_pclmul ast drm_vram_helper i2c_algo_bit drm_ttm_helper
>> ghash_clmulni_intel ttm mlx5_core drm_kms_helper aesni_intel syscopyarea
>> crypto_simd sysfillrect sysimgblt cryptd glue_helper fb_sys_fops cec
>> pci_hyperv_intf rc_core mlxfw megaraid_sas tls drm ixgbe xfrm_algo dca
>> mdio
>> ahci i2c_i801 vmd xhci_pci lpc_ich
>> [66258.085039]  libahci i2c_smbus xhci_pci_renesas wmi
>> [66258.089332] CR2: 00000000000000c0
>> [66258.089825] ---[ end trace c1ae715ae7a3e043 ]---
>> [66258.144753] RIP: 0010:blk_mq_put_rq_ref+0xa/0x60
>> [66258.145285] Code: 15 0f b6 d3 4c 89 e7 be 01 00 00 00 e8 cf fe ff ff 5b
>> 41 5c 5d c3 0f 0b 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 8b 47 10
>> <48> 8b 80 c0 00 00 00 48 89 e5 48 3b 78 40 74 1f 4c 8d 87 e8 00 00
>> [66258.146271] RSP: 0018:ffffa251249fbbc0 EFLAGS: 00010283
>> [66258.146768] RAX: 0000000000000000 RBX: ffffa251249fbc48 RCX:
>> 0000000000000002
>> [66258.147268] RDX: 0000000000000001 RSI: 0000000000000202 RDI:
>> ffff9382f4d8e000
>> [66258.147779] RBP: ffffa251249fbbf8 R08: 0000000000000000 R09:
>> 0000000000000035
>> [66258.148309] R10: abcc77118461cefd R11: ffff93b382257076 R12:
>> ffff9382f4d8e000
>> [66258.148802] R13: ffff9382f5670c00 R14: 0000000000000000 R15:
>> 0000000000000001
>> [66258.149291] FS:  00007fedc0ef46c0(0000) GS:ffff93e13f4c0000(0000)
>> knlGS:0000000000000000
>> [66258.149785] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> [66258.150281] CR2: 00000000000000c0 CR3: 000000011cc44005 CR4:
>> 00000000007706e0
>> [66258.150791] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
>> 0000000000000000
>> [66258.151290] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:
>> 0000000000000400
>> [66258.151789] PKRU: 55555554
>> [69435.862190] perf: interrupt took too long (2513 > 2500), lowering
>> kernel.perf_event_max_sample_rate to 79500
>>
>> Any guidance is much appreciated.
>>
>> Thanks,
>> Marco
>> _______________________________________________
>> ceph-users mailing list -- ceph-users@ceph.io
>> To unsubscribe send an email to ceph-users-le...@ceph.io
>>
>
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

Reply via email to