Re: [PATCH] sunrpc: fix crash when cache_head become valid before update
On Wed, Oct 09, 2019 at 09:51:23AM +1100, NeilBrown wrote: > On Tue, Oct 08 2019, J . Bruce Fields wrote: > > > On Tue, Oct 08, 2019 at 10:02:53AM +, Pavel Tikhomirov wrote: > >> Add Neil to CC, sorry, had lost it somehow... > > > > Always happy when we can fix a bug by deleting code, and your > > explanation makes sense to me, but I'll give Neil a chance to look it > > over if he wants. > > Yes, it makes sense to me. But I'm not sure that is worth much. The > original fix got a Reviewed-by from me but was wrong. > Acked-by: NeilBrown > > 'Acked' is weaker than 'reviewed' - isn't it? :-) Hah--"Self-deprecatingly-reviewed-by:"? Anyway, applied thanks. --b.
Re: [PATCH] sunrpc: fix crash when cache_head become valid before update
On Tue, Oct 08 2019, J . Bruce Fields wrote: > On Tue, Oct 08, 2019 at 10:02:53AM +, Pavel Tikhomirov wrote: >> Add Neil to CC, sorry, had lost it somehow... > > Always happy when we can fix a bug by deleting code, and your > explanation makes sense to me, but I'll give Neil a chance to look it > over if he wants. Yes, it makes sense to me. But I'm not sure that is worth much. The original fix got a Reviewed-by from me but was wrong. Acked-by: NeilBrown 'Acked' is weaker than 'reviewed' - isn't it? :-) NeilBrown signature.asc Description: PGP signature
Re: [PATCH] sunrpc: fix crash when cache_head become valid before update
On Tue, Oct 08, 2019 at 10:02:53AM +, Pavel Tikhomirov wrote: > Add Neil to CC, sorry, had lost it somehow... Always happy when we can fix a bug by deleting code, and your explanation makes sense to me, but I'll give Neil a chance to look it over if he wants. --b. > > On 10/1/19 11:03 AM, Pavel Tikhomirov wrote: > > I was investigating a crash in our Virtuozzo7 kernel which happened in > > in svcauth_unix_set_client. I found out that we access m_client field > > in ip_map structure, which was received from sunrpc_cache_lookup (we > > have a bit older kernel, now the code is in sunrpc_cache_add_entry), and > > these field looks uninitialized (m_client == 0x74 don't look like a > > pointer) but in the cache_head in flags we see 0x1 which is CACHE_VALID. > > > > It looks like the problem appeared from our previous fix to sunrpc (1): > > commit 4ecd55ea0742 ("sunrpc: fix cache_head leak due to queued > > request") > > > > And we've also found a patch already fixing our patch (2): > > commit d58431eacb22 ("sunrpc: don't mark uninitialised items as VALID.") > > > > Though the crash is eliminated, I think the core of the problem is not > > completely fixed: > > > > Neil in the patch (2) makes cache_head CACHE_NEGATIVE, before > > cache_fresh_locked which was added in (1) to fix crash. These way > > cache_is_valid won't say the cache is valid anymore and in > > svcauth_unix_set_client the function cache_check will return error > > instead of 0, and we don't count entry as initialized. > > > > But it looks like we need to remove cache_fresh_locked completely in > > sunrpc_cache_lookup: > > > > In (1) we've only wanted to make cache_fresh_unlocked->cache_dequeue so > > that cache_requests with no readers also release corresponding > > cache_head, to fix their leak. We with Vasily were not sure if > > cache_fresh_locked and cache_fresh_unlocked should be used in pair or > > not, so we've guessed to use them in pair. > > > > Now we see that we don't want the CACHE_VALID bit set here by > > cache_fresh_locked, as "valid" means "initialized" and there is no > > initialization in sunrpc_cache_add_entry. Both expiry_time and > > last_refresh are not used in cache_fresh_unlocked code-path and also not > > required for the initial fix. > > > > So to conclude cache_fresh_locked was called by mistake, and we can just > > safely remove it instead of crutching it with CACHE_NEGATIVE. It looks > > ideologically better for me. Hope I don't miss something here. > > > > Here is our crash backtrace: > > [13108726.326291] BUG: unable to handle kernel NULL pointer dereference at > > 0074 > > [13108726.326365] IP: [] > > svcauth_unix_set_client+0x2ab/0x520 [sunrpc] > > [13108726.326448] PGD 0 > > [13108726.326468] Oops: 0002 [#1] SMP > > [13108726.326497] Modules linked in: nbd isofs xfs loop > > kpatch_cumulative_81_0_r1(O) xt_physdev nfnetlink_queue bluetooth rfkill > > ip6table_nat nf_nat_ipv6 ip_vs_wrr ip_vs_wlc ip_vs_sh nf_conntrack_netlink > > ip_vs_sed ip_vs_pe_sip nf_conntrack_sip ip_vs_nq ip_vs_lc ip_vs_lblcr > > ip_vs_lblc ip_vs_ftp ip_vs_dh nf_nat_ftp nf_conntrack_ftp iptable_raw > > xt_recent nf_log_ipv6 xt_hl ip6t_rt nf_log_ipv4 nf_log_common xt_LOG > > xt_limit xt_TCPMSS xt_tcpmss vxlan ip6_udp_tunnel udp_tunnel xt_statistic > > xt_NFLOG nfnetlink_log dummy xt_mark xt_REDIRECT nf_nat_redirect raw_diag > > udp_diag tcp_diag inet_diag netlink_diag af_packet_diag unix_diag > > rpcsec_gss_krb5 xt_addrtype ip6t_rpfilter ipt_REJECT nf_reject_ipv4 > > ip6t_REJECT nf_reject_ipv6 ebtable_nat ebtable_broute nf_conntrack_ipv6 > > nf_defrag_ipv6 ip6table_mangle ip6table_raw nfsv4 > > [13108726.327173] dns_resolver cls_u32 binfmt_misc arptable_filter > > arp_tables ip6table_filter ip6_tables devlink fuse_kio_pcs ipt_MASQUERADE > > nf_nat_masquerade_ipv4 xt_nat iptable_nat nf_nat_ipv4 xt_comment > > nf_conntrack_ipv4 nf_defrag_ipv4 xt_wdog_tmo xt_multiport bonding xt_set > > xt_conntrack iptable_filter iptable_mangle kpatch(O) ebtable_filter > > ebt_among ebtables ip_set_hash_ip ip_set nfnetlink vfat fat skx_edac > > intel_powerclamp coretemp intel_rapl iosf_mbi kvm_intel kvm irqbypass fuse > > pcspkr ses enclosure joydev sg mei_me hpwdt hpilo lpc_ich mei ipmi_si > > shpchp ipmi_devintf ipmi_msghandler xt_ipvs acpi_power_meter ip_vs_rr nfsv3 > > nfsd auth_rpcgss nfs_acl nfs lockd grace fscache nf_nat cls_fw sch_htb > > sch_cbq sch_sfq ip_vs em_u32 nf_conntrack tun br_netfilter veth overlay > > ip6_vzprivnet ip6_vznetstat ip_vznetstat > > [13108726.327817] ip_vzprivnet vziolimit vzevent vzlist vzstat vznetstat > > vznetdev vzmon vzdev bridge pio_kaio pio_nfs pio_direct pfmt_raw > > pfmt_ploop1 ploop ip_tables ext4 mbcache jbd2 sd_mod crc_t10dif > > crct10dif_generic mgag200 i2c_algo_bit drm_kms_helper scsi_transport_iscsi > > 8021q syscopyarea sysfillrect garp sysimgblt fb_sys_fops mrp stp ttm llc > > bnx2x crct10dif_pclmul crct10dif_common crc32_pclmul crc32c_intel drm > > d
Re: [PATCH] sunrpc: fix crash when cache_head become valid before update
Add Neil to CC, sorry, had lost it somehow... On 10/1/19 11:03 AM, Pavel Tikhomirov wrote: > I was investigating a crash in our Virtuozzo7 kernel which happened in > in svcauth_unix_set_client. I found out that we access m_client field > in ip_map structure, which was received from sunrpc_cache_lookup (we > have a bit older kernel, now the code is in sunrpc_cache_add_entry), and > these field looks uninitialized (m_client == 0x74 don't look like a > pointer) but in the cache_head in flags we see 0x1 which is CACHE_VALID. > > It looks like the problem appeared from our previous fix to sunrpc (1): > commit 4ecd55ea0742 ("sunrpc: fix cache_head leak due to queued > request") > > And we've also found a patch already fixing our patch (2): > commit d58431eacb22 ("sunrpc: don't mark uninitialised items as VALID.") > > Though the crash is eliminated, I think the core of the problem is not > completely fixed: > > Neil in the patch (2) makes cache_head CACHE_NEGATIVE, before > cache_fresh_locked which was added in (1) to fix crash. These way > cache_is_valid won't say the cache is valid anymore and in > svcauth_unix_set_client the function cache_check will return error > instead of 0, and we don't count entry as initialized. > > But it looks like we need to remove cache_fresh_locked completely in > sunrpc_cache_lookup: > > In (1) we've only wanted to make cache_fresh_unlocked->cache_dequeue so > that cache_requests with no readers also release corresponding > cache_head, to fix their leak. We with Vasily were not sure if > cache_fresh_locked and cache_fresh_unlocked should be used in pair or > not, so we've guessed to use them in pair. > > Now we see that we don't want the CACHE_VALID bit set here by > cache_fresh_locked, as "valid" means "initialized" and there is no > initialization in sunrpc_cache_add_entry. Both expiry_time and > last_refresh are not used in cache_fresh_unlocked code-path and also not > required for the initial fix. > > So to conclude cache_fresh_locked was called by mistake, and we can just > safely remove it instead of crutching it with CACHE_NEGATIVE. It looks > ideologically better for me. Hope I don't miss something here. > > Here is our crash backtrace: > [13108726.326291] BUG: unable to handle kernel NULL pointer dereference at > 0074 > [13108726.326365] IP: [] > svcauth_unix_set_client+0x2ab/0x520 [sunrpc] > [13108726.326448] PGD 0 > [13108726.326468] Oops: 0002 [#1] SMP > [13108726.326497] Modules linked in: nbd isofs xfs loop > kpatch_cumulative_81_0_r1(O) xt_physdev nfnetlink_queue bluetooth rfkill > ip6table_nat nf_nat_ipv6 ip_vs_wrr ip_vs_wlc ip_vs_sh nf_conntrack_netlink > ip_vs_sed ip_vs_pe_sip nf_conntrack_sip ip_vs_nq ip_vs_lc ip_vs_lblcr > ip_vs_lblc ip_vs_ftp ip_vs_dh nf_nat_ftp nf_conntrack_ftp iptable_raw > xt_recent nf_log_ipv6 xt_hl ip6t_rt nf_log_ipv4 nf_log_common xt_LOG xt_limit > xt_TCPMSS xt_tcpmss vxlan ip6_udp_tunnel udp_tunnel xt_statistic xt_NFLOG > nfnetlink_log dummy xt_mark xt_REDIRECT nf_nat_redirect raw_diag udp_diag > tcp_diag inet_diag netlink_diag af_packet_diag unix_diag rpcsec_gss_krb5 > xt_addrtype ip6t_rpfilter ipt_REJECT nf_reject_ipv4 ip6t_REJECT > nf_reject_ipv6 ebtable_nat ebtable_broute nf_conntrack_ipv6 nf_defrag_ipv6 > ip6table_mangle ip6table_raw nfsv4 > [13108726.327173] dns_resolver cls_u32 binfmt_misc arptable_filter > arp_tables ip6table_filter ip6_tables devlink fuse_kio_pcs ipt_MASQUERADE > nf_nat_masquerade_ipv4 xt_nat iptable_nat nf_nat_ipv4 xt_comment > nf_conntrack_ipv4 nf_defrag_ipv4 xt_wdog_tmo xt_multiport bonding xt_set > xt_conntrack iptable_filter iptable_mangle kpatch(O) ebtable_filter ebt_among > ebtables ip_set_hash_ip ip_set nfnetlink vfat fat skx_edac intel_powerclamp > coretemp intel_rapl iosf_mbi kvm_intel kvm irqbypass fuse pcspkr ses > enclosure joydev sg mei_me hpwdt hpilo lpc_ich mei ipmi_si shpchp > ipmi_devintf ipmi_msghandler xt_ipvs acpi_power_meter ip_vs_rr nfsv3 nfsd > auth_rpcgss nfs_acl nfs lockd grace fscache nf_nat cls_fw sch_htb sch_cbq > sch_sfq ip_vs em_u32 nf_conntrack tun br_netfilter veth overlay ip6_vzprivnet > ip6_vznetstat ip_vznetstat > [13108726.327817] ip_vzprivnet vziolimit vzevent vzlist vzstat vznetstat > vznetdev vzmon vzdev bridge pio_kaio pio_nfs pio_direct pfmt_raw pfmt_ploop1 > ploop ip_tables ext4 mbcache jbd2 sd_mod crc_t10dif crct10dif_generic mgag200 > i2c_algo_bit drm_kms_helper scsi_transport_iscsi 8021q syscopyarea > sysfillrect garp sysimgblt fb_sys_fops mrp stp ttm llc bnx2x crct10dif_pclmul > crct10dif_common crc32_pclmul crc32c_intel drm dm_multipath > ghash_clmulni_intel uas aesni_intel lrw gf128mul glue_helper ablk_helper > cryptd tg3 smartpqi scsi_transport_sas mdio libcrc32c i2c_core usb_storage > ptp pps_core wmi sunrpc dm_mirror dm_region_hash dm_log dm_mod [last > unloaded: kpatch_cumulative_82_0_r1] > [13108726.328403] CPU: 35 PID: 63742 Comm: nfsd ve: 51332 Kdump: loaded > Tainted: GW O