[PATCH bpf-next v2] bpf: kprobe: remove unused declaring of bpf_kprobe_override

2024-07-29 Thread Menglong Dong
After the commit 5ad2f102 ("tracing/kprobe: bpf: Compare instruction
pointer with original one"), "bpf_kprobe_override" is not used anywhere
anymore, and we can remove it now.

Fixes: 5ad2f102 ("tracing/kprobe: bpf: Compare instruction pointer with 
original one")
Signed-off-by: Menglong Dong 
---
v2: add the Fixes tag
---
 include/linux/trace_events.h | 1 -
 1 file changed, 1 deletion(-)

diff --git a/include/linux/trace_events.h b/include/linux/trace_events.h
index 9df3e2973626..9435185c10ef 100644
--- a/include/linux/trace_events.h
+++ b/include/linux/trace_events.h
@@ -880,7 +880,6 @@ do {
\
 struct perf_event;
 
 DECLARE_PER_CPU(struct pt_regs, perf_trace_regs);
-DECLARE_PER_CPU(int, bpf_kprobe_override);
 
 extern int  perf_trace_init(struct perf_event *event);
 extern void perf_trace_destroy(struct perf_event *event);
-- 
2.39.2




Re: [PATCH bpf-next] bpf: kprobe: remove unused declaring of bpf_kprobe_override

2024-07-17 Thread Menglong Dong
On Wed, Jul 10, 2024 at 10:18 PM Masami Hiramatsu  wrote:
>
> On Wed, 10 Jul 2024 15:13:07 +0200
> Jiri Olsa  wrote:
>
> > On Wed, Jul 10, 2024 at 04:59:39PM +0800, Menglong Dong wrote:
> > > After the commit 5ad2f102 ("tracing/kprobe: bpf: Compare instruction
> >
> > should be in Fixes: tag probably ?
>
> Yes, I'll add a Fixed tag.
>

Hello!

Should I send a v2 with the "Fixes" tag? It seems that this commit has
been pending for a while.

Thanks!
Menglong Dong

> >
> > > pointer with original one"), "bpf_kprobe_override" is not used anywhere
> > > anymore, and we can remove it now.
> > >
> > > Signed-off-by: Menglong Dong 
> >
> > lgtm, cc-ing Masami
> >
> > Acked-by: Jiri Olsa 
>
> Thanks!
>
> >
> > jirka
> >
> > > ---
> > >  include/linux/trace_events.h | 1 -
> > >  1 file changed, 1 deletion(-)
> > >
> > > diff --git a/include/linux/trace_events.h b/include/linux/trace_events.h
> > > index 9df3e2973626..9435185c10ef 100644
> > > --- a/include/linux/trace_events.h
> > > +++ b/include/linux/trace_events.h
> > > @@ -880,7 +880,6 @@ do {  
> > >   \
> > >  struct perf_event;
> > >
> > >  DECLARE_PER_CPU(struct pt_regs, perf_trace_regs);
> > > -DECLARE_PER_CPU(int, bpf_kprobe_override);
> > >
> > >  extern int  perf_trace_init(struct perf_event *event);
> > >  extern void perf_trace_destroy(struct perf_event *event);
> > > --
> > > 2.39.2
> > >
> > >
>
>
> --
> Masami Hiramatsu (Google) 



[PATCH bpf-next] bpf: kprobe: remove unused declaring of bpf_kprobe_override

2024-07-10 Thread Menglong Dong
After the commit 5ad2f102 ("tracing/kprobe: bpf: Compare instruction
pointer with original one"), "bpf_kprobe_override" is not used anywhere
anymore, and we can remove it now.

Signed-off-by: Menglong Dong 
---
 include/linux/trace_events.h | 1 -
 1 file changed, 1 deletion(-)

diff --git a/include/linux/trace_events.h b/include/linux/trace_events.h
index 9df3e2973626..9435185c10ef 100644
--- a/include/linux/trace_events.h
+++ b/include/linux/trace_events.h
@@ -880,7 +880,6 @@ do {
\
 struct perf_event;
 
 DECLARE_PER_CPU(struct pt_regs, perf_trace_regs);
-DECLARE_PER_CPU(int, bpf_kprobe_override);
 
 extern int  perf_trace_init(struct perf_event *event);
 extern void perf_trace_destroy(struct perf_event *event);
-- 
2.39.2




Re: [PATCH net-next 2/2] net: socket: change MSG_CMSG_COMPAT to BIT(21)

2021-03-21 Thread Menglong Dong
On Sun, Mar 21, 2021 at 8:49 PM Herbert Xu  wrote:
>
...
>
> Shouldn't you add some comment here to stop people from trying to
> use BIT(31) in the future?
>
> Thanks,

Yeah, I think it's necessary. Thank you for your reminder~

With Regards,
Menglong Dong


Re: [PATCH v4 RESEND net-next] net: socket: use BIT() for MSG_*

2021-03-17 Thread Menglong Dong
On Wed, Mar 17, 2021 at 11:12 PM David Laight  wrote:
>
...
>
> Isn't MSG_CMSG_COMPAT an internal value?
> Could it be changed to 1u << 30 instead of 1u << 31 ?
> Then it wouldn't matter if the high bit of flags got replicated.
>

Yeah, MSG_CMSG_COMPAT is an internal value, and maybe
it's why it is defined as 1<< 31, to make it look different.

I think it's a good idea to change it to other value which is
not used, such as 1u<<21.

I will test it and resend this patch later, thanks~

With Regards,
Menglong Dong


Re: [PATCH v4 RESEND net-next] net: socket: use BIT() for MSG_*

2021-03-17 Thread Menglong Dong
On Wed, Mar 17, 2021 at 9:53 PM Menglong Dong  wrote:
>
...
>
> Seems that the inconsistent usages of 'msg_flags' is a lot, for example the
> 'recvmsg()' in 'struct proto' and 'recvmsg()' in 'struct proto_ops':
>
> int (*recvmsg)(struct sock *sk, struct msghdr *msg,
> size_t len, int noblock, int flags,
> int *addr_len);
>
> This function prototype is used in many places, It's not easy to fix them.
> This patch is already reverted, and I think maybe
> I can resend it after I fix these 'int' flags.
>

I doubt it now...there are hundreds of functions that are defined as
'proto_ops->recvmsg()'.
enn...will this kind of patch be acceptable? Is it the time to give up?

With Best Regards,
Menglong Dong


Re: [PATCH v4 RESEND net-next] net: socket: use BIT() for MSG_*

2021-03-17 Thread Menglong Dong
On Wed, Mar 17, 2021 at 5:36 PM Andy Shevchenko
 wrote:
>
...
>
> The problematic code is negation of the flags when it's done in
> operations like &.
> It maybe fixed by swapping positions of the arguments, i.e. ~(FOO |
> BAR) & flags.
>
> All this is a beast called "integer promotions" in the C standard.
>
> The best is to try to get flags to be unsigned. By how invasive it may be?

Seems that the inconsistent usages of 'msg_flags' is a lot, for example the
'recvmsg()' in 'struct proto' and 'recvmsg()' in 'struct proto_ops':

int (*recvmsg)(struct sock *sk, struct msghdr *msg,
size_t len, int noblock, int flags,
int *addr_len);

This function prototype is used in many places, It's not easy to fix them.
This patch is already reverted, and I think maybe
I can resend it after I fix these 'int' flags.

>
> --
> With Best Regards,
> Andy Shevchenko

Thanks!
Menglong Dong


Re: [PATCH v4 RESEND net-next] net: socket: use BIT() for MSG_*

2021-03-17 Thread Menglong Dong
Hello,

On Wed, Mar 17, 2021 at 9:38 AM Guenter Roeck  wrote:
>
> On Wed, Mar 17, 2021 at 01:02:51AM +0200, Andy Shevchenko wrote:
> > On Wednesday, March 17, 2021, Guenter Roeck  wrote:
> >
...
>
> The problem is in net/packet/af_packet.c:packet_recvmsg(). This function,
> as well as all other rcvmsg functions, is declared as
>
> static int packet_recvmsg(struct socket *sock, struct msghdr *msg, size_t len,
>   int flags)
>
> MSG_CMSG_COMPAT (0x8000) is set in flags, meaning its value is negative.
> This is then evaluated in
>
>if (flags & 
> ~(MSG_PEEK|MSG_DONTWAIT|MSG_TRUNC|MSG_CMSG_COMPAT|MSG_ERRQUEUE))
> goto out;
>
> If any of those flags is declared as BIT() and thus long, flags is
> sign-extended to long. Since it is negative, its upper 32 bits will be set,
> the if statement evaluates as true, and the function bails out.
>
> This is relatively easy to fix here with, for example,
>
> if ((unsigned int)flags & 
> ~(MSG_PEEK|MSG_DONTWAIT|MSG_TRUNC|MSG_CMSG_COMPAT|MSG_ERRQUEUE))
> goto out;
>
> but that is just a hack, and it doesn't solve the real problem:
> Each function in struct proto_ops which passes flags passes it as int
> (see include/linux/net.h:struct proto_ops). Each such function, if
> called with MSG_CMSG_COMPAT set, will fail a match against
> ~(MSG_anything) if MSG_anything is declared as BIT() or long.
>
> As it turns out, I was kind of lucky to catch the problem: So far I have
> seen it only on mips64 kernels with N32 userspace.
>
> Guenter

 Wow, now the usages of 'msg_flag' really puzzle me. Seems that
it is used as 'unsigned int' somewhere, but 'int' somewhere
else.

As I found, It is used as 'int' in 'netlink_recvmsg()',
'io_sr_msg->msg_flags', 'atalk_sendmsg()',
'dn_recvmsg()',  'proto_ops->recvmsg()', etc.

So what should I do? Revert this patch? Or fix the usages of 'flags'?
Or change the type of MSG_* to 'unsigned int'? I prefer the last
one(the usages of 'flags' can be fixed too, maybe later).


Thanks!
Menglong Dong


Re: [PATCH v4 net-next] net: socket: use BIT() for MSG_*

2021-03-09 Thread Menglong Dong
Hello!

On Wed, Feb 24, 2021 at 4:30 AM Jakub Kicinski  wrote:
>
...
> Please repost when net-next reopens after 5.12-rc1 is cut.
>
> Look out for the announcement on the mailing list or check:
> http://vger.kernel.org/~davem/net-next.html
>
> RFC patches sent for review only are obviously welcome at any time.

Is 'net-next' open? Can I resend this patch now? It seems that a long
time has passed.

Thanks~
Menglong Dong


Re: [PATCH v2 net-next] net: socket: use BIT() for MSG_*

2021-02-07 Thread Menglong Dong
On Sun, Feb 7, 2021 at 7:52 PM Andy Shevchenko
 wrote:
>
> On Sun, Feb 7, 2021 at 6:32 AM  wrote:
> >
> > From: Menglong Dong 
> >
> > The bit mask for MSG_* seems a little confused here. Replace it
> > with BIT() to make it clear to understand.
>
> Now it's confusing which version maintainer should take (you forgot,
> it seems twice, to bump the patch version and mention the changes in
> the changelog).

Sorry, a 'BIT_MASK()' escaped in the first one, and I just thought that the
second one will override the first one, as long as I send it quick enough
before anyone see it:)

Thanks~
Menglong Dong


Re: [PATCH net-next] net: socket: use BIT_MASK for MSG_*

2021-02-06 Thread Menglong Dong
Hello!

On Sat, Feb 6, 2021 at 4:20 PM Andy Shevchenko
 wrote:
>
>
>
> On Saturday, February 6, 2021,  wrote:
>>
>> From: Menglong Dong 
>>
>> The bit mask for MSG_* seems a little confused here. Replace it
>> with BIT_MASK to make it clear to understand.
>
>
> It makes it more confusing if you understand the difference between 
> BIT_MASK() and BIT(). I think you have to use the latter. And note () when 
> referring to the function or macro.
>

I replaced BIT_MASK() with BIT() in the patch of v2, and it looks much
more tidy.
I can't figure out the difference between BIT() and BIT_MASK(), seems
the latter one more safe... isn't it?

Thanks:)
Menglong Dong


Re: [PATCH net-next 3/3] net: core: Namespace-ify sysctl_rmem_max and sysctl_wmem_max

2021-01-20 Thread Menglong Dong
Hello~

On Wed, Jan 20, 2021 at 6:46 PM Florian Westphal  wrote:
>
> >
> > For that reason, make sysctl_wmem_max and sysctl_rmem_max
> > per-namespace.
>
> I think having those values be restricted by init netns is a desirable
> property.

I just thought that having these values per-namespace can be more flexible,
and users can have more choices. Is there any bad influence that I didn't
realize?

Thanks~
Menglong Dong


Re: [PATCH] ata: remove redundant error print in rb532_pata_driver_probe

2021-01-14 Thread Menglong Dong
On Thu, Jan 14, 2021 at 4:30 PM Sergei Shtylyov
 wrote:
[...]
> >
> > What does this 'MBR' mean? I am a novice~~~
>
> Generally speaking, Master Boot Record. But I also use it to send you My
> Best Regards. :-)

Haha~,

> > So, is it better to replace 'platform_get_irq' with
> > 'platform_get_irq_optional' here?
>
> No. You should stop overriding the result to -ENOENT and pass the result
> up the call chain instead. In order to do it, you should only check for (irq 
> < 0).

Well, I didn't even notice this. It does seem to be another problem...

---
 Best Regards
 Menglong Dong


Re: [PATCH] ata: remove redundant error print in rb532_pata_driver_probe

2021-01-13 Thread Menglong Dong
Hello, Sergei

On Tue, Jan 12, 2021 at 7:15 PM Sergei Shtylyov
 wrote:
>
> Hello!
>
> On 1/12/21 5:36 AM, menglong8.d...@gmail.com wrote:
>
[]
> >   irq = platform_get_irq(pdev, 0);
> > - if (irq <= 0) {
> > - dev_err(&pdev->dev, "no IRQ resource found\n");
> > + if (irq <= 0)
> >   return -ENOENT;
>
>This still beaks the probe deferral. :-(
>But that's another problem...
>
> [...]
>
> MBR, Sergei

What does this 'MBR' mean? I am a novice~~~
So, is it better to replace 'platform_get_irq' with
'platform_get_irq_optional' here?

--
 Best Regards
 Menglong Dong


Re: [PATCH] net: sched: fix misspellings using misspell-fixer tool

2020-11-10 Thread Menglong Dong
Hi, John.

On Tue, Nov 10, 2020 at 5:23 AM John Fastabend  wrote:
>
> Hi, you will want to add net-next to the [PATCH *] line next time
> to make it clear this is for net-next. The contents make it
> obvious in this case though.
>
> Also I'm not sure why the bpf@ include but OK.

Thanks for your suggestion, and I will pay attention to the [PATCH *] next time.

As for the bpf@, I guess that 'get_maintainer.pl' listed it to me
because of 'act_bpf.c'.

Cheers,
Menglong Dong


Re: [PATCH] net: tcp: ratelimit warnings in tcp_recvmsg

2020-11-09 Thread Menglong Dong
On Mon, Nov 9, 2020 at 9:36 PM Eric Dumazet  wrote:
>
> I do not think this patch is useful. That is simply code churn.
>
> Can you trigger the WARN() in the latest upstream version ?
> If yes this is a serious bug that needs urgent attention.
>
> Make sure you have backported all needed fixes into your kernel, if
> you get this warning on a non pristine kernel.

Theoretically, this WARN() shouldn't be triggered in any branches.
Somehow, it just happened in kernel v3.10. This really confused me. I
wasn't able to keep tracing it, as it is a product environment.

I notice that the codes for tcp skb receiving didn't change much
between v3.10 and the latest upstream version, and guess the latest
version can be triggered too.

If something is fixed and this WARN() won't be triggered, just ignore me.

Cheers,
Menglong Dong


[PATCH v2] net: ipv4: remove redundant initialization in inet_rtm_deladdr

2020-11-07 Thread Menglong Dong
The initialization for 'err' with '-EINVAL' is redundant and
can be removed, as it is updated soon.

Changes since v1:
- Remove redundant empty line

Signed-off-by: Menglong Dong 
---
 net/ipv4/devinet.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/net/ipv4/devinet.c b/net/ipv4/devinet.c
index 123a6d39438f..43e04382c593 100644
--- a/net/ipv4/devinet.c
+++ b/net/ipv4/devinet.c
@@ -650,8 +650,7 @@ static int inet_rtm_deladdr(struct sk_buff *skb, struct 
nlmsghdr *nlh,
struct in_device *in_dev;
struct ifaddrmsg *ifm;
struct in_ifaddr *ifa;
-
-   int err = -EINVAL;
+   int err;
 
ASSERT_RTNL();
 
-- 
2.29.2



Re: [PATCH] net: ipv4: remove redundant initialization in inet_rtm_deladdr

2020-11-07 Thread Menglong Dong
Dear Jakub,

On Sun, Nov 8, 2020 at 7:48 AM Jakub Kicinski  wrote:
>
> On Fri,  6 Nov 2020 01:42:37 -0500 menglong8.d...@gmail.com wrote:
> > From: Menglong Dong 
> >
> > The initialization for 'err' with '-EINVAL' is redundant and
> > can be removed, as it is updated soon.
> >
> > Signed-off-by: Menglong Dong 
>
> How many changes like this are there in the kernel right now?
>
> I'm afraid that if there are too many it's not worth the effort.
>
> Also - what tool do you use to find those, we need to make sure new
> instances don't get into the tree.
>

I didn't use any tools. Maybe some general tools, such as kw,
coverity, coccicheck,
are able to find these changes(as far as I know, they can).

In fact, I find these changes by my eyes. I believe 'err' is the most
likely victim
and checked every usage of it in 'net' directory. Here are all the
changes I found,
and I think there won't be too many.

Cheers,
Menglong Dong


Re: [PATCH] net: bridge: disable multicast while delete bridge

2020-11-04 Thread Menglong Dong
Dear Nik,

On Wed, Nov 4, 2020 at 12:26 AM Nikolay Aleksandrov  wrote:
>
> On Mon, 2020-11-02 at 22:38 +0800, Menglong Dong wrote:
> > From: Menglong Dong 
> >
> > This commit seems make no sense, as bridge is destroyed when
> > br_multicast_dev_del is called.
> >
> > In commit b1b9d366028f
> > ("bridge: move bridge multicast cleanup to ndo_uninit"), Xin Long
> > fixed the use-after-free panic in br_multicast_group_expired by
> > moving br_multicast_dev_del to ndo_uninit. However, that patch is
> > not applied to 4.4.X, and the bug exists.
> >
> > Fix that bug by disabling multicast in br_multicast_dev_del for
> > 4.4.X, and there is no harm for other branches.
> >
> > Signed-off-by: Menglong Dong 
> > ---
> >  net/bridge/br_multicast.c | 1 +
> >  1 file changed, 1 insertion(+)
> >
> > diff --git a/net/bridge/br_multicast.c b/net/bridge/br_multicast.c
> > index eae898c3cff7..9992fdff2951 100644
> > --- a/net/bridge/br_multicast.c
> > +++ b/net/bridge/br_multicast.c
> > @@ -3369,6 +3369,7 @@ void br_multicast_dev_del(struct net_bridge *br)
> >   hlist_for_each_entry_safe(mp, tmp, &br->mdb_list, mdb_node)
> >   br_multicast_del_mdb_entry(mp);
> >   hlist_move_list(&br->mcast_gc_list, &deleted_head);
> > + br_opt_toggle(br, BROPT_MULTICAST_ENABLED, false);
> >   spin_unlock_bh(&br->multicast_lock);
> >
> >   br_multicast_gc(&deleted_head);
>
> This doesn't make any sense. It doesn't fix anything.
> If 4.4 has a problem then the relevant patches should get backported to it.
> We don't add random changes to fix older releases.
>
> Cheers,
>  Nik
>
> Nacked-by: Nikolay Aleksandrov 

Thanks for your patient explanation, and I see it now~

Cheers,
Menglong Dong


[PATCH] net: bridge: disable multicast while delete bridge

2020-11-02 Thread Menglong Dong
From: Menglong Dong 

This commit seems make no sense, as bridge is destroyed when
br_multicast_dev_del is called.

In commit b1b9d366028f
("bridge: move bridge multicast cleanup to ndo_uninit"), Xin Long
fixed the use-after-free panic in br_multicast_group_expired by
moving br_multicast_dev_del to ndo_uninit. However, that patch is
not applied to 4.4.X, and the bug exists.

Fix that bug by disabling multicast in br_multicast_dev_del for
4.4.X, and there is no harm for other branches.

Signed-off-by: Menglong Dong 
---
 net/bridge/br_multicast.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/net/bridge/br_multicast.c b/net/bridge/br_multicast.c
index eae898c3cff7..9992fdff2951 100644
--- a/net/bridge/br_multicast.c
+++ b/net/bridge/br_multicast.c
@@ -3369,6 +3369,7 @@ void br_multicast_dev_del(struct net_bridge *br)
hlist_for_each_entry_safe(mp, tmp, &br->mdb_list, mdb_node)
br_multicast_del_mdb_entry(mp);
hlist_move_list(&br->mcast_gc_list, &deleted_head);
+   br_opt_toggle(br, BROPT_MULTICAST_ENABLED, false);
spin_unlock_bh(&br->multicast_lock);
 
br_multicast_gc(&deleted_head);
-- 
2.25.1



[PATCH] net: ipv6: remove redundant blank in ip6_frags_ns_sysctl_register

2020-11-02 Thread Menglong Dong
From: Menglong Dong 

This blank seems redundant.

Signed-off-by: Menglong Dong 
---
 net/ipv6/reassembly.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/net/ipv6/reassembly.c b/net/ipv6/reassembly.c
index 1f5d4d196dcc..b1b8d104063b 100644
--- a/net/ipv6/reassembly.c
+++ b/net/ipv6/reassembly.c
@@ -437,7 +437,6 @@ static int __net_init ip6_frags_ns_sysctl_register(struct 
net *net)
table = kmemdup(table, sizeof(ip6_frags_ns_ctl_table), 
GFP_KERNEL);
if (!table)
goto err_alloc;
-
}
table[0].data   = &net->ipv6.fqdir->high_thresh;
table[0].extra1 = &net->ipv6.fqdir->low_thresh;
-- 
2.28.0



Re: [PATCH] net: udp: increase UDP_MIB_RCVBUFERRORS when ENOBUFS

2020-10-26 Thread Menglong Dong
Hello~

On Mon, Oct 26, 2020 at 5:52 PM Paolo Abeni  wrote:
>
> Hello,
>
> On Mon, 2020-10-26 at 17:39 +0800, Menglong Dong wrote:
> > The error returned from __udp_enqueue_schedule_skb is ENOMEM or ENOBUFS.
> > For now, only ENOMEM is counted into UDP_MIB_RCVBUFERRORS in
> > __udp_queue_rcv_skb. UDP_MIB_RCVBUFERRORS should count all of the
> > failed skb because of memory errors during udp receiving, not just because 
> > of the limit of sock receive queue. We can see this
> > in __udp4_lib_mcast_deliver:
> >
> >   nskb = skb_clone(skb, GFP_ATOMIC);
> >
> >   if (unlikely(!nskb)) {
> >   atomic_inc(&sk->sk_drops);
> >   __UDP_INC_STATS(net, UDP_MIB_RCVBUFERRORS,
> >   IS_UDPLITE(sk));
> >   __UDP_INC_STATS(net, UDP_MIB_INERRORS,
> >   IS_UDPLITE(sk));
> >   continue;
> >   }
> >
> > See, UDP_MIB_RCVBUFERRORS is increased when skb clone failed. From this
> > point, ENOBUFS from __udp_enqueue_schedule_skb should be counted, too.
> > It means that the buffer used by all of the UDP sock is to the limit, and
> > it ought to be counted.
> >
> > Signed-off-by: Menglong Dong 
> > ---
> >  net/ipv4/udp.c | 4 +---
> >  net/ipv6/udp.c | 4 +---
> >  2 files changed, 2 insertions(+), 6 deletions(-)
> >
> > diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
> > index 09f0a23d1a01..49a69d8d55b3 100644
> > --- a/net/ipv4/udp.c
> > +++ b/net/ipv4/udp.c
> > @@ -2035,9 +2035,7 @@ static int __udp_queue_rcv_skb(struct sock *sk, 
> > struct sk_buff *skb)
> >   int is_udplite = IS_UDPLITE(sk);
> >
> >   /* Note that an ENOMEM error is charged twice */
> > - if (rc == -ENOMEM)
> > - UDP_INC_STATS(sock_net(sk), UDP_MIB_RCVBUFERRORS,
> > - is_udplite);
> > + UDP_INC_STATS(sock_net(sk), UDP_MIB_RCVBUFERRORS, is_udplite);
> >   UDP_INC_STATS(sock_net(sk), UDP_MIB_INERRORS, is_udplite);
> >   kfree_skb(skb);
> >   trace_udp_fail_queue_rcv_skb(rc, sk);
> > diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c
> > index 29d9691359b9..d5e23b150fd9 100644
> > --- a/net/ipv6/udp.c
> > +++ b/net/ipv6/udp.c
> > @@ -634,9 +634,7 @@ static int __udpv6_queue_rcv_skb(struct sock *sk, 
> > struct sk_buff *skb)
> >   int is_udplite = IS_UDPLITE(sk);
> >
> >   /* Note that an ENOMEM error is charged twice */
> > - if (rc == -ENOMEM)
> > - UDP6_INC_STATS(sock_net(sk),
> > -  UDP_MIB_RCVBUFERRORS, is_udplite);
> > + UDP6_INC_STATS(sock_net(sk), UDP_MIB_RCVBUFERRORS, 
> > is_udplite);
> >   UDP6_INC_STATS(sock_net(sk), UDP_MIB_INERRORS, is_udplite);
> >   kfree_skb(skb);
> >   return -1;
>
> The diffstat is nice, but I'm unsure we can do this kind of change
> (well, I really think we should not do it): it will fool any kind of
> existing users (application, scripts, admin) currently reading the
> above counters and expecting UDP_MIB_RCVBUFERRORS being increased with
> the existing schema.
>
> Cheers,
>
> Paolo
>

Well, your words make sense, this change isn't friendly for the existing users.
It really puzzled me when this ENOBUFS happened, no counters were done and
I hardly figured out what happened.

So, is it a good idea to introduce a 'UDP_MIB_MEMERRORS'?

Cheers,

Menglong Dong


[PATCH] net: udp: increase UDP_MIB_RCVBUFERRORS when ENOBUFS

2020-10-26 Thread Menglong Dong
The error returned from __udp_enqueue_schedule_skb is ENOMEM or ENOBUFS.
For now, only ENOMEM is counted into UDP_MIB_RCVBUFERRORS in
__udp_queue_rcv_skb. UDP_MIB_RCVBUFERRORS should count all of the
failed skb because of memory errors during udp receiving, not just
those because of the limit of sock receive queue. We can see this
in __udp4_lib_mcast_deliver:

nskb = skb_clone(skb, GFP_ATOMIC);

if (unlikely(!nskb)) {
atomic_inc(&sk->sk_drops);
__UDP_INC_STATS(net, UDP_MIB_RCVBUFERRORS,
IS_UDPLITE(sk));
__UDP_INC_STATS(net, UDP_MIB_INERRORS,
IS_UDPLITE(sk));
continue;
}

See, UDP_MIB_RCVBUFERRORS is increased when skb clone failed. From this
point, ENOBUFS from __udp_enqueue_schedule_skb should be counted, too.
It means that the buffer used by all of the UDP sock is to the limit, and
it ought to be counted.

Signed-off-by: Menglong Dong 
---
 net/ipv4/udp.c | 4 +---
 net/ipv6/udp.c | 4 +---
 2 files changed, 2 insertions(+), 6 deletions(-)

diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index 09f0a23d1a01..49a69d8d55b3 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -2035,9 +2035,7 @@ static int __udp_queue_rcv_skb(struct sock *sk, struct 
sk_buff *skb)
int is_udplite = IS_UDPLITE(sk);
 
/* Note that an ENOMEM error is charged twice */
-   if (rc == -ENOMEM)
-   UDP_INC_STATS(sock_net(sk), UDP_MIB_RCVBUFERRORS,
-   is_udplite);
+   UDP_INC_STATS(sock_net(sk), UDP_MIB_RCVBUFERRORS, is_udplite);
UDP_INC_STATS(sock_net(sk), UDP_MIB_INERRORS, is_udplite);
kfree_skb(skb);
trace_udp_fail_queue_rcv_skb(rc, sk);
diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c
index 29d9691359b9..d5e23b150fd9 100644
--- a/net/ipv6/udp.c
+++ b/net/ipv6/udp.c
@@ -634,9 +634,7 @@ static int __udpv6_queue_rcv_skb(struct sock *sk, struct 
sk_buff *skb)
int is_udplite = IS_UDPLITE(sk);
 
/* Note that an ENOMEM error is charged twice */
-   if (rc == -ENOMEM)
-   UDP6_INC_STATS(sock_net(sk),
-UDP_MIB_RCVBUFERRORS, is_udplite);
+   UDP6_INC_STATS(sock_net(sk), UDP_MIB_RCVBUFERRORS, is_udplite);
UDP6_INC_STATS(sock_net(sk), UDP_MIB_INERRORS, is_udplite);
kfree_skb(skb);
return -1;
-- 
2.28.0