Re: [PATCH net-next v3 1/3] ipv4: support sport, dport and ip_proto in RTM_GETROUTE
On Wed, May 16, 2018 at 7:36 PM, David Millerwrote: > From: Roopa Prabhu > Date: Wed, 16 May 2018 13:30:28 -0700 > >> yes, but we hold rcu read lock before calling the reply function for >> fib result. I did consider allocating the skb before the read >> lock..but then the refactoring (into a separate netlink reply func) >> would seem unnecessary. >> >> I am fine with pre-allocating and undoing the refactoring if that works >> better. > > Hmmm... I also notice that with this change we end up doing the > rtnl_unicast() under the RCU lock which is unnecessary too. that was unintentional, it seemed like the previous code did that too.. and you are right it did not. > > So yes, please pull the "out_skb" allocation before the > rcu_read_lock(), and push the rtnl_unicast() after the > rcu_read_unlock(). agreed, will do. > > It really is a shame that sharing the ETH_P_IP skb between the route > route lookup and the netlink response doesn't work properly. I did try a few things before giving up on the same skb...since it also seemed like keeping the netlink code separate would be a good thing for the future. > > I was using RTM_GETROUTE at one point for route/fib lookup performance > measurements. It never was great at that, but now that there is going > to be two SKB allocations instead of one it is going to be even less > useful for that kind of usage. oh...did not realize this use of it. It certainly seems like we should try to retain the single skb in that case. let me see what i can do. thanks.
Re: [PATCH net-next v3 1/3] ipv4: support sport, dport and ip_proto in RTM_GETROUTE
From: Roopa PrabhuDate: Wed, 16 May 2018 13:30:28 -0700 > yes, but we hold rcu read lock before calling the reply function for > fib result. I did consider allocating the skb before the read > lock..but then the refactoring (into a separate netlink reply func) > would seem unnecessary. > > I am fine with pre-allocating and undoing the refactoring if that works > better. Hmmm... I also notice that with this change we end up doing the rtnl_unicast() under the RCU lock which is unnecessary too. So yes, please pull the "out_skb" allocation before the rcu_read_lock(), and push the rtnl_unicast() after the rcu_read_unlock(). It really is a shame that sharing the ETH_P_IP skb between the route route lookup and the netlink response doesn't work properly. I was using RTM_GETROUTE at one point for route/fib lookup performance measurements. It never was great at that, but now that there is going to be two SKB allocations instead of one it is going to be even less useful for that kind of usage.
Re: [PATCH net-next v3 1/3] ipv4: support sport, dport and ip_proto in RTM_GETROUTE
On Wed, May 16, 2018 at 11:37 AM, David Millerwrote: > From: Roopa Prabhu > Date: Tue, 15 May 2018 20:55:06 -0700 > >> +static int inet_rtm_getroute_reply(struct sk_buff *in_skb, struct nlmsghdr >> *nlh, >> +__be32 dst, __be32 src, struct flowi4 *fl4, >> +struct rtable *rt, struct fib_result *res) >> +{ >> + struct net *net = sock_net(in_skb->sk); >> + struct rtmsg *rtm = nlmsg_data(nlh); >> + u32 table_id = RT_TABLE_MAIN; >> + struct sk_buff *skb; >> + int err = 0; >> + >> + skb = nlmsg_new(NLMSG_DEFAULT_SIZE, GFP_ATOMIC); >> + if (!skb) >> + return -ENOMEM; > > If the caller can use GFP_KERNEL, so can this allocation. yes, but we hold rcu read lock before calling the reply function for fib result. I did consider allocating the skb before the read lock..but then the refactoring (into a separate netlink reply func) would seem unnecessary. I am fine with pre-allocating and undoing the refactoring if that works better.
Re: [PATCH net-next v3 1/3] ipv4: support sport, dport and ip_proto in RTM_GETROUTE
From: Roopa PrabhuDate: Tue, 15 May 2018 20:55:06 -0700 > +static int inet_rtm_getroute_reply(struct sk_buff *in_skb, struct nlmsghdr > *nlh, > +__be32 dst, __be32 src, struct flowi4 *fl4, > +struct rtable *rt, struct fib_result *res) > +{ > + struct net *net = sock_net(in_skb->sk); > + struct rtmsg *rtm = nlmsg_data(nlh); > + u32 table_id = RT_TABLE_MAIN; > + struct sk_buff *skb; > + int err = 0; > + > + skb = nlmsg_new(NLMSG_DEFAULT_SIZE, GFP_ATOMIC); > + if (!skb) > + return -ENOMEM; If the caller can use GFP_KERNEL, so can this allocation.
[PATCH net-next v3 1/3] ipv4: support sport, dport and ip_proto in RTM_GETROUTE
From: Roopa PrabhuThis is a followup to fib rules sport, dport and ipproto match support. Only supports tcp, udp and icmp for ipproto. Used by fib rule self tests. Before this patch getroute used same skb to pass through the route lookup and for the netlink getroute reply msg. This patch allocates separate skb's to keep flow dissector happy. Signed-off-by: Roopa Prabhu --- include/net/ip.h | 3 + include/uapi/linux/rtnetlink.h | 3 + net/ipv4/Makefile | 2 +- net/ipv4/fib_frontend.c| 4 + net/ipv4/netlink.c | 23 + net/ipv4/route.c | 190 ++--- 6 files changed, 173 insertions(+), 52 deletions(-) create mode 100644 net/ipv4/netlink.c diff --git a/include/net/ip.h b/include/net/ip.h index bada1f1..0d2281b 100644 --- a/include/net/ip.h +++ b/include/net/ip.h @@ -664,4 +664,7 @@ extern int sysctl_icmp_msgs_burst; int ip_misc_proc_init(void); #endif +int rtm_getroute_parse_ip_proto(struct nlattr *attr, u8 *ip_proto, + struct netlink_ext_ack *extack); + #endif /* _IP_H */ diff --git a/include/uapi/linux/rtnetlink.h b/include/uapi/linux/rtnetlink.h index 9b15005..cabb210 100644 --- a/include/uapi/linux/rtnetlink.h +++ b/include/uapi/linux/rtnetlink.h @@ -327,6 +327,9 @@ enum rtattr_type_t { RTA_PAD, RTA_UID, RTA_TTL_PROPAGATE, + RTA_IP_PROTO, + RTA_SPORT, + RTA_DPORT, __RTA_MAX }; diff --git a/net/ipv4/Makefile b/net/ipv4/Makefile index b379520..13f2ba9 100644 --- a/net/ipv4/Makefile +++ b/net/ipv4/Makefile @@ -14,7 +14,7 @@ obj-y := route.o inetpeer.o protocol.o \ udp_offload.o arp.o icmp.o devinet.o af_inet.o igmp.o \ fib_frontend.o fib_semantics.o fib_trie.o fib_notifier.o \ inet_fragment.o ping.o ip_tunnel_core.o gre_offload.o \ -metrics.o +metrics.o netlink.o obj-$(CONFIG_NET_IP_TUNNEL) += ip_tunnel.o obj-$(CONFIG_SYSCTL) += sysctl_net_ipv4.o diff --git a/net/ipv4/fib_frontend.c b/net/ipv4/fib_frontend.c index f05afaf..d7092c5 100644 --- a/net/ipv4/fib_frontend.c +++ b/net/ipv4/fib_frontend.c @@ -643,6 +643,10 @@ const struct nla_policy rtm_ipv4_policy[RTA_MAX + 1] = { [RTA_ENCAP] = { .type = NLA_NESTED }, [RTA_UID] = { .type = NLA_U32 }, [RTA_MARK] = { .type = NLA_U32 }, + [RTA_TABLE] = { .type = NLA_U32 }, + [RTA_IP_PROTO] = { .type = NLA_U8 }, + [RTA_SPORT] = { .type = NLA_U16 }, + [RTA_DPORT] = { .type = NLA_U16 }, }; static int rtm_to_fib_config(struct net *net, struct sk_buff *skb, diff --git a/net/ipv4/netlink.c b/net/ipv4/netlink.c new file mode 100644 index 000..f86bb4f --- /dev/null +++ b/net/ipv4/netlink.c @@ -0,0 +1,23 @@ +#include +#include +#include +#include +#include +#include + +int rtm_getroute_parse_ip_proto(struct nlattr *attr, u8 *ip_proto, + struct netlink_ext_ack *extack) +{ + *ip_proto = nla_get_u8(attr); + + switch (*ip_proto) { + case IPPROTO_TCP: + case IPPROTO_UDP: + case IPPROTO_ICMP: + return 0; + default: + NL_SET_ERR_MSG(extack, "Unsupported ip proto"); + return -EOPNOTSUPP; + } +} +EXPORT_SYMBOL_GPL(rtm_getroute_parse_ip_proto); diff --git a/net/ipv4/route.c b/net/ipv4/route.c index 29268ef..073c849 100644 --- a/net/ipv4/route.c +++ b/net/ipv4/route.c @@ -2569,11 +2569,10 @@ struct rtable *ip_route_output_flow(struct net *net, struct flowi4 *flp4, EXPORT_SYMBOL_GPL(ip_route_output_flow); /* called with rcu_read_lock held */ -static int rt_fill_info(struct net *net, __be32 dst, __be32 src, u32 table_id, - struct flowi4 *fl4, struct sk_buff *skb, u32 portid, - u32 seq) +static int rt_fill_info(struct net *net, __be32 dst, __be32 src, + struct rtable *rt, u32 table_id, struct flowi4 *fl4, + struct sk_buff *skb, u32 portid, u32 seq) { - struct rtable *rt = skb_rtable(skb); struct rtmsg *r; struct nlmsghdr *nlh; unsigned long expires = 0; @@ -2669,7 +2668,7 @@ static int rt_fill_info(struct net *net, __be32 dst, __be32 src, u32 table_id, } } else #endif - if (nla_put_u32(skb, RTA_IIF, skb->dev->ifindex)) + if (nla_put_u32(skb, RTA_IIF, fl4->flowi4_iif)) goto nla_put_failure; } @@ -2684,43 +2683,128 @@ static int rt_fill_info(struct net *net, __be32 dst, __be32 src, u32 table_id, return -EMSGSIZE; } +static int inet_rtm_getroute_reply(struct sk_buff *in_skb, struct nlmsghdr *nlh, + __be32 dst, __be32 src,