[iproute PATCH 12/51] iproute_lwtunnel: csum_mode value checking was ineffective
ila_csum_name2mode() returning -1 on error but being declared as returning __u8 doesn't make much sense. Change the code to correctly detect this issue. Checking for __u8 overruns shouldn't be necessary though since ila_csum_name2mode() return values are well-defined. Signed-off-by: Phil Sutter <p...@nwl.cc> --- ip/iproute_lwtunnel.c | 7 --- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/ip/iproute_lwtunnel.c b/ip/iproute_lwtunnel.c index 5c0c7d110d23e..398ab5e077ed8 100644 --- a/ip/iproute_lwtunnel.c +++ b/ip/iproute_lwtunnel.c @@ -171,7 +171,7 @@ static char *ila_csum_mode2name(__u8 csum_mode) } } -static __u8 ila_csum_name2mode(char *name) +static int ila_csum_name2mode(char *name) { if (strcmp(name, "adj-transport") == 0) return ILA_CSUM_ADJUST_TRANSPORT; @@ -517,7 +517,7 @@ static int parse_encap_ila(struct rtattr *rta, size_t len, while (argc > 0) { if (strcmp(*argv, "csum-mode") == 0) { - __u8 csum_mode; + int csum_mode; NEXT_ARG(); @@ -526,7 +526,8 @@ static int parse_encap_ila(struct rtattr *rta, size_t len, invarg("\"csum-mode\" value is invalid\n", *argv); - rta_addattr8(rta, 1024, ILA_ATTR_CSUM_MODE, csum_mode); + rta_addattr8(rta, 1024, ILA_ATTR_CSUM_MODE, +(__u8)csum_mode); argc--; argv++; } else { -- 2.13.1
[iproute PATCH 44/51] tipc/bearer: Fix resource leak in error path
Signed-off-by: Phil Sutter <p...@nwl.cc> --- tipc/bearer.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/tipc/bearer.c b/tipc/bearer.c index 810344f672af1..c3d4491f8f6ef 100644 --- a/tipc/bearer.c +++ b/tipc/bearer.c @@ -163,6 +163,7 @@ static int nl_add_udp_enable_opts(struct nlmsghdr *nlh, struct opt *opts, if (!remip) { if (generate_multicast(loc->ai_family, buf, sizeof(buf))) { fprintf(stderr, "Failed to generate multicast address\n"); + freeaddrinfo(loc); return -EINVAL; } remip = buf; @@ -177,6 +178,8 @@ static int nl_add_udp_enable_opts(struct nlmsghdr *nlh, struct opt *opts, if (rem->ai_family != loc->ai_family) { fprintf(stderr, "UDP local and remote AF mismatch\n"); + freeaddrinfo(rem); + freeaddrinfo(loc); return -EINVAL; } -- 2.13.1
[iproute PATCH 46/51] tipc/node: Fix socket fd check in cmd_node_get_addr()
socket() returns -1 on error, not 0. Signed-off-by: Phil Sutter <p...@nwl.cc> --- tipc/node.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/tipc/node.c b/tipc/node.c index 201fe1a4df3bd..fe085aec9b4ac 100644 --- a/tipc/node.c +++ b/tipc/node.c @@ -109,7 +109,8 @@ static int cmd_node_get_addr(struct nlmsghdr *nlh, const struct cmd *cmd, socklen_t sz = sizeof(struct sockaddr_tipc); struct sockaddr_tipc addr; - if (!(sk = socket(AF_TIPC, SOCK_RDM, 0))) { + sk = socket(AF_TIPC, SOCK_RDM, 0); + if (sk < 0) { fprintf(stderr, "opening TIPC socket: %s\n", strerror(errno)); return -1; } -- 2.13.1
[iproute PATCH 43/51] tc/tc_filter: Make sure filter name is not empty
The later check for 'k[0] != 0' requires a non-empty filter name, otherwise NULL pointer dereference in 'q' might happen. Signed-off-by: Phil Sutter <p...@nwl.cc> --- tc/tc_filter.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/tc/tc_filter.c b/tc/tc_filter.c index b13fb9185d4fd..a799edb35886d 100644 --- a/tc/tc_filter.c +++ b/tc/tc_filter.c @@ -412,6 +412,9 @@ static int tc_filter_get(int cmd, unsigned int flags, int argc, char **argv) usage(); return 0; } else { + if (!strlen(*argv)) + invarg("invalid filter name", *argv); + strncpy(k, *argv, sizeof(k)-1); q = get_filter_kind(k); -- 2.13.1
[iproute PATCH 15/51] ipvrf: Fix error path of vrf_switch()
Apart from trying to close(-1), this also leaked memory. Signed-off-by: Phil Sutter <p...@nwl.cc> --- ip/ipvrf.c | 9 + 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/ip/ipvrf.c b/ip/ipvrf.c index 92e2db98ca7d7..75cc026d072b8 100644 --- a/ip/ipvrf.c +++ b/ip/ipvrf.c @@ -373,12 +373,12 @@ static int vrf_switch(const char *name) /* -1 on length to add '/' to the end */ if (ipvrf_get_netns(netns, sizeof(netns) - 1) < 0) - return -1; + goto out; if (vrf_path(vpath, sizeof(vpath)) < 0) { fprintf(stderr, "Failed to get base cgroup path: %s\n", strerror(errno)); - return -1; + goto out; } /* if path already ends in netns then don't add it again */ @@ -429,13 +429,14 @@ static int vrf_switch(const char *name) snprintf(pid, sizeof(pid), "%d", getpid()); if (write(fd, pid, strlen(pid)) < 0) { fprintf(stderr, "Failed to join cgroup\n"); - goto out; + goto out2; } rc = 0; +out2: + close(fd); out: free(mnt); - close(fd); return rc; } -- 2.13.1
[iproute PATCH 06/51] iplink_vrf: Complain if main table is not found
Signed-off-by: Phil Sutter <p...@nwl.cc> --- ip/iplink_vrf.c | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/ip/iplink_vrf.c b/ip/iplink_vrf.c index 917630e853375..809eda5de8f6e 100644 --- a/ip/iplink_vrf.c +++ b/ip/iplink_vrf.c @@ -131,7 +131,10 @@ __u32 ipvrf_get_table(const char *name) , sizeof(answer)) < 0) { /* special case "default" vrf to be the main table */ if (errno == ENODEV && !strcmp(name, "default")) - rtnl_rttable_a2n(_id, "main"); + if (rtnl_rttable_a2n(_id, "main")) + fprintf(stderr, + "BUG: RTTable \"main\" not found.\n"); + return tb_id; } -- 2.13.1
[iproute PATCH 07/51] ipmaddr: Avoid accessing uninitialized data
Looks like this can only happen if /proc/net/igmp is malformed, but better be sure. Signed-off-by: Phil Sutter <p...@nwl.cc> --- ip/ipmaddr.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/ip/ipmaddr.c b/ip/ipmaddr.c index 4f726fdd976f1..85a69e779563d 100644 --- a/ip/ipmaddr.c +++ b/ip/ipmaddr.c @@ -136,7 +136,7 @@ static void read_igmp(struct ma_info **result_p) while (fgets(buf, sizeof(buf), fp)) { struct ma_info *ma; - size_t len; + size_t len = 0; if (buf[0] != '\t') { sscanf(buf, "%d%s", , m.name); -- 2.13.1
[iproute PATCH 42/51] tc/q_netem: Don't dereference possibly NULL pointer
Assuming 'opt' might be NULL, move the call to RTA_PAYLOAD to after the check since it dereferences its parameter. Signed-off-by: Phil Sutter <p...@nwl.cc> --- tc/q_netem.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/tc/q_netem.c b/tc/q_netem.c index 0975ae111de97..7e3330512041a 100644 --- a/tc/q_netem.c +++ b/tc/q_netem.c @@ -538,7 +538,7 @@ static int netem_print_opt(struct qdisc_util *qu, FILE *f, struct rtattr *opt) int *ecn = NULL; struct tc_netem_qopt qopt; const struct tc_netem_rate *rate = NULL; - int len = RTA_PAYLOAD(opt) - sizeof(qopt); + int len; __u64 rate64 = 0; SPRINT_BUF(b1); @@ -546,6 +546,8 @@ static int netem_print_opt(struct qdisc_util *qu, FILE *f, struct rtattr *opt) if (opt == NULL) return 0; + len = RTA_PAYLOAD(opt) - sizeof(qopt); + if (len < 0) { fprintf(stderr, "options size error\n"); return -1; -- 2.13.1
[iproute PATCH 17/51] lib/bpf: Don't leak fp in bpf_find_mntpt()
If fopen() succeeded but len != PATH_MAX, the function leaks the open FILE pointer. Fix this by checking len value before calling fopen(). Signed-off-by: Phil Sutter <p...@nwl.cc> --- lib/bpf.c | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/lib/bpf.c b/lib/bpf.c index 4f52ad4a8f023..1dcb261dc915f 100644 --- a/lib/bpf.c +++ b/lib/bpf.c @@ -537,8 +537,11 @@ static const char *bpf_find_mntpt(const char *fstype, unsigned long magic, } } + if (len != PATH_MAX) + return NULL; + fp = fopen("/proc/mounts", "r"); - if (fp == NULL || len != PATH_MAX) + if (fp == NULL) return NULL; while (fscanf(fp, "%*s %" textify(PATH_MAX) "s %99s %*s %*d %*d\n", -- 2.13.1
[iproute PATCH 37/51] netem/maketable: Check return value of fscanf()
Signed-off-by: Phil Sutter <p...@nwl.cc> --- netem/maketable.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/netem/maketable.c b/netem/maketable.c index ad660e7d457f0..ccb8f0c68b062 100644 --- a/netem/maketable.c +++ b/netem/maketable.c @@ -38,8 +38,8 @@ readdoubles(FILE *fp, int *number) } for (i=0; i<limit; ++i){ - fscanf(fp, "%lf", [i]); - if (feof(fp)) + if (fscanf(fp, "%lf", [i]) != 1 || + feof(fp)) break; ++n; } -- 2.13.1
[iproute PATCH 18/51] lib/fs: Fix format string in find_fs_mount()
A field width of 4096 allows fscanf() to store that amount of characters into the given buffer, though that doesn't include the terminating NULL byte. Decrease the value by one to leave space for it. Signed-off-by: Phil Sutter <p...@nwl.cc> --- lib/fs.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/lib/fs.c b/lib/fs.c index c59ac564581d0..1ff881ecfcd8c 100644 --- a/lib/fs.c +++ b/lib/fs.c @@ -45,7 +45,7 @@ static char *find_fs_mount(const char *fs_to_find) return NULL; } - while (fscanf(fp, "%*s %4096s %127s %*s %*d %*d\n", + while (fscanf(fp, "%*s %4095s %127s %*s %*d %*d\n", path, fstype) == 2) { if (strcmp(fstype, fs_to_find) == 0) { mnt = strdup(path); -- 2.13.1
[iproute PATCH 19/51] lib/fs: Fix and simplify make_path()
Calling stat() before mkdir() is racey: The entry might change in between. Also, the call to stat() seems to exist only to check if the directory exists already. So simply call mkdir() unconditionally and catch only errors other than EEXIST. Signed-off-by: Phil Sutter <p...@nwl.cc> --- lib/fs.c | 20 +--- 1 file changed, 5 insertions(+), 15 deletions(-) diff --git a/lib/fs.c b/lib/fs.c index 1ff881ecfcd8c..ebe05cd44e11b 100644 --- a/lib/fs.c +++ b/lib/fs.c @@ -102,7 +102,6 @@ out: int make_path(const char *path, mode_t mode) { char *dir, *delim; - struct stat sbuf; int rc = -1; delim = dir = strdup(path); @@ -120,20 +119,11 @@ int make_path(const char *path, mode_t mode) if (delim) *delim = '\0'; - if (stat(dir, ) != 0) { - if (errno != ENOENT) { - fprintf(stderr, - "stat failed for %s: %s\n", - dir, strerror(errno)); - goto out; - } - - if (mkdir(dir, mode) != 0) { - fprintf(stderr, - "mkdir failed for %s: %s\n", - dir, strerror(errno)); - goto out; - } + rc = mkdir(dir, mode); + if (mkdir(dir, mode) != 0 && errno != EEXIST) { + fprintf(stderr, "mkdir failed for %s: %s\n", + dir, strerror(errno)); + goto out; } if (delim == NULL) -- 2.13.1
[iproute PATCH 16/51] xfrm_state: Make sure alg_name is NULL-terminated
Signed-off-by: Phil Sutter <p...@nwl.cc> --- ip/xfrm_state.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/ip/xfrm_state.c b/ip/xfrm_state.c index e11c93bf1c3b5..7c0389038986e 100644 --- a/ip/xfrm_state.c +++ b/ip/xfrm_state.c @@ -125,7 +125,8 @@ static int xfrm_algo_parse(struct xfrm_algo *alg, enum xfrm_attr_type_t type, fprintf(stderr, "warning: ALGO-NAME/ALGO-KEYMAT values will be sent to the kernel promiscuously! (verifying them isn't implemented yet)\n"); #endif - strncpy(alg->alg_name, name, sizeof(alg->alg_name)); + strncpy(alg->alg_name, name, sizeof(alg->alg_name) - 1); + alg->alg_name[sizeof(alg->alg_name) - 1] = '\0'; if (slen > 2 && strncmp(key, "0x", 2) == 0) { /* split two chars "0x" from the top */ -- 2.13.1
[iproute PATCH 14/51] ipvrf: Don't try to close an invalid fd
Signed-off-by: Phil Sutter <p...@nwl.cc> --- ip/ipvrf.c | 7 --- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/ip/ipvrf.c b/ip/ipvrf.c index 0094cf8557cd7..92e2db98ca7d7 100644 --- a/ip/ipvrf.c +++ b/ip/ipvrf.c @@ -268,7 +268,7 @@ static int vrf_configure_cgroup(const char *path, int ifindex) fprintf(stderr, "Failed to open cgroup path: '%s'\n", strerror(errno)); - goto out; + return rc; } /* @@ -290,13 +290,14 @@ static int vrf_configure_cgroup(const char *path, int ifindex) if (bpf_prog_attach_fd(prog_fd, cg_fd, BPF_CGROUP_INET_SOCK_CREATE)) { fprintf(stderr, "Failed to attach prog to cgroup: '%s'\n", strerror(errno)); - goto out; + goto out2; } rc = 0; +out2: + close(prog_fd); out: close(cg_fd); - close(prog_fd); return rc; } -- 2.13.1
[iproute PATCH 50/51] Check user supplied interface name lengths
The original problem was that something like: | strncpy(ifr.ifr_name, *argv, IFNAMSIZ); might leave ifr.ifr_name unterminated if length of *argv exceeds IFNAMSIZ. In order to fix this, I thought about replacing all those cases with (equivalent) calls to snprintf() or even introducing strlcpy(). But as Ulrich Drepper correctly pointed out when rejecting the latter from being added to glibc, truncating a string without notifying the user is not to be considered good practice. So let's excercise what he suggested and reject empty or overlong interface names right from the start - this way calls to strncpy() like shown above become safe and the user has a chance to reconsider what he was trying to do. Note that this doesn't add calls to assert_valid_dev_name() to all places where user supplied interface name is parsed. In many cases, the interface must exist already and is therefore being looked up using ll_name_to_index(), so if_nametoindex() will perform the necessary checks already. Signed-off-by: Phil Sutter <p...@nwl.cc> --- include/utils.h | 1 + ip/ip6tunnel.c | 6 -- ip/ipl2tp.c | 1 + ip/iplink.c | 27 --- ip/ipmaddr.c| 1 + ip/iprule.c | 4 ip/iptunnel.c | 12 ip/iptuntap.c | 4 +++- lib/utils.c | 8 misc/arpd.c | 1 + 10 files changed, 39 insertions(+), 26 deletions(-) diff --git a/include/utils.h b/include/utils.h index 6080b962fb411..103def33c92f3 100644 --- a/include/utils.h +++ b/include/utils.h @@ -132,6 +132,7 @@ void missarg(const char *) __attribute__((noreturn)); void invarg(const char *, const char *) __attribute__((noreturn)); void duparg(const char *, const char *) __attribute__((noreturn)); void duparg2(const char *, const char *) __attribute__((noreturn)); +void assert_valid_dev_name(const char *, const char *); int matches(const char *arg, const char *pattern); int inet_addr_match(const inet_prefix *a, const inet_prefix *b, int bits); diff --git a/ip/ip6tunnel.c b/ip/ip6tunnel.c index b4a7def144226..b3ccfcf87be8f 100644 --- a/ip/ip6tunnel.c +++ b/ip/ip6tunnel.c @@ -180,7 +180,8 @@ static int parse_args(int argc, char **argv, int cmd, struct ip6_tnl_parm2 *p) memcpy(>laddr, , sizeof(p->laddr)); } else if (strcmp(*argv, "dev") == 0) { NEXT_ARG(); - strncpy(medium, *argv, IFNAMSIZ - 1); + assert_valid_dev_name("dev", *argv); + strncpy(medium, *argv, IFNAMSIZ); } else if (strcmp(*argv, "encaplimit") == 0) { NEXT_ARG(); if (strcmp(*argv, "none") == 0) { @@ -273,7 +274,8 @@ static int parse_args(int argc, char **argv, int cmd, struct ip6_tnl_parm2 *p) usage(); if (p->name[0]) duparg2("name", *argv); - strncpy(p->name, *argv, IFNAMSIZ - 1); + assert_valid_dev_name("name", *argv); + strncpy(p->name, *argv, IFNAMSIZ); if (cmd == SIOCCHGTUNNEL && count == 0) { struct ip6_tnl_parm2 old_p = {}; diff --git a/ip/ipl2tp.c b/ip/ipl2tp.c index 88664c909e11f..cdf9f4d8425d4 100644 --- a/ip/ipl2tp.c +++ b/ip/ipl2tp.c @@ -545,6 +545,7 @@ static int parse_args(int argc, char **argv, int cmd, struct l2tp_parm *p) } } else if (strcmp(*argv, "name") == 0) { NEXT_ARG(); + assert_valid_dev_name("name", *argv); p->ifname = *argv; } else if (strcmp(*argv, "remote") == 0) { NEXT_ARG(); diff --git a/ip/iplink.c b/ip/iplink.c index 5aff2fde38dae..8f76f0467fc9c 100644 --- a/ip/iplink.c +++ b/ip/iplink.c @@ -869,7 +869,6 @@ int iplink_parse(int argc, char **argv, struct iplink_req *req, static int iplink_modify(int cmd, unsigned int flags, int argc, char **argv) { - int len; char *dev = NULL; char *name = NULL; char *link = NULL; @@ -959,13 +958,9 @@ static int iplink_modify(int cmd, unsigned int flags, int argc, char **argv) } if (name) { - len = strlen(name) + 1; - if (len == 1) - invarg("\"\" is not a valid device identifier\n", - "name"); - if (len > IFNAMSIZ) - invarg("\"name\" too long\n", name); - addattr_l(, sizeof(req), IFLA_IFNAME, name, len); + assert_valid_dev_name("name", name); + addattr_l(, sizeof(req), + IFLA_IFNAME, name, strlen(name) +
[iproute PATCH 08/51] ipntable: No need to check and assign to parms_rta
This variable is initialized at declaration and nowhere else does any assignment to it happen, so just drop the check. Signed-off-by: Phil Sutter <p...@nwl.cc> --- ip/ipntable.c | 2 -- 1 file changed, 2 deletions(-) diff --git a/ip/ipntable.c b/ip/ipntable.c index 879626ee4f491..1837909fa42e7 100644 --- a/ip/ipntable.c +++ b/ip/ipntable.c @@ -202,8 +202,6 @@ static int ipntable_modify(int cmd, int flags, int argc, char **argv) if (get_u32(, *argv, 0)) invarg("\"queue\" value is invalid", *argv); - if (!parms_rta) - parms_rta = (struct rtattr *)_buf; rta_addattr32(parms_rta, sizeof(parms_buf), NDTPA_QUEUE_LEN, queue); parms_change = 1; -- 2.13.1
[iproute PATCH 04/51] ipaddress: Avoid accessing uninitialized variable lcl
If no address was given, ipaddr_modify() accesses uninitialized data when assigning to req.ifa.ifa_prefixlen. Signed-off-by: Phil Sutter <p...@nwl.cc> --- ip/ipaddress.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/ip/ipaddress.c b/ip/ipaddress.c index 3c9decb51b412..9307c9416dde3 100644 --- a/ip/ipaddress.c +++ b/ip/ipaddress.c @@ -1888,7 +1888,7 @@ static int ipaddr_modify(int cmd, int flags, int argc, char **argv) char *lcl_arg = NULL; char *valid_lftp = NULL; char *preferred_lftp = NULL; - inet_prefix lcl; + inet_prefix lcl = {}; inet_prefix peer; int local_len = 0; int peer_len = 0; -- 2.13.1
[iproute PATCH 33/51] ss: Don't leak fd in tcp_show_netlink_file()
Signed-off-by: Phil Sutter <p...@nwl.cc> --- misc/ss.c | 32 1 file changed, 20 insertions(+), 12 deletions(-) diff --git a/misc/ss.c b/misc/ss.c index 4d2f75b571ea6..cda5e3b6a2d6f 100644 --- a/misc/ss.c +++ b/misc/ss.c @@ -2687,41 +2687,44 @@ static int tcp_show_netlink_file(struct filter *f) { FILE*fp; charbuf[16384]; + int err = -1; if ((fp = fopen(getenv("TCPDIAG_FILE"), "r")) == NULL) { perror("fopen($TCPDIAG_FILE)"); - return -1; + return err; } while (1) { - int status, err; + int status, err2; struct nlmsghdr *h = (struct nlmsghdr *)buf; struct sockstat s = {}; status = fread(buf, 1, sizeof(*h), fp); if (status < 0) { perror("Reading header from $TCPDIAG_FILE"); - return -1; + break; } if (status != sizeof(*h)) { perror("Unexpected EOF reading $TCPDIAG_FILE"); - return -1; + break; } status = fread(h+1, 1, NLMSG_ALIGN(h->nlmsg_len-sizeof(*h)), fp); if (status < 0) { perror("Reading $TCPDIAG_FILE"); - return -1; + break; } if (status + sizeof(*h) < h->nlmsg_len) { perror("Unexpected EOF reading $TCPDIAG_FILE"); - return -1; + break; } /* The only legal exit point */ - if (h->nlmsg_type == NLMSG_DONE) - return 0; + if (h->nlmsg_type == NLMSG_DONE) { + err = 0; + break; + } if (h->nlmsg_type == NLMSG_ERROR) { struct nlmsgerr *err = (struct nlmsgerr *)NLMSG_DATA(h); @@ -2732,7 +2735,7 @@ static int tcp_show_netlink_file(struct filter *f) errno = -err->error; perror("TCPDIAG answered"); } - return -1; + break; } parse_diag_msg(h, ); @@ -2741,10 +2744,15 @@ static int tcp_show_netlink_file(struct filter *f) if (f && f->f && run_ssfilter(f->f, ) == 0) continue; - err = inet_show_sock(h, ); - if (err < 0) - return err; + err2 = inet_show_sock(h, ); + if (err2 < 0) { + err = err2; + break; + } } + + fclose(fp); + return err; } static int tcp_show(struct filter *f, int socktype) -- 2.13.1
[iproute PATCH 34/51] ss: Make sure scanned index value to unix_state_map is sane
Signed-off-by: Phil Sutter <p...@nwl.cc> --- misc/ss.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/misc/ss.c b/misc/ss.c index cda5e3b6a2d6f..667b8faad6528 100644 --- a/misc/ss.c +++ b/misc/ss.c @@ -3150,7 +3150,8 @@ static int unix_show(struct filter *f) if (flags & (1 << 16)) { u->state = SS_LISTEN; - } else { + } else if (u->state > 0 && + u->state <= ARRAY_SIZE(unix_state_map)) { u->state = unix_state_map[u->state-1]; if (u->type == SOCK_DGRAM && u->state == SS_CLOSE && u->rport) u->state = SS_ESTABLISHED; -- 2.13.1
[iproute PATCH 28/51] nstat: Avoid passing negative fd to fdopen()
Introduce a wrapper which does the sanity checking and returns NULL in case fd is invalid. Signed-off-by: Phil Sutter <p...@nwl.cc> --- misc/nstat.c | 15 +++ 1 file changed, 11 insertions(+), 4 deletions(-) diff --git a/misc/nstat.c b/misc/nstat.c index 23e1569d7872b..c1e7ddec271e2 100644 --- a/misc/nstat.c +++ b/misc/nstat.c @@ -252,9 +252,16 @@ static void load_ugly_table(FILE *fp) } } +static FILE *fdopen_null(int fd, const char *mode) +{ + if (fd < 0) + return NULL; + return fdopen(fd, mode); +} + static void load_sctp_snmp(void) { - FILE *fp = fdopen(net_sctp_snmp_open(), "r"); + FILE *fp = fdopen_null(net_sctp_snmp_open(), "r"); if (fp) { load_good_table(fp); @@ -264,7 +271,7 @@ static void load_sctp_snmp(void) static void load_snmp(void) { - FILE *fp = fdopen(net_snmp_open(), "r"); + FILE *fp = fdopen_null(net_snmp_open(), "r"); if (fp) { load_ugly_table(fp); @@ -274,7 +281,7 @@ static void load_snmp(void) static void load_snmp6(void) { - FILE *fp = fdopen(net_snmp6_open(), "r"); + FILE *fp = fdopen_null(net_snmp6_open(), "r"); if (fp) { load_good_table(fp); @@ -284,7 +291,7 @@ static void load_snmp6(void) static void load_netstat(void) { - FILE *fp = fdopen(net_netstat_open(), "r"); + FILE *fp = fdopen_null(net_netstat_open(), "r"); if (fp) { load_ugly_table(fp); -- 2.13.1
[iproute PATCH 30/51] ss: Skip useless check in parse_hostcond()
The passed 'addr' parameter is dereferenced by caller before and in parse_hostcond() multiple times before this check, so assume it is always true. Signed-off-by: Phil Sutter <p...@nwl.cc> --- misc/ss.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/misc/ss.c b/misc/ss.c index b4f89c85c2d52..5ea388fbf1c1a 100644 --- a/misc/ss.c +++ b/misc/ss.c @@ -1671,7 +1671,7 @@ void *parse_hostcond(char *addr, bool is_port) } } } - if (!is_port && addr && *addr && *addr != '*') { + if (!is_port && *addr && *addr != '*') { if (get_prefix_1(, addr, fam)) { if (get_dns_host(, addr, fam)) { fprintf(stderr, "Error: an inet prefix is expected rather than \"%s\".\n", addr); -- 2.13.1
[iproute PATCH 41/51] tc/q_multiq: Don't pass garbage in TCA_OPTIONS
multiq_parse_opt() doesn't change 'opt' at all. So at least make sure it doesn't fill TCA_OPTIONS attribute with garbage from stack. Signed-off-by: Phil Sutter <p...@nwl.cc> --- tc/q_multiq.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tc/q_multiq.c b/tc/q_multiq.c index 7823931494563..9c09c9a7748f6 100644 --- a/tc/q_multiq.c +++ b/tc/q_multiq.c @@ -43,7 +43,7 @@ static void explain(void) static int multiq_parse_opt(struct qdisc_util *qu, int argc, char **argv, struct nlmsghdr *n) { - struct tc_multiq_qopt opt; + struct tc_multiq_qopt opt = {}; if (argc) { if (strcmp(*argv, "help") == 0) { -- 2.13.1
[iproute PATCH 40/51] tc/m_xt: Fix for potential string buffer overflows
- Use strncpy() when writing to target->t->u.user.name and make sure the final byte remains untouched (xtables_calloc() set it to zero). - 'tname' length sanitization was completely wrong: If it's length exceeded the 16 bytes available in 'k', passing a length value of 16 to strncpy() would overwrite the previously NULL'ed 'k[15]'. Also, the sanitization has to happen if 'tname' is exactly 16 bytes long as well. Signed-off-by: Phil Sutter <p...@nwl.cc> --- tc/m_xt.c | 7 --- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/tc/m_xt.c b/tc/m_xt.c index ad52d239caf61..9218b14594403 100644 --- a/tc/m_xt.c +++ b/tc/m_xt.c @@ -95,7 +95,8 @@ build_st(struct xtables_target *target, struct xt_entry_target *t) if (t == NULL) { target->t = xtables_calloc(1, size); target->t->u.target_size = size; - strcpy(target->t->u.user.name, target->name); + strncpy(target->t->u.user.name, target->name, + sizeof(target->t->u.user.name) - 1); target->t->u.user.revision = target->revision; if (target->init != NULL) @@ -277,8 +278,8 @@ static int parse_ipt(struct action_util *a, int *argc_p, } fprintf(stdout, " index %d\n", index); - if (strlen(tname) > 16) { - size = 16; + if (strlen(tname) >= 16) { + size = 15; k[15] = 0; } else { size = 1 + strlen(tname); -- 2.13.1
[iproute PATCH 39/51] tc/m_gact: Drop dead code
The use of 'ok' variable in parse_gact() is ineffective: The second conditional increments it either if *argv is 'gact' or if parse_action_control() doesn't fail (in which case exit() is called). So this is effectively an unconditional increment and since no decrement happens anywhere, all remaining checks for 'ok != 0' can be dropped. Signed-off-by: Phil Sutter <p...@nwl.cc> --- tc/m_gact.c | 14 +++--- 1 file changed, 3 insertions(+), 11 deletions(-) diff --git a/tc/m_gact.c b/tc/m_gact.c index 1a2583372c34e..df143c9e0953e 100644 --- a/tc/m_gact.c +++ b/tc/m_gact.c @@ -76,7 +76,6 @@ parse_gact(struct action_util *a, int *argc_p, char ***argv_p, { int argc = *argc_p; char **argv = *argv_p; - int ok = 0; struct tc_gact p = { 0 }; #ifdef CONFIG_GACT_PROB int rd = 0; @@ -89,17 +88,14 @@ parse_gact(struct action_util *a, int *argc_p, char ***argv_p, if (matches(*argv, "gact") == 0) { - ok++; argc--; argv++; - } else { - if (parse_action_control(, , , false) == -1) - usage(); - ok++; + } else if (parse_action_control(, , , false) == -1) { + usage();/* does not return */ } #ifdef CONFIG_GACT_PROB - if (ok && argc > 0) { + if (argc > 0) { if (matches(*argv, "random") == 0) { rd = 1; NEXT_ARG(); @@ -142,15 +138,11 @@ parse_gact(struct action_util *a, int *argc_p, char ***argv_p, } argc--; argv++; - ok++; } else if (matches(*argv, "help") == 0) { usage(); } } - if (!ok) - return -1; - tail = NLMSG_TAIL(n); addattr_l(n, MAX_MSG, tca_id, NULL, 0); addattr_l(n, MAX_MSG, TCA_GACT_PARMS, , sizeof(p)); -- 2.13.1
[iproute PATCH 03/51] ipaddress: Make buffer for filter.flushb static
The buffer is accessed outside of the function defining it, so make it static. Signed-off-by: Phil Sutter <p...@nwl.cc> --- ip/ipaddress.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/ip/ipaddress.c b/ip/ipaddress.c index 4d37c5e045071..3c9decb51b412 100644 --- a/ip/ipaddress.c +++ b/ip/ipaddress.c @@ -1488,7 +1488,7 @@ static void ipaddr_filter(struct nlmsg_chain *linfo, struct nlmsg_chain *ainfo) static int ipaddr_flush(void) { int round = 0; - char flushb[4096-512]; + static char flushb[4096-512]; filter.flushb = flushb; filter.flushp = 0; -- 2.13.1
[iproute PATCH 45/51] tipc/bearer: Prevent NULL pointer dereference
Signed-off-by: Phil Sutter <p...@nwl.cc> --- tipc/bearer.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/tipc/bearer.c b/tipc/bearer.c index c3d4491f8f6ef..0598328ab1f1b 100644 --- a/tipc/bearer.c +++ b/tipc/bearer.c @@ -438,8 +438,8 @@ static int cmd_bearer_enable(struct nlmsghdr *nlh, const struct cmd *cmd, if (err) return err; - opt = get_opt(opts, "media"); - if (strcmp(opt->val, "udp") == 0) { + if ((opt = get_opt(opts, "media")) && + strcmp(opt->val, "udp") == 0) { err = nl_add_udp_enable_opts(nlh, opts, cmdl); if (err) return err; -- 2.13.1
[iproute PATCH 00/51] Fix potential issues detected by Coverity tool
Covscan really wasn't amused (indicated by the number of patches in this series). Try to make it happy. Phil Sutter (51): devlink: Check return code of strslashrsplit() devlink: No need for this self-assignment ipaddress: Make buffer for filter.flushb static ipaddress: Avoid accessing uninitialized variable lcl iplink_can: Prevent overstepping array bounds iplink_vrf: Complain if main table is not found ipmaddr: Avoid accessing uninitialized data ipntable: No need to check and assign to parms_rta ipntable: Make sure filter.name is NULL-terminated iproute: Fix for missing 'Oifs:' display iproute: Check mark value input iproute_lwtunnel: csum_mode value checking was ineffective iproute_lwtunnel: Argument to strerror must be positive ipvrf: Don't try to close an invalid fd ipvrf: Fix error path of vrf_switch() xfrm_state: Make sure alg_name is NULL-terminated lib/bpf: Don't leak fp in bpf_find_mntpt() lib/fs: Fix format string in find_fs_mount() lib/fs: Fix and simplify make_path() lib/inet_proto: Make sure destination buffers are NULL-terminated lib/libnetlink: Don't pass NULL parameter to memcpy() lib/rt_names: Drop dead code in rtnl_rttable_n2a() ifstat: Fix memleak in error case ifstat, nstat: Check fdopen() return value ifstat: Fix memleak in dump_kern_db() for json output lnstat_util: Simplify alloc_and_open() a bit nstat: Fix for potential NULL pointer dereference nstat: Avoid passing negative fd to fdopen() ss: Use C99 initializer in netlink_show_one() ss: Skip useless check in parse_hostcond() ss: Drop useless assignment ss: Make sure index variable is >= 0 ss: Don't leak fd in tcp_show_netlink_file() ss: Make sure scanned index value to unix_state_map is sane ss: Fix potential memleak in unix_stats_print() netem/maketable: Check return value of fstat() netem/maketable: Check return value of fscanf() tc/em_ipset: Don't leak sockfd on error path tc/m_gact: Drop dead code tc/m_xt: Fix for potential string buffer overflows tc/q_multiq: Don't pass garbage in TCA_OPTIONS tc/q_netem: Don't dereference possibly NULL pointer tc/tc_filter: Make sure filter name is not empty tipc/bearer: Fix resource leak in error path tipc/bearer: Prevent NULL pointer dereference tipc/node: Fix socket fd check in cmd_node_get_addr() examples: Some shell fixes to cbq.init ifcfg: Quote left-hand side of [ ] expression lib/ll_map: Make sure im->name is NULL-terminated Check user supplied interface name lengths lib/bpf: Check return value of write() devlink/devlink.c| 18 ++- examples/cbq.init-v0.7.3 | 24 ++-- include/utils.h | 1 + ip/ifcfg | 2 +- ip/ip6tunnel.c | 6 +++-- ip/ipaddress.c | 4 ++-- ip/ipl2tp.c | 1 + ip/iplink.c | 27 +++ ip/iplink_can.c | 4 ++-- ip/iplink_vrf.c | 5 - ip/ipmaddr.c | 3 ++- ip/ipntable.c| 5 ++--- ip/iproute.c | 14 +++- ip/iproute_lwtunnel.c| 9 ip/iprule.c | 4 ip/iptunnel.c| 12 ++ ip/iptuntap.c| 4 +++- ip/ipvrf.c | 16 -- ip/xfrm_state.c | 3 ++- lib/bpf.c| 8 +-- lib/fs.c | 22 +-- lib/inet_proto.c | 9 +--- lib/libnetlink.c | 6 +++-- lib/ll_map.c | 4 ++-- lib/rt_names.c | 4 lib/utils.c | 8 +++ misc/arpd.c | 1 + misc/ifstat.c| 28 +--- misc/lnstat_util.c | 7 ++ misc/nstat.c | 33 +++- misc/ss.c| 57 +--- netem/maketable.c| 8 +++ tc/em_ipset.c| 1 + tc/m_gact.c | 14 +++- tc/m_xt.c| 7 +++--- tc/q_multiq.c| 2 +- tc/q_netem.c | 4 +++- tc/tc_filter.c | 3 +++ tipc/bearer.c| 7 -- tipc/node.c | 3 ++- 40 files changed, 230 insertions(+), 168 deletions(-) -- 2.13.1
[iproute PATCH 20/51] lib/inet_proto: Make sure destination buffers are NULL-terminated
Signed-off-by: Phil Sutter <p...@nwl.cc> --- lib/inet_proto.c | 9 ++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/lib/inet_proto.c b/lib/inet_proto.c index ceda082b12a2e..87ed4769fc3da 100644 --- a/lib/inet_proto.c +++ b/lib/inet_proto.c @@ -35,8 +35,10 @@ const char *inet_proto_n2a(int proto, char *buf, int len) pe = getprotobynumber(proto); if (pe) { icache = proto; - strncpy(ncache, pe->p_name, 16); - strncpy(buf, pe->p_name, len); + strncpy(ncache, pe->p_name, 15); + ncache[15] = '\0'; + strncpy(buf, pe->p_name, len - 1); + buf[len] = '\0'; return buf; } snprintf(buf, len, "ipproto-%d", proto); @@ -62,7 +64,8 @@ int inet_proto_a2n(const char *buf) pe = getprotobyname(buf); if (pe) { icache = pe->p_proto; - strncpy(ncache, pe->p_name, 16); + strncpy(ncache, pe->p_name, 15); + ncache[15] = '\0'; return pe->p_proto; } return -1; -- 2.13.1
[iproute PATCH 36/51] netem/maketable: Check return value of fstat()
Otherwise info.st_size may contain garbage. Signed-off-by: Phil Sutter <p...@nwl.cc> --- netem/maketable.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/netem/maketable.c b/netem/maketable.c index 6aff927be7040..ad660e7d457f0 100644 --- a/netem/maketable.c +++ b/netem/maketable.c @@ -24,8 +24,8 @@ readdoubles(FILE *fp, int *number) int limit; int n=0, i; - fstat(fileno(fp), ); - if (info.st_size > 0) { + if (!fstat(fileno(fp), ) && + info.st_size > 0) { limit = 2*info.st_size/sizeof(double); /* @@ approximate */ } else { limit = 1; -- 2.13.1
[iproute PATCH 23/51] ifstat: Fix memleak in error case
Signed-off-by: Phil Sutter <p...@nwl.cc> --- misc/ifstat.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/misc/ifstat.c b/misc/ifstat.c index a853ee6d7e3b3..8fa354265a9a1 100644 --- a/misc/ifstat.c +++ b/misc/ifstat.c @@ -143,8 +143,10 @@ static int get_nlmsg_extended(const struct sockaddr_nl *who, struct rtattr *attr; attr = parse_rtattr_one_nested(sub_type, tb[filter_type]); - if (attr == NULL) + if (attr == NULL) { + free(n); return 0; + } memcpy(>val, RTA_DATA(attr), sizeof(n->val)); } memset(>rate, 0, sizeof(n->rate)); -- 2.13.1
[iproute PATCH 47/51] examples: Some shell fixes to cbq.init
This addresses the following issues: - $@ is an array, so don't use it in quoted strings - use $* instead. - Add missing quotes to components of [ ] expressions. These are not strictly necessary since the output of 'wc -l' should be a single word only, but in case of errors, bash prints "integer expression expected" instead of "too many arguments". - Use -print0/-0 when piping from find to xargs to allow for filenames which contain whitespace. - Quote arguments to 'eval' to prevent word-splitting. Signed-off-by: Phil Sutter <p...@nwl.cc> --- examples/cbq.init-v0.7.3 | 24 1 file changed, 12 insertions(+), 12 deletions(-) diff --git a/examples/cbq.init-v0.7.3 b/examples/cbq.init-v0.7.3 index 1bc0d446f8983..66448d88f0053 100644 --- a/examples/cbq.init-v0.7.3 +++ b/examples/cbq.init-v0.7.3 @@ -532,7 +532,7 @@ cbq_off () { ### Prefixed message cbq_message () { - echo -e "**CBQ: $@" + echo -e "**CBQ: $*" } # cbq_message ### Failure message @@ -560,15 +560,15 @@ cbq_time2abs () { ### Display CBQ setup cbq_show () { for dev in `cbq_device_list`; do - [ `tc qdisc show dev $dev| wc -l` -eq 0 ] && continue + [ "`tc qdisc show dev $dev| wc -l`" -eq 0 ] && continue echo -e "### $dev: queueing disciplines\n" tc $1 qdisc show dev $dev; echo - [ `tc class show dev $dev| wc -l` -eq 0 ] && continue + [ "`tc class show dev $dev| wc -l`" -eq 0 ] && continue echo -e "### $dev: traffic classes\n" tc $1 class show dev $dev; echo - [ `tc filter show dev $dev| wc -l` -eq 0 ] && continue + [ "`tc filter show dev $dev| wc -l`" -eq 0 ] && continue echo -e "### $dev: filtering rules\n" tc $1 filter show dev $dev; echo done @@ -585,7 +585,7 @@ cbq_init () { ### Gather all DEVICE fields from $1/cbq-* DEVFIELDS=`find $1 -maxdepth 1 \( -type f -or -type l \) -name 'cbq-*' \ - -not -name '*~' | xargs sed -n 's/#.*//; \ + -not -name '*~' -print0 | xargs -0 sed -n 's/#.*//; \ s/[[:space:]]//g; /^DEVICE=[^,]*,[^,]*\(,[^,]*\)\?/ \ { s/.*=//; p; }'| sort -u` [ -z "$DEVFIELDS" ] && @@ -593,7 +593,7 @@ cbq_init () { ### Check for different DEVICE fields for the same device DEVICES=`echo "$DEVFIELDS"| sed 's/,.*//'| sort -u` - [ `echo "$DEVICES"| wc -l` -ne `echo "$DEVFIELDS"| wc -l` ] && + [ "`echo "$DEVICES"| wc -l`" -ne "`echo "$DEVFIELDS"| wc -l`" ] && cbq_failure "different DEVICE fields for single device!\n$DEVFIELDS" } # cbq_init @@ -618,7 +618,7 @@ cbq_load_class () { PRIO_MARK=$PRIO_MARK_DEFAULT PRIO_REALM=$PRIO_REALM_DEFAULT - eval `echo "$CFILE"| grep -E "^($CBQ_WORDS)="` + eval "`echo "$CFILE"| grep -E "^($CBQ_WORDS)="`" ### Require RATE/WEIGHT [ -z "$RATE" -o -z "$WEIGHT" ] && @@ -661,7 +661,7 @@ if [ "$1" = "compile" ]; then ### echo-only version of "tc" command tc () { - echo "$TC $@" + echo "$TC $*" } # tc elif [ -n "$CBQ_DEBUG" ]; then @@ -669,13 +669,13 @@ elif [ -n "$CBQ_DEBUG" ]; then ### Logging version of "ip" command ip () { - echo -e "\n# ip $@" >> $CBQ_DEBUG + echo -e "\n# ip $*" >> $CBQ_DEBUG $IP "$@" 2>&1 | tee -a $CBQ_DEBUG } # ip ### Logging version of "tc" command tc () { - echo -e "\n# tc $@" >> $CBQ_DEBUG + echo -e "\n# tc $*" >> $CBQ_DEBUG $TC "$@" 2>&1 | tee -a $CBQ_DEBUG } # tc else @@ -711,8 +711,8 @@ if [ "$1" != "compile" -a "$2" != "nocache" -a -z "$CBQ_DEBUG" ]; then ### validate the cache [ "$2" = "invalidate" -o ! -f $CBQ_CACHE ] && VALID=0 if [ $VALID -eq 1 ]; then - [ `find $CBQ_PATH -maxdepth 1 -newer $CBQ_CACHE| \ - wc -l` -gt 0 ] && VALID=0 + [ "`find $CBQ_PATH -maxdepth 1 -newer $CBQ_CACHE| \ + wc -l`" -gt 0 ] && VALID=0 fi ### compile the config if the cache is invalid -- 2.13.1
[iproute PATCH 09/51] ipntable: Make sure filter.name is NULL-terminated
Signed-off-by: Phil Sutter <p...@nwl.cc> --- ip/ipntable.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/ip/ipntable.c b/ip/ipntable.c index 1837909fa42e7..30907146e85a3 100644 --- a/ip/ipntable.c +++ b/ip/ipntable.c @@ -631,7 +631,8 @@ static int ipntable_show(int argc, char **argv) } else if (strcmp(*argv, "name") == 0) { NEXT_ARG(); - strncpy(filter.name, *argv, sizeof(filter.name)); + strncpy(filter.name, *argv, sizeof(filter.name) - 1); + filter.name[sizeof(filter.name) - 1] = '\0'; } else invarg("unknown", *argv); -- 2.13.1
[iproute PATCH 11/51] iproute: Check mark value input
Signed-off-by: Phil Sutter <p...@nwl.cc> --- ip/iproute.c | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/ip/iproute.c b/ip/iproute.c index 89caac124f489..5fe8a3a75d5b7 100644 --- a/ip/iproute.c +++ b/ip/iproute.c @@ -1495,7 +1495,8 @@ static int iproute_list_flush_or_save(int argc, char **argv, int action) id = *argv; } else if (strcmp(*argv, "mark") == 0) { NEXT_ARG(); - get_unsigned(, *argv, 0); + if (get_unsigned(, *argv, 0)) + invarg("invalid mark value\n", *argv); filter.markmask = -1; } else if (strcmp(*argv, "via") == 0) { int family; @@ -1712,7 +1713,8 @@ static int iproute_get(int argc, char **argv) idev = *argv; } else if (matches(*argv, "mark") == 0) { NEXT_ARG(); - get_unsigned(, *argv, 0); + if (get_unsigned(, *argv, 0)) + invarg("invalid mark value\n", *argv); } else if (matches(*argv, "oif") == 0 || strcmp(*argv, "dev") == 0) { NEXT_ARG(); -- 2.13.1
[iproute PATCH 10/51] iproute: Fix for missing 'Oifs:' display
Covscan complained about dead code but after reading it, I assume the author's intention was to prefix the interface list with 'Oifs: '. Initializing first to 1 and setting it to 0 after above prefix was printed should fix it. Signed-off-by: Phil Sutter <p...@nwl.cc> --- ip/iproute.c | 8 +--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/ip/iproute.c b/ip/iproute.c index cb695ad4141a7..89caac124f489 100644 --- a/ip/iproute.c +++ b/ip/iproute.c @@ -624,7 +624,7 @@ int print_route(const struct sockaddr_nl *who, struct nlmsghdr *n, void *arg) } if (tb[RTA_MULTIPATH]) { struct rtnexthop *nh = RTA_DATA(tb[RTA_MULTIPATH]); - int first = 0; + int first = 1; len = RTA_PAYLOAD(tb[RTA_MULTIPATH]); @@ -634,10 +634,12 @@ int print_route(const struct sockaddr_nl *who, struct nlmsghdr *n, void *arg) if (nh->rtnh_len > len) break; if (r->rtm_flags_F_CLONED && r->rtm_type == RTN_MULTICAST) { - if (first) + if (first) { fprintf(fp, "Oifs: "); - else + first = 0; + } else { fprintf(fp, " "); + } } else fprintf(fp, "%s\tnexthop ", _SL_); if (nh->rtnh_len > sizeof(*nh)) { -- 2.13.1
[iproute PATCH 05/51] iplink_can: Prevent overstepping array bounds
can_state_names array contains at most CAN_STATE_MAX fields, so allowing an index to it to be equal to that number is wrong. While here, also make sure the array is indeed that big so nothing bad happens if CAN_STATE_MAX ever increases. Signed-off-by: Phil Sutter <p...@nwl.cc> --- ip/iplink_can.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/ip/iplink_can.c b/ip/iplink_can.c index 5df56b23b..2954010fefa22 100644 --- a/ip/iplink_can.c +++ b/ip/iplink_can.c @@ -251,7 +251,7 @@ static int can_parse_opt(struct link_util *lu, int argc, char **argv, return 0; } -static const char *can_state_names[] = { +static const char *can_state_names[CAN_STATE_MAX] = { [CAN_STATE_ERROR_ACTIVE] = "ERROR-ACTIVE", [CAN_STATE_ERROR_WARNING] = "ERROR-WARNING", [CAN_STATE_ERROR_PASSIVE] = "ERROR-PASSIVE", @@ -275,7 +275,7 @@ static void can_print_opt(struct link_util *lu, FILE *f, struct rtattr *tb[]) if (tb[IFLA_CAN_STATE]) { uint32_t state = rta_getattr_u32(tb[IFLA_CAN_STATE]); - fprintf(f, "state %s ", state <= CAN_STATE_MAX ? + fprintf(f, "state %s ", state < CAN_STATE_MAX ? can_state_names[state] : "UNKNOWN"); } -- 2.13.1
[iproute PATCH 48/51] ifcfg: Quote left-hand side of [ ] expression
This prevents word-splitting and therefore leads to more accurate error message in case 'grep -c' prints something other than a number. Signed-off-by: Phil Sutter <p...@nwl.cc> --- ip/ifcfg | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/ip/ifcfg b/ip/ifcfg index 083d9df36742f..30a2dc49816cd 100644 --- a/ip/ifcfg +++ b/ip/ifcfg @@ -131,7 +131,7 @@ noarp=$? ip route add unreachable 224.0.0.0/24 >& /dev/null ip route add unreachable 255.255.255.255 >& /dev/null -if [ `ip link ls $dev | grep -c MULTICAST` -ge 1 ]; then +if [ "`ip link ls $dev | grep -c MULTICAST`" -ge 1 ]; then ip route add 224.0.0.0/4 dev $dev scope global >& /dev/null fi -- 2.13.1
[iproute PATCH 13/51] iproute_lwtunnel: Argument to strerror must be positive
Signed-off-by: Phil Sutter <p...@nwl.cc> --- ip/iproute_lwtunnel.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/ip/iproute_lwtunnel.c b/ip/iproute_lwtunnel.c index 398ab5e077ed8..1a3dc4d4c0ed9 100644 --- a/ip/iproute_lwtunnel.c +++ b/ip/iproute_lwtunnel.c @@ -643,7 +643,7 @@ static int lwt_parse_bpf(struct rtattr *rta, size_t len, err = bpf_parse_common(bpf_type, , _cb_ops, ); if (err < 0) { fprintf(stderr, "Failed to parse eBPF program: %s\n", - strerror(err)); + strerror(-err)); return -1; } rta_nest_end(rta, nest); -- 2.13.1
[iproute PATCH 01/51] devlink: Check return code of strslashrsplit()
This function shouldn't fail because all callers of __dl_argv_handle_port() make sure the passed string contains enough slashes already, but better make sure if this changes in future the function won't access uninitialized data. Signed-off-by: Phil Sutter <p...@nwl.cc> --- devlink/devlink.c | 16 1 file changed, 12 insertions(+), 4 deletions(-) diff --git a/devlink/devlink.c b/devlink/devlink.c index f9bc16c350c40..de41a9f4aae10 100644 --- a/devlink/devlink.c +++ b/devlink/devlink.c @@ -526,18 +526,26 @@ static int __dl_argv_handle_port(char *str, char **p_bus_name, char **p_dev_name, uint32_t *p_port_index) { - char *handlestr = handlestr; - char *portstr = portstr; + char *handlestr; + char *portstr; int err; - strslashrsplit(str, , ); + err = strslashrsplit(str, , ); + if (err) { + pr_err("Port identification \"%s\" is invalid\n", str); + return err; + } err = strtouint32_t(portstr, p_port_index); if (err) { pr_err("Port index \"%s\" is not a number or not within range\n", portstr); return err; } - strslashrsplit(handlestr, p_bus_name, p_dev_name); + err = strslashrsplit(handlestr, p_bus_name, p_dev_name); + if (err) { + pr_err("Port identification \"%s\" is invalid\n", str); + return err; + } return 0; } -- 2.13.1
[iproute PATCH 38/51] tc/em_ipset: Don't leak sockfd on error path
Signed-off-by: Phil Sutter <p...@nwl.cc> --- tc/em_ipset.c | 1 + 1 file changed, 1 insertion(+) diff --git a/tc/em_ipset.c b/tc/em_ipset.c index fab975f5ea563..b59756515d239 100644 --- a/tc/em_ipset.c +++ b/tc/em_ipset.c @@ -84,6 +84,7 @@ static int get_version(unsigned int *version) res = getsockopt(sockfd, SOL_IP, SO_IP_SET, _version, ); if (res != 0) { perror("xt_set getsockopt"); + close(sockfd); return -1; } -- 2.13.1
[iproute PATCH 26/51] lnstat_util: Simplify alloc_and_open() a bit
Relying upon callers and using unsafe strcpy() is probably not the best idea. Aside from that, using snprintf() allows to format the string for lf->path in one go. Signed-off-by: Phil Sutter <p...@nwl.cc> --- misc/lnstat_util.c | 7 ++- 1 file changed, 2 insertions(+), 5 deletions(-) diff --git a/misc/lnstat_util.c b/misc/lnstat_util.c index cc54598fe1bef..ec19238c24b94 100644 --- a/misc/lnstat_util.c +++ b/misc/lnstat_util.c @@ -180,11 +180,8 @@ static struct lnstat_file *alloc_and_open(const char *path, const char *file) } /* initialize */ - /* de->d_name is guaranteed to be <= NAME_MAX */ - strcpy(lf->basename, file); - strcpy(lf->path, path); - strcat(lf->path, "/"); - strcat(lf->path, lf->basename); + snprintf(lf->basename, sizeof(lf->basename), "%s", file); + snprintf(lf->path, sizeof(lf->path), "%s/%s", path, file); /* initialize to default */ lf->interval.tv_sec = 1; -- 2.13.1
[iproute PATCH 27/51] nstat: Fix for potential NULL pointer dereference
If the string at 'p' contains neither space not newline, 'p' will become NULL. Make sure this isn't the case before dereferencing it. Signed-off-by: Phil Sutter <p...@nwl.cc> --- misc/nstat.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/misc/nstat.c b/misc/nstat.c index a4dd405d43a93..23e1569d7872b 100644 --- a/misc/nstat.c +++ b/misc/nstat.c @@ -198,7 +198,7 @@ static void load_ugly_table(FILE *fp) off = p - buf; p += 2; - while (*p) { + while (p && *p) { char *next; if ((next = strchr(p, ' ')) != NULL) -- 2.13.1
[iproute PATCH 25/51] ifstat: Fix memleak in dump_kern_db() for json output
Looks like this was forgotten when converting to common json output formatter. Fixes: fcc16c2287bf8 ("provide common json output formatter") Signed-off-by: Phil Sutter <p...@nwl.cc> --- misc/ifstat.c | 8 ++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/misc/ifstat.c b/misc/ifstat.c index 6c8cfa0bd94f5..ac3eff6b870a9 100644 --- a/misc/ifstat.c +++ b/misc/ifstat.c @@ -535,8 +535,12 @@ static void dump_kern_db(FILE *fp) else print_one_if(fp, n, n->val); } - if (json_output) - fprintf(fp, "\n} }\n"); + if (jw) { + jsonw_end_object(jw); + + jsonw_end_object(jw); + jsonw_destroy(); + } } static void dump_incr_db(FILE *fp) -- 2.13.1
[iproute PATCH 31/51] ss: Drop useless assignment
After '*b = *a', 'b->next' already has the same value as 'a->next'. Signed-off-by: Phil Sutter <p...@nwl.cc> --- misc/ss.c | 1 - 1 file changed, 1 deletion(-) diff --git a/misc/ss.c b/misc/ss.c index 5ea388fbf1c1a..d767b1103ea81 100644 --- a/misc/ss.c +++ b/misc/ss.c @@ -1440,7 +1440,6 @@ static int remember_he(struct aafilter *a, struct hostent *he) if ((b = malloc(sizeof(*b))) == NULL) return cnt; *b = *a; - b->next = a->next; a->next = b; } memcpy(b->addr.data, *ptr, len); -- 2.13.1
[iproute PATCH 35/51] ss: Fix potential memleak in unix_stats_print()
Fixes: 2d0e538f3e1cd ("ss: Drop list traversal from unix_stats_print()") Signed-off-by: Phil Sutter <p...@nwl.cc> --- misc/ss.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/misc/ss.c b/misc/ss.c index 667b8faad6528..7d84b83c8ad71 100644 --- a/misc/ss.c +++ b/misc/ss.c @@ -3170,8 +3170,10 @@ static int unix_show(struct filter *f) if (name[0]) { u->name = strdup(name); - if (!u->name) + if (!u->name) { + free(u); break; + } } if (u->rport) { -- 2.13.1
[iproute PATCH 51/51] lib/bpf: Check return value of write()
This is merely to silence the compiler warning. If write to stderr failed, assume that printing an error message will fail as well so don't even try. Signed-off-by: Phil Sutter <p...@nwl.cc> --- lib/bpf.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/lib/bpf.c b/lib/bpf.c index 1dcb261dc915f..825e071cea572 100644 --- a/lib/bpf.c +++ b/lib/bpf.c @@ -591,7 +591,8 @@ int bpf_trace_pipe(void) ret = read(fd, buff, sizeof(buff) - 1); if (ret > 0) { - write(2, buff, ret); + if (write(STDERR_FILENO, buff, ret) != ret) + return -1; fflush(stderr); } } -- 2.13.1
[iproute PATCH 29/51] ss: Use C99 initializer in netlink_show_one()
This has the additional benefit of initializing st.ino to zero which is used later in is_sctp_assoc() function. Signed-off-by: Phil Sutter <p...@nwl.cc> --- misc/ss.c | 13 +++-- 1 file changed, 7 insertions(+), 6 deletions(-) diff --git a/misc/ss.c b/misc/ss.c index f0d1c22f75cff..b4f89c85c2d52 100644 --- a/misc/ss.c +++ b/misc/ss.c @@ -3470,17 +3470,18 @@ static int netlink_show_one(struct filter *f, int rq, int wq, unsigned long long sk, unsigned long long cb) { - struct sockstat st; + struct sockstat st = { + .state = SS_CLOSE, + .rq = rq, + .wq = wq, + .local.family = AF_NETLINK, + .remote.family = AF_NETLINK, + }; SPRINT_BUF(prot_buf) = {}; const char *prot_name; char procname[64] = {}; - st.state = SS_CLOSE; - st.rq= rq; - st.wq= wq; - st.local.family = st.remote.family = AF_NETLINK; - if (f->f) { st.rport = -1; st.lport = pid; -- 2.13.1
[iproute PATCH 21/51] lib/libnetlink: Don't pass NULL parameter to memcpy()
Both addattr_l() and rta_addattr_l() may be called with NULL data pointer and 0 alen parameters. Avoid calling memcpy() in that case. Signed-off-by: Phil Sutter <p...@nwl.cc> --- lib/libnetlink.c | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/lib/libnetlink.c b/lib/libnetlink.c index 81a344abff27a..5e3adade77077 100644 --- a/lib/libnetlink.c +++ b/lib/libnetlink.c @@ -866,7 +866,8 @@ int addattr_l(struct nlmsghdr *n, int maxlen, int type, const void *data, rta = NLMSG_TAIL(n); rta->rta_type = type; rta->rta_len = len; - memcpy(RTA_DATA(rta), data, alen); + if (alen) + memcpy(RTA_DATA(rta), data, alen); n->nlmsg_len = NLMSG_ALIGN(n->nlmsg_len) + RTA_ALIGN(len); return 0; } @@ -953,7 +954,8 @@ int rta_addattr_l(struct rtattr *rta, int maxlen, int type, subrta = (struct rtattr *)(((char *)rta) + RTA_ALIGN(rta->rta_len)); subrta->rta_type = type; subrta->rta_len = len; - memcpy(RTA_DATA(subrta), data, alen); + if (alen) + memcpy(RTA_DATA(subrta), data, alen); rta->rta_len = NLMSG_ALIGN(rta->rta_len) + RTA_ALIGN(len); return 0; } -- 2.13.1
[iproute PATCH 22/51] lib/rt_names: Drop dead code in rtnl_rttable_n2a()
Since 'id' is 32bit unsigned, it can never exceed RT_TABLE_MAX (which is defined to 0x). Therefore drop that never matching conditional. Signed-off-by: Phil Sutter <p...@nwl.cc> --- lib/rt_names.c | 4 1 file changed, 4 deletions(-) diff --git a/lib/rt_names.c b/lib/rt_names.c index 04c15ff5b15f8..e5efd78e6f810 100644 --- a/lib/rt_names.c +++ b/lib/rt_names.c @@ -410,10 +410,6 @@ const char *rtnl_rttable_n2a(__u32 id, char *buf, int len) { struct rtnl_hash_entry *entry; - if (id > RT_TABLE_MAX) { - snprintf(buf, len, "%u", id); - return buf; - } if (!rtnl_rttable_init) rtnl_rttable_initialize(); entry = rtnl_rttable_hash[id & 255]; -- 2.13.1
[iproute PATCH] tc-simple: Fix documentation
- CONTROL has to come last, otherwise 'index' applies to gact and not simple itself. - Man page wasn't updated to reflect syntax changes. Signed-off-by: Phil Sutter <p...@nwl.cc> --- man/man8/tc-simple.8 | 29 ++--- tc/m_simple.c| 4 ++-- 2 files changed, 28 insertions(+), 5 deletions(-) diff --git a/man/man8/tc-simple.8 b/man/man8/tc-simple.8 index 2206dc3b88614..7363ab563e189 100644 --- a/man/man8/tc-simple.8 +++ b/man/man8/tc-simple.8 @@ -6,15 +6,37 @@ simple - basic example action .in +8 .ti -8 .BR tc " ... " "action simple" -.I STRING +[ +.BI sdata " STRING" +] [ +.BI index " INDEX" +] [ +.I CONTROL +] + +.ti -8 +.IR CONTROL " := {" +.BR reclassify " | " pipe " | " drop " | " continue " | " ok " }" + .SH DESCRIPTION This is a pedagogical example rather than an actually useful action. Upon every access, it prints the given .I STRING which may be of arbitrary length. .SH OPTIONS .TP -.I STRING +.BI sdata " STRING" The actual string to print. +.TP +.BI index " INDEX" +Optional action index value. +.TP +.I CONTROL +Indicate how +.B tc +should proceed after executing the action. For a description of the possible +.I CONTROL +values, see +.BR tc-actions (8). .SH EXAMPLES The following example makes the kernel yell "Incoming ICMP!" every time it sees an incoming ICMP on eth0. Steps are: @@ -36,7 +58,7 @@ display stats again and observe increment by 1 .EX hadi@noma1:$ tc qdisc add dev eth0 ingress hadi@noma1:$tc filter add dev eth0 parent : protocol ip prio 5 \\ -u32 match ip protocol 1 0xff flowid 1:1 action simple "Incoming ICMP" +u32 match ip protocol 1 0xff flowid 1:1 action simple sdata "Incoming ICMP" hadi@noma1:$ sudo tc -s filter ls dev eth0 parent : filter protocol ip pref 5 u32 @@ -74,3 +96,4 @@ display stats again and observe increment by 1 .EE .SH SEE ALSO .BR tc (8) +.BR tc-actions (8) diff --git a/tc/m_simple.c b/tc/m_simple.c index a4457c70324ee..800cf7d703be6 100644 --- a/tc/m_simple.c +++ b/tc/m_simple.c @@ -81,10 +81,10 @@ #endif static void explain(void) { - fprintf(stderr, "Usage:... simple [sdata STRING] [CONTROL] [index INDEX]\n"); + fprintf(stderr, "Usage:... simple [sdata STRING] [index INDEX] [CONTROL]\n"); fprintf(stderr, "\tSTRING being an arbitrary string\n" - "\tCONTROL := reclassify|pipe|drop|continue|ok\n" "\tINDEX := optional index value used\n"); + "\tCONTROL := reclassify|pipe|drop|continue|ok\n" } static void usage(void) -- 2.13.1
[iproute PATCH] bpf: Make bytecode-file reading a little more robust
bpf_parse_string() will now correctly handle: - Extraneous whitespace, - OPs on multiple lines and - overlong file names. The added feature of allowing to have OPs on multiple lines (like e.g. tcpdump prints them) is rather a side effect of fixing detection of malformed bytecode files having random content on a second line, like e.g.: | 4,40 0 0 12,21 0 1 2048,6 0 0 262144,6 0 0 0 | foobar Cc: Daniel Borkmann <dan...@iogearbox.net> Signed-off-by: Phil Sutter <p...@nwl.cc> --- lib/bpf.c | 32 1 file changed, 24 insertions(+), 8 deletions(-) diff --git a/lib/bpf.c b/lib/bpf.c index e7a4d12fdf2f9..4f52ad4a8f023 100644 --- a/lib/bpf.c +++ b/lib/bpf.c @@ -208,11 +208,11 @@ static int bpf_parse_string(char *arg, bool from_file, __u16 *bpf_len, if (from_file) { size_t tmp_len, op_len = sizeof("65535 255 255 4294967295,"); - char *tmp_string, *last; + char *tmp_string, *pos, c, c_prev = ' '; FILE *fp; tmp_len = sizeof("4096,") + BPF_MAXINSNS * op_len; - tmp_string = calloc(1, tmp_len); + tmp_string = pos = calloc(1, tmp_len); if (tmp_string == NULL) return -ENOMEM; @@ -223,17 +223,33 @@ static int bpf_parse_string(char *arg, bool from_file, __u16 *bpf_len, return -ENOENT; } - if (!fgets(tmp_string, tmp_len, fp)) { + while ((c = fgetc(fp)) != EOF) { + switch (c) { + case '\n': + if (c_prev != ',') + *(pos++) = ','; + break; + case ' ': + case '\t': + if (c_prev != ' ') + *(pos++) = c; + break; + default: + *(pos++) = c; + } + if (pos - tmp_string == tmp_len) + break; + c_prev = c; + } + + if (!feof(fp)) { free(tmp_string); fclose(fp); - return -EIO; + return -E2BIG; } fclose(fp); - - last = _string[strlen(tmp_string) - 1]; - if (*last == '\n') - *last = 0; + *pos = 0; *need_release = true; *bpf_string = tmp_string; -- 2.13.1
[iproute PATCH] iplink: Notify user if EEXIST error might be spurious
Back in the days when RTM_NEWLINK wasn't yet implemented, people had to rely upon kernel modules to create (virtual) interfaces for them. The number of those was usually defined via module parameter, and a sane default value was chosen. Now that iproute2 allows users to instantiate new interfaces at will, this is no longer required - though for backwards compatibility reasons, we're stuck with both methods which collide at the point when one tries to create an interface with a standard name for a type which exists in a kernel module: The kernel will load the module, which instantiates the interface and the following RTM_NEWLINK request will fail since the interface exists already. For instance: | # lsmod | grep dummy | # ip link show | grep dummy0 | # ip link add dummy0 type dummy | RTNETLINK answers: File exists | # ip link show | grep -c dummy0 | 1 There is no race-free solution in userspace for this dilemma as far as I can tell, so try to detect whether a user might have run into this and notify that the given error message might be irrelevant. Signed-off-by: Phil Sutter <p...@nwl.cc> --- ip/iplink.c | 35 ++- 1 file changed, 34 insertions(+), 1 deletion(-) diff --git a/ip/iplink.c b/ip/iplink.c index 5aff2fde38dae..f94fa96668d21 100644 --- a/ip/iplink.c +++ b/ip/iplink.c @@ -867,6 +867,33 @@ int iplink_parse(int argc, char **argv, struct iplink_req *req, return ret - argc; } +static bool link_name_is_standard(const char *type, const char *name) +{ + const struct { + const char *type; + const char *name; + } standard_links[] = { + { "dummy", "dummy0" }, + { "ifb", "ifb0" }, + { "ifb", "ifb1" }, + { "bond", "bond0" }, + { "ipip", "tunl0" }, + { "gre", "gre0" }, + { "gretap", "gretap0" }, + { "vti", "ip_vti0" }, + { "sit", "sit0" }, + }; + int i; + + for (i = 0; i < ARRAY_SIZE(standard_links); i++) { + if (strcmp(type, standard_links[i].type) || + strcmp(name, standard_links[i].name)) + continue; + return true; + } + return false; +} + static int iplink_modify(int cmd, unsigned int flags, int argc, char **argv) { int len; @@ -1007,8 +1034,14 @@ static int iplink_modify(int cmd, unsigned int flags, int argc, char **argv) return -1; } - if (rtnl_talk(, , NULL, 0) < 0) + if (rtnl_talk(, , NULL, 0) < 0) { + if (errno == EEXIST && + flags & NLM_F_CREATE && + link_name_is_standard(type, name)) + fprintf(stderr, + "Note: Kernel might have created interface upon module load.\n"); return -2; + } return 0; } -- 2.13.1
[iproute PATCH] Really fix get_addr() and get_prefix() error messages
Both functions take the desired address family as a parameter. So using that to notify the user what address family was expected is correct, unlike using dst->family which will tell the user only what address family was specified. The situation which commit 334af76143368 tried to fix was when 'ip' would accept addresses from multiple families. In that case, the family parameter is set to AF_UNSPEC so that get_addr_1() may accept any valid address. This patch introduces a wrapper around family_name() which returns the string "any valid" for AF_UNSPEC instead of the three question marks unsuitable for use in error messages. Tests for AF_UNSPEC: | # ip a a 256.10.166.1/24 dev d0 | Error: any valid prefix is expected rather than "256.10.166.1/24". | # ip neighbor add proxy 2001:db8::g dev d0 | Error: any valid address is expected rather than "2001:db8::g". Tests for explicit address family: | # ip -6 addrlabel add prefix 1.1.1.1/24 label 123 | Error: inet6 prefix is expected rather than "1.1.1.1/24". | # ip -4 addrlabel add prefix dead:beef::1/24 label 123 | Error: inet prefix is expected rather than "dead:beef::1/24". Reported-by: Jaroslav Aster <jas...@redhat.com> Fixes: 334af76143368 ("fix get_addr() and get_prefix() error messages") Signed-off-by: Phil Sutter <p...@nwl.cc> --- lib/utils.c | 11 +-- 1 file changed, 9 insertions(+), 2 deletions(-) diff --git a/lib/utils.c b/lib/utils.c index 9aa3219c5547d..9143ed2284870 100644 --- a/lib/utils.c +++ b/lib/utils.c @@ -616,12 +616,19 @@ done: return err; } +static const char *family_name_verbose(int family) +{ + if (family == AF_UNSPEC) + return "any valid"; + return family_name(family); +} + int get_addr(inet_prefix *dst, const char *arg, int family) { if (get_addr_1(dst, arg, family)) { fprintf(stderr, "Error: %s address is expected rather than \"%s\".\n", - family_name(dst->family), arg); + family_name_verbose(family), arg); exit(1); } return 0; @@ -639,7 +646,7 @@ int get_prefix(inet_prefix *dst, char *arg, int family) if (get_prefix_1(dst, arg, family)) { fprintf(stderr, "Error: %s prefix is expected rather than \"%s\".\n", - family_name(dst->family), arg); + family_name_verbose(family), arg); exit(1); } return 0; -- 2.13.1
Re: [PATCH v3] ss: Enclose IPv6 address in brackets
On Tue, Aug 01, 2017 at 12:05:13PM +0200, Florian Lehner wrote: [...] > @@ -114,9 +114,13 @@ int addr64_n2a(__u64 addr, char *buff, size_t len); > int af_bit_len(int af); > int af_byte_len(int af); > > -const char *format_host_r(int af, int len, const void *addr, > -char *buf, int buflen); > -const char *format_host(int af, int lne, const void *addr); > +const char *format_host_rb(int af, int len, const void *addr, > +char *buf, int buflen, bool *resolved); > +#define format_host_r(af, len, addr, buf, buflen) \ > + format_host_rb(af, len, addr, buf, buflen, NULL) > +const char *format_host_b(int af, int lne, const void *addr, bool > *resolved); > +#define format_host(af, lne, addr) \ > + format_host_b(af, lne, addr, NULL) > #define format_host_rta(af, rta) \ > format_host(af, RTA_PAYLOAD(rta), RTA_DATA(rta)) > const char *rt_addr_n2a_r(int af, int len, const void *addr, > diff --git a/lib/utils.c b/lib/utils.c > index 9aa3219..42c3bf5 100644 > --- a/lib/utils.c > +++ b/lib/utils.c > @@ -898,8 +898,8 @@ static const char *resolve_address(const void *addr, > int len, int af) > } > #endif > > -const char *format_host_r(int af, int len, const void *addr, > - char *buf, int buflen) > +const char *format_host_rb(int af, int len, const void *addr, > + char *buf, int buflen, bool *resolved) > { > #ifdef RESOLVE_HOSTNAMES > if (resolve_hosts) { > @@ -909,17 +909,20 @@ const char *format_host_r(int af, int len, const > void *addr, > > if (len > 0 && > (n = resolve_address(addr, len, af)) != NULL) > + { > + *resolved = true; > return n; > + } > } > #endif > return rt_addr_n2a_r(af, len, addr, buf, buflen); > } Did you test that? I guess calling format_host() will lead to dereference of a NULL pointer. Cheers, Phil
Re: [PATCH v2] ss: Enclose IPv6 address in brackets
On Mon, Jul 31, 2017 at 09:50:04PM +0200, Florian Lehner wrote: > This updated patch adds support for RFC2732 IPv6 address format with > brackets for the tool ss. Resolved hostnames will not be enclosed in > brackets, therefore the global variable resolve_hosts is initialized and > checked. > > Signed-off-by: Lehner Florian> --- > misc/ss.c | 8 ++-- > 1 file changed, 6 insertions(+), 2 deletions(-) > > diff --git a/misc/ss.c b/misc/ss.c > index 12763c9..ac94537 100644 > --- a/misc/ss.c > +++ b/misc/ss.c > @@ -88,7 +88,7 @@ static int security_get_initial_context(char *name, > char **context) > } > #endif > > -int resolve_hosts; > +int resolve_hosts = 0; Global variables are guaranteed to be initialized to zero. According to the web this is by C89. > int resolve_services = 1; > int preferred_family = AF_UNSPEC; > int show_options; > @@ -1059,7 +1059,11 @@ static void inet_addr_print(const inet_prefix *a, > int port, unsigned int ifindex > ap = format_host(AF_INET, 4, a->data); > } > } else { > - ap = format_host(a->family, 16, a->data); > + if (a->family == AF_INET6 && !resolve_hosts) { > + sprintf(buf, "[%s]", format_host(a->family, 16, > a->data)); > + } else { > + ap = format_host(a->family, 16, a->data); > + } > est_len = strlen(ap); > if (est_len <= addr_width) > est_len = addr_width; This won't work if name resolution was requested but failed. In that case an IPv6 address is returned but not enclosed in brackets. Cheers, Phil
Re: [PATCH] ss: Enclose IPv6 address in brackets
On Mon, Jul 31, 2017 at 09:27:55AM -0700, Stephen Hemminger wrote: > On Mon, 31 Jul 2017 12:30:10 +0200 > Phil Sutter <p...@nwl.cc> wrote: > > > On Sat, Jul 29, 2017 at 02:29:10PM +0200, Florian Lehner wrote: > > > This patch adds support for RFC2732 IPv6 address format with brackets > > > for the tool ss. So output for ss changes from > > > 2a00:1450:400a:804::200e:443 to [2a00:1450:400a:804::200e]:443 for IPv6 > > > addresses with attached port number. > > > > > > Signed-off-by: Lehner Florian <d...@der-flo.net> > > > --- > > > misc/ss.c | 6 +- > > > 1 file changed, 5 insertions(+), 1 deletion(-) > > > > > > diff --git a/misc/ss.c b/misc/ss.c > > > index 12763c9..db39c93 100644 > > > --- a/misc/ss.c > > > +++ b/misc/ss.c > > > @@ -1059,7 +1059,11 @@ static void inet_addr_print(const inet_prefix *a, > > > int port, unsigned int ifindex > > > ap = format_host(AF_INET, 4, a->data); > > > } > > > } else { > > > - ap = format_host(a->family, 16, a->data); > > > + if (a->family == AF_INET6) { > > > + sprintf(buf, "[%s]", format_host(a->family, 16, > > > a->data)); > > > + } else { > > > + ap = format_host(a->family, 16, a->data); > > > + } > > > est_len = strlen(ap); > > > if (est_len <= addr_width) > > > est_len = addr_width; > > > > Note that this will enclosed resolved hostnames in brackets as well, not > > sure if that's intended. Looks like fixing that is not exactly trivial: > > Hostname resolution is buried in format_host() which resides in > > lib/utils.c so is shared code with ip, tc, etc. Hence, adding the > > brackets in rt_addr_n2a_r() is not an option, either. Adding a 'bool *' > > param to format_host() and format_host_r() indicating that name > > resolution has happened might help here. > > > > Cheers, Phil > > Also, this code should return "*" for IN6ADDR_ANY Oh, really? It doesn't do that currently, and I always thought the IPv6 all-zero address was written '::' (or '[::]:1234' if a port is present). Cheers, Phil
Re: [PATCH] ss: Enclose IPv6 address in brackets
On Sat, Jul 29, 2017 at 02:29:10PM +0200, Florian Lehner wrote: > This patch adds support for RFC2732 IPv6 address format with brackets > for the tool ss. So output for ss changes from > 2a00:1450:400a:804::200e:443 to [2a00:1450:400a:804::200e]:443 for IPv6 > addresses with attached port number. > > Signed-off-by: Lehner Florian> --- > misc/ss.c | 6 +- > 1 file changed, 5 insertions(+), 1 deletion(-) > > diff --git a/misc/ss.c b/misc/ss.c > index 12763c9..db39c93 100644 > --- a/misc/ss.c > +++ b/misc/ss.c > @@ -1059,7 +1059,11 @@ static void inet_addr_print(const inet_prefix *a, > int port, unsigned int ifindex > ap = format_host(AF_INET, 4, a->data); > } > } else { > - ap = format_host(a->family, 16, a->data); > + if (a->family == AF_INET6) { > + sprintf(buf, "[%s]", format_host(a->family, 16, > a->data)); > + } else { > + ap = format_host(a->family, 16, a->data); > + } > est_len = strlen(ap); > if (est_len <= addr_width) > est_len = addr_width; Note that this will enclosed resolved hostnames in brackets as well, not sure if that's intended. Looks like fixing that is not exactly trivial: Hostname resolution is buried in format_host() which resides in lib/utils.c so is shared code with ip, tc, etc. Hence, adding the brackets in rt_addr_n2a_r() is not an option, either. Adding a 'bool *' param to format_host() and format_host_r() indicating that name resolution has happened might help here. Cheers, Phil
Re: [PATCHv2 iproute2] utils: return default family when rtm_family is not RTNL_FAMILY_IPMR/IP6MR
On Thu, Jul 27, 2017 at 05:44:15PM +0800, Hangbin Liu wrote: > When we get a multicast route, the rtm_type is RTN_MULTICAST, but the > rtm_family may be AF_INET. If we only check the type with RTNL_FAMILY_IPMR, > we will get malformed address. e.g. > > + ip -4 route add multicast 172.111.1.1 dev em1 table main > > Before fix: > + ip route list type multicast table main > multicast ac6f:101:800:400:400:0:3c00:0 dev em1 scope link > > After fix: > + ip route list type multicast table main > multicast 172.111.1.1 dev em1 scope link > > Fixes: 56e3eb4c3400 ("ip: route: fix multicast route dumps") > Signed-off-by: Hangbin Liu <liuhang...@gmail.com> Acked-by: Phil Sutter <p...@nwl.cc>
Re: [PATCH iproute2] utils: also check AF_INET family when rtm_type is RTN_MULTICAST
Hi Hangbin, On Thu, Jul 27, 2017 at 05:01:49PM +0800, Hangbin Liu wrote: [...] > diff --git a/lib/utils.c b/lib/utils.c > index e77bd30..0479e00 100644 > --- a/lib/utils.c > +++ b/lib/utils.c > @@ -1215,5 +1215,6 @@ int get_real_family(int rtm_type, int rtm_family) > if (rtm_type != RTN_MULTICAST) > return rtm_family; > > - return rtm_family == RTNL_FAMILY_IPMR ? AF_INET : AF_INET6; > + return (rtm_family == RTNL_FAMILY_IPMR || > + rtm_family == AF_INET) ? AF_INET : AF_INET6; > } I think this is not very readable. How about this instead: - return rtm_family == RTNL_FAMILY_IPMR ? AF_INET : AF_INET6; + if (rtm_family == RTNL_FAMILY_IPMR) + return AF_INET; + + return rtm_family; Thanks, Phil
Re: TC-pedit man page examples error
Hi Tyler, On Wed, Jul 26, 2017 at 09:47:51AM -0700, Tyler Bautista wrote: > Thank you for taking my message. I have gotten a solution working so > feel free to ignore the rest of this email. By vanilla iproute2 do you > mean the latest package through the yum or apt repository? All I can > tell you is that in the example I gave the iproute2 package was the > latest one through the RHEL 7 yum repository. 'Vanilla' is a widely used term[1], I meant using latest iproute2 from upstream[2], ideally from Git. You have to report problems with RHEL packages in Red Hat bugzilla[3]. Cheers, Phil [1] https://en.wikipedia.org/wiki/Vanilla_software [2] https://wiki.linuxfoundation.org/networking/iproute2 [3] https://bugzilla.redhat.com/
Re: TC-pedit man page examples error
Hi Tyler, On Tue, Jul 25, 2017 at 08:33:40AM -0700, Tyler Bautista wrote: > To whom it may concern, > I recently attempted to use simple tc action pedit commands on the man > page and I ran into some errors. The following is some information > about my version of iproute and my machine: > > the following is my iproute package information > > Loaded plugins: fastestmirror, langpacks > iproute-3.10.0-74.el7.x86_64 This mailing list is for upstream issues only. Can you reproduce the problem with vanilla iproute2? You are using RHEL7's iproute package, which has a number of known issues in pedit action: https://bugzilla.redhat.com/show_bug.cgi?id=1322406 These are planned to be resolved for RHEL7.5, not sure when it will land in CentOS. Cheers, Phil
[PATCH] lib: test_rhashtable: Fix KASAN warning
I forgot one spot when introducing struct test_obj_val. Fixes: e859afe1ee0c5 ("lib: test_rhashtable: fix for large entry counts") Reported by: kernel test robot <fengguang...@intel.com> Signed-off-by: Phil Sutter <p...@nwl.cc> --- lib/test_rhashtable.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/lib/test_rhashtable.c b/lib/test_rhashtable.c index 16949d219291a..0ffca990a8337 100644 --- a/lib/test_rhashtable.c +++ b/lib/test_rhashtable.c @@ -223,7 +223,9 @@ static s64 __init test_rhashtable(struct rhashtable *ht) pr_info(" Deleting %d keys\n", entries); for (i = 0; i < entries; i++) { - u32 key = i * 2; + struct test_obj_val key = { + .id = i * 2, + }; if (array[i].value.id != TEST_INSERT_FAIL) { obj = rhashtable_lookup_fast(ht, , test_rht_params); -- 2.13.1
[PATCH] lib: test_rhashtable: fix for large entry counts
During concurrent access testing, threadfunc() concatenated thread ID and object index to create a unique key like so: | tdata->objs[i].value = (tdata->id << 16) | i; This breaks if a user passes an entries parameter of 64k or higher, since 'i' might use more than 16 bits then. Effectively, this will lead to duplicate keys in the table. Fix the problem by introducing a struct holding object and thread ID and using that as key instead of a single integer type field. Fixes: f4a3e90ba5739 ("rhashtable-test: extend to test concurrency") Reported by: Manuel Messner <m...@skelett.io> Signed-off-by: Phil Sutter <p...@nwl.cc> --- lib/test_rhashtable.c | 53 +++ 1 file changed, 32 insertions(+), 21 deletions(-) diff --git a/lib/test_rhashtable.c b/lib/test_rhashtable.c index 64e899b633371..16949d219291a 100644 --- a/lib/test_rhashtable.c +++ b/lib/test_rhashtable.c @@ -56,8 +56,13 @@ static bool enomem_retry = false; module_param(enomem_retry, bool, 0); MODULE_PARM_DESC(enomem_retry, "Retry insert even if -ENOMEM was returned (default: off)"); +struct test_obj_val { + int id; + int tid; +}; + struct test_obj { - int value; + struct test_obj_val value; struct rhash_head node; }; @@ -72,7 +77,7 @@ static struct test_obj array[MAX_ENTRIES]; static struct rhashtable_params test_rht_params = { .head_offset = offsetof(struct test_obj, node), .key_offset = offsetof(struct test_obj, value), - .key_len = sizeof(int), + .key_len = sizeof(struct test_obj_val), .hashfn = jhash, .nulls_base = (3U << RHT_BASE_SHIFT), }; @@ -109,24 +114,26 @@ static int __init test_rht_lookup(struct rhashtable *ht) for (i = 0; i < entries * 2; i++) { struct test_obj *obj; bool expected = !(i % 2); - u32 key = i; + struct test_obj_val key = { + .id = i, + }; - if (array[i / 2].value == TEST_INSERT_FAIL) + if (array[i / 2].value.id == TEST_INSERT_FAIL) expected = false; obj = rhashtable_lookup_fast(ht, , test_rht_params); if (expected && !obj) { - pr_warn("Test failed: Could not find key %u\n", key); + pr_warn("Test failed: Could not find key %u\n", key.id); return -ENOENT; } else if (!expected && obj) { pr_warn("Test failed: Unexpected entry found for key %u\n", - key); + key.id); return -EEXIST; } else if (expected && obj) { - if (obj->value != i) { + if (obj->value.id != i) { pr_warn("Test failed: Lookup value mismatch %u!=%u\n", - obj->value, i); + obj->value.id, i); return -EINVAL; } } @@ -195,7 +202,7 @@ static s64 __init test_rhashtable(struct rhashtable *ht) for (i = 0; i < entries; i++) { struct test_obj *obj = [i]; - obj->value = i * 2; + obj->value.id = i * 2; err = insert_retry(ht, >node, test_rht_params); if (err > 0) insert_retries += err; @@ -218,7 +225,7 @@ static s64 __init test_rhashtable(struct rhashtable *ht) for (i = 0; i < entries; i++) { u32 key = i * 2; - if (array[i].value != TEST_INSERT_FAIL) { + if (array[i].value.id != TEST_INSERT_FAIL) { obj = rhashtable_lookup_fast(ht, , test_rht_params); BUG_ON(!obj); @@ -242,18 +249,21 @@ static int thread_lookup_test(struct thread_data *tdata) for (i = 0; i < entries; i++) { struct test_obj *obj; - int key = (tdata->id << 16) | i; + struct test_obj_val key = { + .id = i, + .tid = tdata->id, + }; obj = rhashtable_lookup_fast(, , test_rht_params); - if (obj && (tdata->objs[i].value == TEST_INSERT_FAIL)) { - pr_err(" found unexpected object %d\n", key); + if (obj && (tdata->objs[i].value.id == TEST_INSERT_FAIL)) { + pr_err(" found unexpected object %d-%d\n", key.tid, key.id); err++; - } else if (!obj && (tdata->objs[i].value != TE
Re: [PATCH v2] netns: avoid directory traversal (was: ip netns: Make sure netns name is sane)
Hi Matteo, I don't think git-am understands the subject change semantics you use here, so please drop the '(was: ...)' part from it. On Thu, Jul 20, 2017 at 12:36:32AM +0200, Matteo Croce wrote: > v2: reword commit message In my opinion, this belongs to patch meta info (below the three-dashes line), not the commit message. Please put the maintainer in Cc when submitting patches. It helps them to keep track of it. This counts for all upstream projects, not just iproute2. Thanks, Phil
Re: [iproute PATCH] ip netns: Make sure netns name is sane
On Wed, Jul 12, 2017 at 09:38:34AM -0700, Stephen Hemminger wrote: > On Mon, 10 Jul 2017 13:19:12 +0200 > Phil Sutter <p...@nwl.cc> wrote: > > > +static bool is_basename(const char *name) > > +{ > > + char *name_dup = strdup(name); > > + bool rc = true; > > + > > + if (!name_dup) > > + return false; > > + > > + if (strcmp(basename(name_dup), name)) > > + rc = false; > > + > > + free(name_dup); > > + return rc; > > +} > > Looks like natural place to use strdupa. > > static bool is_basename(const char *name) > { > return strcmp(basename(strdupa(name), name) == 0; > } Good point! Anyway, my patch fails to cover 'ip netns del' command (apart from the '..' issue), so I'd suggest to instead apply Matteo's version (Message-ID 20170710120831.9355-1-mcr...@redhat.com). Thanks, Phil
Re: [iproute PATCH] ip netns: Make sure netns name is sane
On Mon, Jul 10, 2017 at 08:17:02AM -0700, Stephen Hemminger wrote: > On Mon, 10 Jul 2017 13:19:12 +0200 > Phil Sutter <p...@nwl.cc> wrote: > > > +static bool is_basename(const char *name) > > +{ > > + char *name_dup = strdup(name); > > + bool rc = true; > > + > > + if (!name_dup) > > + return false; > > + > > + if (strcmp(basename(name_dup), name)) > > + rc = false; > > + > > + free(name_dup); > > + return rc; > > +} > > Why not just: > > static bool is_basename(const char *name) > { > return strchr(name '/') == NULL; > } This is not sufficient since it doesn't cover netns names of '..' and '.', as Matteo correctly pointed out. Cheers, Phil
Re: net-next STATUS page
On Tue, Jul 11, 2017 at 05:05:06PM +0200, Jiri Pirko wrote: > Tue, Jul 11, 2017 at 04:24:16PM CEST, da...@davemloft.net wrote: > > > >It has gotten to the point that even casually walking around > >Faro, Portugal last week, random German tourists would stop > >me in the street and ask if net-next was open or not. > > > >Therefore, in order to avoid any and all confusion I have created this > >web site: > > > > http://vger.kernel.org/~davem/net-next.html > > Hallelujah! > > You remember I suggested this couple of years ago? :) Never underestimate the power of German tourists. I even heard stories about them causing bad dreams! Cheers, Phil
Re: [PATCH] netns: avoid directory traversal (was: ip netns: Make sure netns name is sane)
Hi Matteo, On Mon, Jul 10, 2017 at 02:08:31PM +0200, Matteo Croce wrote: > I noticed that your patch still leaves an uncovered scenario, the one where > the > namespace name is "." or "..". > Calling 'ip netns del ..' will remove /var/run which is a symlink to /run on > most systems causing some daemons, eg. dbus, to fail. Oh, indeed. My patch even checks the name for 'ip netns add' only! > ip netns doesn't validate input, allowing creation and deletion of files > relatives to /var/run/netns. > This patch denies creation or deletion of namespaces with names contaning > "/" or that matches exactly "." or "..". You might want to have a look at --scissors option to git-am for a more elegant way of sending a reply with attached patch. [...] > int do_netns(int argc, char **argv) > { > netns_nsid_socket_init(); > @@ -775,6 +780,11 @@ int do_netns(int argc, char **argv) > return netns_list(0, NULL); > } > > + if (argc > 1 && invalid_name(argv[1])) { > + fprintf(stderr, "Invalid netns name \"%s\"\n", argv[1]); > + exit(-1); > + } Maybe worth noting, this assumes argv[1] will be used for the netns name which doesn't hold for 'ip netns identify' command. It doesn't matter though, since netns_identify() enforces the parameter to be either "self" or a decimal number. Yet, in future a new subcommand might be added which requires this check to be refactored. Thanks, Phil
[iproute PATCH] ip netns: Make sure netns name is sane
In order to keep track of created netns, 'ip netns' creates a mount point inside NETNS_RUN_DIR. By not checking the user-specified name, it allowed to create that mount point outside of NETNS_RUN_DIR and hence lose track of it afterwards: | # ip netns add ../../tmp/foobar | # ip netns list | # mount | grep foobar | nsfs on /tmp/foobar type nsfs (rw) Prevent this by making sure basename() does not see a need to alter the given netns name. Fixes: 0dc34c7713bb7 ("iproute2: Add processless network namespace support") Signed-off-by: Phil Sutter <p...@nwl.cc> --- ip/ipnetns.c | 20 1 file changed, 20 insertions(+) diff --git a/ip/ipnetns.c b/ip/ipnetns.c index 0b0378ab6560c..4eee85e146b3d 100644 --- a/ip/ipnetns.c +++ b/ip/ipnetns.c @@ -595,6 +595,21 @@ static int create_netns_dir(void) return 0; } +static bool is_basename(const char *name) +{ + char *name_dup = strdup(name); + bool rc = true; + + if (!name_dup) + return false; + + if (strcmp(basename(name_dup), name)) + rc = false; + + free(name_dup); + return rc; +} + static int netns_add(int argc, char **argv) { /* This function creates a new network namespace and @@ -616,6 +631,11 @@ static int netns_add(int argc, char **argv) } name = argv[0]; + if (!is_basename(name)) { + fprintf(stderr, "Invalid netns name: contains path components\n"); + return -1; + } + snprintf(netns_path, sizeof(netns_path), "%s/%s", NETNS_RUN_DIR, name); if (create_netns_dir()) -- 2.13.1
[iproute PATCH] man: Collect names of man pages automatically
As it turned out, forgetting to add a man page to the respective Makefile when introducing it is a common mistake. Overcome this once and for all by using $(wildcard) function in Makefiles. Fixes: 7124942942e53 ("genl: add manpage") Fixes: 958cd210942c8 ("ifcfg: add manpage") Fixes: e1b7f883e50de ("man: add documentation for IPv6 SR commands") Fixes: 1949f82cdf62c ("Introduce ip vrf command") Fixes: 535194a172d23 ("tipc: add peer remove functionality") Signed-off-by: Phil Sutter <p...@nwl.cc> --- man/man3/Makefile | 2 +- man/man7/Makefile | 2 +- man/man8/Makefile | 21 + 3 files changed, 3 insertions(+), 22 deletions(-) diff --git a/man/man3/Makefile b/man/man3/Makefile index bf55658c96746..a98741de29262 100644 --- a/man/man3/Makefile +++ b/man/man3/Makefile @@ -1,4 +1,4 @@ -MAN3PAGES=libnetlink.3 +MAN3PAGES = $(wildcard *.3) all: diff --git a/man/man7/Makefile b/man/man7/Makefile index ccfd8398c5f29..689fc713b6729 100644 --- a/man/man7/Makefile +++ b/man/man7/Makefile @@ -1,4 +1,4 @@ -MAN7PAGES = tc-hfsc.7 +MAN7PAGES = $(wildcard *.7) all: diff --git a/man/man8/Makefile b/man/man8/Makefile index f33186446819e..12af66be4bc73 100644 --- a/man/man8/Makefile +++ b/man/man8/Makefile @@ -1,25 +1,6 @@ TARGETS = ip-address.8 ip-link.8 ip-route.8 -MAN8PAGES = $(TARGETS) ip.8 arpd.8 lnstat.8 routel.8 rtacct.8 rtmon.8 rtpr.8 ss.8 \ - tc.8 tc-bfifo.8 tc-bpf.8 tc-cbq.8 tc-cbq-details.8 tc-choke.8 tc-codel.8 \ - tc-fq.8 \ - tc-drr.8 tc-ematch.8 tc-fq_codel.8 tc-hfsc.8 tc-htb.8 tc-pie.8 \ - tc-mqprio.8 tc-netem.8 tc-pfifo.8 tc-pfifo_fast.8 tc-prio.8 tc-red.8 \ - tc-sfb.8 tc-sfq.8 tc-stab.8 tc-tbf.8 \ - bridge.8 rtstat.8 ctstat.8 nstat.8 routef.8 \ - ip-addrlabel.8 ip-fou.8 ip-gue.8 ip-l2tp.8 ip-macsec.8 \ - ip-maddress.8 ip-monitor.8 ip-mroute.8 ip-neighbour.8 \ - ip-netns.8 ip-ntable.8 ip-rule.8 ip-tunnel.8 ip-xfrm.8 \ - ip-tcp_metrics.8 ip-netconf.8 ip-token.8 \ - tipc.8 tipc-bearer.8 tipc-link.8 tipc-media.8 tipc-nametable.8 \ - tipc-node.8 tipc-socket.8 \ - tc-basic.8 tc-cgroup.8 tc-flow.8 tc-flower.8 tc-fw.8 tc-route.8 \ - tc-tcindex.8 tc-u32.8 tc-matchall.8 \ - tc-connmark.8 tc-csum.8 tc-mirred.8 tc-nat.8 tc-pedit.8 tc-police.8 \ - tc-simple.8 tc-skbedit.8 tc-vlan.8 tc-xt.8 tc-skbmod.8 tc-ife.8 \ - tc-tunnel_key.8 tc-sample.8 \ - devlink.8 devlink-dev.8 devlink-monitor.8 devlink-port.8 devlink-sb.8 \ - ifstat.8 +MAN8PAGES = $(TARGETS) $(filter-out $(TARGETS),$(wildcard *.8)) all: $(TARGETS) -- 2.13.1
[iproute PATCH] tc: m_xt: Prevent a segfault in libipt
From: Phil Sutter <psut...@redhat.com> This happens with NAT targets, such as SNAT, DNAT and MASQUERADE. These are still not usable with this patch, but at least tc doesn't crash anymore when one tries to use them. Signed-off-by: Phil Sutter <p...@nwl.cc> --- tc/m_xt.c | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/tc/m_xt.c b/tc/m_xt.c index e59df8e10afef..ad52d239caf61 100644 --- a/tc/m_xt.c +++ b/tc/m_xt.c @@ -146,6 +146,9 @@ static int parse_ipt(struct action_util *a, int *argc_p, char ***argv_p, int tca_id, struct nlmsghdr *n) { struct xtables_target *m = NULL; +#if XTABLES_VERSION_CODE >= 6 + struct ipt_entry fw = {}; +#endif struct rtattr *tail; int c; @@ -206,7 +209,7 @@ static int parse_ipt(struct action_util *a, int *argc_p, default: #if XTABLES_VERSION_CODE >= 6 if (m != NULL && m->x6_parse != NULL) { - xtables_option_tpcall(c, argv, 0, m, NULL); + xtables_option_tpcall(c, argv, 0, m, ); #else if (m != NULL && m->parse != NULL) { m->parse(c - m->option_offset, argv, 0, -- 2.11.0
Re: iproute2 ss outputs duplicate tcp sockets info on kernel 3.10.105
Hi, Cc'ing Cyrill who wrote the code in question. Maybe he has an idea what's going wrong here. Cheers, Phil On Mon, May 08, 2017 at 06:56:04PM -0700, Li Er wrote: > i'm using v4.11.0 release of iproute2 and kernel 3.10.105, simply > running > > $ ss > Netid State Recv-Q Send-Q Local Address:Port > Peer Address:Port > tcpCLOSE-WAIT 4340 10.0.0.1:47931 > 65.49.18.136:https > tcpCLOSE-WAIT 4320 10.0.0.1:47932 > 65.49.18.136:https > tcpCLOSE-WAIT 4340 10.0.0.1:47931 > 65.49.18.136:https > tcpCLOSE-WAIT 4320 10.0.0.1:47932 > 65.49.18.136:https > > as you can see, there's one duplicate entry of each tcp socket, > however, if i explicitly specify tcp socket by adding the -t > switch, > > $ ss -t > State Recv-Q Send-Q Local Address:Port Peer > Address:Port > CLOSE-WAIT 4340 10.0.0.1:47931 > 65.49.18.136:https > CLOSE-WAIT 4320 10.0.0.1:47932 > 65.49.18.136:https > > the duplication is gone. > > this problem also occurs on git master, but not on iproute2 > v4.3.0, so i did a git bisect and found out the commit which > caused this is 9f66764e308e9c645b3fb2d1cfbb7fb207eb5de1, and by > revert this commit on git master, i.e. removing > > err = rtnl_dump_done(rth, h); > if (err < 0) > return -1; > > these 3 lines of code of lib/libnetlink.c, the problem is gone. > > since i'm not familiar with the source code, i doubt this is the > right way to solve the problem, what's your suggestions? thanks.
Re: [RFC] iproute: Add support for extended ack to rtnl_talk
Hi, On Thu, May 04, 2017 at 09:43:56AM -0700, Stephen Hemminger wrote: > On Thu, 04 May 2017 10:41:03 -0400 (EDT) > David Millerwrote: > > > From: David Ahern > > Date: Thu, 4 May 2017 08:27:35 -0600 > > > > > On 5/4/17 3:36 AM, Daniel Borkmann wrote: > > >> What is the clear benefit/rationale of outsourcing this to > > >> libmnl? I always was the impression we should strive for as little > > >> dependencies as possible? > > > > > > +1 > > > > Agreed, all else being equal iproute2 should be as self contained > > as possible since it is such a fundamental tool. > > Sorry, the old netlink code is more difficult to understand than libmnl. > Having dependency on a library is not a problem. There already is > an alternative implementation of ip commands in busybox for those > people trying to work in small environments. I second that. If you can't afford the extra ~24KB of libmnl on your system, you much rather can't afford the 20 times bigger ip binary, either. Regarding conversion to libmnl, which I investigated and started working on once: My gut feeling back then was that it's not quite worth the effor since iproute2 requires an intermediate layer of functions anyway. Another detail which I didn't like that much was libmnl's idiom of creating netlink messages on base of just a plain buffer and using mnl_nlmsg_put_header() et al. to populate it with data. I'm probably a bit biased since I did the conversion to c99-style initializers for the various struct req data types, but I didn't like the added run-time overhead to achieve just the same. So in summary, given that very little change happens to iproute2's internal libnetlink, I don't see much urge to make it use libmnl as backend. In my opinion it just adds another potential source of errors. Eventually this should be a maintainer level decision, though. :) Cheers, Phil
[iproute PATCH] man: ip.8: Document -brief flag
Brief output is especially useful for new users, so at least mention it's existence in ip man page. Signed-off-by: Phil Sutter <p...@nwl.cc> --- man/man8/ip.8 | 8 +++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/man/man8/ip.8 b/man/man8/ip.8 index 1c5a7419e4fc2..ae018fdf11ac9 100644 --- a/man/man8/ip.8 +++ b/man/man8/ip.8 @@ -48,7 +48,8 @@ ip \- show / manipulate routing, devices, policy routing and tunnels \fB\-ts\fR[\fIhort\fR] | \fB\-n\fR[\fIetns\fR] name | \fB\-a\fR[\fIll\fR] | -\fB\-c\fR[\fIolor\fR] } +\fB\-c\fR[\fIolor\fR] +\fB\-br\fR[\fIief\fR] } .SH OPTIONS @@ -206,6 +207,11 @@ Set the netlink socket receive buffer size, defaults to 1MB. .BR "\-iec" print human readable rates in IEC units (e.g. 1Ki = 1024). +.TP +.BR "\-br" , "\-brief" +Print only basic information in a tabular format for better readability. This option is currently only supported by +.BR "ip addr show " and " ip link show " commands. + .SH IP - COMMAND SYNTAX .SS -- 2.11.0
[iproute PATCH] man: ip-rule.8: Further clarify how to interpret priority value
Despite the past changes, users seemed to get confused by the seemingly contradictory relation of priority value and actual rule priority. Signed-off-by: Phil Sutter <p...@nwl.cc> --- man/man8/ip-rule.8 | 8 ++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/man/man8/ip-rule.8 b/man/man8/ip-rule.8 index 7de80f3e6db9f..a5c479811927f 100644 --- a/man/man8/ip-rule.8 +++ b/man/man8/ip-rule.8 @@ -95,7 +95,10 @@ Each policy routing rule consists of a .B selector and an .B action predicate. -The RPDB is scanned in order of decreasing priority. The selector +The RPDB is scanned in order of decreasing priority (note that lower number +means higher priority, see the description of +.I PREFERENCE +below). The selector of each rule is applied to {source address, destination address, incoming interface, tos, fwmark} and, if the selector matches the packet, the action is performed. The action predicate may return with success. @@ -225,7 +228,8 @@ value to match. .BI priority " PREFERENCE" the priority of this rule. .I PREFERENCE -is an unsigned integer value, higher number means lower priority. Each rule +is an unsigned integer value, higher number means lower priority, and rules get +processed in order of increasing number. Each rule should have an explicitly set .I unique priority value. -- 2.11.0
Re: [iproute PATCH] ip-route: Prevent some other double spaces in output
On Wed, Apr 05, 2017 at 07:10:26PM +0200, Timothy Redaelli wrote: > Print spaces only after text. > > CC: Phil Sutter <p...@nwl.cc> > Signed-off-by: Timothy Redaelli <tredae...@redhat.com> Acked-by: Phil Sutter <p...@nwl.cc> > Fixed all the problems reported in V2 > > Tested with improved script https://da.gd/tUVso that tests ip route and ip get > for both ipv4 and ipv6. > > If it's OK I'd like to send it upstream This is obviously a leftover from internal review. :) Cheers, Phil
Re: [PATCH iproute2] man: fix man page warnings
On Sun, Mar 26, 2017 at 09:11:14PM +0200, Alexander Alemayhu wrote: > While generating PDFs from the man pages, I saw the warning below from > several files. Compared the tc-matchall.8 with bridge.8 and used .RI > instead of .R. It should have no effect on the man page rendering. > > `R' is a string (producing the registered sign), not a macro. > > Signed-off-by: Alexander AlemayhuI know this has been applied already, but let me get this straight: People using '.R' try to force "normal" font, like for every second parameter to '.IR' but in fact they could just leave the macro away since lines starting without any macro will get normal font settings anyway. With this in mind, I see better solutions for the changes in this patch: [...] > @@ -20,7 +20,7 @@ flower \- flow based traffic control filter > .B indev > .IR ifname " | " > .BR skip_sw " | " skip_hw > -.R " | { " > +.RI " | { " Just merge this into previous line: | .BR skip_sw " | " skip_hw " | {" Note the missing space at the end, as that comes for free anyway. :) [...] > @@ -13,7 +13,7 @@ IFE - encapsulate/decapsulate metadata > .IR SMAC " ] " > .RB "[ " type > .IR TYPE " ] " > -.R "[ " > +.RI "[ " Same here: | .IR TYPE " ] [" [...] > @@ -7,7 +7,7 @@ matchall \- traffic control filter that matches every packet > .ti -8 > .BR tc " " filter " ... " matchall " [ " > .BR skip_sw " | " skip_hw > -.R " ] [ " > +.RI " ] [ " And here as well: | .BR skip_sw " | " skip_hw " ] [" Cheers, Phil
[iproute PATCH] man: ip-link: Specify min/max values for bridge slave priority and cost
The values are parsed as u16/u32, but kernel limits allowed values. Signed-off-by: Phil Sutter <p...@nwl.cc> --- man/man8/ip-link.8.in | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/man/man8/ip-link.8.in b/man/man8/ip-link.8.in index 3f5d57c28885f..12ec330a9c38e 100644 --- a/man/man8/ip-link.8.in +++ b/man/man8/ip-link.8.in @@ -1485,10 +1485,10 @@ is a number representing the following states: .BR 4 " (blocking)." .BI priority " PRIO" -- set port priority (a 16bit unsigned value). +- set port priority (allowed values are between 0 and 63, inclusively). .BI cost " COST" -- set port cost (a 32bit unsigned value). +- set port cost (allowed values are between 1 and 65535, inclusively). .BR guard " { " on " | " off " }" - block incoming BPDU packets on this port. -- 2.11.0
[iproute PATCH 1/4] ip: link: bond: Fix whitespace in help text
Signed-off-by: Phil Sutter <p...@nwl.cc> --- ip/iplink_bond.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/ip/iplink_bond.c b/ip/iplink_bond.c index fe83479a091a8..772b05fd89eaf 100644 --- a/ip/iplink_bond.c +++ b/ip/iplink_bond.c @@ -133,7 +133,7 @@ static void print_explain(FILE *f) "[ min_links MIN_LINKS ]\n" "[ lp_interval LP_INTERVAL ]\n" "[ packets_per_slave PACKETS_PER_SLAVE ]\n" - "[ tlb_dynamic_lb TLB_DYNAMIC_LB ]\n" + "[ tlb_dynamic_lb TLB_DYNAMIC_LB ]\n" "[ lacp_rate LACP_RATE ]\n" "[ ad_select AD_SELECT ]\n" "[ ad_user_port_key PORTKEY ]\n" -- 2.11.0
[iproute PATCH 4/4] ip: link: Add missing link type help texts
These are basically stubs: The types which lacked their own help text simply don't accept any options (yet). Still it might be a bit confusing to users if they are presented with the generic 'ip link' help text instead of something saying there are no type specific options. Signed-off-by: Phil Sutter <p...@nwl.cc> --- ip/Makefile | 3 ++- ip/iplink_dummy.c | 16 ip/iplink_ifb.c | 16 ip/iplink_nlmon.c | 16 ip/iplink_team.c | 25 + ip/iplink_vcan.c | 16 6 files changed, 91 insertions(+), 1 deletion(-) create mode 100644 ip/iplink_dummy.c create mode 100644 ip/iplink_ifb.c create mode 100644 ip/iplink_nlmon.c create mode 100644 ip/iplink_team.c create mode 100644 ip/iplink_vcan.c diff --git a/ip/Makefile b/ip/Makefile index 4276a34b529e3..035d42c74f90b 100644 --- a/ip/Makefile +++ b/ip/Makefile @@ -1,7 +1,8 @@ IPOBJ=ip.o ipaddress.o ipaddrlabel.o iproute.o iprule.o ipnetns.o \ rtm_map.o iptunnel.o ip6tunnel.o tunnel.o ipneigh.o ipntable.o iplink.o \ ipmaddr.o ipmonitor.o ipmroute.o ipprefix.o iptuntap.o iptoken.o \ -ipxfrm.o xfrm_state.o xfrm_policy.o xfrm_monitor.o \ +ipxfrm.o xfrm_state.o xfrm_policy.o xfrm_monitor.o iplink_dummy.o \ +iplink_ifb.o iplink_nlmon.o iplink_team.o iplink_vcan.o \ iplink_vlan.o link_veth.o link_gre.o iplink_can.o iplink_xdp.o \ iplink_macvlan.o ipl2tp.o link_vti.o link_vti6.o \ iplink_vxlan.o tcp_metrics.o iplink_ipoib.o ipnetconf.o link_ip6tnl.o \ diff --git a/ip/iplink_dummy.c b/ip/iplink_dummy.c new file mode 100644 index 0..cf78ea5bca926 --- /dev/null +++ b/ip/iplink_dummy.c @@ -0,0 +1,16 @@ +#include +#include + +#include "utils.h" +#include "ip_common.h" + +static void dummy_print_help(struct link_util *lu, + int argc, char **argv, FILE *f) +{ + fprintf(f, "Usage: ... dummy\n"); +} + +struct link_util dummy_link_util = { + .id = "dummy", + .print_help = dummy_print_help, +}; diff --git a/ip/iplink_ifb.c b/ip/iplink_ifb.c new file mode 100644 index 0..d7dc8f987d120 --- /dev/null +++ b/ip/iplink_ifb.c @@ -0,0 +1,16 @@ +#include +#include + +#include "utils.h" +#include "ip_common.h" + +static void ifb_print_help(struct link_util *lu, + int argc, char **argv, FILE *f) +{ + fprintf(f, "Usage: ... ifb\n"); +} + +struct link_util ifb_link_util = { + .id = "ifb", + .print_help = ifb_print_help, +}; diff --git a/ip/iplink_nlmon.c b/ip/iplink_nlmon.c new file mode 100644 index 0..51d5919a75d3d --- /dev/null +++ b/ip/iplink_nlmon.c @@ -0,0 +1,16 @@ +#include +#include + +#include "utils.h" +#include "ip_common.h" + +static void nlmon_print_help(struct link_util *lu, + int argc, char **argv, FILE *f) +{ + fprintf(f, "Usage: ... nlmon\n"); +} + +struct link_util nlmon_link_util = { + .id = "nlmon", + .print_help = nlmon_print_help, +}; diff --git a/ip/iplink_team.c b/ip/iplink_team.c new file mode 100644 index 0..6225268dda2dc --- /dev/null +++ b/ip/iplink_team.c @@ -0,0 +1,25 @@ +#include +#include + +#include "utils.h" +#include "ip_common.h" + +static void team_print_help(struct link_util *lu, + int argc, char **argv, FILE *f) +{ + fprintf(f, "Usage: ... team\n"); +} + +static void team_slave_print_help(struct link_util *lu, + int argc, char **argv, FILE *f) +{ + fprintf(f, "Usage: ... team_slave\n"); +} + +struct link_util team_link_util = { + .id = "team", + .print_help = team_print_help, +}, team_slave_link_util = { + .id = "team_slave", + .print_help = team_slave_print_help, +}; diff --git a/ip/iplink_vcan.c b/ip/iplink_vcan.c new file mode 100644 index 0..b7ae15f072a4e --- /dev/null +++ b/ip/iplink_vcan.c @@ -0,0 +1,16 @@ +#include +#include + +#include "utils.h" +#include "ip_common.h" + +static void vcan_print_help(struct link_util *lu, + int argc, char **argv, FILE *f) +{ + fprintf(f, "Usage: ... vcan\n"); +} + +struct link_util vcan_link_util = { + .id = "vcan", + .print_help = vcan_print_help, +}; -- 2.11.0
[iproute PATCH 3/4] ip: link: Unify link type help functions a bit
Take help function in iplink_bridge.c as an example and make other link types' help functions similar: * Use a single fprintf() call (if possible). * Don't state a full command line, just "... type OPTIONS". * Put every option in it's own line, align options by column. * List mandatory options first. link_veth.c is intentionally left untouched because it's 'peer' option eats all kinds of generic link options and the help text points this out without duplicating all the options there again. Signed-off-by: Phil Sutter <p...@nwl.cc> --- ip/iplink_geneve.c | 28 ++-- ip/iplink_ipoib.c | 4 +++- ip/iplink_vlan.c | 15 +-- ip/iplink_vxlan.c | 44 +--- ip/link_gre.c | 36 +++- ip/link_gre6.c | 47 --- ip/link_ip6tnl.c | 46 +++--- ip/link_iptnl.c| 38 ++ ip/link_vti.c | 17 + 9 files changed, 172 insertions(+), 103 deletions(-) diff --git a/ip/iplink_geneve.c b/ip/iplink_geneve.c index 1e6669d07d603..2c510fceb3c97 100644 --- a/ip/iplink_geneve.c +++ b/ip/iplink_geneve.c @@ -17,16 +17,24 @@ static void print_explain(FILE *f) { - fprintf(f, "Usage: ... geneve id VNI remote ADDR\n"); - fprintf(f, " [ ttl TTL ] [ tos TOS ] [ flowlabel LABEL ]\n"); - fprintf(f, " [ dstport PORT ] [ [no]external ]\n"); - fprintf(f, " [ [no]udpcsum ] [ [no]udp6zerocsumtx ] [ [no]udp6zerocsumrx ]\n"); - fprintf(f, "\n"); - fprintf(f, "Where: VNI := 0-16777215\n"); - fprintf(f, " ADDR := IP_ADDRESS\n"); - fprintf(f, " TOS := { NUMBER | inherit }\n"); - fprintf(f, " TTL := { 1..255 | inherit }\n"); - fprintf(f, " LABEL := 0-1048575\n"); + fprintf(f, + "Usage: ... geneve id VNI\n" + " remote ADDR\n" + " [ ttl TTL ]\n" + " [ tos TOS ]\n" + " [ flowlabel LABEL ]\n" + " [ dstport PORT ]\n" + " [ [no]external ]\n" + " [ [no]udpcsum ]\n" + " [ [no]udp6zerocsumtx ]\n" + " [ [no]udp6zerocsumrx ]\n" + "\n" + "Where: VNI := 0-16777215\n" + " ADDR := IP_ADDRESS\n" + " TOS := { NUMBER | inherit }\n" + " TTL := { 1..255 | inherit }\n" + " LABEL := 0-1048575\n" + ); } static void explain(void) diff --git a/ip/iplink_ipoib.c b/ip/iplink_ipoib.c index cb204af4a25b5..86dc65caa5e01 100644 --- a/ip/iplink_ipoib.c +++ b/ip/iplink_ipoib.c @@ -22,7 +22,9 @@ static void print_explain(FILE *f) { fprintf(f, - "Usage: ... ipoib [pkey PKEY] [mode {datagram | connected}][umcast {0|1}]\n" + "Usage: ... ipoib [ pkey PKEY ]\n" + " [ mode {datagram | connected} ]\n" + " [ umcast {0|1} ]\n" "\n" "PKEY := 0x8001-0x\n" ); diff --git a/ip/iplink_vlan.c b/ip/iplink_vlan.c index 144c83cbf1f37..b47236d8054de 100644 --- a/ip/iplink_vlan.c +++ b/ip/iplink_vlan.c @@ -21,14 +21,17 @@ static void print_explain(FILE *f) { fprintf(f, - "Usage: ... vlan [ protocol VLANPROTO ] id VLANID [ FLAG-LIST ]\n" - "[ ingress-qos-map QOS-MAP ] [ egress-qos-map QOS-MAP ]\n" + "Usage: ... vlan id VLANID\n" + "[ protocol VLANPROTO ]\n" + "[ reorder_hdr { on | off } ]\n" + "[ gvrp { on | off } ]\n" + "[ mvrp { on | off } ]\n" + "[ loose_binding { on | off } ]\n" + "[ ingress-qos-map QOS-MAP ]\n" + "[ egress-qos-map QOS-MAP ]\n" "\n" - "VLANPROTO: [ 802.1Q / 802.1ad ]\n" "VLANID := 0-4095\n" - "FLAG-LIST := [ FLAG-LIST ] FLAG\n" - "FLAG := [ reorder_hdr { on | off } ] [ gvrp { on | off } ] [ mvr
[iproute PATCH 0/4] Smaller link type help review
This series addresses some minor nits with link type specific help texts: * Unify coding style of print_help() callbacks (or the functions they call. * Unify output as much as possible for a common look and feel. * Make sure there's type specific help for each type listed in 'ip link help'. Phil Sutter (4): ip: link: bond: Fix whitespace in help text ip: link: macvlan: Add newline to help output ip: link: Unify link type help functions a bit ip: link: Add missing link type help texts ip/Makefile | 3 ++- ip/iplink_bond.c| 2 +- ip/iplink_dummy.c | 16 ip/iplink_geneve.c | 28 ++-- ip/iplink_ifb.c | 16 ip/iplink_ipoib.c | 4 +++- ip/iplink_macvlan.c | 1 + ip/iplink_nlmon.c | 16 ip/iplink_team.c| 25 + ip/iplink_vcan.c| 16 ip/iplink_vlan.c| 15 +-- ip/iplink_vxlan.c | 44 +--- ip/link_gre.c | 36 +++- ip/link_gre6.c | 47 --- ip/link_ip6tnl.c| 46 +++--- ip/link_iptnl.c | 38 ++ ip/link_vti.c | 17 + 17 files changed, 265 insertions(+), 105 deletions(-) create mode 100644 ip/iplink_dummy.c create mode 100644 ip/iplink_ifb.c create mode 100644 ip/iplink_nlmon.c create mode 100644 ip/iplink_team.c create mode 100644 ip/iplink_vcan.c -- 2.11.0
[iproute PATCH 2/4] ip: link: macvlan: Add newline to help output
A newline between synopsis and variable definition looks nice and is consistent with others. Signed-off-by: Phil Sutter <p...@nwl.cc> --- ip/iplink_macvlan.c | 1 + 1 file changed, 1 insertion(+) diff --git a/ip/iplink_macvlan.c b/ip/iplink_macvlan.c index b9a146f271186..662eb6ff9507c 100644 --- a/ip/iplink_macvlan.c +++ b/ip/iplink_macvlan.c @@ -31,6 +31,7 @@ static void print_explain(struct link_util *lu, FILE *f) { fprintf(f, "Usage: ... %s mode MODE [flag MODE_FLAG] MODE_OPTS\n" + "\n" "MODE: private | vepa | bridge | passthru | source\n" "MODE_FLAG: null | nopromisc\n" "MODE_OPTS: for mode \"source\":\n" -- 2.11.0
[iproute PATCH] man: ss.8: Add missing protocols to description of -A
The list was missing dccp and sctp protocols. Signed-off-by: Phil Sutter <p...@nwl.cc> --- man/man8/ss.8 | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/man/man8/ss.8 b/man/man8/ss.8 index 4ef11523b4268..81de69de8042e 100644 --- a/man/man8/ss.8 +++ b/man/man8/ss.8 @@ -132,7 +132,7 @@ Currently the following families are supported: unix, inet, inet6, link, netlink .B \-A QUERY, \-\-query=QUERY, \-\-socket=QUERY List of socket tables to dump, separated by commas. The following identifiers are understood: all, inet, tcp, udp, raw, unix, packet, netlink, unix_dgram, -unix_stream, unix_seqpacket, packet_raw, packet_dgram. +unix_stream, unix_seqpacket, packet_raw, packet_dgram, dccp, sctp. .TP .B \-D FILE, \-\-diag=FILE Do not display anything, just dump raw information about TCP sockets to FILE after applying filters. If FILE is - stdout is used. -- 2.11.0
Re: [PATCH 2/2] iproute2: add support for invisible qdisc dumping
On Sat, Feb 25, 2017 at 10:29:17PM +0100, Jiri Kosina wrote: > From: Jiri Kosina> > Support the new TCA_DUMP_INVISIBLE netlink attribute that allows asking > kernel to perform 'full qdisc dump', as for historical reasons some of the > default qdiscs are being hidden by the kernel. > > The command syntax is being extended by voluntary 'invisible' argument to > 'tc qdisc show'. > > Signed-off-by: Jiri Kosina > --- > tc/tc_qdisc.c | 25 +++-- > 1 file changed, 23 insertions(+), 2 deletions(-) Would you mind adding a description of the new keyword to tc man page as well? > diff --git a/tc/tc_qdisc.c b/tc/tc_qdisc.c > index 3a3701c2..29da9269 100644 > --- a/tc/tc_qdisc.c > +++ b/tc/tc_qdisc.c > @@ -34,7 +34,7 @@ static int usage(void) > fprintf(stderr, " [ stab [ help | STAB_OPTIONS] ]\n"); > fprintf(stderr, " [ [ QDISC_KIND ] [ help | OPTIONS ] ]\n"); > fprintf(stderr, "\n"); > - fprintf(stderr, " tc qdisc show [ dev STRING ] [ ingress | clsact > ]\n"); > + fprintf(stderr, " tc qdisc show [ dev STRING ] [ ingress | clsact > | invisible ]\n"); Doesn't look like these are mutually exclusive. Therefore I would suggest fixing the syntax to: | + fprintf(stderr, " tc qdisc show [ dev STRING ] [ ingress | clsact ] [ invisible ]\n"); Cheers, Phil
Re: [PATCH] iproute2: hide devices starting with period by default
On Thu, Feb 23, 2017 at 05:31:14PM -0800, Stephen Hemminger wrote: > On Thu, 23 Feb 2017 18:07:07 -0700 > David Ahernwrote: > > > On 2/23/17 5:30 PM, Stephen Hemminger wrote: > > > On Thu, 23 Feb 2017 16:39:52 -0700 > > > David Ahern wrote: > > > > > >> On 2/23/17 12:50 PM, Stephen Hemminger wrote: > > >>> Some use cases create Linux networking devices which are not intended > > >>> for use > > >>> by normal networking. This is an enhancement to ip command to hide > > >>> network > > >>> devices starting with period (like files in normal directory). > > >>> Interfaces whose > > >>> name start with "." are not shown by default, and the -a (or -all) flag > > >>> must > > >>> be used to show these devices. > > >> > > >> Agree that some devices need to be hidden by default -- not just from > > >> users but also other processes. > > >> > > >> This solution is very narrow, only affecting iproute2 users. Any other > > >> programs that use netlink or /proc files will continue to see those > > >> devices. > > > > > > I want solution that works broadly. And this works for sysfs already. > > > > for 'ls' maybe, but not general walking of /sys. It does not hide > > devices from snmpd, from ifconfig, etc., etc. > > > > > > >> I started a patch a year ago that allows devices to marked as invisible > > >> (attribute can be toggled at any time). Invisible devices do not show up > > >> in netlink dumps, proc files or notifications. Netlink dumps can request > > >> invisible devices to be included in a link dump. While it is more > > >> intrusive, it is also more complete covering all of the paths in which > > >> the device is shows up. > > >> > > >> Also, changing the default behavior for iproute2 could break existing > > >> users that have such device names. > > > > > > I am less worried about this. The only people using . in name already > > > are probably Brocade, and they have similar thing in CLI to hide these > > > devices. > > > > > > seems like a big assumption. > > Need a solution now, not something that requires kernel and command changes. Why the haste? This doesn't seem like an urgent thing to fix and given the mixed feelings this provoked giving it a second thought might not be the worst idea, no? Cheers, Phil
[iproute PATCH v2 1/2] testsuite: Generate nlmsg blob at runtime
Since netlink messages are in host byte order, shipping a pre-generated nlmsg blob won't suffice on systems with different endianness. Therefore generate the blob at runtime, so it's content fits the hosts endianness. Note that the generated message will contain only a single interface featuring two VFs instead of the full list before. Yet this is sufficient, as it triggers the crash with iproute versions prior to commit 8c29ae7cc2494 ("ip link: Fix crash on older kernels when show VF dev"). Signed-off-by: Phil Sutter <p...@nwl.cc> --- Changes since v1: - Dropped the previous workaround altogether. - Instead implemented this simple nlmsg generator. --- .gitignore| 2 + testsuite/Makefile| 2 + testsuite/tests/ip/link/dev_wo_vf_rate.nl | Bin 14076 -> 0 bytes testsuite/tools/Makefile | 2 + testsuite/tools/generate_nlmsg.c | 116 ++ 5 files changed, 122 insertions(+) delete mode 100644 testsuite/tests/ip/link/dev_wo_vf_rate.nl create mode 100644 testsuite/tools/Makefile create mode 100644 testsuite/tools/generate_nlmsg.c diff --git a/.gitignore b/.gitignore index 74a5496ddf7aa..c3b8d386450bf 100644 --- a/.gitignore +++ b/.gitignore @@ -36,6 +36,8 @@ series # tests testsuite/results testsuite/iproute2/iproute2-this +testsuite/tools/generate_nlmsg +testsuite/tests/ip/link/dev_wo_vf_rate.nl # doc files generated at runtime doc/*.aux diff --git a/testsuite/Makefile b/testsuite/Makefile index 2027650051d48..fa7ddb8620754 100644 --- a/testsuite/Makefile +++ b/testsuite/Makefile @@ -43,6 +43,8 @@ $(TESTS): clean ifeq (,$(HAVE_UNSHARED_UTIL)) $(error Please install util-linux tools to run tests in separated network namespace) endif + @./tools/generate_nlmsg + @mkdir -p $(RESULTS_DIR) @for d in $(TESTS_DIR); do \ diff --git a/testsuite/tests/ip/link/dev_wo_vf_rate.nl b/testsuite/tests/ip/link/dev_wo_vf_rate.nl deleted file mode 100644 index 40fa87ff1b158fb972e064efb38f76b0e9a56697.. GIT binary patch literal 0 HcmV?d1 literal 14076 zcmeI2QD_`h6o$|4X4B1fZMT}ns!(GsCQqft8lx5o_Nk#2gn}Z4n3_#@VcVp2BSi%l zK`m6<ASk{l3L;Xdtzh4LXw?UwEWQ;H>|<XAeW{rB{P)gj@7$T$?##V+ z*}MnQzaXnJq-Lk$)l|{C)nwxqpi^KR{HbES=`#u}x?l$YprDm#`>(o55*6q! zb)4tkkUXi*T+a%)Z-E|A^#$7Mln*Km1sYXlo*q(Vi3aKD;<4q*a)q|j2GuucGkCJq zDyp+|k0Sdi@-ln2m0iT|)VKO4ZE=*})4fVJbioD$cFa;AC2zZuy`N6-ST&5X6EATo z_D|Z(Qu>&6?e@!KMAvgf!`ULof*!D8j*?c3=uZxngEPAQq%()pN?GQnhhmRa(B~Ye z(A8pTJ~c$m=QA9IL@)C6>*%|64N6VG20Ecbn#D)dZ_ng7{!(c-0=xKL9c<UZr)`+; zOIvp4ZN_9&6znPMI2q#Etu(4TaUM8>c>-r(+vLpn!)W(%rl8In%!oLR?bA34VJ ztQhmlKitHza7C5aH$^e#k$UUmB_c2eF@vZ9V+O7g?SGr-@R?S2`HvUO)6{;^Q==<_ z>do$84D>@_0|GlJ{cNG$mN0hAbE5V)%c8v{}vJPc*{gl%a9wPJSu7_~x%4nmyh7 z_^#58Q`ra<_Op1tpM9YpZCUc_~rNvu9zm6*12K{mj{n*D3`lUV%g zUn49sc-a(QK6ek%rHYrAh=1`}t|P~0gqN_`DKAAobcv6sGkGaYxZrel#r2B zQ{U?X846^*F1%akI<{ai+lX9uP#2Sz_NB|1*yo1jC0w(3`6zUChnGJ%y!^2}FE4v} zc^~`4;Y(kR5Iyaa-ff1{cPQW`^h00+0y_{tTe`))?Zm!YQ?#QXUVc;E_S@cCu#d zdFYwH@;|S(*YAYC>a-Sl`4vy<)WPKPQdrOom<Gw1u9*`N(6<-W;%^Wy<vPl@;ic#a z^4a92-ABg6uJ_|*!?T9Xs7xa3EEb*ORO!BU&22#zx<t34!`^o(_*e;p=TrK zmrDz^g~i#0xmO#KXQCxokY`GMWapU}Yh3fbA)*<df$Jo112<(ZWn5tg;%7^@xVN3y zcWa7v1a3kSlB1F2Xjs#Jeag!m#E{&^DDlH>((4>evgfMfjQGT^D3Lh0ZZ}AW9+z?D zXmZ^_?wic9FI~pOJ~u3LVB2EOW6;%a=E$1Kk2%Oo5Afy_nE*KKh=T|@j8vM;k=z${ zAbz%Vi+kINeYd7)M@eSR)Y+6d(PlHaw}ia>H+p(HfVkbdu%FDqn$%*>E?h?+nbSm` zBkLNQIT&+rc8UjWMkW9b#~|u?nxjWp4m*%2w(S1}t)aWyBr|8__2e=~QsrjA^hg|B z>+}_#|!wF=yI2C&_CojH$(($KiG#nZwWRGbGPC{FsB(11wnSB6B3q5#~g(C2@gq z?3E;$In!rS=EM~T5?gD9IZJ$x;_1w-ZxZl*frmMh7@s>~^sjlu+V*A;kT@P*nwP z)ceBsZ)C3Q^FIF$JSm0-<Tq<ogf#>FzA%Z;a&|wPT;8nJIq$aL=Chozw)~#u`0)nM z9;_Va3E1c>Z}z9W2|Le8=S_7_%A2rrnUsEU@uumM-{Ad<yx-r%Zw!`}57le0%+A%9 zs|{bktqR>%XRe18dh$aOw@YwX6@yD57x9$}<Rck}x;UXLEH%^$8L=H1oJ z?~6wGmD=tEudl+4@<;T>LrV!yl{-g+KTPWWMs+?8L;AedW$F?nt>4HrA8Q zozu*nt1D^V6Lvn6(yyD`DH+^Z&)jJouhr}PC3rvcoyj*o+V<UhDR;ugU(=%G@cv PlX55Qd?uw|T-^B&_@nmR diff --git a/testsuite/tools/Makefile b/testsuite/tools/Makefile new file mode 100644 index 0..9581de51fd358 --- /dev/null +++ b/testsuite/tools/Makefile @@ -0,0 +1,2 @@ +generate_nlmsg: generate_nlmsg.c ../../lib/libnetlink.c + $(CC) -o $@ $^ diff --git a/testsuite/tools/generate_nlmsg.c b/testsuite/tools/generate_nlmsg.c new file mode 100644 index 0..fe96f26220471 --- /dev/null +++ b/testsuite/tools/generate_nlmsg.c @@ -0,0 +1,116 @@ +/* + * generate_nlmsg.cTestsuite helper generating nlmsg blob + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License + * as published by the Free Software Foundation; either version + * 2 of the License, or (at y
[iproute PATCH v2 0/2] Two minor testsuite fixes
While playing around with testsuite, I noticed two minor nits which this series attempts to fix. Changes since v1: - Replaced patch1 completely. Phil Sutter (2): testsuite: Generate nlmsg blob at runtime testsuite: Search kernel config in modules dir also .gitignore| 2 + testsuite/Makefile| 8 +++ testsuite/tests/ip/link/dev_wo_vf_rate.nl | Bin 14076 -> 0 bytes testsuite/tools/Makefile | 2 + testsuite/tools/generate_nlmsg.c | 116 ++ 5 files changed, 128 insertions(+) delete mode 100644 testsuite/tests/ip/link/dev_wo_vf_rate.nl create mode 100644 testsuite/tools/Makefile create mode 100644 testsuite/tools/generate_nlmsg.c -- 2.11.0
[iproute PATCH v2 2/2] testsuite: Search kernel config in modules dir also
At least in Fedora there is no /proc/config.gz but instead /lib/modules/`uname -r`/config, so use that as a fallback. Signed-off-by: Phil Sutter <p...@nwl.cc> --- testsuite/Makefile | 6 ++ 1 file changed, 6 insertions(+) diff --git a/testsuite/Makefile b/testsuite/Makefile index fa7ddb8620754..50a7bafaca897 100644 --- a/testsuite/Makefile +++ b/testsuite/Makefile @@ -15,6 +15,12 @@ IPVERS := $(filter-out iproute2/Makefile,$(wildcard iproute2/*)) ifneq (,$(wildcard /proc/config.gz)) KENV := $(shell cat /proc/config.gz | gunzip | grep ^CONFIG) +else +KVER := $(shell uname -r) +KCPATH := /lib/modules/${KVER}/config +ifneq (,$(wildcard ${KCPATH})) + KENV := $(shell cat ${KCPATH} | grep ^CONFIG) +endif endif .PHONY: compile listtests alltests configure $(TESTS) -- 2.11.0
Re: [iproute PATCH 1/2] testsuite: skip link show test on big endian systems
On Tue, Feb 07, 2017 at 03:12:49PM -0800, Stephen Hemminger wrote: > On Wed, 8 Feb 2017 00:04:21 +0100 > Phil Sutter <p...@nwl.cc> wrote: > > > Netlink protocol is in host byte order, so the provided binary netlink > > message buffer being in little endian format will cause the test to fail > > on big endian systems. > > > > Signed-off-by: Phil Sutter <p...@nwl.cc> > > Maybe better to figure out how to generate the files in host order? Yes, I had thought about that as well. Not sure whether it's worth the effort though to write a program which constructs the messages and dumps them into a file for 'ip monitor' to read. I think there's a certain chance this will eventually test the message builder instead of iproute. :) Cheers, Phil
[iproute PATCH 2/2] testsuite: Search kernel config in modules dir also
At least in Fedora there is no /proc/config.gz but instead /lib/modules/`uname -r`/config, so use that as a fallback. Signed-off-by: Phil Sutter <p...@nwl.cc> --- testsuite/Makefile | 6 ++ 1 file changed, 6 insertions(+) diff --git a/testsuite/Makefile b/testsuite/Makefile index 17568881b8b66..41a3afc92ae98 100644 --- a/testsuite/Makefile +++ b/testsuite/Makefile @@ -15,6 +15,12 @@ IPVERS := $(filter-out iproute2/Makefile,$(wildcard iproute2/*)) ifneq (,$(wildcard /proc/config.gz)) KENV := $(shell cat /proc/config.gz | gunzip | grep ^CONFIG) +else +KVER := $(shell uname -r) +KCPATH := /lib/modules/${KVER}/config +ifneq (,$(wildcard ${KCPATH})) + KENV := $(shell cat ${KCPATH} | grep ^CONFIG) +endif endif .PHONY: compile listtests alltests configure $(TESTS) -- 2.11.0
[iproute PATCH 1/2] testsuite: skip link show test on big endian systems
Netlink protocol is in host byte order, so the provided binary netlink message buffer being in little endian format will cause the test to fail on big endian systems. Signed-off-by: Phil Sutter <p...@nwl.cc> --- .gitignore| 1 + testsuite/Makefile| 1 + testsuite/tests/ip/link/show_dev_wo_vf_rate.t | 5 + testsuite/tools/Makefile | 2 ++ testsuite/tools/is_big_endian.c | 7 +++ 5 files changed, 16 insertions(+) create mode 100644 testsuite/tools/Makefile create mode 100644 testsuite/tools/is_big_endian.c diff --git a/.gitignore b/.gitignore index 74a5496ddf7aa..a1f295dac3dd2 100644 --- a/.gitignore +++ b/.gitignore @@ -36,6 +36,7 @@ series # tests testsuite/results testsuite/iproute2/iproute2-this +testsuite/tools/is_big_endian # doc files generated at runtime doc/*.aux diff --git a/testsuite/Makefile b/testsuite/Makefile index 2027650051d48..17568881b8b66 100644 --- a/testsuite/Makefile +++ b/testsuite/Makefile @@ -24,6 +24,7 @@ configure: compile: configure echo "Entering iproute2" && cd iproute2 && $(MAKE) && cd ..; + echo "Entering tools" && cd tools && $(MAKE) && cd ..; listtests: @for t in $(TESTS); do \ diff --git a/testsuite/tests/ip/link/show_dev_wo_vf_rate.t b/testsuite/tests/ip/link/show_dev_wo_vf_rate.t index a600ba65c5bec..ad90af5400271 100755 --- a/testsuite/tests/ip/link/show_dev_wo_vf_rate.t +++ b/testsuite/tests/ip/link/show_dev_wo_vf_rate.t @@ -2,5 +2,10 @@ source lib/generic.sh +if ./tools/is_big_endian; then + ts_log "won't work on big endian system" + ts_skip +fi + NL_FILE="tests/ip/link/dev_wo_vf_rate.nl" ts_ip "$0" "Show VF devices w/o VF rate info" -d monitor file $NL_FILE diff --git a/testsuite/tools/Makefile b/testsuite/tools/Makefile new file mode 100644 index 0..ecbea16c2c1cc --- /dev/null +++ b/testsuite/tools/Makefile @@ -0,0 +1,2 @@ +is_big_endian: is_big_endian.c + $(CC) -o $@ $< diff --git a/testsuite/tools/is_big_endian.c b/testsuite/tools/is_big_endian.c new file mode 100644 index 0..303e91b4603e8 --- /dev/null +++ b/testsuite/tools/is_big_endian.c @@ -0,0 +1,7 @@ +//#include +#include + +int main(void) +{ + return 1 != ntohs(1); +} -- 2.11.0
[iproute PATCH 0/2] Two minor testsuite fixes
While playing around with testsuite, I noticed two minor nits which this series attempts to fix. Phil Sutter (2): testsuite: skip link show test on big endian systems testsuite: Search kernel config in modules dir also .gitignore| 1 + testsuite/Makefile| 7 +++ testsuite/tests/ip/link/show_dev_wo_vf_rate.t | 5 + testsuite/tools/Makefile | 2 ++ testsuite/tools/is_big_endian.c | 7 +++ 5 files changed, 22 insertions(+) create mode 100644 testsuite/tools/Makefile create mode 100644 testsuite/tools/is_big_endian.c -- 2.11.0
[iproute PATCH] man: ip-route.8: Fix 'expires' indenting
Descriptions of each route sub-command's arguments are enclosed in .RS/.RE pairs. For 'replace' sub-command, '.RE' was incorrectly put before the last argument ('expires'). Fixes: 3fbe7ca847367 ("iproute2: ip-route.8.in: Add expires option for ip route") Signed-off-by: Phil Sutter <p...@nwl.cc> --- man/man8/ip-route.8.in | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/man/man8/ip-route.8.in b/man/man8/ip-route.8.in index 85191531b688c..d6e06649a4640 100644 --- a/man/man8/ip-route.8.in +++ b/man/man8/ip-route.8.in @@ -704,13 +704,13 @@ is a set of encapsulation attributes specific to the .sp .in -8 -.RE .TP .BI expires " TIME " "(4.4+ only)" the route will be deleted after the expires time. .B Only support IPv6 at present. +.RE .TP ip route delete -- 2.11.0
[iproute PATCH v2] man: tc-csum.8: Fix example
This fixes two issues with the provided example: - Add missing 'dev' keyword to second command. - Use a real IPv4 address instead of a bogus hex value since that will be rejected by get_addr_ipv4(). Fixes: dbfb17a67f9c7 ("man: tc-csum.8: Add an example") Reported-by: Davide Caratti <dcara...@redhat.com> Signed-off-by: Phil Sutter <p...@nwl.cc> --- Changes since v1: - Instead of using potentially valid IP addresses, use RFC 5737 ones. --- man/man8/tc-csum.8 | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/man/man8/tc-csum.8 b/man/man8/tc-csum.8 index 3a64c82f09ba8..68e5610513a51 100644 --- a/man/man8/tc-csum.8 +++ b/man/man8/tc-csum.8 @@ -57,9 +57,9 @@ packets, both IP and UDP checksums have to be recalculated: .RS .EX # tc qdisc add dev eth0 ingress handle : -# tc filter add eth0 prio 1 protocol ip parent : \\ - u32 match ip src 192.168.1.100/32 flowid :1 \\ - action pedit munge ip dst set 0x12345678 pipe \\ +# tc filter add dev eth0 prio 1 protocol ip parent : \\ + u32 match ip src 192.0.2.100/32 flowid :1 \\ + action pedit munge ip dst set 198.51.100.1 pipe \\ csum ip and udp .EE .RE -- 2.11.0
Re: [iproute PATCH] man: tc-csum.8: Fix example
On Fri, Jan 27, 2017 at 09:49:58PM +0100, Guillaume Nault wrote: > On Fri, Jan 27, 2017 at 12:15:01PM +0100, Phil Sutter wrote: > > +# tc filter add dev eth0 prio 1 protocol ip parent : \\ > > u32 match ip src 192.168.1.100/32 flowid :1 \\ > > - action pedit munge ip dst set 0x12345678 pipe \\ > > + action pedit munge ip dst set 1.2.3.4 pipe \\ > > > Just nitpicking here, but IMHO examples like this should better use IP > addresses reserved for documentation (192.0.2.0/24, 198.51.100.0/24 or > 203.0.113.0/24). Good point! This wasn't on my radar yet and I didn't know there were IPv4 ranges specifically for that purpose. I guess the reasoning here is analogous to why one shouldn't use 'example.com' everywhere. Luckily, 1.2.3.0/24 seems to be reserved by APNIC for testing purposes. :) I'll respin using another example address. Thanks, Phil
Re: [PATCH iproute2 1/1] tc: distinguish Add/Replace action operations.
On Sun, Jan 22, 2017 at 08:55:33AM -0500, Roman Mashak wrote: > Signed-off-by: Roman Mashak <m...@mojatatu.com> > Signed-off-by: Jamal Hadi Salim <j...@mojatatu.com> Acked-by: Phil Sutter <p...@nwl.cc>
[iproute PATCH] man: tc-csum.8: Fix example
This fixes two issues with the provided example: - Add missing 'dev' keyword to second command. - Use a real IPv4 address instead of a bogus hex value since that will be rejected by get_addr_ipv4(). Fixes: dbfb17a67f9c7 ("man: tc-csum.8: Add an example") Reported-by: Davide Caratti <dcara...@redhat.com> Signed-off-by: Phil Sutter <p...@nwl.cc> --- man/man8/tc-csum.8 | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/man/man8/tc-csum.8 b/man/man8/tc-csum.8 index 3a64c82f09ba8..43561d9f90a8e 100644 --- a/man/man8/tc-csum.8 +++ b/man/man8/tc-csum.8 @@ -57,9 +57,9 @@ packets, both IP and UDP checksums have to be recalculated: .RS .EX # tc qdisc add dev eth0 ingress handle : -# tc filter add eth0 prio 1 protocol ip parent : \\ +# tc filter add dev eth0 prio 1 protocol ip parent : \\ u32 match ip src 192.168.1.100/32 flowid :1 \\ - action pedit munge ip dst set 0x12345678 pipe \\ + action pedit munge ip dst set 1.2.3.4 pipe \\ csum ip and udp .EE .RE -- 2.11.0
[net-next PATCH v7] net: dummy: Introduce dummy virtual functions
The idea for this was born when testing VF support in iproute2 which was impeded by hardware requirements. In fact, not every VF-capable hardware driver implements all netdev ops, so testing the interface is still hard to do even with a well-sorted hardware shelf. To overcome this and allow for testing the user-kernel interface, this patch allows to turn dummy into a PF with a configurable amount of VFs. Since my patch series 'bus-agnostic-num-vf' has been accepted, implementing the required interfaces is pretty straightforward: Iff 'num_vfs' module parameter was given a value >0, a dummy bus type is being registered which implements the 'num_vf()' callback. Additionally, a dummy parent device common to all dummy devices is registered which sits on the above dummy bus. Joint work with Sabrina Dubroca. Signed-off-by: Sabrina Dubroca <s...@queasysnail.net> Signed-off-by: Phil Sutter <p...@nwl.cc> --- Changes since v6: - Dropped ndo_get_vf_count callback since it wasn't accepted. - Implemented dummy bus type and parent to allow for integration with rtnetlink. - Dropped 'num_vfs' field from struct dummy priv since having interfaces with differing numbers of VFs is no longer possible anyway (due to common parent device). - Redefine pr_fmt to make use of pr_err() macro. - Adjusted commit message to reflect code changes. Changes since v5: - Got rid of fake PCI parent hack altogether, implement ndo_get_vf_count instead. Changes since v4: - Initialize pci_pdev.sriov at runtime - older gcc versions don't allow initializing fields of anonymous unions at declaration time. - Rebased onto current net-next/master. Changes since v3: - Changed type of vf_mac field from unsigned char to u8. - Column-aligned structs' field names. Changes since v2: - Fixed oops on reboot (need to initialize parent device mutex). - Got rid of potential mem leak noticed by Eric Dumazet. - Dropped stray newline insertion. Changes since v1: - Fixed issues reported by kbuild test robot: - pci_dev->sriov is only present if CONFIG_PCI_ATS is active. - pci_bus_type does not exist if CONFIG_PCI is not defined. --- drivers/net/dummy.c | 217 +++- 1 file changed, 215 insertions(+), 2 deletions(-) diff --git a/drivers/net/dummy.c b/drivers/net/dummy.c index 1f2de4e8207c5..2c80611b94aef 100644 --- a/drivers/net/dummy.c +++ b/drivers/net/dummy.c @@ -41,7 +41,48 @@ #define DRV_NAME "dummy" #define DRV_VERSION"1.0" +#undef pr_fmt +#define pr_fmt(fmt) DRV_NAME ": " fmt + static int numdummies = 1; +static int num_vfs; + +struct vf_data_storage { + u8 vf_mac[ETH_ALEN]; + u16 pf_vlan; /* When set, guest VLAN config not allowed. */ + u16 pf_qos; + __be16 vlan_proto; + u16 min_tx_rate; + u16 max_tx_rate; + u8 spoofchk_enabled; + boolrss_query_enabled; + u8 trusted; + int link_state; +}; + +struct dummy_priv { + struct vf_data_storage *vfinfo; +}; + +static int dummy_num_vf(struct device *dev) +{ + return num_vfs; +} + +static struct bus_type dummy_bus = { + .name = "dummy", + .num_vf = dummy_num_vf, +}; + +static void release_dummy_parent(struct device *dev) +{ +} + +static struct device dummy_parent = { + .init_name = "dummy", + .bus= _bus, + .release= release_dummy_parent, +}; /* fake multicast ability */ static void set_multicast_list(struct net_device *dev) @@ -90,10 +131,25 @@ static netdev_tx_t dummy_xmit(struct sk_buff *skb, struct net_device *dev) static int dummy_dev_init(struct net_device *dev) { + struct dummy_priv *priv = netdev_priv(dev); + dev->dstats = netdev_alloc_pcpu_stats(struct pcpu_dstats); if (!dev->dstats) return -ENOMEM; + priv->vfinfo = NULL; + + if (!num_vfs) + return 0; + + dev->dev.parent = _parent; + priv->vfinfo = kcalloc(num_vfs, sizeof(struct vf_data_storage), + GFP_KERNEL); + if (!priv->vfinfo) { + free_percpu(dev->dstats); + return -ENOMEM; + } + return 0; } @@ -111,6 +167,117 @@ static int dummy_change_carrier(struct net_device *dev, bool new_carrier) return 0; } +static int dummy_set_vf_mac(struct net_device *dev, int vf, u8 *mac) +{ + struct dummy_priv *priv = netdev_priv(dev); + + if (!is_valid_ether_addr(mac) || (vf >= num_vfs)) + return -EINVAL; + + memcpy(priv->vfinfo[vf].vf_mac, mac, ETH_ALEN); + + return 0; +} + +static int dummy_set_vf_vlan(struct net_device *dev, int vf, +u16 vlan, u8 qos, __be16 vlan_proto) +{ + struct dummy_priv *priv = netdev_priv(dev); + + if ((vf >= num_vfs) || (vlan > 4095) || (qos >
Re: [net-next PATCH 0/3] Retrieve number of VFs in a bus-agnostic way
On Fri, Jan 20, 2017 at 11:43:46AM -0500, David Miller wrote: > From: Phil Sutter <p...@nwl.cc> > Date: Wed, 18 Jan 2017 14:04:36 +0100 > > > Previously, it was assumed that only PCI NICs would be capable of having > > virtual functions - with my proposed enhancement of dummy NIC driver > > implementing (fake) ones for testing purposes, this is no longer true. > > > > Discussion of said patch has led to the suggestion of implementing a > > bus-agnostic method for VF count retrieval so rtnetlink could work with > > both real VF-capable PCI NICs as well as my dummy modifications without > > introducing ugly hacks. > > > > The following series tries to achieve just that by introducing a bus > > type callback to retrieve a device's number of VFs, implementing this > > callback for PCI bus and finally adjusting rtnetlink to make use of the > > generalized infrastructure. > > This is really nice and clean, compare it to your original approach :-) Yes, indeed! Thanks a lot for pointing me into the right direction. :) Cheers, Phil
[net-next PATCH 0/3] Retrieve number of VFs in a bus-agnostic way
Previously, it was assumed that only PCI NICs would be capable of having virtual functions - with my proposed enhancement of dummy NIC driver implementing (fake) ones for testing purposes, this is no longer true. Discussion of said patch has led to the suggestion of implementing a bus-agnostic method for VF count retrieval so rtnetlink could work with both real VF-capable PCI NICs as well as my dummy modifications without introducing ugly hacks. The following series tries to achieve just that by introducing a bus type callback to retrieve a device's number of VFs, implementing this callback for PCI bus and finally adjusting rtnetlink to make use of the generalized infrastructure. Phil Sutter (3): device: bus_type: Introduce num_vf callback PCI: implement num_vf bus type callback device: Implement a bus agnostic dev_num_vf routine drivers/pci/pci-driver.c | 6 ++ include/linux/device.h | 11 +++ include/linux/pci.h | 2 -- net/core/rtnetlink.c | 3 +-- 4 files changed, 18 insertions(+), 4 deletions(-) -- 2.11.0