[PATCH net] jhash: fix -Wimplicit-fallthrough warnings
GCC 7 added a new -Wimplicit-fallthrough warning. It's only enabled with W=1, but since linux/jhash.h is included in over hundred places (including other global headers) it seems worthwhile fixing this warning. Signed-off-by: Jakub Kicinski --- If it looks good, would it be OK to take it via the net tree? include/linux/jhash.h | 29 ++--- 1 file changed, 14 insertions(+), 15 deletions(-) diff --git a/include/linux/jhash.h b/include/linux/jhash.h index 348c6f47e4cc..8037850f3104 100644 --- a/include/linux/jhash.h +++ b/include/linux/jhash.h @@ -85,19 +85,18 @@ static inline u32 jhash(const void *key, u32 length, u32 initval) k += 12; } /* Last block: affect all 32 bits of (c) */ - /* All the case statements fall through */ switch (length) { - case 12: c += (u32)k[11]<<24; - case 11: c += (u32)k[10]<<16; - case 10: c += (u32)k[9]<<8; - case 9: c += k[8]; - case 8: b += (u32)k[7]<<24; - case 7: b += (u32)k[6]<<16; - case 6: b += (u32)k[5]<<8; - case 5: b += k[4]; - case 4: a += (u32)k[3]<<24; - case 3: a += (u32)k[2]<<16; - case 2: a += (u32)k[1]<<8; + case 12: c += (u32)k[11]<<24; /* fall through */ + case 11: c += (u32)k[10]<<16; /* fall through */ + case 10: c += (u32)k[9]<<8; /* fall through */ + case 9: c += k[8]; /* fall through */ + case 8: b += (u32)k[7]<<24;/* fall through */ + case 7: b += (u32)k[6]<<16;/* fall through */ + case 6: b += (u32)k[5]<<8; /* fall through */ + case 5: b += k[4]; /* fall through */ + case 4: a += (u32)k[3]<<24;/* fall through */ + case 3: a += (u32)k[2]<<16;/* fall through */ + case 2: a += (u32)k[1]<<8; /* fall through */ case 1: a += k[0]; __jhash_final(a, b, c); case 0: /* Nothing left to add */ @@ -131,10 +130,10 @@ static inline u32 jhash2(const u32 *k, u32 length, u32 initval) k += 3; } - /* Handle the last 3 u32's: all the case statements fall through */ + /* Handle the last 3 u32's */ switch (length) { - case 3: c += k[2]; - case 2: b += k[1]; + case 3: c += k[2]; /* fall through */ + case 2: b += k[1]; /* fall through */ case 1: a += k[0]; __jhash_final(a, b, c); case 0: /* Nothing left to add */ -- 2.11.0
[PATCH 00/10] Constify isdn pci_device_id's.
pci_device_id are not supposed to change at runtime. All functions working with pci_device_id provided by work with const pci_device_id. So mark the non-const structs as const. Arvind Yadav (10): [PATCH 01/10] isdn: hisax: constify pci_device_id. [PATCH 02/10] isdn: hisax: hfc4s8s_l1: constify pci_device_id. [PATCH 03/10] isdn: hisax: hisax_fcpcipnp: constify pci_device_id. [PATCH 04/10] isdn: eicon: constify pci_device_id. [PATCH 05/10] isdn: mISDN: netjet: constify pci_device_id. [PATCH 06/10] isdn: mISDN: hfcmulti: constify pci_device_id. [PATCH 07/10] isdn: mISDN: w6692: constify pci_device_id. [PATCH 08/10] isdn: mISDN: avmfritz: constify pci_device_id. [PATCH 09/10] isdn: mISDN: hfcpci: constify pci_device_id. [PATCH 10/10] isdn: avm: c4: constify pci_device_id. drivers/isdn/hardware/avm/c4.c | 2 +- drivers/isdn/hardware/eicon/divasmain.c | 2 +- drivers/isdn/hardware/mISDN/avmfritz.c | 2 +- drivers/isdn/hardware/mISDN/hfcmulti.c | 2 +- drivers/isdn/hardware/mISDN/hfcpci.c| 2 +- drivers/isdn/hardware/mISDN/netjet.c| 2 +- drivers/isdn/hardware/mISDN/w6692.c | 2 +- drivers/isdn/hisax/config.c | 2 +- drivers/isdn/hisax/hfc4s8s_l1.c | 2 +- drivers/isdn/hisax/hisax_fcpcipnp.c | 2 +- 10 files changed, 10 insertions(+), 10 deletions(-) -- 2.7.4
[PATCH 01/10] isdn: hisax: constify pci_device_id.
pci_device_id are not supposed to change at runtime. All functions working with pci_device_id provided by work with const pci_device_id. So mark the non-const structs as const. File size before: textdata bss dec hex filename 1368620644416 201664ec6 drivers/isdn/hisax/config.o File size After adding 'const': textdata bss dec hex filename 15030 7204416 201664ec6 drivers/isdn/hisax/config.o Signed-off-by: Arvind Yadav --- drivers/isdn/hisax/config.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/isdn/hisax/config.c b/drivers/isdn/hisax/config.c index c7d6867..7108bdb8 100644 --- a/drivers/isdn/hisax/config.c +++ b/drivers/isdn/hisax/config.c @@ -1909,7 +1909,7 @@ static void EChannel_proc_rcv(struct hisax_d_if *d_if) #ifdef CONFIG_PCI #include -static struct pci_device_id hisax_pci_tbl[] __used = { +static const struct pci_device_id hisax_pci_tbl[] __used = { #ifdef CONFIG_HISAX_FRITZPCI {PCI_VDEVICE(AVM, PCI_DEVICE_ID_AVM_A1)}, #endif -- 2.7.4
Re: [PATCH 05/14] isdn: isdnloop: suppress a gcc-7 warning
On Fri, Jul 14, 2017 at 12:37:05PM +0200, Arnd Bergmann wrote: > On Fri, Jul 14, 2017 at 12:08 PM, Joe Perches wrote: > > On Fri, 2017-07-14 at 11:25 +0200, Arnd Bergmann wrote: > >> We test whether a bit is set in a mask here, which is correct > >> but gcc warns about it as it thinks it might be confusing: > >> > >> drivers/isdn/isdnloop/isdnloop.c:412:37: error: ?: using integer constants > >> in boolean context, the expression will always evaluate to 'true' > >> [-Werror=int-in-bool-context] ... > > Perhaps this is a logic defect and should be: > > > > if (!(card->flags & ((channel) ? ISDNLOOP_FLAGS_B2ACTIVE : > > ISDNLOOP_FLAGS_B1ACTIVE))) > > Yes, good catch. I had thought about it for a bit whether that would be > the answer, but come to the wrong conclusion on my own. > > Note that the version you suggested will still have the warning, so I think > it needs to be It shouldn't - the warning is for using an integer *constant* in boolean context, but the result of & isn't a constant and should be fine. !(flags & mask) is a very common idiom. - Kevin
[PATCH 02/10] isdn: hisax: hfc4s8s_l1: constify pci_device_id.
pci_device_id are not supposed to change at runtime. All functions working with pci_device_id provided by work with const pci_device_id. So mark the non-const structs as const. File size before: textdata bss dec hex filename 10512 536 4 110522b2c drivers/isdn/hisax/hfc4s8s_l1.o File size After adding 'const': textdata bss dec hex filename 10672 376 4 110522b2c drivers/isdn/hisax/hfc4s8s_l1.o Signed-off-by: Arvind Yadav --- drivers/isdn/hisax/hfc4s8s_l1.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/isdn/hisax/hfc4s8s_l1.c b/drivers/isdn/hisax/hfc4s8s_l1.c index 90f051c..9090cc1 100644 --- a/drivers/isdn/hisax/hfc4s8s_l1.c +++ b/drivers/isdn/hisax/hfc4s8s_l1.c @@ -86,7 +86,7 @@ typedef struct { char *device_name; } hfc4s8s_param; -static struct pci_device_id hfc4s8s_ids[] = { +static const struct pci_device_id hfc4s8s_ids[] = { {.vendor = PCI_VENDOR_ID_CCD, .device = PCI_DEVICE_ID_4S, .subvendor = 0x1397, -- 2.7.4
[PATCH 03/10] isdn: hisax: hisax_fcpcipnp: constify pci_device_id.
pci_device_id are not supposed to change at runtime. All functions working with pci_device_id provided by work with const pci_device_id. So mark the non-const structs as const. File size before: textdata bss dec hex filename 5989 576 0656519a5 isdn/hisax/hisax_fcpcipnp.o File size After adding 'const': textdata bss dec hex filename 6085 480 0656519a5 isdn/hisax/hisax_fcpcipnp.o Signed-off-by: Arvind Yadav --- drivers/isdn/hisax/hisax_fcpcipnp.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/isdn/hisax/hisax_fcpcipnp.c b/drivers/isdn/hisax/hisax_fcpcipnp.c index 5a9f39e..e4f7573 100644 --- a/drivers/isdn/hisax/hisax_fcpcipnp.c +++ b/drivers/isdn/hisax/hisax_fcpcipnp.c @@ -52,7 +52,7 @@ module_param(debug, int, 0); MODULE_AUTHOR("Kai Germaschewski /Karsten Keil "); MODULE_DESCRIPTION("AVM Fritz!PCI/PnP ISDN driver"); -static struct pci_device_id fcpci_ids[] = { +static const struct pci_device_id fcpci_ids[] = { { .vendor = PCI_VENDOR_ID_AVM, .device = PCI_DEVICE_ID_AVM_A1, .subvendor = PCI_ANY_ID, -- 2.7.4
[PATCH 04/10] isdn: eicon: constify pci_device_id.
pci_device_id are not supposed to change at runtime. All functions working with pci_device_id provided by work with const pci_device_id. So mark the non-const structs as const. File size before: textdata bss dec hex filename 6224 655 868871ae7 isdn/hardware/eicon/divasmain.o File size After adding 'const': textdata bss dec hex filename 6608 271 868871ae7 isdn/hardware/eicon/divasmain.o Signed-off-by: Arvind Yadav --- drivers/isdn/hardware/eicon/divasmain.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/isdn/hardware/eicon/divasmain.c b/drivers/isdn/hardware/eicon/divasmain.c index 8b7ad4f..b2023e0 100644 --- a/drivers/isdn/hardware/eicon/divasmain.c +++ b/drivers/isdn/hardware/eicon/divasmain.c @@ -110,7 +110,7 @@ typedef struct _diva_os_thread_dpc { /* This table should be sorted by PCI device ID */ -static struct pci_device_id divas_pci_tbl[] = { +static const struct pci_device_id divas_pci_tbl[] = { /* Diva Server BRI-2M PCI 0xE010 */ { PCI_VDEVICE(EICON, PCI_DEVICE_ID_EICON_MAESTRA), CARDTYPE_MAESTRA_PCI }, -- 2.7.4
[PATCH 06/10] isdn: mISDN: hfcmulti: constify pci_device_id.
pci_device_id are not supposed to change at runtime. All functions working with pci_device_id provided by work with const pci_device_id. So mark the non-const structs as const. File size before: textdata bss dec hex filename 6345015361492 66478 103ae isdn/hardware/mISDN/hfcmulti.o File size After adding 'const': textdata bss dec hex filename 64698 2881492 66478 103ae isdn/hardware/mISDN/hfcmulti.o Signed-off-by: Arvind Yadav --- drivers/isdn/hardware/mISDN/hfcmulti.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/isdn/hardware/mISDN/hfcmulti.c b/drivers/isdn/hardware/mISDN/hfcmulti.c index aea0c96..3cf07b8 100644 --- a/drivers/isdn/hardware/mISDN/hfcmulti.c +++ b/drivers/isdn/hardware/mISDN/hfcmulti.c @@ -5348,7 +5348,7 @@ static const struct hm_map hfcm_map[] = { #undef H #define H(x) ((unsigned long)&hfcm_map[x]) -static struct pci_device_id hfmultipci_ids[] = { +static const struct pci_device_id hfmultipci_ids[] = { /* Cards with HFC-4S Chip */ { PCI_VENDOR_ID_CCD, PCI_DEVICE_ID_CCD_HFC4S, PCI_VENDOR_ID_CCD, -- 2.7.4
[PATCH 07/10] isdn: mISDN: w6692: constify pci_device_id.
pci_device_id are not supposed to change at runtime. All functions working with pci_device_id provided by work with const pci_device_id. So mark the non-const structs as const. File size before: textdata bss dec hex filename 139594080 24 18063468f isdn/hardware/mISDN/w6692.o File size After adding 'const': textdata bss dec hex filename 140873952 24 18063468f isdn/hardware/mISDN/w6692.o Signed-off-by: Arvind Yadav --- drivers/isdn/hardware/mISDN/w6692.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/isdn/hardware/mISDN/w6692.c b/drivers/isdn/hardware/mISDN/w6692.c index 3052c83..d80072f 100644 --- a/drivers/isdn/hardware/mISDN/w6692.c +++ b/drivers/isdn/hardware/mISDN/w6692.c @@ -1398,7 +1398,7 @@ w6692_remove_pci(struct pci_dev *pdev) pr_notice("%s: drvdata already removed\n", __func__); } -static struct pci_device_id w6692_ids[] = { +static const struct pci_device_id w6692_ids[] = { { PCI_VENDOR_ID_DYNALINK, PCI_DEVICE_ID_DYNALINK_IS64PH, PCI_ANY_ID, PCI_ANY_ID, 0, 0, (ulong)&w6692_map[0]}, { PCI_VENDOR_ID_WINBOND2, PCI_DEVICE_ID_WINBOND2_6692, -- 2.7.4
[PATCH 05/10] isdn: mISDN: netjet: constify pci_device_id.
pci_device_id are not supposed to change at runtime. All functions working with pci_device_id provided by work with const pci_device_id. So mark the non-const structs as const. File size before: textdata bss dec hex filename 109411776 16 1273331bd isdn/hardware/mISDN/netjet.o File size After adding 'const': textdata bss dec hex filename 110051712 16 1273331bd isdn/hardware/mISDN/netjet.o Signed-off-by: Arvind Yadav --- drivers/isdn/hardware/mISDN/netjet.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/isdn/hardware/mISDN/netjet.c b/drivers/isdn/hardware/mISDN/netjet.c index afde4ed..6a6d848 100644 --- a/drivers/isdn/hardware/mISDN/netjet.c +++ b/drivers/isdn/hardware/mISDN/netjet.c @@ -1137,7 +1137,7 @@ static void nj_remove(struct pci_dev *pdev) /* We cannot select cards with PCI_SUB... IDs, since here are cards with * SUB IDs set to PCI_ANY_ID, so we need to match all and reject * known other cards which not work with this driver - see probe function */ -static struct pci_device_id nj_pci_ids[] = { +static const struct pci_device_id nj_pci_ids[] = { { PCI_VENDOR_ID_TIGERJET, PCI_DEVICE_ID_TIGERJET_300, PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0}, { } -- 2.7.4
[PATCH 08/10] isdn: mISDN: avmfritz: constify pci_device_id.
pci_device_id are not supposed to change at runtime. All functions working with pci_device_id provided by work with const pci_device_id. So mark the non-const structs as const. File size before: textdata bss dec hex filename 99631936 16 119152e8b isdn/hardware/mISDN/avmfritz.o File size After adding 'const': textdata bss dec hex filename 100911808 16 119152e8b isdn/hardware/mISDN/avmfritz.o Signed-off-by: Arvind Yadav --- drivers/isdn/hardware/mISDN/avmfritz.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/isdn/hardware/mISDN/avmfritz.c b/drivers/isdn/hardware/mISDN/avmfritz.c index e3fa1cd..dce6632 100644 --- a/drivers/isdn/hardware/mISDN/avmfritz.c +++ b/drivers/isdn/hardware/mISDN/avmfritz.c @@ -1142,7 +1142,7 @@ fritz_remove_pci(struct pci_dev *pdev) pr_info("%s: drvdata already removed\n", __func__); } -static struct pci_device_id fcpci_ids[] = { +static const struct pci_device_id fcpci_ids[] = { { PCI_VENDOR_ID_AVM, PCI_DEVICE_ID_AVM_A1, PCI_ANY_ID, PCI_ANY_ID, 0, 0, (unsigned long) "Fritz!Card PCI"}, { PCI_VENDOR_ID_AVM, PCI_DEVICE_ID_AVM_A1_V2, PCI_ANY_ID, PCI_ANY_ID, -- 2.7.4
[PATCH 09/10] isdn: mISDN: hfcpci: constify pci_device_id.
pci_device_id are not supposed to change at runtime. All functions working with pci_device_id provided by work with const pci_device_id. So mark the non-const structs as const. File size before: textdata bss dec hex filename 216561024 96 2277658f8 isdn/hardware/mISDN/hfcpci.o File size After adding 'const': textdata bss dec hex filename 22424 256 96 2277658f8 isdn/hardware/mISDN/hfcpci.o Signed-off-by: Arvind Yadav --- drivers/isdn/hardware/mISDN/hfcpci.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/isdn/hardware/mISDN/hfcpci.c b/drivers/isdn/hardware/mISDN/hfcpci.c index 5dc246d..d2e401a 100644 --- a/drivers/isdn/hardware/mISDN/hfcpci.c +++ b/drivers/isdn/hardware/mISDN/hfcpci.c @@ -2161,7 +2161,7 @@ static const struct _hfc_map hfc_map[] = {}, }; -static struct pci_device_id hfc_ids[] = +static const struct pci_device_id hfc_ids[] = { { PCI_VDEVICE(CCD, PCI_DEVICE_ID_CCD_2BD0), (unsigned long) &hfc_map[0] }, -- 2.7.4
[PATCH 10/10] isdn: avm: c4: constify pci_device_id.
pci_device_id are not supposed to change at runtime. All functions working with pci_device_id provided by work with const pci_device_id. So mark the non-const structs as const. File size before: textdata bss dec hex filename 11803 544 1 12348303c isdn/hardware/avm/c4.o File size After adding 'const': textdata bss dec hex filename 11931 416 1 12348303c isdn/hardware/avm/c4.o Signed-off-by: Arvind Yadav --- drivers/isdn/hardware/avm/c4.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/isdn/hardware/avm/c4.c b/drivers/isdn/hardware/avm/c4.c index 40c7e2c..034caba 100644 --- a/drivers/isdn/hardware/avm/c4.c +++ b/drivers/isdn/hardware/avm/c4.c @@ -42,7 +42,7 @@ static char *revision = "$Revision: 1.1.2.2 $"; static bool suppress_pollack; -static struct pci_device_id c4_pci_tbl[] = { +static const struct pci_device_id c4_pci_tbl[] = { { PCI_VENDOR_ID_DEC, PCI_DEVICE_ID_DEC_21285, PCI_VENDOR_ID_AVM, PCI_DEVICE_ID_AVM_C4, 0, 0, (unsigned long)4 }, { PCI_VENDOR_ID_DEC, PCI_DEVICE_ID_DEC_21285, PCI_VENDOR_ID_AVM, PCI_DEVICE_ID_AVM_C2, 0, 0, (unsigned long)2 }, { } /* Terminating entry */ -- 2.7.4
Re: [PATCH 1/6] [next-queue]net: mqprio: Introduce new hardware offload mode in mqprio for offloading full TC configurations
On 7/14/2017 1:36 AM, Jamal Hadi Salim wrote: On 17-07-11 06:18 AM, Amritha Nambiar wrote: This patch introduces a new hardware offload mode in mqprio which makes full use of the mqprio options, the TCs, the queue configurations and the bandwidth rates for the TCs. This is achieved by setting the value 2 for the "hw" option. This new offload mode supports new attributes for traffic class such as minimum and maximum values for bandwidth rate limits. Introduces a new datastructure 'tc_mqprio_qopt_offload' for offloading mqprio queue options and use this to be shared between the kernel and device driver. This contains a copy of the exisiting datastructure for mqprio queue options. This new datastructure can be extended when adding new attributes for traffic class such as bandwidth rate limits. The existing datastructure for mqprio queue options will be shared between the kernel and userspace. This patch enables configuring additional attributes associated with a traffic class such as minimum and maximum bandwidth rates and can be offloaded to the hardware in the new offload mode. The min and max limits for bandwidth rates are provided by the user along with the the TCs and the queue configurations when creating the mqprio qdisc. Example: # tc qdisc add dev eth0 root mqprio num_tc 2 map 0 0 0 0 1 1 1 1\ queues 4@0 4@4 min_rate 0Mbit 0Mbit max_rate 55Mbit 60Mbit hw 2 I know this has nothing to do with your patches - but that is very unfriendly ;-> Most mortals will have a problem with the map (but you can argue it has been there since prio qdisc was introduced) - leave alone the 4@4 syntax and now min_rate where i have to type in obvious defaults like "0Mbit". The min_rate and max_rate are optional attributes for the traffic class and it is not mandatory to have both. It is also possible to have either one of them, say, devices that do not support setting min rate need to specify only the max rate and need not type in the default 0Mbit. My bad, I should probably have given a better example. # tc qdisc add dev eth0 root mqprio num_tc 2 map 0 0 0 0 1 1 1 1\ queues 4@0 4@4 max_rate 55Mbit 60Mbit hw 2 You have some great features that not many people can use as a result. Note: This is just a comment maybe someone can be kind enough to fix (or it would get annoying enough I will fix it); i.e should not be holding your good work. To dump the bandwidth rates: # tc qdisc show dev eth0 qdisc mqprio 804a: root tc 2 map 0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 0 queues:(0:3) (4:7) min rates:0bit 0bit max rates:55Mbit 60Mbit Signed-off-by: Amritha Nambiar --- include/linux/netdevice.h |2 include/net/pkt_cls.h |7 ++ include/uapi/linux/pkt_sched.h | 13 +++ net/sched/sch_mqprio.c | 170 +--- 4 files changed, 181 insertions(+), 11 deletions(-) diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h index e48ee2e..12c6c3f 100644 --- a/include/linux/netdevice.h +++ b/include/linux/netdevice.h @@ -779,6 +779,7 @@ enum { TC_SETUP_CLSFLOWER, TC_SETUP_MATCHALL, TC_SETUP_CLSBPF, + TC_SETUP_MQPRIO_EXT, }; struct tc_cls_u32_offload; @@ -791,6 +792,7 @@ struct tc_to_netdev { struct tc_cls_matchall_offload *cls_mall; struct tc_cls_bpf_offload *cls_bpf; struct tc_mqprio_qopt *mqprio; + struct tc_mqprio_qopt_offload *mqprio_qopt; }; bool egress_dev; }; diff --git a/include/net/pkt_cls.h b/include/net/pkt_cls.h index 537d0a0..9facda2 100644 --- a/include/net/pkt_cls.h +++ b/include/net/pkt_cls.h @@ -569,6 +569,13 @@ struct tc_cls_bpf_offload { u32 gen_flags; }; +struct tc_mqprio_qopt_offload { + /* struct tc_mqprio_qopt must always be the first element */ + struct tc_mqprio_qopt qopt; + u32 flags; + u64 min_rate[TC_QOPT_MAX_QUEUE]; + u64 max_rate[TC_QOPT_MAX_QUEUE]; +}; Quickly scanned code. My opinion is: struct tc_mqprio_qopt is messed up in terms of alignments. And you just made it worse. Why not create a new struct call it "tc_mqprio_qopt_hw" or something indicating it is for hw offload. You can then fixup stuff. I think it will depend on whether you can have both hw priority and rate in all network cards. If some hw cannot support rate offload then I would suggest it becomes optional via TLVs etc. If you are willing to do that clean up I can say more. The existing struct tc_mqprio_qopt does have alignment issues, but is shared with the userspace. The new struct tc_mqprio_qopt_offload is shared with the device driver. This contains a copy of the existing tc_mqprio_qopt for mqprio queue options for legacy users. The rates are optional attributes obtained as TLVs from the userspace via additional netlink attributes. This would be clear from the corresponding iproute2 RFC patch I submitted. ([PATCH RFC, ip
Re: [PATCH net-next 2/4] net-next: mediatek: add platform data to adapt into various hardware
On Wed, 2017-07-12 at 16:50 +0200, Andrew Lunn wrote: > > +static int mtk_clk_enable(struct mtk_eth *eth) > > +{ > > + int clk, ret; > > + > > + for (clk = 0; clk < MTK_CLK_MAX ; clk++) { > > + if (eth->clks[clk]) { > > + ret = clk_prepare_enable(eth->clks[clk]); > > + if (ret) > > + goto err_disable_clks; > > + } > > + } > > + > > + return 0; > > + > > +err_disable_clks: > > + while (--clk >= 0) { > > + if (eth->clks[clk]) > > + clk_disable_unprepare(eth->clks[clk]); > > + } > > + > > + return ret; > > +} > > > + > > static int mtk_hw_init(struct mtk_eth *eth) > > { > > int i, val; > > @@ -1847,10 +1881,8 @@ static int mtk_hw_init(struct mtk_eth *eth) > > pm_runtime_enable(eth->dev); > > pm_runtime_get_sync(eth->dev); > > > > - clk_prepare_enable(eth->clks[MTK_CLK_ETHIF]); > > - clk_prepare_enable(eth->clks[MTK_CLK_ESW]); > > - clk_prepare_enable(eth->clks[MTK_CLK_GP1]); > > - clk_prepare_enable(eth->clks[MTK_CLK_GP2]); > > + mtk_clk_enable(eth); > > + > > mtk_clk_enable() returns an error code. It is probably a good idea to > use it, especially if it could be EPRODE_DEFER. okay, I will improve those clocks handling better along with Florian's suggestion in the next version Sean > > Andrew
[PATCH RFC, iproute2] tc/mqprio: Add support to configure bandwidth rate limit through mqprio
Support bandwidth rate limit information for a traffic class in addition to the number of TCs and associated queue configuration data. This is supported in the new hardware offload mode in mqprio by setting the value of 'hw' option to 2. This new hardware offload mode in mqprio makes full use of the mqprio options, the TCs, the queue configurations and the bandwidth rates for the TCs. # tc qdisc add dev eth0 root mqprio num_tc 2 map 0 0 0 0 1 1 1 1\ queues 4@0 4@4 min_rate 0Mbit 0Mbit max_rate 55Mbit 60Mbit hw 2 # tc qdisc show dev eth0 qdisc mqprio 804a: root tc 2 map 0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 0 queues:(0:3) (4:7) min rates:0bit 0bit max rates:55Mbit 60Mbit Signed-off-by: Amritha Nambiar --- include/linux/pkt_sched.h | 12 tc/q_mqprio.c | 128 ++--- 2 files changed, 132 insertions(+), 8 deletions(-) diff --git a/include/linux/pkt_sched.h b/include/linux/pkt_sched.h index 099bf55..bbad3ec 100644 --- a/include/linux/pkt_sched.h +++ b/include/linux/pkt_sched.h @@ -633,6 +633,18 @@ struct tc_mqprio_qopt { __u16 offset[TC_QOPT_MAX_QUEUE]; }; +#define TC_MQPRIO_F_MIN_RATE 0x1 +#define TC_MQPRIO_F_MAX_RATE 0x2 + +enum { + TCA_MQPRIO_UNSPEC, + TCA_MQPRIO_MIN_RATE64, + TCA_MQPRIO_MAX_RATE64, + __TCA_MQPRIO_MAX, +}; + +#define TCA_MQPRIO_MAX (__TCA_MQPRIO_MAX - 1) + /* SFB */ enum { diff --git a/tc/q_mqprio.c b/tc/q_mqprio.c index fa1022b..b7826ac 100644 --- a/tc/q_mqprio.c +++ b/tc/q_mqprio.c @@ -26,7 +26,7 @@ static void explain(void) { fprintf(stderr, "Usage: ... mqprio [num_tc NUMBER] [map P0 P1 ...]\n"); fprintf(stderr, " [queues count1@offset1 count2@offset2 ...] "); - fprintf(stderr, "[hw 1|0]\n"); + fprintf(stderr, "[hw 2|1|0]\n"); } static int mqprio_parse_opt(struct qdisc_util *qu, int argc, @@ -38,6 +38,10 @@ static int mqprio_parse_opt(struct qdisc_util *qu, int argc, {0, 1, 2, 3, 4, 5, 6, 7, 0, 1, 1, 1, 3, 3, 3, 3}, 1, }; + __u64 min_rate64[TC_QOPT_MAX_QUEUE] = {0}; + __u64 max_rate64[TC_QOPT_MAX_QUEUE] = {0}; + struct rtattr *tail; + __u32 flags = 0; while (argc > 0) { idx = 0; @@ -83,6 +87,34 @@ static int mqprio_parse_opt(struct qdisc_util *qu, int argc, free(tmp); idx++; } + } else if (strcmp(*argv, "min_rate") == 0) { + while (idx < TC_QOPT_MAX_QUEUE && NEXT_ARG_OK()) { + if (idx > opt.num_tc) { + fprintf(stderr, "Illegal number of min rates\n"); + return -1; + } + NEXT_ARG(); + if (get_rate64(&min_rate64[idx], *argv)) { + PREV_ARG(); + break; + } + idx++; + } + flags |= TC_MQPRIO_F_MIN_RATE; + } else if (strcmp(*argv, "max_rate") == 0) { + while (idx < TC_QOPT_MAX_QUEUE && NEXT_ARG_OK()) { + if (idx > opt.num_tc) { + fprintf(stderr, "Illegal number of max rates\n"); + return -1; + } + NEXT_ARG(); + if (get_rate64(&max_rate64[idx], *argv)) { + PREV_ARG(); + break; + } + idx++; + } + flags |= TC_MQPRIO_F_MAX_RATE; } else if (strcmp(*argv, "hw") == 0) { NEXT_ARG(); if (get_u8(&opt.hw, *argv, 10)) { @@ -100,27 +132,107 @@ static int mqprio_parse_opt(struct qdisc_util *qu, int argc, argc--; argv++; } + tail = NLMSG_TAIL(n); addattr_l(n, 1024, TCA_OPTIONS, &opt, sizeof(opt)); + + if (flags & TC_MQPRIO_F_MIN_RATE) { + struct rtattr *start; + + start = addattr_nest(n, 1024, +TCA_MQPRIO_MIN_RATE64 | NLA_F_NESTED); + + for (idx = 0; idx < opt.num_tc; idx++) + addattr_l(n, 1024, TCA_MQPRIO_MIN_RATE64, + &min_rate64[idx], sizeof(min_rate64[idx])); + + addattr_nest_end(n, start); + } + + if (flags & TC_MQPRIO_F_MAX_RATE) { + struct rtattr *start; + + start = a
Re: [PATCH net 1/5] tcp_bbr: cut pacing rate only if filled pipe
On Fri, Jul 14, 2017 at 7:15 PM, Stephen Hemminger wrote: > On Fri, 14 Jul 2017 18:54:02 -0400 > Neal Cardwell wrote: > >> On Fri, Jul 14, 2017 at 6:36 PM, Stephen Hemminger >> wrote: >> > On Fri, 14 Jul 2017 17:49:21 -0400 >> > Neal Cardwell wrote: >> > >> >> In bbr_set_pacing_rate(), which decides whether to cut the pacing >> >> rate, there was some code that considered exiting STARTUP to be >> >> equivalent to the notion of filling the pipe (i.e., >> >> bbr_full_bw_reached()). Specifically, as the code was structured, >> >> exiting STARTUP and going into PROBE_RTT could cause us to cut the >> >> pacing rate down to something silly and low, based on whatever >> >> bandwidth samples we've had so far, when it's possible that all of >> >> them have been small app-limited bandwidth samples that are not >> >> representative of the bandwidth available in the path. (The code was >> >> correct at the time it was written, but the state machine changed >> >> without this spot being adjusted correspondingly.) >> >> >> >> Fixes: 0f8782ea1497 ("tcp_bbr: add BBR congestion control") >> >> Signed-off-by: Neal Cardwell >> >> Signed-off-by: Yuchung Cheng >> >> Signed-off-by: Soheil Hassas Yeganeh >> > > > You are correct, these look more like bug fixes. I was a little concerned > that the changes would be visible but they really aren't user visible. Yes, exactly. > Should they go to stable as well? Yes, please. The intention was for this whole 5-patch BBR pacing bug-fix series to go into "net" and into the -stable queue together. thanks, neal
Re: [PATCH net 1/5] tcp_bbr: cut pacing rate only if filled pipe
On Fri, 14 Jul 2017 18:54:02 -0400 Neal Cardwell wrote: > On Fri, Jul 14, 2017 at 6:36 PM, Stephen Hemminger > wrote: > > On Fri, 14 Jul 2017 17:49:21 -0400 > > Neal Cardwell wrote: > > > >> In bbr_set_pacing_rate(), which decides whether to cut the pacing > >> rate, there was some code that considered exiting STARTUP to be > >> equivalent to the notion of filling the pipe (i.e., > >> bbr_full_bw_reached()). Specifically, as the code was structured, > >> exiting STARTUP and going into PROBE_RTT could cause us to cut the > >> pacing rate down to something silly and low, based on whatever > >> bandwidth samples we've had so far, when it's possible that all of > >> them have been small app-limited bandwidth samples that are not > >> representative of the bandwidth available in the path. (The code was > >> correct at the time it was written, but the state machine changed > >> without this spot being adjusted correspondingly.) > >> > >> Fixes: 0f8782ea1497 ("tcp_bbr: add BBR congestion control") > >> Signed-off-by: Neal Cardwell > >> Signed-off-by: Yuchung Cheng > >> Signed-off-by: Soheil Hassas Yeganeh > > You are correct, these look more like bug fixes. I was a little concerned that the changes would be visible but they really aren't user visible. Should they go to stable as well?
[PATCH net 1/2] net: bcmgenet: Fix unmapping of fragments in bcmgenet_xmit()
In case we fail to map a single fragment, we would be leaving the transmit ring populated with stale entries. This commit introduces the helper function bcmgenet_put_txcb() which takes care of rewinding the per-ring write pointer back to where we left. It also consolidates the functionality of bcmgenet_xmit_single() and bcmgenet_xmit_frag() into the bcmgenet_xmit() function to make the unmapping of control blocks cleaner. Fixes: 1c1008c793fa ("net: bcmgenet: add main driver file") Suggested-by: Florian Fainelli Signed-off-by: Doug Berger --- drivers/net/ethernet/broadcom/genet/bcmgenet.c | 191 +++-- 1 file changed, 85 insertions(+), 106 deletions(-) diff --git a/drivers/net/ethernet/broadcom/genet/bcmgenet.c b/drivers/net/ethernet/broadcom/genet/bcmgenet.c index daca1c9d254b..20021525f795 100644 --- a/drivers/net/ethernet/broadcom/genet/bcmgenet.c +++ b/drivers/net/ethernet/broadcom/genet/bcmgenet.c @@ -1202,6 +1202,23 @@ static struct enet_cb *bcmgenet_get_txcb(struct bcmgenet_priv *priv, return tx_cb_ptr; } +static struct enet_cb *bcmgenet_put_txcb(struct bcmgenet_priv *priv, +struct bcmgenet_tx_ring *ring) +{ + struct enet_cb *tx_cb_ptr; + + tx_cb_ptr = ring->cbs; + tx_cb_ptr += ring->write_ptr - ring->cb_ptr; + + /* Rewinding local write pointer */ + if (ring->write_ptr == ring->cb_ptr) + ring->write_ptr = ring->end_ptr; + else + ring->write_ptr--; + + return tx_cb_ptr; +} + /* Simple helper to free a control block's resources */ static void bcmgenet_free_cb(struct enet_cb *cb) { @@ -1380,95 +1397,6 @@ static void bcmgenet_tx_reclaim_all(struct net_device *dev) bcmgenet_tx_reclaim(dev, &priv->tx_rings[DESC_INDEX]); } -/* Transmits a single SKB (either head of a fragment or a single SKB) - * caller must hold priv->lock - */ -static int bcmgenet_xmit_single(struct net_device *dev, - struct sk_buff *skb, - u16 dma_desc_flags, - struct bcmgenet_tx_ring *ring) -{ - struct bcmgenet_priv *priv = netdev_priv(dev); - struct device *kdev = &priv->pdev->dev; - struct enet_cb *tx_cb_ptr; - unsigned int skb_len; - dma_addr_t mapping; - u32 length_status; - int ret; - - tx_cb_ptr = bcmgenet_get_txcb(priv, ring); - - if (unlikely(!tx_cb_ptr)) - BUG(); - - tx_cb_ptr->skb = skb; - - skb_len = skb_headlen(skb); - - mapping = dma_map_single(kdev, skb->data, skb_len, DMA_TO_DEVICE); - ret = dma_mapping_error(kdev, mapping); - if (ret) { - priv->mib.tx_dma_failed++; - netif_err(priv, tx_err, dev, "Tx DMA map failed\n"); - dev_kfree_skb(skb); - return ret; - } - - dma_unmap_addr_set(tx_cb_ptr, dma_addr, mapping); - dma_unmap_len_set(tx_cb_ptr, dma_len, skb_len); - length_status = (skb_len << DMA_BUFLENGTH_SHIFT) | dma_desc_flags | - (priv->hw_params->qtag_mask << DMA_TX_QTAG_SHIFT) | - DMA_TX_APPEND_CRC; - - if (skb->ip_summed == CHECKSUM_PARTIAL) - length_status |= DMA_TX_DO_CSUM; - - dmadesc_set(priv, tx_cb_ptr->bd_addr, mapping, length_status); - - return 0; -} - -/* Transmit a SKB fragment */ -static int bcmgenet_xmit_frag(struct net_device *dev, - skb_frag_t *frag, - u16 dma_desc_flags, - struct bcmgenet_tx_ring *ring) -{ - struct bcmgenet_priv *priv = netdev_priv(dev); - struct device *kdev = &priv->pdev->dev; - struct enet_cb *tx_cb_ptr; - unsigned int frag_size; - dma_addr_t mapping; - int ret; - - tx_cb_ptr = bcmgenet_get_txcb(priv, ring); - - if (unlikely(!tx_cb_ptr)) - BUG(); - - tx_cb_ptr->skb = NULL; - - frag_size = skb_frag_size(frag); - - mapping = skb_frag_dma_map(kdev, frag, 0, frag_size, DMA_TO_DEVICE); - ret = dma_mapping_error(kdev, mapping); - if (ret) { - priv->mib.tx_dma_failed++; - netif_err(priv, tx_err, dev, "%s: Tx DMA map failed\n", - __func__); - return ret; - } - - dma_unmap_addr_set(tx_cb_ptr, dma_addr, mapping); - dma_unmap_len_set(tx_cb_ptr, dma_len, frag_size); - - dmadesc_set(priv, tx_cb_ptr->bd_addr, mapping, - (frag_size << DMA_BUFLENGTH_SHIFT) | dma_desc_flags | - (priv->hw_params->qtag_mask << DMA_TX_QTAG_SHIFT)); - - return 0; -} - /* Reallocate the SKB to put enough headroom in front of it and insert * the transmit checksum offsets in the descriptors */ @@ -1535,11 +1463,16 @@ static struct sk_buff *bcmgenet_put_tx_csum(struct net_device *dev, static netdev_tx_t bcmgenet_xmi
[PATCH net 0/2] Fragmented SKB corrections
Two issues were observed in a review of the bcmgenet driver support for fragmented SKBs which are addressed by this patch set. The first addresses a problem that could occur if the driver is not able to DMA map a fragment of the SKB. This would be a highly unusual event but it would leave the hardware descriptors in an invalid state which should be prevented. The second is a hazard that could occur if the driver is able to reclaim the first control block of a fragmented SKB before all of its fragments have completed processing by the hardware. In this case the SKB could be freed leading to reuse of memory that is still in use by hardware. Doug Berger (2): net: bcmgenet: Fix unmapping of fragments in bcmgenet_xmit() net: bcmgenet: Free skb after last Tx frag drivers/net/ethernet/broadcom/genet/bcmgenet.c | 299 + drivers/net/ethernet/broadcom/genet/bcmgenet.h | 2 + 2 files changed, 152 insertions(+), 149 deletions(-) -- 2.13.0
[PATCH net 2/2] net: bcmgenet: Free skb after last Tx frag
Since the skb is attached to the first control block of a fragmented skb it is possible that the skb could be freed when reclaiming that control block before all fragments of the skb have been consumed by the hardware and unmapped. This commit introduces first_cb and last_cb pointers to the skb control block used by the driver to keep track of which transmit control blocks within a transmit ring are the first and last ones associated with the skb. It then splits the bcmgenet_free_cb() function into transmit (bcmgenet_free_tx_cb) and receive (bcmgenet_free_rx_cb) versions that can handle the unmapping of dma mapped memory and cleaning up the corresponding control block structure so that the skb is only freed after the last associated transmit control block is reclaimed. Fixes: 1c1008c793fa ("net: bcmgenet: add main driver file") Signed-off-by: Doug Berger --- drivers/net/ethernet/broadcom/genet/bcmgenet.c | 142 ++--- drivers/net/ethernet/broadcom/genet/bcmgenet.h | 2 + 2 files changed, 84 insertions(+), 60 deletions(-) diff --git a/drivers/net/ethernet/broadcom/genet/bcmgenet.c b/drivers/net/ethernet/broadcom/genet/bcmgenet.c index 20021525f795..7b0b399aaedd 100644 --- a/drivers/net/ethernet/broadcom/genet/bcmgenet.c +++ b/drivers/net/ethernet/broadcom/genet/bcmgenet.c @@ -1219,14 +1219,6 @@ static struct enet_cb *bcmgenet_put_txcb(struct bcmgenet_priv *priv, return tx_cb_ptr; } -/* Simple helper to free a control block's resources */ -static void bcmgenet_free_cb(struct enet_cb *cb) -{ - dev_kfree_skb_any(cb->skb); - cb->skb = NULL; - dma_unmap_addr_set(cb, dma_addr, 0); -} - static inline void bcmgenet_rx_ring16_int_disable(struct bcmgenet_rx_ring *ring) { bcmgenet_intrl2_0_writel(ring->priv, UMAC_IRQ_RXDMA_DONE, @@ -1277,18 +1269,72 @@ static inline void bcmgenet_tx_ring_int_disable(struct bcmgenet_tx_ring *ring) INTRL2_CPU_MASK_SET); } +/* Simple helper to free a transmit control block's resources + * Returns an skb when the last transmit control block associated with the + * skb is freed. The skb should be freed by the caller if necessary. + */ +static struct sk_buff *bcmgenet_free_tx_cb(struct device *dev, + struct enet_cb *cb) +{ + struct sk_buff *skb; + + skb = cb->skb; + + if (skb) { + cb->skb = NULL; + if (cb == GENET_CB(skb)->first_cb) + dma_unmap_single(dev, dma_unmap_addr(cb, dma_addr), +dma_unmap_len(cb, dma_len), +DMA_TO_DEVICE); + else + dma_unmap_page(dev, dma_unmap_addr(cb, dma_addr), + dma_unmap_len(cb, dma_len), + DMA_TO_DEVICE); + dma_unmap_addr_set(cb, dma_addr, 0); + + if (cb == GENET_CB(skb)->last_cb) + return skb; + + } else if (dma_unmap_addr(cb, dma_addr)) { + dma_unmap_page(dev, + dma_unmap_addr(cb, dma_addr), + dma_unmap_len(cb, dma_len), + DMA_TO_DEVICE); + dma_unmap_addr_set(cb, dma_addr, 0); + } + + return 0; +} + +/* Simple helper to free a receive control block's resources */ +static struct sk_buff *bcmgenet_free_rx_cb(struct device *dev, + struct enet_cb *cb) +{ + struct sk_buff *skb; + + skb = cb->skb; + cb->skb = NULL; + + if (dma_unmap_addr(cb, dma_addr)) { + dma_unmap_single(dev, dma_unmap_addr(cb, dma_addr), +dma_unmap_len(cb, dma_len), DMA_FROM_DEVICE); + dma_unmap_addr_set(cb, dma_addr, 0); + } + + return skb; +} + /* Unlocked version of the reclaim routine */ static unsigned int __bcmgenet_tx_reclaim(struct net_device *dev, struct bcmgenet_tx_ring *ring) { struct bcmgenet_priv *priv = netdev_priv(dev); - struct device *kdev = &priv->pdev->dev; - struct enet_cb *tx_cb_ptr; - unsigned int pkts_compl = 0; + unsigned int txbds_processed = 0; unsigned int bytes_compl = 0; - unsigned int c_index; + unsigned int pkts_compl = 0; unsigned int txbds_ready; - unsigned int txbds_processed = 0; + unsigned int c_index; + struct sk_buff *skb; /* Clear status before servicing to reduce spurious interrupts */ if (ring->index == DESC_INDEX) @@ -1309,21 +1355,12 @@ static unsigned int __bcmgenet_tx_reclaim(struct net_device *dev, /* Reclaim transmitted buffers */ while (txbds_processed < txbds_ready) { - tx_cb_ptr = &priv->tx_cbs[ring->clean_ptr]; - if (tx_cb_ptr->skb) { +
Re: [PATCH net 1/5] tcp_bbr: cut pacing rate only if filled pipe
On Fri, Jul 14, 2017 at 6:36 PM, Stephen Hemminger wrote: > On Fri, 14 Jul 2017 17:49:21 -0400 > Neal Cardwell wrote: > >> In bbr_set_pacing_rate(), which decides whether to cut the pacing >> rate, there was some code that considered exiting STARTUP to be >> equivalent to the notion of filling the pipe (i.e., >> bbr_full_bw_reached()). Specifically, as the code was structured, >> exiting STARTUP and going into PROBE_RTT could cause us to cut the >> pacing rate down to something silly and low, based on whatever >> bandwidth samples we've had so far, when it's possible that all of >> them have been small app-limited bandwidth samples that are not >> representative of the bandwidth available in the path. (The code was >> correct at the time it was written, but the state machine changed >> without this spot being adjusted correspondingly.) >> >> Fixes: 0f8782ea1497 ("tcp_bbr: add BBR congestion control") >> Signed-off-by: Neal Cardwell >> Signed-off-by: Yuchung Cheng >> Signed-off-by: Soheil Hassas Yeganeh > > Looks good, but net-next is closed at this time. Please resubmit later. > > http://vger.kernel.org/~davem/net-next.html Thanks, Stephen. We see your point on the net-next patch "tcp: adjust tail loss probe timeout"; we'll resubmit that patch when net-next opens. Sorry about that! But for the tcp_bbr patch series, those are bug fixes, and we were marking them as being for "net" with Fixes: footers in the hopes that they could go into the "net" branch and be queued up for inclusion in -stable releases. Are you saying that in your estimation the substance of the fixes doesn't rise to the level of "net" material? If that is the consensus then we can resubmit for net-next when that opens. thanks, neal
Re: [PATCH 13/22] liquidio: fix possible eeprom format string overflow
The 07/14/2017 09:04, David Miller wrote: > From: Arnd Bergmann > Date: Fri, 14 Jul 2017 14:07:05 +0200 > > > gcc reports that the temporary buffer for computing the > > string length may be too small here: > > > > drivers/net/ethernet/cavium/liquidio/lio_ethtool.c: In function > 'lio_get_eeprom_len': > > /drivers/net/ethernet/cavium/liquidio/lio_ethtool.c:345:21: error: 'sprintf' > may write a terminating nul past the end of the destination [-Werror= > format-overflow=] > > len = sprintf(buf, "boardname:%s serialnum:%s maj:%lld min:%lld\n", > > ^~~ > > drivers/net/ethernet/cavium/liquidio/lio_ethtool.c:345:6: note: 'sprintf' > output between 35 and 167 bytes into a destination of size 128 > > len = sprintf(buf, "boardname:%s serialnum:%s maj:%lld min:%lld\n", > > > > This extends it to 192 bytes, which is certainly enough. As far > > as I could tell, there are no other constraints that require a specific > > maximum size. > > > > Signed-off-by: Arnd Bergmann > > Applied. I had raised a bug for this earlier and attached a patch as well. http://cabugzilla1.caveonetworks.com/octeon_bugzilla/show_bug.cgi?id=26421 -- Regards Satanand
Re: [PATCH net 1/5] tcp_bbr: cut pacing rate only if filled pipe
On Fri, 14 Jul 2017 17:49:21 -0400 Neal Cardwell wrote: > In bbr_set_pacing_rate(), which decides whether to cut the pacing > rate, there was some code that considered exiting STARTUP to be > equivalent to the notion of filling the pipe (i.e., > bbr_full_bw_reached()). Specifically, as the code was structured, > exiting STARTUP and going into PROBE_RTT could cause us to cut the > pacing rate down to something silly and low, based on whatever > bandwidth samples we've had so far, when it's possible that all of > them have been small app-limited bandwidth samples that are not > representative of the bandwidth available in the path. (The code was > correct at the time it was written, but the state machine changed > without this spot being adjusted correspondingly.) > > Fixes: 0f8782ea1497 ("tcp_bbr: add BBR congestion control") > Signed-off-by: Neal Cardwell > Signed-off-by: Yuchung Cheng > Signed-off-by: Soheil Hassas Yeganeh Looks good, but net-next is closed at this time. Please resubmit later. http://vger.kernel.org/~davem/net-next.html
Re: Quirks of the Atheros 8035 PHY
On 14/07/2017 23:28, Florian Fainelli wrote: > On 07/14/2017 02:08 PM, Mason wrote: > >> I've discussed this subject in the past, but we never reached >> a conclusion, AFAIR. >> >> The Atheros 8035 GigE PHY has IMO 2 quirks wrt to clock delays. >> >> https://www.redeszone.net/app/uploads/2014/04/AR8035.pdf >> >> >> 1) RX clock delay >> >> Commit 2e5f9f281ee8369f56d403b4a52942f19b6978f8 >> >> In fact, RX clock delay is *ENABLED* by default at >> reset. So if a board actually required it *disabled* >> then we actually need to set the bit to 0. >> >> Reference: >> 4.2.25 rgmii rx clock delay control >> >> >> 2) TX clock delay >> >> Commit 1ca6d1b1aef4628ff0fe458135ddb008d134ad8f >> >> TX clock delay is DISABLED by default, so no surprises >> there... except that it is only DISABLED after *HW reset*. >> >> After a SW reset, such as that performed in Linux IIUC, >> the PHY retains the value previously set. >> >> So if a bootloader such a Uboot enabled TX delay, then >> Linux would "inherit" the setting, even if it performs >> a reset. If the PHY setting calls for no TX clock delay, >> the Linux driver would have to actively disable it. >> >> Reference: >> 4.2.26 rgmii tx clock delay control >> >> >> I can (of course) send a patch fixing both issues, but >> what was said last time was that "it's too late to >> fix it now, since the fix risks breaking working >> setups". Is that an accurate paraphrase? > > More or less, this particular problematic PHY has come up with some many > different platforms, and people and typically no one being able to have > insights on its internal design that it is really hard to comment on > what would break, it's already apparently pretty broken. When you say broken, do you mean the situation, or the HW? I don't think the HW is broken. The API is confusing, and violates the principle of least astonishment, which led several SW devs to make implementation errors, but the documentation is pretty clear and well-written, as far as docs go. The fix is straight-forward: if (PHY_INTERFACE_MODE_RGMII) disable_rx_delay; disable_tx_delay; if (PHY_INTERFACE_MODE_RGMII_RXID) enable_rx_delay; disable_tx_delay; if (PHY_INTERFACE_MODE_RGMII_TXID) disable_rx_delay; enable_tx_delay; if (PHY_INTERFACE_MODE_RGMII_ID) enable_rx_delay; enable_tx_delay; Then the setting will be as requested in the DT. >> 3) There's also a RGMII GTX_CLK documented as >> "RGMII transmit clock, 125 MHz digital. Adding a 22 ohm >> damping resistor is recommended for EMI design near MAC side" >> => Is that TX_CLK? >> There's a 2-bit related field called Gtx_dly_val >> which allows tweaking the delay >> >> Select the delay of gtx_clk. >> 00: 0.25ns >> 01: 1.3ns >> 10: 2.4ns >> 11: 3.4ns >> (DEFAULT 10b = 2.4 ns, BUT Retain val on SW reset, >> so inherit bootloader value) >> I'm not sure the current DT allows such fine-grained >> tweaking? Should we enhance it? > > What is "the current DT" in that context? There is no binding for > at8033x defined and there is not one either for nb8800. I meant Documentation/devicetree/bindings/net/ethernet.txt Documentation/devicetree/bindings/net/phy.txt There is no prop to define the required delay, in case the specific PHY supports multiple delays, as the AR8035 seems to do. This is the current DT http://elixir.free-electrons.com/linux/latest/source/arch/arm/boot/dts/tango4-vantage-1172.dts ð0 { phy-connection-type = "rgmii"; phy-handle = <ð0_phy>; #address-cells = <1>; #size-cells = <0>; /* Atheros AR8035 */ eth0_phy: ethernet-phy@4 { compatible = "ethernet-phy-id004d.d072", "ethernet-phy-ieee802.3-c22"; interrupts = <37 IRQ_TYPE_EDGE_RISING>; reg = <4>; }; }; phy-connection-type is wrong, since I need both delays. So I will change to rgmii-id, and fix the MAC driver to honor the doc blurb: * "rgmii" (RX and TX delays are added by the MAC when required) * "rgmii-id" (RGMII with internal RX and TX delays provided by the PHY, the MAC should not add the RX or TX delays in this case) * "rgmii-rxid" (RGMII with internal RX delay provided by the PHY, the MAC should not add an RX delay in this case) * "rgmii-txid" (RGMII with internal TX delay provided by the PHY, the MAC should not add an TX delay in this case) > Some bindings do define RGMII RX/TX delay, but they can't agree on that > either (net/cavium-pip.txt, net/apm-xgene-enet.txt and > net/dwmac-sun8i.txt). The latest in date: net/dwmac-sun8i.txt is > probably the best example of what should be defined and generalized. OK, I'll take a close look at it. >> 4) There's the whole mess of having clock delays >> supported both by the PHY *AND* the MAC. If both >> add a delay, things won't work as expected. >> What's the recommended approach there? > > Submit patches that fix problems for your particular u
Re: [PATCH v2 2/2] net: ethernet: nixge: Add support for National Instruments XGE netdev
On 07/14/2017 01:48 PM, Moritz Fischer wrote: > Add support for the National Instruments XGE 1/10G network device. > > It uses the EEPROM on the board via NVMEM. > > Signed-off-by: Moritz Fischer > --- > + > +static void nixge_handle_link_change(struct net_device *ndev) > +{ > + struct nixge_priv *priv = netdev_priv(ndev); > + struct phy_device *phydev = ndev->phydev; > + unsigned long flags; > + int status_change = 0; > + > + spin_lock_irqsave(&priv->lock, flags); The adjust_link function is called with the PHY device mutex held so the spinlock here looks completely unnecessary. > + > + if (phydev->link != priv->link || phydev->speed != priv->speed || > + phydev->duplex != priv->duplex) { > + priv->link = phydev->link; > + priv->speed = phydev->speed; > + priv->duplex = phydev->duplex; > + status_change = 1; > + } > + > + spin_unlock_irqrestore(&priv->lock, flags); > + > + if (status_change) > + phy_print_status(phydev); It's fine to print what changed, but surely the hardware should also react to link changes, like change of duplex, speed, pause etc. > +} > + > +static void nixge_start_xmit_done(struct net_device *ndev) > +{ This should be done in a NAPI context (soft IRQ) as well, except that for TX you don't need to bind the reclaiming process against the NAPI budget. > + u32 size = 0; > + u32 packets = 0; > + struct nixge_priv *priv = netdev_priv(ndev); > + struct nixge_dma_bd *cur_p; > + unsigned int status = 0; > + > + cur_p = &priv->tx_bd_v[priv->tx_bd_ci]; > + status = cur_p->status; > + > + while (status & XAXIDMA_BD_STS_COMPLETE_MASK) { > + dma_unmap_single(ndev->dev.parent, cur_p->phys, > + (cur_p->cntrl & XAXIDMA_BD_CTRL_LENGTH_MASK), > + DMA_TO_DEVICE); Fragments are unmapped with dma_unmap_page(), how are you unmapping them at the moment? > + if (cur_p->app4) > + dev_kfree_skb_irq((struct sk_buff *)cur_p->app4); > + /*cur_p->phys = 0;*/ > + cur_p->app0 = 0; > + cur_p->app1 = 0; > + cur_p->app2 = 0; > + cur_p->app4 = 0; > + cur_p->status = 0; Is this really necessary? Your descriptor is in coherent memory which means that you are doing slow uncached/writethrough accesses to the memory that holds them. Can't you just set status to 0 for the HW to ignore this descriptor? > + > + size += status & XAXIDMA_BD_STS_ACTUAL_LEN_MASK; > + packets++; > + > + ++priv->tx_bd_ci; > + priv->tx_bd_ci %= TX_BD_NUM; > + cur_p = &priv->tx_bd_v[priv->tx_bd_ci]; > + status = cur_p->status; > + } > + > + ndev->stats.tx_packets += packets; > + ndev->stats.tx_bytes += size; > + netif_wake_queue(ndev); You can only wake the queue if you were successful transmitting packets. > +} > + > +static inline int nixge_check_tx_bd_space(struct nixge_priv *priv, > + int num_frag) > +{ > + struct nixge_dma_bd *cur_p; > + > + cur_p = &priv->tx_bd_v[(priv->tx_bd_tail + num_frag) % TX_BD_NUM]; > + if (cur_p->status & XAXIDMA_BD_STS_ALL_MASK) > + return NETDEV_TX_BUSY; You are not propagating this to the caller, so just return a boolean for this. > + return 0; > +} > + > +static int nixge_start_xmit(struct sk_buff *skb, struct net_device *ndev) > +{ > + u32 ii; > + u32 num_frag; > + skb_frag_t *frag; > + dma_addr_t tail_p; > + struct nixge_priv *priv = netdev_priv(ndev); > + struct nixge_dma_bd *cur_p; > + > + num_frag = skb_shinfo(skb)->nr_frags; > + cur_p = &priv->tx_bd_v[priv->tx_bd_tail]; > + > + if (nixge_check_tx_bd_space(priv, num_frag)) { > + if (!netif_queue_stopped(ndev)) > + netif_stop_queue(ndev); > + return NETDEV_TX_BUSY; NETDEV_TX_OK is what you should return since you properly asserted flow contro with netif_stop_queue(). > + } > + > + cur_p->cntrl = skb_headlen(skb) | XAXIDMA_BD_CTRL_TXSOF_MASK; > + cur_p->phys = dma_map_single(ndev->dev.parent, skb->data, > + skb_headlen(skb), DMA_TO_DEVICE); This needs to be checked with dma_mapping_error(). > + > + for (ii = 0; ii < num_frag; ii++) { > + ++priv->tx_bd_tail; > + priv->tx_bd_tail %= TX_BD_NUM; > + cur_p = &priv->tx_bd_v[priv->tx_bd_tail]; > + frag = &skb_shinfo(skb)->frags[ii]; > + cur_p->phys = dma_map_single(ndev->dev.parent, > + skb_frag_address(frag), > + skb_frag_size(frag), > + DMA_TO_DEVICE); Needs to be checked against dma_mapping_error() and you would have to unwind the whol
[PATCH net 3/5] tcp_bbr: introduce bbr_init_pacing_rate_from_rtt() helper
Introduce a helper to initialize the BBR pacing rate unconditionally, based on the current cwnd and RTT estimate. This is a pure refactor, but is needed for two following fixes. Fixes: 0f8782ea1497 ("tcp_bbr: add BBR congestion control") Signed-off-by: Neal Cardwell Signed-off-by: Yuchung Cheng Signed-off-by: Soheil Hassas Yeganeh --- net/ipv4/tcp_bbr.c | 23 ++- 1 file changed, 18 insertions(+), 5 deletions(-) diff --git a/net/ipv4/tcp_bbr.c b/net/ipv4/tcp_bbr.c index 29e23b851b97..3276140c2506 100644 --- a/net/ipv4/tcp_bbr.c +++ b/net/ipv4/tcp_bbr.c @@ -221,6 +221,23 @@ static u32 bbr_bw_to_pacing_rate(struct sock *sk, u32 bw, int gain) return rate; } +/* Initialize pacing rate to: high_gain * init_cwnd / RTT. */ +static void bbr_init_pacing_rate_from_rtt(struct sock *sk) +{ + struct tcp_sock *tp = tcp_sk(sk); + u64 bw; + u32 rtt_us; + + if (tp->srtt_us) { /* any RTT sample yet? */ + rtt_us = max(tp->srtt_us >> 3, 1U); + } else { /* no RTT sample yet */ + rtt_us = USEC_PER_MSEC; /* use nominal default RTT */ + } + bw = (u64)tp->snd_cwnd * BW_UNIT; + do_div(bw, rtt_us); + sk->sk_pacing_rate = bbr_bw_to_pacing_rate(sk, bw, bbr_high_gain); +} + /* Pace using current bw estimate and a gain factor. In order to help drive the * network toward lower queues while maintaining high utilization and low * latency, the average pacing rate aims to be slightly (~1%) lower than the @@ -805,7 +822,6 @@ static void bbr_init(struct sock *sk) { struct tcp_sock *tp = tcp_sk(sk); struct bbr *bbr = inet_csk_ca(sk); - u64 bw; bbr->prior_cwnd = 0; bbr->tso_segs_goal = 0; /* default segs per skb until first ACK */ @@ -821,11 +837,8 @@ static void bbr_init(struct sock *sk) minmax_reset(&bbr->bw, bbr->rtt_cnt, 0); /* init max bw to 0 */ - /* Initialize pacing rate to: high_gain * init_cwnd / RTT. */ - bw = (u64)tp->snd_cwnd * BW_UNIT; - do_div(bw, (tp->srtt_us >> 3) ? : USEC_PER_MSEC); sk->sk_pacing_rate = 0; /* force an update of sk_pacing_rate */ - bbr_set_pacing_rate(sk, bw, bbr_high_gain); + bbr_init_pacing_rate_from_rtt(sk); bbr->restore_cwnd = 0; bbr->round_start = 0; -- 2.13.2.932.g7449e964c-goog
[PATCH net 5/5] tcp_bbr: init pacing rate on first RTT sample
Fixes the following behavior: for connections that had no RTT sample at the time of initializing congestion control, BBR was initializing the pacing rate to a high nominal rate (based an a guess of RTT=1ms, in case this is LAN traffic). Then BBR never adjusted the pacing rate downward upon obtaining an actual RTT sample, if the connection never filled the pipe (e.g. all sends were small app-limited writes()). This fix adjusts the pacing rate upon obtaining the first RTT sample. Fixes: 0f8782ea1497 ("tcp_bbr: add BBR congestion control") Signed-off-by: Neal Cardwell Signed-off-by: Yuchung Cheng Signed-off-by: Soheil Hassas Yeganeh --- net/ipv4/tcp_bbr.c | 10 +- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/net/ipv4/tcp_bbr.c b/net/ipv4/tcp_bbr.c index 42e0017f2ebc..69ee877574d0 100644 --- a/net/ipv4/tcp_bbr.c +++ b/net/ipv4/tcp_bbr.c @@ -112,7 +112,8 @@ struct bbr { cwnd_gain:10, /* current gain for setting cwnd */ full_bw_cnt:3, /* number of rounds without large bw gains */ cycle_idx:3,/* current index in pacing_gain cycle array */ - unused_b:6; + has_seen_rtt:1, /* have we seen an RTT sample yet? */ + unused_b:5; u32 prior_cwnd; /* prior cwnd upon entering loss recovery */ u32 full_bw;/* recent bw, to estimate if pipe is full */ }; @@ -225,11 +226,13 @@ static u32 bbr_bw_to_pacing_rate(struct sock *sk, u32 bw, int gain) static void bbr_init_pacing_rate_from_rtt(struct sock *sk) { struct tcp_sock *tp = tcp_sk(sk); + struct bbr *bbr = inet_csk_ca(sk); u64 bw; u32 rtt_us; if (tp->srtt_us) { /* any RTT sample yet? */ rtt_us = max(tp->srtt_us >> 3, 1U); + bbr->has_seen_rtt = 1; } else { /* no RTT sample yet */ rtt_us = USEC_PER_MSEC; /* use nominal default RTT */ } @@ -247,8 +250,12 @@ static void bbr_init_pacing_rate_from_rtt(struct sock *sk) */ static void bbr_set_pacing_rate(struct sock *sk, u32 bw, int gain) { + struct tcp_sock *tp = tcp_sk(sk); + struct bbr *bbr = inet_csk_ca(sk); u32 rate = bbr_bw_to_pacing_rate(sk, bw, gain); + if (unlikely(!bbr->has_seen_rtt && tp->srtt_us)) + bbr_init_pacing_rate_from_rtt(sk); if (bbr_full_bw_reached(sk) || rate > sk->sk_pacing_rate) sk->sk_pacing_rate = rate; } @@ -837,6 +844,7 @@ static void bbr_init(struct sock *sk) minmax_reset(&bbr->bw, bbr->rtt_cnt, 0); /* init max bw to 0 */ + bbr->has_seen_rtt = 0; bbr_init_pacing_rate_from_rtt(sk); bbr->restore_cwnd = 0; -- 2.13.2.932.g7449e964c-goog
[PATCH net 4/5] tcp_bbr: remove sk_pacing_rate=0 transient during init
Fix a corner case noticed by Eric Dumazet, where BBR's setting sk->sk_pacing_rate to 0 during initialization could theoretically cause packets in the sending host to hang if there were packets "in flight" in the pacing infrastructure at the time the BBR congestion control state is initialized. This could occur if the pacing infrastructure happened to race with bbr_init() in a way such that the pacer read the 0 rather than the immediately following non-zero pacing rate. Fixes: 0f8782ea1497 ("tcp_bbr: add BBR congestion control") Reported-by: Eric Dumazet Signed-off-by: Neal Cardwell Signed-off-by: Yuchung Cheng Signed-off-by: Soheil Hassas Yeganeh --- net/ipv4/tcp_bbr.c | 1 - 1 file changed, 1 deletion(-) diff --git a/net/ipv4/tcp_bbr.c b/net/ipv4/tcp_bbr.c index 3276140c2506..42e0017f2ebc 100644 --- a/net/ipv4/tcp_bbr.c +++ b/net/ipv4/tcp_bbr.c @@ -837,7 +837,6 @@ static void bbr_init(struct sock *sk) minmax_reset(&bbr->bw, bbr->rtt_cnt, 0); /* init max bw to 0 */ - sk->sk_pacing_rate = 0; /* force an update of sk_pacing_rate */ bbr_init_pacing_rate_from_rtt(sk); bbr->restore_cwnd = 0; -- 2.13.2.932.g7449e964c-goog
[PATCH net 1/5] tcp_bbr: cut pacing rate only if filled pipe
In bbr_set_pacing_rate(), which decides whether to cut the pacing rate, there was some code that considered exiting STARTUP to be equivalent to the notion of filling the pipe (i.e., bbr_full_bw_reached()). Specifically, as the code was structured, exiting STARTUP and going into PROBE_RTT could cause us to cut the pacing rate down to something silly and low, based on whatever bandwidth samples we've had so far, when it's possible that all of them have been small app-limited bandwidth samples that are not representative of the bandwidth available in the path. (The code was correct at the time it was written, but the state machine changed without this spot being adjusted correspondingly.) Fixes: 0f8782ea1497 ("tcp_bbr: add BBR congestion control") Signed-off-by: Neal Cardwell Signed-off-by: Yuchung Cheng Signed-off-by: Soheil Hassas Yeganeh --- net/ipv4/tcp_bbr.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/net/ipv4/tcp_bbr.c b/net/ipv4/tcp_bbr.c index dbcc9352a48f..743e97511dc8 100644 --- a/net/ipv4/tcp_bbr.c +++ b/net/ipv4/tcp_bbr.c @@ -220,12 +220,11 @@ static u64 bbr_rate_bytes_per_sec(struct sock *sk, u64 rate, int gain) */ static void bbr_set_pacing_rate(struct sock *sk, u32 bw, int gain) { - struct bbr *bbr = inet_csk_ca(sk); u64 rate = bw; rate = bbr_rate_bytes_per_sec(sk, rate, gain); rate = min_t(u64, rate, sk->sk_max_pacing_rate); - if (bbr->mode != BBR_STARTUP || rate > sk->sk_pacing_rate) + if (bbr_full_bw_reached(sk) || rate > sk->sk_pacing_rate) sk->sk_pacing_rate = rate; } -- 2.13.2.932.g7449e964c-goog
[PATCH net 2/5] tcp_bbr: introduce bbr_bw_to_pacing_rate() helper
Introduce a helper to convert a BBR bandwidth and gain factor to a pacing rate in bytes per second. This is a pure refactor, but is needed for two following fixes. Fixes: 0f8782ea1497 ("tcp_bbr: add BBR congestion control") Signed-off-by: Neal Cardwell Signed-off-by: Yuchung Cheng Signed-off-by: Soheil Hassas Yeganeh --- net/ipv4/tcp_bbr.c | 14 +++--- 1 file changed, 11 insertions(+), 3 deletions(-) diff --git a/net/ipv4/tcp_bbr.c b/net/ipv4/tcp_bbr.c index 743e97511dc8..29e23b851b97 100644 --- a/net/ipv4/tcp_bbr.c +++ b/net/ipv4/tcp_bbr.c @@ -211,6 +211,16 @@ static u64 bbr_rate_bytes_per_sec(struct sock *sk, u64 rate, int gain) return rate >> BW_SCALE; } +/* Convert a BBR bw and gain factor to a pacing rate in bytes per second. */ +static u32 bbr_bw_to_pacing_rate(struct sock *sk, u32 bw, int gain) +{ + u64 rate = bw; + + rate = bbr_rate_bytes_per_sec(sk, rate, gain); + rate = min_t(u64, rate, sk->sk_max_pacing_rate); + return rate; +} + /* Pace using current bw estimate and a gain factor. In order to help drive the * network toward lower queues while maintaining high utilization and low * latency, the average pacing rate aims to be slightly (~1%) lower than the @@ -220,10 +230,8 @@ static u64 bbr_rate_bytes_per_sec(struct sock *sk, u64 rate, int gain) */ static void bbr_set_pacing_rate(struct sock *sk, u32 bw, int gain) { - u64 rate = bw; + u32 rate = bbr_bw_to_pacing_rate(sk, bw, gain); - rate = bbr_rate_bytes_per_sec(sk, rate, gain); - rate = min_t(u64, rate, sk->sk_max_pacing_rate); if (bbr_full_bw_reached(sk) || rate > sk->sk_pacing_rate) sk->sk_pacing_rate = rate; } -- 2.13.2.932.g7449e964c-goog
Re: Quirks of the Atheros 8035 PHY
On 07/14/2017 02:08 PM, Mason wrote: > Hello, > > I've discussed this subject in the past, but we never reached > a conclusion, AFAIR. > > The Atheros 8035 GigE PHY has IMO 2 quirks wrt to clock delays. > > https://www.redeszone.net/app/uploads/2014/04/AR8035.pdf > > > 1) RX clock delay > > Commit 2e5f9f281ee8369f56d403b4a52942f19b6978f8 > > In fact, RX clock delay is *ENABLED* by default at > reset. So if a board actually required it *disabled* > then we actually need to set the bit to 0. > > Reference: > 4.2.25 rgmii rx clock delay control > > > 2) TX clock delay > > Commit 1ca6d1b1aef4628ff0fe458135ddb008d134ad8f > > TX clock delay is DISABLED by default, so no surprises > there... except that it is only DISABLED after *HW reset*. > > After a SW reset, such as that performed in Linux IIUC, > the PHY retains the value previously set. > > So if a bootloader such a Uboot enabled TX delay, then > Linux would "inherit" the setting, even if it performs > a reset. If the PHY setting calls for no TX clock delay, > the Linux driver would have to actively disable it. > > Reference: > 4.2.26 rgmii tx clock delay control > > > I can (of course) send a patch fixing both issues, but > what was said last time was that "it's too late to > fix it now, since the fix risks breaking working > setups". Is that an accurate paraphrase? More or less, this particular problematic PHY has come up with some many different platforms, and people and typically no one being able to have insights on its internal design that it is really hard to comment on what would break, it's already apparently pretty broken. > > > 3) There's also a RGMII GTX_CLK documented as > "RGMII transmit clock, 125 MHz digital. Adding a 22 ohm > damping resistor is recommended for EMI design near MAC side" > => Is that TX_CLK? > There's a 2-bit related field called Gtx_dly_val > which allows tweaking the delay > > Select the delay of gtx_clk. > 00: 0.25ns > 01: 1.3ns > 10: 2.4ns > 11: 3.4ns > (DEFAULT 10b = 2.4 ns, BUT Retain val on SW reset, > so inherit bootloader value) > I'm not sure the current DT allows such fine-grained > tweaking? Should we enhance it? What is "the current DT" in that context? There is no binding for at8033x defined and there is not one either for nb8800. Some bindings do define RGMII RX/TX delay, but they can't agree on that either (net/cavium-pip.txt, net/apm-xgene-enet.txt and net/dwmac-sun8i.txt). The latest in date: net/dwmac-sun8i.txt is probably the best example of what should be defined and generalized. > > > 4) There's the whole mess of having clock delays > supported both by the PHY *AND* the MAC. If both > add a delay, things won't work as expected. > What's the recommended approach there? Submit patches that fix problems for your particular use case that we can review and evaluate, once we have that, it's a lot easier to assess the impact it could have on other platforms. -- Florian
Re: Quirks of the Atheros 8035 PHY
Mugunthan's address bounces. Removing it. Adding Grygorii's address instead. On 14/07/2017 23:08, Mason wrote: > Hello, > > I've discussed this subject in the past, but we never reached > a conclusion, AFAIR. > > The Atheros 8035 GigE PHY has IMO 2 quirks wrt to clock delays. > > https://www.redeszone.net/app/uploads/2014/04/AR8035.pdf > > > 1) RX clock delay > > Commit 2e5f9f281ee8369f56d403b4a52942f19b6978f8 > > In fact, RX clock delay is *ENABLED* by default at > reset. So if a board actually required it *disabled* > then we actually need to set the bit to 0. > > Reference: > 4.2.25 rgmii rx clock delay control > > > 2) TX clock delay > > Commit 1ca6d1b1aef4628ff0fe458135ddb008d134ad8f > > TX clock delay is DISABLED by default, so no surprises > there... except that it is only DISABLED after *HW reset*. > > After a SW reset, such as that performed in Linux IIUC, > the PHY retains the value previously set. > > So if a bootloader such a Uboot enabled TX delay, then > Linux would "inherit" the setting, even if it performs > a reset. If the PHY setting calls for no TX clock delay, > the Linux driver would have to actively disable it. > > Reference: > 4.2.26 rgmii tx clock delay control > > > I can (of course) send a patch fixing both issues, but > what was said last time was that "it's too late to > fix it now, since the fix risks breaking working > setups". Is that an accurate paraphrase? > > > 3) There's also a RGMII GTX_CLK documented as > "RGMII transmit clock, 125 MHz digital. Adding a 22 ohm > damping resistor is recommended for EMI design near MAC side" > => Is that TX_CLK? > There's a 2-bit related field called Gtx_dly_val > which allows tweaking the delay > > Select the delay of gtx_clk. > 00: 0.25ns > 01: 1.3ns > 10: 2.4ns > 11: 3.4ns > (DEFAULT 10b = 2.4 ns, BUT Retain val on SW reset, > so inherit bootloader value) > I'm not sure the current DT allows such fine-grained > tweaking? Should we enhance it? > > > 4) There's the whole mess of having clock delays > supported both by the PHY *AND* the MAC. If both > add a delay, things won't work as expected. > What's the recommended approach there? > > > Regards.
Quirks of the Atheros 8035 PHY
Hello, I've discussed this subject in the past, but we never reached a conclusion, AFAIR. The Atheros 8035 GigE PHY has IMO 2 quirks wrt to clock delays. https://www.redeszone.net/app/uploads/2014/04/AR8035.pdf 1) RX clock delay Commit 2e5f9f281ee8369f56d403b4a52942f19b6978f8 In fact, RX clock delay is *ENABLED* by default at reset. So if a board actually required it *disabled* then we actually need to set the bit to 0. Reference: 4.2.25 rgmii rx clock delay control 2) TX clock delay Commit 1ca6d1b1aef4628ff0fe458135ddb008d134ad8f TX clock delay is DISABLED by default, so no surprises there... except that it is only DISABLED after *HW reset*. After a SW reset, such as that performed in Linux IIUC, the PHY retains the value previously set. So if a bootloader such a Uboot enabled TX delay, then Linux would "inherit" the setting, even if it performs a reset. If the PHY setting calls for no TX clock delay, the Linux driver would have to actively disable it. Reference: 4.2.26 rgmii tx clock delay control I can (of course) send a patch fixing both issues, but what was said last time was that "it's too late to fix it now, since the fix risks breaking working setups". Is that an accurate paraphrase? 3) There's also a RGMII GTX_CLK documented as "RGMII transmit clock, 125 MHz digital. Adding a 22 ohm damping resistor is recommended for EMI design near MAC side" => Is that TX_CLK? There's a 2-bit related field called Gtx_dly_val which allows tweaking the delay Select the delay of gtx_clk. 00: 0.25ns 01: 1.3ns 10: 2.4ns 11: 3.4ns (DEFAULT 10b = 2.4 ns, BUT Retain val on SW reset, so inherit bootloader value) I'm not sure the current DT allows such fine-grained tweaking? Should we enhance it? 4) There's the whole mess of having clock delays supported both by the PHY *AND* the MAC. If both add a delay, things won't work as expected. What's the recommended approach there? Regards.
[PATCH v2 1/2] dt-bindings: net: Add bindings for National Instruments XGE netdev
This adds bindings for the NI XGE 1G/10G network device. Signed-off-by: Moritz Fischer --- Documentation/devicetree/bindings/net/nixge.txt | 32 + 1 file changed, 32 insertions(+) create mode 100644 Documentation/devicetree/bindings/net/nixge.txt diff --git a/Documentation/devicetree/bindings/net/nixge.txt b/Documentation/devicetree/bindings/net/nixge.txt new file mode 100644 index 000..9fff5a7 --- /dev/null +++ b/Documentation/devicetree/bindings/net/nixge.txt @@ -0,0 +1,32 @@ +* NI XGE Ethernet controller + +Required properties: +- compatible: Should be "ni,xge-enet-2.00" +- reg: Address and length of the register set for the device +- interrupts: Should contain tx and rx interrupt +- interrupt-names: Should be "rx-irq" and "tx-irq" +- phy-mode: See ethernet.txt file in the same directory. +- nvmem-cells: Phandle of nvmem cell containing the mac address +- nvmem-cell-names: Should be "address" + +Examples (10G generic PHY): + nixge0: ethernet@4000 { + compatible = "ni,xge-enet-2.00"; + reg = <0x4000 0x6000>; + + nvmem-cells = <ð1_addr>; + nvmem-cell-names = "address"; + + interrupts = <0 29 4>, <0 30 4>; + interrupt-names = "rx-irq", "tx-irq"; + interrupt-parent = <&intc>; + + phy-mode = "xgmii"; + phy-handle = <ðernet_phy1>; + + ethernet_phy1: ethernet-phy@4 { + compatible = "ethernet-phy-ieee802.3-c45"; + reg = <4>; + devices = <0xa>; + }; + }; -- 2.7.4
[PATCH v2 2/2] net: ethernet: nixge: Add support for National Instruments XGE netdev
Add support for the National Instruments XGE 1/10G network device. It uses the EEPROM on the board via NVMEM. Signed-off-by: Moritz Fischer --- Changes from v1: - Added dependency on ARCH_ZYNQ (Kbuild) - Removed unused variables - Use of_phy_connect as suggested - Removed masking of (un)supported modes - Added #define for some constants - Removed empty pm functions - Reworked mac_address handling - Made nixge_mdio_*() static (sparse) - Removed driver version - Addressed timeout loop - Adressed return values on timeout --- drivers/net/ethernet/Kconfig |1 + drivers/net/ethernet/Makefile|1 + drivers/net/ethernet/ni/Kconfig | 27 + drivers/net/ethernet/ni/Makefile |1 + drivers/net/ethernet/ni/nixge.c | 1190 ++ 5 files changed, 1220 insertions(+) create mode 100644 drivers/net/ethernet/ni/Kconfig create mode 100644 drivers/net/ethernet/ni/Makefile create mode 100644 drivers/net/ethernet/ni/nixge.c diff --git a/drivers/net/ethernet/Kconfig b/drivers/net/ethernet/Kconfig index edae15ac..2021806 100644 --- a/drivers/net/ethernet/Kconfig +++ b/drivers/net/ethernet/Kconfig @@ -127,6 +127,7 @@ config FEALNX source "drivers/net/ethernet/natsemi/Kconfig" source "drivers/net/ethernet/netronome/Kconfig" +source "drivers/net/ethernet/ni/Kconfig" source "drivers/net/ethernet/8390/Kconfig" config NET_NETX diff --git a/drivers/net/ethernet/Makefile b/drivers/net/ethernet/Makefile index bf7f450..68f49f7 100644 --- a/drivers/net/ethernet/Makefile +++ b/drivers/net/ethernet/Makefile @@ -58,6 +58,7 @@ obj-$(CONFIG_NET_VENDOR_MYRI) += myricom/ obj-$(CONFIG_FEALNX) += fealnx.o obj-$(CONFIG_NET_VENDOR_NATSEMI) += natsemi/ obj-$(CONFIG_NET_VENDOR_NETRONOME) += netronome/ +obj-$(CONFIG_NET_VENDOR_NI) += ni/ obj-$(CONFIG_NET_NETX) += netx-eth.o obj-$(CONFIG_NET_VENDOR_NUVOTON) += nuvoton/ obj-$(CONFIG_NET_VENDOR_NVIDIA) += nvidia/ diff --git a/drivers/net/ethernet/ni/Kconfig b/drivers/net/ethernet/ni/Kconfig new file mode 100644 index 000..cd30f7d --- /dev/null +++ b/drivers/net/ethernet/ni/Kconfig @@ -0,0 +1,27 @@ +# +# National Instuments network device configuration +# + +config NET_VENDOR_NI + bool "National Instruments Devices" + default y + ---help--- + If you have a network (Ethernet) device belonging to this class, say Y. + + Note that the answer to this question doesn't directly affect the + kernel: saying N will just cause the configurator to skip all + the questions about National Instrument devices. + If you say Y, you will be asked for your specific device in the + following questions. + +if NET_VENDOR_NI + +config NI_XGE_MANAGEMENT_ENET + tristate "National Instruments XGE management enet support" + depends on ARCH_ZYNQ + select PHYLIB + ---help--- + Simple LAN device for debug or management purposes. Can + support either 10G or 1G PHYs via SFP+ ports. + +endif diff --git a/drivers/net/ethernet/ni/Makefile b/drivers/net/ethernet/ni/Makefile new file mode 100644 index 000..99c6646 --- /dev/null +++ b/drivers/net/ethernet/ni/Makefile @@ -0,0 +1 @@ +obj-$(CONFIG_NI_XGE_MANAGEMENT_ENET) += nixge.o diff --git a/drivers/net/ethernet/ni/nixge.c b/drivers/net/ethernet/ni/nixge.c new file mode 100644 index 000..6b52683 --- /dev/null +++ b/drivers/net/ethernet/ni/nixge.c @@ -0,0 +1,1190 @@ +/* + * Copyright (c) 2016-2017, National Instruments Corp. + * + * Network Driver for Ettus Research XGE MAC + * + * This is largely based on the Xilinx AXI Ethernet Driver, + * and uses the same DMA engine in the FPGA + * + * SPDX-License-Identifier: GPL-2.0 + * + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#define TX_BD_NUM 64 +#define RX_BD_NUM 128 + +/* Axi DMA Register definitions */ + +#define XAXIDMA_TX_CR_OFFSET 0x /* Channel control */ +#define XAXIDMA_TX_SR_OFFSET 0x0004 /* Status */ +#define XAXIDMA_TX_CDESC_OFFSET0x0008 /* Current descriptor pointer */ +#define XAXIDMA_TX_TDESC_OFFSET0x0010 /* Tail descriptor pointer */ + +#define XAXIDMA_RX_CR_OFFSET 0x0030 /* Channel control */ +#define XAXIDMA_RX_SR_OFFSET 0x0034 /* Status */ +#define XAXIDMA_RX_CDESC_OFFSET0x0038 /* Current descriptor pointer */ +#define XAXIDMA_RX_TDESC_OFFSET0x0040 /* Tail descriptor pointer */ + +#define XAXIDMA_CR_RUNSTOP_MASK0x0001 /* Start/stop DMA channel */ +#define XAXIDMA_CR_RESET_MASK 0x0004 /* Reset DMA engine */ + +#define XAXIDMA_BD_NDESC_OFFSET0x00 /* Next descriptor pointer */ +#define XAXIDMA_BD_BUFA_OFFSET 0x08 /* Buffer address */ +#define XAXIDMA_BD_CTRL_LEN_OFFSET 0x18 /* Control/buffer length */ +#define XAXIDMA_BD_STS_OFFSET 0x1C /*
Re: [PATCH v2] sctp: don't dereference ptr before leaving _sctp_walk_{params,errors}()
Hi Alexander, [auto build test WARNING on net-next/master] [also build test WARNING on next-20170714] [cannot apply to v4.12] [if your patch is applied to the wrong git tree, please drop us a note to help improve the system] url: https://github.com/0day-ci/linux/commits/Alexander-Potapenko/sctp-don-t-dereference-ptr-before-leaving-_sctp_walk_-params-errors/20170715-013318 reproduce: # apt-get install sparse make ARCH=x86_64 allmodconfig make C=1 CF=-D__CHECK_ENDIAN__ sparse warnings: (new ones prefixed by >>) include/linux/compiler.h:260:8: sparse: attribute 'no_sanitize_address': unknown attribute net/sctp/sm_statefuns.c:3871:9: sparse: Expected , in __builtin_offset net/sctp/sm_statefuns.c:3871:9: sparse: got sctp_paramhdr_t >> builtin:0:0: sparse: No right hand side of '+'-expression net/sctp/sm_statefuns.c:3871:9: sparse: Expected ) in 'for' net/sctp/sm_statefuns.c:3871:9: sparse: got ; net/sctp/sm_statefuns.c:3871:9: sparse: Expected ; at end of statement net/sctp/sm_statefuns.c:3871:9: sparse: got ) >> net/sctp/sm_statefuns.c:3903:9: sparse: Trying to use reserved word 'return' >> as identifier net/sctp/sm_statefuns.c:3903:16: sparse: Expected ; at end of declaration net/sctp/sm_statefuns.c:3903:16: sparse: got SCTP_DISPOSITION_CONSUME net/sctp/sm_statefuns.c:3904:1: sparse: Expected ; at the end of type declaration net/sctp/sm_statefuns.c:3904:1: sparse: got } net/sctp/sm_statefuns.c:3933:13: sparse: Expected ) in function declarator net/sctp/sm_statefuns.c:3933:13: sparse: got ! >> net/sctp/sm_statefuns.c:3933:9: sparse: Trying to use reserved word 'if' as >> identifier net/sctp/sm_statefuns.c:3936:17: sparse: Trying to use reserved word 'return' as identifier net/sctp/sm_statefuns.c:3936:24: sparse: Expected ; at end of declaration net/sctp/sm_statefuns.c:3936:24: sparse: got sctp_sf_pdiscard net/sctp/sm_statefuns.c:3937:9: sparse: Expected ; at the end of type declaration net/sctp/sm_statefuns.c:3937:9: sparse: got } net/sctp/sm_statefuns.c:3943:13: sparse: Expected ) in function declarator net/sctp/sm_statefuns.c:3943:13: sparse: got ! net/sctp/sm_statefuns.c:3943:9: sparse: Trying to use reserved word 'if' as identifier net/sctp/sm_statefuns.c:3948:14: sparse: Expected ; at end of declaration net/sctp/sm_statefuns.c:3948:14: sparse: got -> net/sctp/sm_statefuns.c:3950:13: sparse: Expected ; at end of declaration net/sctp/sm_statefuns.c:3950:13: sparse: got -= net/sctp/sm_statefuns.c:3951:23: sparse: Expected ) in function declarator net/sctp/sm_statefuns.c:3951:23: sparse: got -> >> net/sctp/sm_statefuns.c:3954:9: sparse: Trying to use reserved word 'do' as >> identifier net/sctp/sm_statefuns.c:3954:9: sparse: Expected ; at end of declaration net/sctp/sm_statefuns.c:3954:9: sparse: got { net/sctp/sm_statefuns.c:3954:9: sparse: Expected ) in function declarator net/sctp/sm_statefuns.c:3954:9: sparse: got ( net/sctp/sm_statefuns.c:3954:9: sparse: Trying to use reserved word 'if' as identifier net/sctp/sm_statefuns.c:3954:9: sparse: Expected ) in function declarator net/sctp/sm_statefuns.c:3954:9: sparse: got ( net/sctp/sm_statefuns.c:3954:9: sparse: Trying to use reserved word 'if' as identifier >> net/sctp/sm_statefuns.c:3954:9: sparse: Trying to use reserved word 'else' >> as identifier net/sctp/sm_statefuns.c:3954:9: sparse: Expected ; at end of declaration net/sctp/sm_statefuns.c:3954:9: sparse: got if >> net/sctp/sm_statefuns.c:3954:9: sparse: Trying to use reserved word 'else' >> as identifier net/sctp/sm_statefuns.c:3954:9: sparse: Expected ; at end of declaration net/sctp/sm_statefuns.c:3954:9: sparse: got branch net/sctp/sm_statefuns.c:3954:9: sparse: Expected ; at the end of type declaration net/sctp/sm_statefuns.c:3954:9: sparse: got } net/sctp/sm_statefuns.c:3954:9: sparse: Expected ; at the end of type declaration net/sctp/sm_statefuns.c:3954:9: sparse: got } net/sctp/sm_statefuns.c:3959:30: sparse: Expected ) in function declarator net/sctp/sm_statefuns.c:3959:30: sparse: got ( net/sctp/sm_statefuns.c:3959:9: sparse: Trying to use reserved word 'if' as identifier net/sctp/sm_statefuns.c:3963:9: sparse: Expected ) in function declarator net/sctp/sm_statefuns.c:3963:9: sparse: got ( >> net/sctp/sm_statefuns.c:3963:9: sparse: Trying to use reserved word 'for' as >> identifier net/sctp/sm_statefuns.c:3963:9: sparse: Expected ) in nested declarator net/sctp/sm_statefuns.c:3963:9: sparse: got * >> net/sctp/sm_statefuns.c:3963:9: sparse: Trying to use reserved word 'void' >> as identifier net/sctp/sm_statefuns.c:3963:9: sparse: Expected
Re: [iovisor-dev] [PATCH v3 net-next 00/12] bpf: rewrite value tracking in verifier
I did an experiment with one of our internal bpf programs. The program has 1563 insns. Without Edward's patch: processed 13634 insns, stack depth 160 With Edward's patch: processed 15807 insns, stack depth 160 So the number of processed insns regressed by roughly 16%. Did anybody do any similar experiments to quantify the patch's impact in verification performance (e.g., in terms of processed insns)? On Tue, Jun 27, 2017 at 5:53 AM, Edward Cree via iovisor-dev wrote: > This series simplifies alignment tracking, generalises bounds tracking and > fixes some bounds-tracking bugs in the BPF verifier. Pointer arithmetic on > packet pointers, stack pointers, map value pointers and context pointers has > been unified, and bounds on these pointers are only checked when the pointer > is dereferenced. > Operations on pointers which destroy all relation to the original pointer > (such as multiplies and shifts) are disallowed if !env->allow_ptr_leaks, > otherwise they convert the pointer to an unknown scalar and feed it to the > normal scalar arithmetic handling. > Pointer types have been unified with the corresponding adjusted-pointer types > where those existed (e.g. PTR_TO_MAP_VALUE[_ADJ] or FRAME_PTR vs > PTR_TO_STACK); similarly, CONST_IMM and UNKNOWN_VALUE have been unified into > SCALAR_VALUE. > Pointer types (except CONST_PTR_TO_MAP, PTR_TO_MAP_VALUE_OR_NULL and > PTR_TO_PACKET_END, which do not allow arithmetic) have a 'fixed offset' and > a 'variable offset'; the former is used when e.g. adding an immediate or a > known-constant register, as long as it does not overflow. Otherwise the > latter is used, and any operation creating a new variable offset creates a > new 'id' (and, for PTR_TO_PACKET, clears the 'range'). > SCALAR_VALUEs use the 'variable offset' fields to track the range of possible > values; the 'fixed offset' should never be set on a scalar. > > As of patch 12/12, all tests of tools/testing/selftests/bpf/test_verifier > and tools/testing/selftests/bpf/test_align pass. > > v3: added a few more tests; removed RFC tags. > > v2: fixed nfp build, made test_align pass again and extended it with a few > new tests (though still need to add more). > > Edward Cree (12): > selftests/bpf: add test for mixed signed and unsigned bounds checks > bpf/verifier: rework value tracking > nfp: change bpf verifier hooks to match new verifier data structures > bpf/verifier: track signed and unsigned min/max values > bpf/verifier: more concise register state logs for constant var_off > selftests/bpf: change test_verifier expectations > selftests/bpf: rewrite test_align > selftests/bpf: add a test to test_align > selftests/bpf: add test for bogus operations on pointers > selftests/bpf: don't try to access past MAX_PACKET_OFF in > test_verifier > selftests/bpf: add tests for subtraction & negative numbers > selftests/bpf: variable offset negative tests > > drivers/net/ethernet/netronome/nfp/bpf/verifier.c | 24 +- > include/linux/bpf.h | 34 +- > include/linux/bpf_verifier.h | 56 +- > include/linux/tnum.h | 81 + > kernel/bpf/Makefile |2 +- > kernel/bpf/tnum.c | 180 ++ > kernel/bpf/verifier.c | 1943 > - > tools/testing/selftests/bpf/test_align.c | 462 - > tools/testing/selftests/bpf/test_verifier.c | 293 ++-- > 9 files changed, 2034 insertions(+), 1041 deletions(-) > create mode 100644 include/linux/tnum.h > create mode 100644 kernel/bpf/tnum.c > > ___ > iovisor-dev mailing list > iovisor-...@lists.iovisor.org > https://lists.iovisor.org/mailman/listinfo/iovisor-dev
Re: [PATCH 18/22] gpio: acpi: fix string overflow for large pin numbers
On Fri, Jul 14, 2017 at 2:52 PM, Andy Shevchenko wrote: > On Fri, 2017-07-14 at 14:07 +0200, Arnd Bergmann wrote: >> gcc-7 notices that the pin_table is an array of 16-bit numbers, >> but we assume it can be printed as a two-character hexadecimal >> string: >> >> drivers/gpio/gpiolib-acpi.c: In function >> 'acpi_gpiochip_request_interrupt': >> drivers/gpio/gpiolib-acpi.c:206:24: warning: '%02X' directive writing >> between 2 and 4 bytes into a region of size 3 [-Wformat-overflow=] >>sprintf(ev_name, "_%c%02X", >> ^~~~ >> drivers/gpio/gpiolib-acpi.c:206:20: note: directive argument in the >> range [0, 65535] >>sprintf(ev_name, "_%c%02X", >> ^ >> drivers/gpio/gpiolib-acpi.c:206:3: note: 'sprintf' output between 5 >> and 7 bytes into a destination of size 5 >>sprintf(ev_name, "_%c%02X", >>^~~ >> agpio->triggering == ACPI_EDGE_SENSITIVE ? 'E' : 'L', >> ~ >> pin); >> > > > This is obviously a false positive warning. > > Here we have > int pin = u16 pin_table[0] <= 255 (implying >= 0). > > I see few options how to make it more clear > 1) your proposal; > 2) use "%02hhX" instead; > 3) use if (ret >= 0 && ret <= 255) condition. > > I would choose one of the 2-3. > > In case gcc will complain about 3), file a bug to gcc crazy warning. Makes sense. I didn't remember the syntax for 2) and couldn't find it in the man page when I first looked. This seems like a good solution here. I'm pretty sure I tried 3) a few times when the warning first showed up last year, but couldn't get that to work. Filing a gcc bug also seems like a good idea, but I should first see if it's already fixed. The version I use for testing at the moment is from late April, and others may have complained about that already. Arnd
Re: [PATCH 17/22] platform/x86: alienware-wmi: fix format string overflow warning
On Fri, Jul 14, 2017 at 10:37 PM, Arnd Bergmann wrote: > On Fri, Jul 14, 2017 at 9:18 PM, Andy Shevchenko > wrote: >> On Fri, Jul 14, 2017 at 3:07 PM, Arnd Bergmann wrote: >>> gcc points out a possible format string overflow for a large value of >>> 'zone': >> Here we need to convert >> >> int i; >> >> to >> >> u8 i; > > That was my first impulse, but then I decided not to change the > idiomatic 'int i' for the index variable to 'u8' as that would be > less idiomatic. > >> I will take it after addressing above. >> >> P.S. You may do this change across the file. > > How about changing it to 'u8 zone'? I'm ultimately fine with that (just gentle reminder you might fix all 3 occurrences of it in that driver). -- With Best Regards, Andy Shevchenko
[PATCH V3 net] openvswitch: Fix for force/commit action failures
When there is an established connection in direction A->B, it is possible to receive a packet on port B which then executes ct(commit,force) without first performing ct() - ie, a lookup. In this case, we would expect that this packet can delete the existing entry so that we can commit a connection with direction B->A. However, currently we only perform a check in skb_nfct_cached() for whether OVS_CS_F_TRACKED is set and OVS_CS_F_INVALID is not set, ie that a lookup previously occurred. In the above scenario, a lookup has not occurred but we should still be able to statelessly look up the existing entry and potentially delete the entry if it is in the opposite direction. This patch extends the check to also hint that if the action has the force flag set, then we will lookup the existing entry so that the force check at the end of skb_nfct_cached has the ability to delete the connection. Fixes: dd41d330b03 ("openvswitch: Add force commit.") CC: Pravin Shelar CC: d...@openvswitch.org Signed-off-by: Joe Stringer Signed-off-by: Greg Rose --- V2: Make sure nf_conntrack_in() is called for force case V3: Fix compiler warning for newer compilers --- net/openvswitch/conntrack.c | 51 - 1 file changed, 36 insertions(+), 15 deletions(-) diff --git a/net/openvswitch/conntrack.c b/net/openvswitch/conntrack.c index 08679eb..e3c4c6c 100644 --- a/net/openvswitch/conntrack.c +++ b/net/openvswitch/conntrack.c @@ -629,6 +629,34 @@ static int handle_fragments(struct net *net, struct sw_flow_key *key, return ct; } +static +struct nf_conn *ovs_ct_executed(struct net *net, + const struct sw_flow_key *key, + const struct ovs_conntrack_info *info, + struct sk_buff *skb, + bool *ct_executed) +{ + struct nf_conn *ct = NULL; + + /* If no ct, check if we have evidence that an existing conntrack entry +* might be found for this skb. This happens when we lose a skb->_nfct +* due to an upcall, or if the direction is being forced. If the +* connection was not confirmed, it is not cached and needs to be run +* through conntrack again. +*/ + *ct_executed = (key->ct_state & OVS_CS_F_TRACKED) && + !(key->ct_state & OVS_CS_F_INVALID) && + (key->ct_zone == info->zone.id); + + if (*ct_executed || (!key->ct_state && info->force)) { + ct = ovs_ct_find_existing(net, &info->zone, info->family, skb, + !!(key->ct_state & + OVS_CS_F_NAT_MASK)); + } + + return ct; +} + /* Determine whether skb->_nfct is equal to the result of conntrack lookup. */ static bool skb_nfct_cached(struct net *net, const struct sw_flow_key *key, @@ -637,24 +665,17 @@ static bool skb_nfct_cached(struct net *net, { enum ip_conntrack_info ctinfo; struct nf_conn *ct; + bool ct_executed = true; ct = nf_ct_get(skb, &ctinfo); - /* If no ct, check if we have evidence that an existing conntrack entry -* might be found for this skb. This happens when we lose a skb->_nfct -* due to an upcall. If the connection was not confirmed, it is not -* cached and needs to be run through conntrack again. -*/ - if (!ct && key->ct_state & OVS_CS_F_TRACKED && - !(key->ct_state & OVS_CS_F_INVALID) && - key->ct_zone == info->zone.id) { - ct = ovs_ct_find_existing(net, &info->zone, info->family, skb, - !!(key->ct_state -& OVS_CS_F_NAT_MASK)); - if (ct) - nf_ct_get(skb, &ctinfo); - } if (!ct) + ct = ovs_ct_executed(net, key, info, skb, &ct_executed); + + if (ct) + nf_ct_get(skb, &ctinfo); + else return false; + if (!net_eq(net, read_pnet(&ct->ct_net))) return false; if (!nf_ct_zone_equal_any(info->ct, nf_ct_zone(ct))) @@ -679,7 +700,7 @@ static bool skb_nfct_cached(struct net *net, return false; } - return true; + return ct_executed; } #ifdef CONFIG_NF_NAT_NEEDED -- 1.8.3.1
Re: [PATCH 17/22] platform/x86: alienware-wmi: fix format string overflow warning
On Fri, Jul 14, 2017 at 9:18 PM, Andy Shevchenko wrote: > On Fri, Jul 14, 2017 at 3:07 PM, Arnd Bergmann wrote: >> gcc points out a possible format string overflow for a large value of 'zone': >> >> drivers/platform/x86/alienware-wmi.c: In function 'alienware_wmi_init': >> drivers/platform/x86/alienware-wmi.c:461:24: error: '%02X' directive writing >> between 2 and 8 bytes into a region of size 6 [-Werror=format-overflow=] >>sprintf(buffer, "zone%02X", i); >> ^~~~ >> drivers/platform/x86/alienware-wmi.c:461:19: note: directive argument in the >> range [0, 2147483646] >>sprintf(buffer, "zone%02X", i); >>^~ >> drivers/platform/x86/alienware-wmi.c:461:3: note: 'sprintf' output between 7 >> and 13 bytes into a destination of size 10 >> >> While the zone should never be that large, it's easy to make the >> buffer a few bytes longer so gcc can prove this to be safe. > > Please, be a bit smarter on such fixes. > > Here we need to convert > > int i; > > to > > u8 i; That was my first impulse, but then I decided not to change the idiomatic 'int i' for the index variable to 'u8' as that would be less idiomatic. > I will take it after addressing above. > > P.S. You may do this change across the file. How about changing it to 'u8 zone'? Arnd
Re: [PATCH 17/22] platform/x86: alienware-wmi: fix format string overflow warning
On Fri, Jul 14, 2017 at 3:07 PM, Arnd Bergmann wrote: > gcc points out a possible format string overflow for a large value of 'zone': > > drivers/platform/x86/alienware-wmi.c: In function 'alienware_wmi_init': > drivers/platform/x86/alienware-wmi.c:461:24: error: '%02X' directive writing > between 2 and 8 bytes into a region of size 6 [-Werror=format-overflow=] >sprintf(buffer, "zone%02X", i); > ^~~~ > drivers/platform/x86/alienware-wmi.c:461:19: note: directive argument in the > range [0, 2147483646] >sprintf(buffer, "zone%02X", i); >^~ > drivers/platform/x86/alienware-wmi.c:461:3: note: 'sprintf' output between 7 > and 13 bytes into a destination of size 10 > > While the zone should never be that large, it's easy to make the > buffer a few bytes longer so gcc can prove this to be safe. Please, be a bit smarter on such fixes. Here we need to convert int i; to u8 i; I will take it after addressing above. P.S. You may do this change across the file. > Signed-off-by: Arnd Bergmann > --- > drivers/platform/x86/alienware-wmi.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/platform/x86/alienware-wmi.c > b/drivers/platform/x86/alienware-wmi.c > index 0831b428c217..acc01242da82 100644 > --- a/drivers/platform/x86/alienware-wmi.c > +++ b/drivers/platform/x86/alienware-wmi.c > @@ -421,7 +421,7 @@ static DEVICE_ATTR(lighting_control_state, 0644, > show_control_state, > static int alienware_zone_init(struct platform_device *dev) > { > int i; > - char buffer[10]; > + char buffer[13]; > char *name; > > if (interface == WMAX) { > -- > 2.9.0 > -- With Best Regards, Andy Shevchenko
Re: [PATCH] ipv6: initialize treq->txhash in cookie_v6_check()
On Fri, Jul 14, 2017 at 1:35 PM, Alexander Potapenko wrote: > On Fri, Jul 14, 2017 at 7:04 PM, Neal Cardwell wrote: >> On Fri, Jul 14, 2017 at 12:54 PM, Alexander Potapenko >> wrote: >>> KMSAN reported use of uninitialized memory in skb_set_hash_from_sk(), >>> which originated from the TCP request socket created in >>> cookie_v6_check(): >> ... >>> --- a/net/ipv6/syncookies.c >>> +++ b/net/ipv6/syncookies.c >>> @@ -216,6 +216,7 @@ struct sock *cookie_v6_check(struct sock *sk, struct >>> sk_buff *skb) >>> treq->rcv_isn = ntohl(th->seq) - 1; >>> treq->snt_isn = cookie; >>> treq->ts_off = 0; >>> + treq->txhash = 0; >>> >>> /* >>> * We need to lookup the dst_entry to get the correct window size. >> >> I would have thought that the same fix is needed in the corresponding >> line in cookie_v4_check() in net/ipv4/syncookies.c? (I do not see >> txhash being initialized for the IPv4 side.) If it's not needed for >> some reason, then it would be worth a comment in the commit >> description to explain why not. > Most certainly it is needed. I haven't seen reports for that in the > wild and couldn't forge a repro triggering the bug in IPv4, but I'll > give it another shot. If you force syncookies to be used, with: sysctl -q net.ipv4.tcp_syncookies=2 then every passive IPv4 TCP connection that is accepted by a TCP server app should be using that uninitialized memory, I would think (and thus should be liable to fire the KMSAN use of uninitialized memory warning). Does that work? neal
Re: [PATCH v2] sctp: don't dereference ptr before leaving _sctp_walk_{params,errors}()
Hi Alexander, [auto build test ERROR on net-next/master] [also build test ERROR on next-20170714] [cannot apply to v4.12] [if your patch is applied to the wrong git tree, please drop us a note to help improve the system] url: https://github.com/0day-ci/linux/commits/Alexander-Potapenko/sctp-don-t-dereference-ptr-before-leaving-_sctp_walk_-params-errors/20170715-013318 config: x86_64-rhel (attached as .config) compiler: gcc-6 (Debian 6.2.0-3) 6.2.0 20160901 reproduce: # save the attached .config to linux build tree make ARCH=x86_64 All error/warnings (new ones prefixed by >>): In file included from include/linux/compiler.h:58:0, from include/uapi/linux/stddef.h:1, from include/linux/stddef.h:4, from include/uapi/linux/posix_types.h:4, from include/uapi/linux/types.h:13, from include/linux/types.h:5, from net/sctp/sm_statefuns.c:48: net/sctp/sm_statefuns.c: In function 'sctp_sf_do_reconf': >> include/net/sctp/sctp.h:472:24: error: unknown type name 'sctp_paramhdr_t' (pos.v + offsetof(sctp_paramhdr_t, length) + sizeof(pos.p->length) <\ ^ include/linux/compiler-gcc.h:161:21: note: in definition of macro '__compiler_offsetof' __builtin_offsetof(a, b) ^ >> include/net/sctp/sctp.h:472:15: note: in expansion of macro 'offsetof' (pos.v + offsetof(sctp_paramhdr_t, length) + sizeof(pos.p->length) <\ ^~~~ >> include/net/sctp/sctp.h:468:1: note: in expansion of macro >> '_sctp_walk_params' _sctp_walk_params((pos), (chunk), ntohs((chunk)->chunk_hdr.length), member) ^ >> net/sctp/sm_statefuns.c:3871:2: note: in expansion of macro >> 'sctp_walk_params' sctp_walk_params(param, hdr, params) { ^~~~ -- In file included from include/linux/compiler.h:58:0, from arch/x86/include/asm/atomic.h:4, from include/linux/atomic.h:4, from include/linux/crypto.h:20, from include/crypto/hash.h:16, from net/sctp/sm_make_chunk.c:48: net/sctp/sm_make_chunk.c: In function 'sctp_verify_init': >> include/net/sctp/sctp.h:472:24: error: unknown type name 'sctp_paramhdr_t' (pos.v + offsetof(sctp_paramhdr_t, length) + sizeof(pos.p->length) <\ ^ include/linux/compiler-gcc.h:161:21: note: in definition of macro '__compiler_offsetof' __builtin_offsetof(a, b) ^ >> include/net/sctp/sctp.h:472:15: note: in expansion of macro 'offsetof' (pos.v + offsetof(sctp_paramhdr_t, length) + sizeof(pos.p->length) <\ ^~~~ >> include/net/sctp/sctp.h:468:1: note: in expansion of macro >> '_sctp_walk_params' _sctp_walk_params((pos), (chunk), ntohs((chunk)->chunk_hdr.length), member) ^ >> net/sctp/sm_make_chunk.c:2262:2: note: in expansion of macro >> 'sctp_walk_params' sctp_walk_params(param, peer_init, init_hdr.params) { ^~~~ >> include/net/sctp/sctp.h:472:24: error: unknown type name 'sctp_paramhdr_t' (pos.v + offsetof(sctp_paramhdr_t, length) + sizeof(pos.p->length) <\ ^ include/linux/compiler-gcc.h:161:21: note: in definition of macro '__compiler_offsetof' __builtin_offsetof(a, b) ^ >> include/net/sctp/sctp.h:472:15: note: in expansion of macro 'offsetof' (pos.v + offsetof(sctp_paramhdr_t, length) + sizeof(pos.p->length) <\ ^~~~ >> include/net/sctp/sctp.h:468:1: note: in expansion of macro >> '_sctp_walk_params' _sctp_walk_params((pos), (chunk), ntohs((chunk)->chunk_hdr.length), member) ^ net/sctp/sm_make_chunk.c:2285:2: note: in expansion of macro 'sctp_walk_params' sctp_walk_params(param, peer_init, init_hdr.params) { ^~~~ net/sctp/sm_make_chunk.c: In function 'sctp_process_init': >> include/net/sctp/sctp.h:472:24: error: unknown type name 'sctp_paramhdr_t' (pos.v + offsetof(sctp_paramhdr_t, length) + sizeof(pos.p->length) <\ ^ include/linux/compiler-gcc.h:161:21: note: in definition of macro '__compiler_offsetof' __builtin_offsetof(a, b) ^ >> include/net/sctp/sctp.h:472:15: note: in expansion of macro 'offsetof' (pos.v + offsetof(sctp_paramhdr_t, length) + sizeof(pos.p->length) <\ ^~~~ >
Re: [PATCH net] openvswitch: Fix for force/commit action failures
On 07/14/2017 11:42 AM, Joe Stringer wrote: On 14 July 2017 at 09:10, Greg Rose wrote: > When there is an established connection in direction A->B, it is > possible to receive a packet on port B which then executes > ct(commit,force) without first performing ct() - ie, a lookup. > In this case, we would expect that this packet can delete the existing > entry so that we can commit a connection with direction B->A. However, > currently we only perform a check in skb_nfct_cached() for whether > OVS_CS_F_TRACKED is set and OVS_CS_F_INVALID is not set, ie that a > lookup previously occurred. In the above scenario, a lookup has not > occurred but we should still be able to statelessly look up the > existing entry and potentially delete the entry if it is in the > opposite direction. > > This patch extends the check to also hint that if the action has the > force flag set, then we will lookup the existing entry so that the > force check at the end of skb_nfct_cached has the ability to delete > the connection. > > Fixes: dd41d330b03 ("openvswitch: Add force commit.") > CC: Pravin Shelar > CC: d...@openvswitch.org > Signed-off-by: Joe Stringer > Signed-off-by: Greg Rose > --- > net/openvswitch/conntrack.c | 50 +++-- > 1 file changed, 35 insertions(+), 15 deletions(-) > > diff --git a/net/openvswitch/conntrack.c b/net/openvswitch/conntrack.c > index 08679eb..1260f2b 100644 > --- a/net/openvswitch/conntrack.c > +++ b/net/openvswitch/conntrack.c > @@ -629,6 +629,33 @@ static int handle_fragments(struct net *net, struct sw_flow_key *key, > return ct; > } > > +struct nf_conn *ovs_ct_executed(struct net *net, > + const struct sw_flow_key *key, > + const struct ovs_conntrack_info *info, > + struct sk_buff *skb, > + bool *ct_executed) Actually, some compilers will report this new warning: net/openvswitch/conntrack.c:632:16: warning: symbol 'ovs_ct_executed' was not declared. Should it be static? Could you make this function static and repost? Thanks. Yes, I just saw that too. Need to update my compiler for my primary build machine. I'll send a V2... Thanks! - Greg
Re: [PATCH net] openvswitch: Fix for force/commit action failures
On 14 July 2017 at 09:10, Greg Rose wrote: > When there is an established connection in direction A->B, it is > possible to receive a packet on port B which then executes > ct(commit,force) without first performing ct() - ie, a lookup. > In this case, we would expect that this packet can delete the existing > entry so that we can commit a connection with direction B->A. However, > currently we only perform a check in skb_nfct_cached() for whether > OVS_CS_F_TRACKED is set and OVS_CS_F_INVALID is not set, ie that a > lookup previously occurred. In the above scenario, a lookup has not > occurred but we should still be able to statelessly look up the > existing entry and potentially delete the entry if it is in the > opposite direction. > > This patch extends the check to also hint that if the action has the > force flag set, then we will lookup the existing entry so that the > force check at the end of skb_nfct_cached has the ability to delete > the connection. > > Fixes: dd41d330b03 ("openvswitch: Add force commit.") > CC: Pravin Shelar > CC: d...@openvswitch.org > Signed-off-by: Joe Stringer > Signed-off-by: Greg Rose > --- > net/openvswitch/conntrack.c | 50 > +++-- > 1 file changed, 35 insertions(+), 15 deletions(-) > > diff --git a/net/openvswitch/conntrack.c b/net/openvswitch/conntrack.c > index 08679eb..1260f2b 100644 > --- a/net/openvswitch/conntrack.c > +++ b/net/openvswitch/conntrack.c > @@ -629,6 +629,33 @@ static int handle_fragments(struct net *net, struct > sw_flow_key *key, > return ct; > } > > +struct nf_conn *ovs_ct_executed(struct net *net, > + const struct sw_flow_key *key, > + const struct ovs_conntrack_info *info, > + struct sk_buff *skb, > + bool *ct_executed) Actually, some compilers will report this new warning: net/openvswitch/conntrack.c:632:16: warning: symbol 'ovs_ct_executed' was not declared. Should it be static? Could you make this function static and repost? Thanks.
RE: [PATCH 17/22] platform/x86: alienware-wmi: fix format string overflow warning
> -Original Message- > From: Arnd Bergmann [mailto:a...@arndb.de] > Sent: Friday, July 14, 2017 7:07 AM > To: linux-ker...@vger.kernel.org; Darren Hart ; Andy > Shevchenko > Cc: Greg Kroah-Hartman ; Linus Torvalds > ; Guenter Roeck ; > a...@linux-foundation.org; netdev@vger.kernel.org; David S . Miller > ; James E . J . Bottomley ; > Martin K . Petersen ; linux-s...@vger.kernel.org; > x...@kernel.org; Arnd Bergmann ; Limonciello, Mario > ; Arvind Yadav ; > platform-driver-...@vger.kernel.org > Subject: [PATCH 17/22] platform/x86: alienware-wmi: fix format string overflow > warning > > gcc points out a possible format string overflow for a large value of 'zone': > > drivers/platform/x86/alienware-wmi.c: In function 'alienware_wmi_init': > drivers/platform/x86/alienware-wmi.c:461:24: error: '%02X' directive writing > between 2 and 8 bytes into a region of size 6 [-Werror=format-overflow=] >sprintf(buffer, "zone%02X", i); > ^~~~ > drivers/platform/x86/alienware-wmi.c:461:19: note: directive argument in the > range [0, 2147483646] >sprintf(buffer, "zone%02X", i); >^~ > drivers/platform/x86/alienware-wmi.c:461:3: note: 'sprintf' output between 7 > and > 13 bytes into a destination of size 10 > > While the zone should never be that large, it's easy to make the > buffer a few bytes longer so gcc can prove this to be safe. > > Signed-off-by: Arnd Bergmann > --- > drivers/platform/x86/alienware-wmi.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/platform/x86/alienware-wmi.c > b/drivers/platform/x86/alienware-wmi.c > index 0831b428c217..acc01242da82 100644 > --- a/drivers/platform/x86/alienware-wmi.c > +++ b/drivers/platform/x86/alienware-wmi.c > @@ -421,7 +421,7 @@ static DEVICE_ATTR(lighting_control_state, 0644, > show_control_state, > static int alienware_zone_init(struct platform_device *dev) > { > int i; > - char buffer[10]; > + char buffer[13]; > char *name; > > if (interface == WMAX) { > -- > 2.9.0 LGTM, Thanks. Signed-off-by: Mario Limonciello
Re: [PATCH net] openvswitch: Fix for force/commit action failures
On 14 July 2017 at 09:10, Greg Rose wrote: > When there is an established connection in direction A->B, it is > possible to receive a packet on port B which then executes > ct(commit,force) without first performing ct() - ie, a lookup. > In this case, we would expect that this packet can delete the existing > entry so that we can commit a connection with direction B->A. However, > currently we only perform a check in skb_nfct_cached() for whether > OVS_CS_F_TRACKED is set and OVS_CS_F_INVALID is not set, ie that a > lookup previously occurred. In the above scenario, a lookup has not > occurred but we should still be able to statelessly look up the > existing entry and potentially delete the entry if it is in the > opposite direction. > > This patch extends the check to also hint that if the action has the > force flag set, then we will lookup the existing entry so that the > force check at the end of skb_nfct_cached has the ability to delete > the connection. > > Fixes: dd41d330b03 ("openvswitch: Add force commit.") > CC: Pravin Shelar > CC: d...@openvswitch.org > Signed-off-by: Joe Stringer > Signed-off-by: Greg Rose > --- Thanks for the fix. Reviewed-by: Joe Stringer
[PATCH net-next] tcp: adjust tail loss probe timeout
This patch adjusts the timeout formula to schedule the TCP loss probe (TLP). The previous formula uses 2*SRTT or 1.5*RTT + DelayACKMax if only one packet is in flight. It keeps a lower bound of 10 msec which is too large for short RTT connections (e.g. within a data-center). The new formula = 2*RTT + (inflight == 1 ? 200ms : 2ticks) which performs better for short and fast connections. Signed-off-by: Yuchung Cheng Signed-off-by: Neal Cardwell --- include/net/tcp.h | 3 +-- net/ipv4/tcp_output.c | 17 ++--- net/ipv4/tcp_recovery.c | 2 +- 3 files changed, 12 insertions(+), 10 deletions(-) diff --git a/include/net/tcp.h b/include/net/tcp.h index 70483296157f..4f056ea79df2 100644 --- a/include/net/tcp.h +++ b/include/net/tcp.h @@ -139,6 +139,7 @@ void tcp_time_wait(struct sock *sk, int state, int timeo); #endif #define TCP_RTO_MAX((unsigned)(120*HZ)) #define TCP_RTO_MIN((unsigned)(HZ/5)) +#define TCP_TIMEOUT_MIN(2U) /* Min timeout for TCP timers in jiffies */ #define TCP_TIMEOUT_INIT ((unsigned)(1*HZ))/* RFC6298 2.1 initial RTO value*/ #define TCP_TIMEOUT_FALLBACK ((unsigned)(3*HZ))/* RFC 1122 initial RTO value, now * used as a fallback RTO for the @@ -150,8 +151,6 @@ void tcp_time_wait(struct sock *sk, int state, int timeo); #define TCP_RESOURCE_PROBE_INTERVAL ((unsigned)(HZ/2U)) /* Maximal interval between probes * for local resources. */ -#define TCP_REO_TIMEOUT_MIN(2000) /* Min RACK reordering timeout in usec */ - #define TCP_KEEPALIVE_TIME (120*60*HZ) /* two hours */ #define TCP_KEEPALIVE_PROBES 9 /* Max of 9 keepalive probes */ #define TCP_KEEPALIVE_INTVL(75*HZ) diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c index 4e985dea1dd2..886d874775df 100644 --- a/net/ipv4/tcp_output.c +++ b/net/ipv4/tcp_output.c @@ -2377,7 +2377,6 @@ bool tcp_schedule_loss_probe(struct sock *sk) struct inet_connection_sock *icsk = inet_csk(sk); struct tcp_sock *tp = tcp_sk(sk); u32 timeout, tlp_time_stamp, rto_time_stamp; - u32 rtt = usecs_to_jiffies(tp->srtt_us >> 3); /* No consecutive loss probes. */ if (WARN_ON(icsk->icsk_pending == ICSK_TIME_LOSS_PROBE)) { @@ -2406,15 +2405,19 @@ bool tcp_schedule_loss_probe(struct sock *sk) tcp_send_head(sk)) return false; - /* Probe timeout is at least 1.5*rtt + TCP_DELACK_MAX to account + /* Probe timeout is 2*rtt. Add minimum RTO to account * for delayed ack when there's one outstanding packet. If no RTT * sample is available then probe after TCP_TIMEOUT_INIT. */ - timeout = rtt << 1 ? : TCP_TIMEOUT_INIT; - if (tp->packets_out == 1) - timeout = max_t(u32, timeout, - (rtt + (rtt >> 1) + TCP_DELACK_MAX)); - timeout = max_t(u32, timeout, msecs_to_jiffies(10)); + if (tp->srtt_us) { + timeout = usecs_to_jiffies(tp->srtt_us >> 2); + if (tp->packets_out == 1) + timeout += TCP_RTO_MIN; + else + timeout += TCP_TIMEOUT_MIN; + } else { + timeout = TCP_TIMEOUT_INIT; + } /* If RTO is shorter, just schedule TLP in its place. */ tlp_time_stamp = tcp_jiffies32 + timeout; diff --git a/net/ipv4/tcp_recovery.c b/net/ipv4/tcp_recovery.c index fe9a493d0208..449cd914d58e 100644 --- a/net/ipv4/tcp_recovery.c +++ b/net/ipv4/tcp_recovery.c @@ -113,7 +113,7 @@ void tcp_rack_mark_lost(struct sock *sk) tp->rack.advanced = 0; tcp_rack_detect_loss(sk, &timeout); if (timeout) { - timeout = usecs_to_jiffies(timeout + TCP_REO_TIMEOUT_MIN); + timeout = usecs_to_jiffies(timeout) + TCP_TIMEOUT_MIN; inet_csk_reset_xmit_timer(sk, ICSK_TIME_REO_TIMEOUT, timeout, inet_csk(sk)->icsk_rto); } -- 2.13.2.932.g7449e964c-goog
[PATCH net-next,1/1] tools: hv: ignore a NIC if it has been configured
Let bondvf.sh ignore this NIC if it has been configured, to prevent user configuration from being overwritten unexpectly. Signed-off-by: Simon Xiao --- tools/hv/bondvf.sh | 27 +-- 1 file changed, 25 insertions(+), 2 deletions(-) diff --git a/tools/hv/bondvf.sh b/tools/hv/bondvf.sh index 89b2506..80f1028 100755 --- a/tools/hv/bondvf.sh +++ b/tools/hv/bondvf.sh @@ -211,6 +211,30 @@ function create_bond { echo $'\nBond name:' $bondname + if [ $distro == ubuntu ] + then + local mainfn=$cfgdir/interfaces + local s="^[ \t]*(auto|iface|mapping|allow-.*)[ \t]+${bondname}" + + grep -E "$s" $mainfn + if [ $? -eq 0 ] + then + echo "WARNING: ${bondname} has been configured already" + return + fi + elif [ $distro == redhat ] || [ $distro == suse ] + then + local fn=$cfgdir/ifcfg-$bondname + if [ -f $fn ] + then + echo "WARNING: ${bondname} has been configured already" + return + fi + else + echo "Unsupported Distro: ${distro}" + return + fi + echo configuring $primary create_eth_cfg_pri_$distro $primary $bondname @@ -219,8 +243,6 @@ function create_bond { echo creating: $bondname with primary slave: $primary create_bond_cfg_$distro $bondname $primary $secondary - - let bondcnt=bondcnt+1 } for (( i=0; i < $eth_cnt-1; i++ )) @@ -228,5 +250,6 @@ do if [ -n "${list_match[$i]}" ] then create_bond ${list_eth[$i]} ${list_match[$i]} + let bondcnt=bondcnt+1 fi done -- 2.7.4
Re: [PATCH v2] sctp: don't dereference ptr before leaving _sctp_walk_{params,errors}()
From: Alexander Potapenko Date: Fri, 14 Jul 2017 19:33:54 +0200 > On Fri, Jul 14, 2017 at 7:23 PM, David Miller wrote: >> From: Alexander Potapenko >> Date: Fri, 14 Jul 2017 18:33:01 +0200 >> >>> On Fri, Jul 14, 2017 at 5:58 PM, David Miller wrote: From: Alexander Potapenko Date: Fri, 14 Jul 2017 12:03:29 +0200 > v2: per comment from David Miller, make sure the whole iterator->length > fits into the remaining buffer. Please compile and functionally test your changes: In file included from ./include/linux/compiler.h:58:0, from ./include/uapi/linux/stddef.h:1, from ./include/linux/stddef.h:4, from ./include/uapi/linux/posix_types.h:4, from ./include/uapi/linux/types.h:13, from ./include/linux/types.h:5, from net/sctp/sm_statefuns.c:48: net/sctp/sm_statefuns.c: In function ‘sctp_sf_do_reconf’: ./include/net/sctp/sctp.h:472:24: error: unknown type name ‘sctp_paramhdr_t’ (pos.v + offsetof(sctp_paramhdr_t, length) + sizeof(pos.p->length) <\ ^ >>> Oops. Fixed. >> >> Did you functionally test the new version or just do a quick compile >> check and resubmit? > I've checked that the kernel still works, but unfortunately I couldn't > check whether or not this affected the uninit memory, as KMSAN > currently works on a fixed kernel revision. The compilation error was > actually caused by me failing to test the kernel when porting the fix > from that revision to upstream. > >> I really want you to test this if the logic has been changed. > Do you mean any specific tests in addition to, say, running the > reproducer on which the uninit use was reported? I mean the reproducer.
Re: [PATCH v7 3/3] net/cxgb4: Use new PCI_DEV_FLAGS_NO_RELAXED_ORDERING flag
Reviewed-by: Casey Leedom
Re: [PATCH] ipv6: initialize treq->txhash in cookie_v6_check()
On Fri, Jul 14, 2017 at 7:04 PM, Neal Cardwell wrote: > On Fri, Jul 14, 2017 at 12:54 PM, Alexander Potapenko > wrote: >> KMSAN reported use of uninitialized memory in skb_set_hash_from_sk(), >> which originated from the TCP request socket created in >> cookie_v6_check(): > ... >> --- a/net/ipv6/syncookies.c >> +++ b/net/ipv6/syncookies.c >> @@ -216,6 +216,7 @@ struct sock *cookie_v6_check(struct sock *sk, struct >> sk_buff *skb) >> treq->rcv_isn = ntohl(th->seq) - 1; >> treq->snt_isn = cookie; >> treq->ts_off = 0; >> + treq->txhash = 0; >> >> /* >> * We need to lookup the dst_entry to get the correct window size. > > I would have thought that the same fix is needed in the corresponding > line in cookie_v4_check() in net/ipv4/syncookies.c? (I do not see > txhash being initialized for the IPv4 side.) If it's not needed for > some reason, then it would be worth a comment in the commit > description to explain why not. Most certainly it is needed. I haven't seen reports for that in the wild and couldn't forge a repro triggering the bug in IPv4, but I'll give it another shot. > thanks, > neal -- Alexander Potapenko Software Engineer Google Germany GmbH Erika-Mann-Straße, 33 80636 München Geschäftsführer: Matthew Scott Sucherman, Paul Terence Manicle Registergericht und -nummer: Hamburg, HRB 86891 Sitz der Gesellschaft: Hamburg
Re: [PATCH v2] sctp: don't dereference ptr before leaving _sctp_walk_{params,errors}()
On Fri, Jul 14, 2017 at 7:23 PM, David Miller wrote: > From: Alexander Potapenko > Date: Fri, 14 Jul 2017 18:33:01 +0200 > >> On Fri, Jul 14, 2017 at 5:58 PM, David Miller wrote: >>> From: Alexander Potapenko >>> Date: Fri, 14 Jul 2017 12:03:29 +0200 >>> v2: per comment from David Miller, make sure the whole iterator->length fits into the remaining buffer. >>> >>> Please compile and functionally test your changes: >>> >>> In file included from ./include/linux/compiler.h:58:0, >>> from ./include/uapi/linux/stddef.h:1, >>> from ./include/linux/stddef.h:4, >>> from ./include/uapi/linux/posix_types.h:4, >>> from ./include/uapi/linux/types.h:13, >>> from ./include/linux/types.h:5, >>> from net/sctp/sm_statefuns.c:48: >>> net/sctp/sm_statefuns.c: In function ‘sctp_sf_do_reconf’: >>> ./include/net/sctp/sctp.h:472:24: error: unknown type name ‘sctp_paramhdr_t’ >>> (pos.v + offsetof(sctp_paramhdr_t, length) + sizeof(pos.p->length) <\ >>> ^ >> Oops. Fixed. > > Did you functionally test the new version or just do a quick compile > check and resubmit? I've checked that the kernel still works, but unfortunately I couldn't check whether or not this affected the uninit memory, as KMSAN currently works on a fixed kernel revision. The compilation error was actually caused by me failing to test the kernel when porting the fix from that revision to upstream. > I really want you to test this if the logic has been changed. Do you mean any specific tests in addition to, say, running the reproducer on which the uninit use was reported? Thanks -- Alexander Potapenko Software Engineer Google Germany GmbH Erika-Mann-Straße, 33 80636 München Geschäftsführer: Matthew Scott Sucherman, Paul Terence Manicle Registergericht und -nummer: Hamburg, HRB 86891 Sitz der Gesellschaft: Hamburg
Re: [PATCH v2] sctp: don't dereference ptr before leaving _sctp_walk_{params,errors}()
From: Alexander Potapenko Date: Fri, 14 Jul 2017 18:33:01 +0200 > On Fri, Jul 14, 2017 at 5:58 PM, David Miller wrote: >> From: Alexander Potapenko >> Date: Fri, 14 Jul 2017 12:03:29 +0200 >> >>> v2: per comment from David Miller, make sure the whole iterator->length >>> fits into the remaining buffer. >> >> Please compile and functionally test your changes: >> >> In file included from ./include/linux/compiler.h:58:0, >> from ./include/uapi/linux/stddef.h:1, >> from ./include/linux/stddef.h:4, >> from ./include/uapi/linux/posix_types.h:4, >> from ./include/uapi/linux/types.h:13, >> from ./include/linux/types.h:5, >> from net/sctp/sm_statefuns.c:48: >> net/sctp/sm_statefuns.c: In function ‘sctp_sf_do_reconf’: >> ./include/net/sctp/sctp.h:472:24: error: unknown type name ‘sctp_paramhdr_t’ >> (pos.v + offsetof(sctp_paramhdr_t, length) + sizeof(pos.p->length) <\ >> ^ > Oops. Fixed. Did you functionally test the new version or just do a quick compile check and resubmit? I really want you to test this if the logic has been changed.
Re: [PATCH] ipv6: initialize treq->txhash in cookie_v6_check()
On Fri, Jul 14, 2017 at 12:54 PM, Alexander Potapenko wrote: > KMSAN reported use of uninitialized memory in skb_set_hash_from_sk(), > which originated from the TCP request socket created in > cookie_v6_check(): ... > --- a/net/ipv6/syncookies.c > +++ b/net/ipv6/syncookies.c > @@ -216,6 +216,7 @@ struct sock *cookie_v6_check(struct sock *sk, struct > sk_buff *skb) > treq->rcv_isn = ntohl(th->seq) - 1; > treq->snt_isn = cookie; > treq->ts_off = 0; > + treq->txhash = 0; > > /* > * We need to lookup the dst_entry to get the correct window size. I would have thought that the same fix is needed in the corresponding line in cookie_v4_check() in net/ipv4/syncookies.c? (I do not see txhash being initialized for the IPv4 side.) If it's not needed for some reason, then it would be worth a comment in the commit description to explain why not. thanks, neal
[PATCH] ipv6: initialize treq->txhash in cookie_v6_check()
KMSAN reported use of uninitialized memory in skb_set_hash_from_sk(), which originated from the TCP request socket created in cookie_v6_check(): == BUG: KMSAN: use of uninitialized memory in tcp_transmit_skb+0xf77/0x3ec0 CPU: 1 PID: 2949 Comm: syz-execprog Not tainted 4.11.0-rc5+ #2931 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 TCP: request_sock_TCPv6: Possible SYN flooding on port 20028. Sending cookies. Check SNMP counters. Call Trace: __dump_stack lib/dump_stack.c:16 dump_stack+0x172/0x1c0 lib/dump_stack.c:52 kmsan_report+0x12a/0x180 mm/kmsan/kmsan.c:927 __msan_warning_32+0x61/0xb0 mm/kmsan/kmsan_instr.c:469 skb_set_hash_from_sk ./include/net/sock.h:2011 tcp_transmit_skb+0xf77/0x3ec0 net/ipv4/tcp_output.c:983 tcp_send_ack+0x75b/0x830 net/ipv4/tcp_output.c:3493 tcp_delack_timer_handler+0x9a6/0xb90 net/ipv4/tcp_timer.c:284 tcp_delack_timer+0x1b0/0x310 net/ipv4/tcp_timer.c:309 call_timer_fn+0x240/0x520 kernel/time/timer.c:1268 expire_timers kernel/time/timer.c:1307 __run_timers+0xc13/0xf10 kernel/time/timer.c:1601 run_timer_softirq+0x36/0xa0 kernel/time/timer.c:1614 __do_softirq+0x485/0x942 kernel/softirq.c:284 invoke_softirq kernel/softirq.c:364 irq_exit+0x1fa/0x230 kernel/softirq.c:405 exiting_irq+0xe/0x10 ./arch/x86/include/asm/apic.h:657 smp_apic_timer_interrupt+0x5a/0x80 arch/x86/kernel/apic/apic.c:966 apic_timer_interrupt+0x86/0x90 arch/x86/entry/entry_64.S:489 RIP: 0010:native_restore_fl ./arch/x86/include/asm/irqflags.h:36 RIP: 0010:arch_local_irq_restore ./arch/x86/include/asm/irqflags.h:77 RIP: 0010:__msan_poison_alloca+0xed/0x120 mm/kmsan/kmsan_instr.c:440 RSP: 0018:880024917cd8 EFLAGS: 0246 ORIG_RAX: ff10 RAX: 0246 RBX: 8800224c RCX: 0005 RDX: 0004 RSI: 8800 RDI: eab6d770 RBP: 880024917d58 R08: 0dd8 R09: 0004 R10: 1600 R11: R12: 85abf810 R13: 880024917dd8 R14: 0010 R15: 81cabde4 poll_select_copy_remaining+0xac/0x6b0 fs/select.c:293 SYSC_select+0x4b4/0x4e0 fs/select.c:653 SyS_select+0x76/0xa0 fs/select.c:634 entry_SYSCALL_64_fastpath+0x13/0x94 arch/x86/entry/entry_64.S:204 RIP: 0033:0x4597e7 RSP: 002b:00c420037ee0 EFLAGS: 0246 ORIG_RAX: 0017 RAX: ffda RBX: RCX: 004597e7 RDX: RSI: RDI: RBP: 00c420037ef0 R08: 00c420037ee0 R09: 0059 R10: R11: 0246 R12: 0042dc20 R13: 00f3 R14: 0030 R15: 0003 chained origin: save_stack_trace+0x37/0x40 arch/x86/kernel/stacktrace.c:59 kmsan_save_stack_with_flags mm/kmsan/kmsan.c:302 kmsan_save_stack mm/kmsan/kmsan.c:317 kmsan_internal_chain_origin+0x12a/0x1f0 mm/kmsan/kmsan.c:547 __msan_store_shadow_origin_4+0xac/0x110 mm/kmsan/kmsan_instr.c:259 tcp_create_openreq_child+0x709/0x1ae0 net/ipv4/tcp_minisocks.c:472 tcp_v6_syn_recv_sock+0x7eb/0x2a30 net/ipv6/tcp_ipv6.c:1103 tcp_get_cookie_sock+0x136/0x5f0 net/ipv4/syncookies.c:212 cookie_v6_check+0x17a9/0x1b50 net/ipv6/syncookies.c:245 tcp_v6_cookie_check net/ipv6/tcp_ipv6.c:989 tcp_v6_do_rcv+0xdd8/0x1c60 net/ipv6/tcp_ipv6.c:1298 tcp_v6_rcv+0x41a3/0x4f00 net/ipv6/tcp_ipv6.c:1487 ip6_input_finish+0x82f/0x1ee0 net/ipv6/ip6_input.c:279 NF_HOOK ./include/linux/netfilter.h:257 ip6_input+0x239/0x290 net/ipv6/ip6_input.c:322 dst_input ./include/net/dst.h:492 ip6_rcv_finish net/ipv6/ip6_input.c:69 NF_HOOK ./include/linux/netfilter.h:257 ipv6_rcv+0x1dbd/0x22e0 net/ipv6/ip6_input.c:203 __netif_receive_skb_core+0x2f6f/0x3a20 net/core/dev.c:4208 __netif_receive_skb net/core/dev.c:4246 process_backlog+0x667/0xba0 net/core/dev.c:4866 napi_poll net/core/dev.c:5268 net_rx_action+0xc95/0x1590 net/core/dev.c:5333 __do_softirq+0x485/0x942 kernel/softirq.c:284 origin: save_stack_trace+0x37/0x40 arch/x86/kernel/stacktrace.c:59 kmsan_save_stack_with_flags mm/kmsan/kmsan.c:302 kmsan_internal_poison_shadow+0xb1/0x1a0 mm/kmsan/kmsan.c:198 kmsan_kmalloc+0x7f/0xe0 mm/kmsan/kmsan.c:337 kmem_cache_alloc+0x1c2/0x1e0 mm/slub.c:2766 reqsk_alloc ./include/net/request_sock.h:87 inet_reqsk_alloc+0xa4/0x5b0 net/ipv4/tcp_input.c:6200 cookie_v6_check+0x4f4/0x1b50 net/ipv6/syncookies.c:169 tcp_v6_cookie_check net/ipv6/tcp_ipv6.c:989 tcp_v6_do_rcv+0xdd8/0x1c60 net/ipv6/tcp_ipv6.c:1298 tcp_v6_rcv+0x41a3/0x4f00 net/ipv6/tcp_ipv6.c:1487 ip6_input_finish+0x82f/0x1ee0 net/ipv6/ip6_input.c:279 NF_HOOK ./include/linux/netfilter.h:257 ip6_input+0x239/0x290 net/ipv6/ip6_input.c:322 dst_input ./include/net/dst.h:492 ip6_rcv_finish net/ipv6/ip6_input.c:69 NF_HOOK ./include/linux/netfilter.h:257 ipv6_rcv+0x1dbd/0x22e0 net/ipv6/ip6_input.c:203 __net
[PATCH v3] sctp: don't dereference ptr before leaving _sctp_walk_{params,errors}()
If the length field of the iterator (|pos.p| or |err|) is past the end of the chunk, we shouldn't access it. This bug has been detected by KMSAN. For the following pair of system calls: socket(PF_INET6, SOCK_STREAM, 0x84 /* IPPROTO_??? */) = 3 sendto(3, "A", 1, MSG_OOB, {sa_family=AF_INET6, sin6_port=htons(0), inet_pton(AF_INET6, "::1", &sin6_addr), sin6_flowinfo=0, sin6_scope_id=0}, 28) = 1 the tool has reported a use of uninitialized memory: == BUG: KMSAN: use of uninitialized memory in sctp_rcv+0x17b8/0x43b0 CPU: 1 PID: 2940 Comm: probe Not tainted 4.11.0-rc5+ #2926 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:16 dump_stack+0x172/0x1c0 lib/dump_stack.c:52 kmsan_report+0x12a/0x180 mm/kmsan/kmsan.c:927 __msan_warning_32+0x61/0xb0 mm/kmsan/kmsan_instr.c:469 __sctp_rcv_init_lookup net/sctp/input.c:1074 __sctp_rcv_lookup_harder net/sctp/input.c:1233 __sctp_rcv_lookup net/sctp/input.c:1255 sctp_rcv+0x17b8/0x43b0 net/sctp/input.c:170 sctp6_rcv+0x32/0x70 net/sctp/ipv6.c:984 ip6_input_finish+0x82f/0x1ee0 net/ipv6/ip6_input.c:279 NF_HOOK ./include/linux/netfilter.h:257 ip6_input+0x239/0x290 net/ipv6/ip6_input.c:322 dst_input ./include/net/dst.h:492 ip6_rcv_finish net/ipv6/ip6_input.c:69 NF_HOOK ./include/linux/netfilter.h:257 ipv6_rcv+0x1dbd/0x22e0 net/ipv6/ip6_input.c:203 __netif_receive_skb_core+0x2f6f/0x3a20 net/core/dev.c:4208 __netif_receive_skb net/core/dev.c:4246 process_backlog+0x667/0xba0 net/core/dev.c:4866 napi_poll net/core/dev.c:5268 net_rx_action+0xc95/0x1590 net/core/dev.c:5333 __do_softirq+0x485/0x942 kernel/softirq.c:284 do_softirq_own_stack+0x1c/0x30 arch/x86/entry/entry_64.S:902 do_softirq kernel/softirq.c:328 __local_bh_enable_ip+0x25b/0x290 kernel/softirq.c:181 local_bh_enable+0x37/0x40 ./include/linux/bottom_half.h:31 rcu_read_unlock_bh ./include/linux/rcupdate.h:931 ip6_finish_output2+0x19b2/0x1cf0 net/ipv6/ip6_output.c:124 ip6_finish_output+0x764/0x970 net/ipv6/ip6_output.c:149 NF_HOOK_COND ./include/linux/netfilter.h:246 ip6_output+0x456/0x520 net/ipv6/ip6_output.c:163 dst_output ./include/net/dst.h:486 NF_HOOK ./include/linux/netfilter.h:257 ip6_xmit+0x1841/0x1c00 net/ipv6/ip6_output.c:261 sctp_v6_xmit+0x3b7/0x470 net/sctp/ipv6.c:225 sctp_packet_transmit+0x38cb/0x3a20 net/sctp/output.c:632 sctp_outq_flush+0xeb3/0x46e0 net/sctp/outqueue.c:885 sctp_outq_uncork+0xb2/0xd0 net/sctp/outqueue.c:750 sctp_side_effects net/sctp/sm_sideeffect.c:1773 sctp_do_sm+0x6962/0x6ec0 net/sctp/sm_sideeffect.c:1147 sctp_primitive_ASSOCIATE+0x12c/0x160 net/sctp/primitive.c:88 sctp_sendmsg+0x43e5/0x4f90 net/sctp/socket.c:1954 inet_sendmsg+0x498/0x670 net/ipv4/af_inet.c:762 sock_sendmsg_nosec net/socket.c:633 sock_sendmsg net/socket.c:643 SYSC_sendto+0x608/0x710 net/socket.c:1696 SyS_sendto+0x8a/0xb0 net/socket.c:1664 do_syscall_64+0xe6/0x130 arch/x86/entry/common.c:285 entry_SYSCALL64_slow_path+0x25/0x25 arch/x86/entry/entry_64.S:246 RIP: 0033:0x401133 RSP: 002b:7fff6d99cd38 EFLAGS: 0246 ORIG_RAX: 002c RAX: ffda RBX: 004002b0 RCX: 00401133 RDX: 0001 RSI: 00494088 RDI: 0003 RBP: 7fff6d99cd90 R08: 7fff6d99cd50 R09: 001c R10: 0001 R11: 0246 R12: R13: 004063d0 R14: 00406460 R15: origin: save_stack_trace+0x37/0x40 arch/x86/kernel/stacktrace.c:59 kmsan_save_stack_with_flags mm/kmsan/kmsan.c:302 kmsan_internal_poison_shadow+0xb1/0x1a0 mm/kmsan/kmsan.c:198 kmsan_poison_shadow+0x6d/0xc0 mm/kmsan/kmsan.c:211 slab_alloc_node mm/slub.c:2743 __kmalloc_node_track_caller+0x200/0x360 mm/slub.c:4351 __kmalloc_reserve net/core/skbuff.c:138 __alloc_skb+0x26b/0x840 net/core/skbuff.c:231 alloc_skb ./include/linux/skbuff.h:933 sctp_packet_transmit+0x31e/0x3a20 net/sctp/output.c:570 sctp_outq_flush+0xeb3/0x46e0 net/sctp/outqueue.c:885 sctp_outq_uncork+0xb2/0xd0 net/sctp/outqueue.c:750 sctp_side_effects net/sctp/sm_sideeffect.c:1773 sctp_do_sm+0x6962/0x6ec0 net/sctp/sm_sideeffect.c:1147 sctp_primitive_ASSOCIATE+0x12c/0x160 net/sctp/primitive.c:88 sctp_sendmsg+0x43e5/0x4f90 net/sctp/socket.c:1954 inet_sendmsg+0x498/0x670 net/ipv4/af_inet.c:762 sock_sendmsg_nosec net/socket.c:633 sock_sendmsg net/socket.c:643 SYSC_sendto+0x608/0x710 net/socket.c:1696 SyS_sendto+0x8a/0xb0 net/socket.c:1664 do_syscall_64+0xe6/0x130 arch/x86/entry/common.c:285 return_from_SYSCALL_64+0x0/0x6a arch/x86/entry/entry_64.S:246 == Signed-off-by: Alexander Potapenko --- v3: fix compilation v2: per comment from David M
Re: [PATCH v2] sctp: don't dereference ptr before leaving _sctp_walk_{params,errors}()
On Fri, Jul 14, 2017 at 5:58 PM, David Miller wrote: > From: Alexander Potapenko > Date: Fri, 14 Jul 2017 12:03:29 +0200 > >> v2: per comment from David Miller, make sure the whole iterator->length >> fits into the remaining buffer. > > Please compile and functionally test your changes: > > In file included from ./include/linux/compiler.h:58:0, > from ./include/uapi/linux/stddef.h:1, > from ./include/linux/stddef.h:4, > from ./include/uapi/linux/posix_types.h:4, > from ./include/uapi/linux/types.h:13, > from ./include/linux/types.h:5, > from net/sctp/sm_statefuns.c:48: > net/sctp/sm_statefuns.c: In function ‘sctp_sf_do_reconf’: > ./include/net/sctp/sctp.h:472:24: error: unknown type name ‘sctp_paramhdr_t’ > (pos.v + offsetof(sctp_paramhdr_t, length) + sizeof(pos.p->length) <\ > ^ Oops. Fixed. -- Alexander Potapenko Software Engineer Google Germany GmbH Erika-Mann-Straße, 33 80636 München Geschäftsführer: Matthew Scott Sucherman, Paul Terence Manicle Registergericht und -nummer: Hamburg, HRB 86891 Sitz der Gesellschaft: Hamburg
[PATCH net] openvswitch: Fix for force/commit action failures
When there is an established connection in direction A->B, it is possible to receive a packet on port B which then executes ct(commit,force) without first performing ct() - ie, a lookup. In this case, we would expect that this packet can delete the existing entry so that we can commit a connection with direction B->A. However, currently we only perform a check in skb_nfct_cached() for whether OVS_CS_F_TRACKED is set and OVS_CS_F_INVALID is not set, ie that a lookup previously occurred. In the above scenario, a lookup has not occurred but we should still be able to statelessly look up the existing entry and potentially delete the entry if it is in the opposite direction. This patch extends the check to also hint that if the action has the force flag set, then we will lookup the existing entry so that the force check at the end of skb_nfct_cached has the ability to delete the connection. Fixes: dd41d330b03 ("openvswitch: Add force commit.") CC: Pravin Shelar CC: d...@openvswitch.org Signed-off-by: Joe Stringer Signed-off-by: Greg Rose --- net/openvswitch/conntrack.c | 50 +++-- 1 file changed, 35 insertions(+), 15 deletions(-) diff --git a/net/openvswitch/conntrack.c b/net/openvswitch/conntrack.c index 08679eb..1260f2b 100644 --- a/net/openvswitch/conntrack.c +++ b/net/openvswitch/conntrack.c @@ -629,6 +629,33 @@ static int handle_fragments(struct net *net, struct sw_flow_key *key, return ct; } +struct nf_conn *ovs_ct_executed(struct net *net, + const struct sw_flow_key *key, + const struct ovs_conntrack_info *info, + struct sk_buff *skb, + bool *ct_executed) +{ + struct nf_conn *ct = NULL; + + /* If no ct, check if we have evidence that an existing conntrack entry +* might be found for this skb. This happens when we lose a skb->_nfct +* due to an upcall, or if the direction is being forced. If the +* connection was not confirmed, it is not cached and needs to be run +* through conntrack again. +*/ + *ct_executed = (key->ct_state & OVS_CS_F_TRACKED) && + !(key->ct_state & OVS_CS_F_INVALID) && + (key->ct_zone == info->zone.id); + + if (*ct_executed || (!key->ct_state && info->force)) { + ct = ovs_ct_find_existing(net, &info->zone, info->family, skb, + !!(key->ct_state & + OVS_CS_F_NAT_MASK)); + } + + return ct; +} + /* Determine whether skb->_nfct is equal to the result of conntrack lookup. */ static bool skb_nfct_cached(struct net *net, const struct sw_flow_key *key, @@ -637,24 +664,17 @@ static bool skb_nfct_cached(struct net *net, { enum ip_conntrack_info ctinfo; struct nf_conn *ct; + bool ct_executed = true; ct = nf_ct_get(skb, &ctinfo); - /* If no ct, check if we have evidence that an existing conntrack entry -* might be found for this skb. This happens when we lose a skb->_nfct -* due to an upcall. If the connection was not confirmed, it is not -* cached and needs to be run through conntrack again. -*/ - if (!ct && key->ct_state & OVS_CS_F_TRACKED && - !(key->ct_state & OVS_CS_F_INVALID) && - key->ct_zone == info->zone.id) { - ct = ovs_ct_find_existing(net, &info->zone, info->family, skb, - !!(key->ct_state -& OVS_CS_F_NAT_MASK)); - if (ct) - nf_ct_get(skb, &ctinfo); - } if (!ct) + ct = ovs_ct_executed(net, key, info, skb, &ct_executed); + + if (ct) + nf_ct_get(skb, &ctinfo); + else return false; + if (!net_eq(net, read_pnet(&ct->ct_net))) return false; if (!nf_ct_zone_equal_any(info->ct, nf_ct_zone(ct))) @@ -679,7 +699,7 @@ static bool skb_nfct_cached(struct net *net, return false; } - return true; + return ct_executed; } #ifdef CONFIG_NF_NAT_NEEDED -- 1.8.3.1
Re: [PATCH 08/22] isdn: divert: fix sprintf buffer overflow warning
From: Arnd Bergmann Date: Fri, 14 Jul 2017 14:07:00 +0200 > One string we pass into the cs->info buffer might be too long, > as pointed out by gcc: > > drivers/isdn/divert/isdn_divert.c: In function 'll_callback': > drivers/isdn/divert/isdn_divert.c:488:22: error: '%d' directive writing > between 1 and 3 bytes into a region of size between 1 and 69 > [-Werror=format-overflow=] > sprintf(cs->info, "%d 0x%lx %s %s %s %s 0x%x 0x%x %d %d %s\n", >^~~ > drivers/isdn/divert/isdn_divert.c:488:22: note: directive argument in the > range [0, 255] > drivers/isdn/divert/isdn_divert.c:488:4: note: 'sprintf' output 25 or more > bytes (assuming 129) into a destination of size 90 > > This is unlikely to actually cause problems, so let's use snprintf > as a simple workaround to shut up the warning and truncate the > buffer instead. > > Signed-off-by: Arnd Bergmann Applied.
Re: [PATCH net] sctp: fix an array overflow when all ext chunks are set
From: Xin Long Date: Fri, 14 Jul 2017 22:07:33 +0800 > Marcelo noticed an array overflow caused by commit c28445c3cb07 > ("sctp: add reconf_enable in asoc ep and netns"), in which sctp > would add SCTP_CID_RECONF into extensions when reconf_enable is > set in sctp_make_init and sctp_make_init_ack. > > Then now when all ext chunks are set, 4 ext chunk ids can be put > into extensions array while extensions array size is 3. It would > cause a kernel panic because of this overflow. > > This patch is to fix it by defining extensions array size is 4 in > both sctp_make_init and sctp_make_init_ack. > > Fixes: c28445c3cb07 ("sctp: add reconf_enable in asoc ep and netns") > Signed-off-by: Xin Long Applied and queued up for -stable, thanks.
Re: [PATCH 11/22] net: thunder_bgx: avoid format string overflow warning
From: Arnd Bergmann Date: Fri, 14 Jul 2017 14:07:03 +0200 > gcc warns that the temporary buffer might be too small here: > > drivers/net/ethernet/cavium/thunder/thunder_bgx.c: In function 'bgx_probe': > drivers/net/ethernet/cavium/thunder/thunder_bgx.c:1020:16: error: '%d' > directive writing between 1 and 10 bytes into a region of size between 9 and > 11 [-Werror=format-overflow=] > sprintf(str, "BGX%d LMAC%d mode", bgx->bgx_id, lmacid); > ^~~ > drivers/net/ethernet/cavium/thunder/thunder_bgx.c:1020:16: note: directive > argument in the range [0, 2147483647] > drivers/net/ethernet/cavium/thunder/thunder_bgx.c:1020:3: note: 'sprintf' > output between 16 and 27 bytes into a destination of size 20 > > This probably can't happen, but it can't hurt to make it long > enough for the theoretical limit. > > Signed-off-by: Arnd Bergmann Applied.
Re: [PATCH 10/22] bnx2x: fix format overflow warning
From: Arnd Bergmann Date: Fri, 14 Jul 2017 14:07:02 +0200 > gcc notices that large queue numbers would overflow the queue name > string: > > drivers/net/ethernet/broadcom/bnx2x/bnx2x_ethtool.c: In function > 'bnx2x_get_strings': > drivers/net/ethernet/broadcom/bnx2x/bnx2x_ethtool.c:3165:25: error: '%d' > directive writing between 1 and 10 bytes into a region of size 5 > [-Werror=format-overflow=] > drivers/net/ethernet/broadcom/bnx2x/bnx2x_ethtool.c:3165:25: note: directive > argument in the range [0, 2147483647] > drivers/net/ethernet/broadcom/bnx2x/bnx2x_ethtool.c:3165:5: note: 'sprintf' > output between 2 and 11 bytes into a destination of size 5 > > There is a hard limit in place that makes the number at most two > digits, so the code is fine. This changes it to use snprintf() > to truncate instead of overflowing, which shuts up that warning. > > Signed-off-by: Arnd Bergmann Applied.
Re: [PATCH 13/22] liquidio: fix possible eeprom format string overflow
From: Arnd Bergmann Date: Fri, 14 Jul 2017 14:07:05 +0200 > gcc reports that the temporary buffer for computing the > string length may be too small here: > > drivers/net/ethernet/cavium/liquidio/lio_ethtool.c: In function > 'lio_get_eeprom_len': > /drivers/net/ethernet/cavium/liquidio/lio_ethtool.c:345:21: error: 'sprintf' > may write a terminating nul past the end of the destination > [-Werror=format-overflow=] > len = sprintf(buf, "boardname:%s serialnum:%s maj:%lld min:%lld\n", > ^~~ > drivers/net/ethernet/cavium/liquidio/lio_ethtool.c:345:6: note: 'sprintf' > output between 35 and 167 bytes into a destination of size 128 > len = sprintf(buf, "boardname:%s serialnum:%s maj:%lld min:%lld\n", > > This extends it to 192 bytes, which is certainly enough. As far > as I could tell, there are no other constraints that require a specific > maximum size. > > Signed-off-by: Arnd Bergmann Applied.
Re: [PATCH 12/22] vmxnet3: avoid format strint overflow warning
From: Arnd Bergmann Date: Fri, 14 Jul 2017 14:07:04 +0200 > gcc-7 notices that "-event-%d" could be more than 11 characters long > if we had larger 'vector' numbers: > > drivers/net/vmxnet3/vmxnet3_drv.c: In function 'vmxnet3_activate_dev': > drivers/net/vmxnet3/vmxnet3_drv.c:2095:40: error: 'sprintf' may write a > terminating nul past the end of the destination [-Werror=format-overflow=] > sprintf(intr->event_msi_vector_name, "%s-event-%d", > ^ > drivers/net/vmxnet3/vmxnet3_drv.c:2095:3: note: 'sprintf' output between 9 > and 33 bytes into a destination of size 32 > > The current code is safe, but making the string a little longer > is harmless and lets gcc see that it's ok. > > Signed-off-by: Arnd Bergmann Applied.
Re: [PATCH 09/22] net: niu: fix format string overflow warning:
From: Arnd Bergmann Date: Fri, 14 Jul 2017 14:07:01 +0200 > We get a warning for the port_name string that might be longer than > six characters if we had more than 10 ports: > > drivers/net/ethernet/sun/niu.c: In function 'niu_put_parent': > drivers/net/ethernet/sun/niu.c:9563:21: error: '%d' directive writing between > 1 and 3 bytes into a region of size 2 [-Werror=format-overflow=] > sprintf(port_name, "port%d", port); > ^~~~ > drivers/net/ethernet/sun/niu.c:9563:21: note: directive argument in the range > [0, 255] > drivers/net/ethernet/sun/niu.c:9563:2: note: 'sprintf' output between 6 and 8 > bytes into a destination of size 6 > sprintf(port_name, "port%d", port); > ^~ > drivers/net/ethernet/sun/niu.c: In function 'niu_pci_init_one': > drivers/net/ethernet/sun/niu.c:9538:22: error: '%d' directive writing between > 1 and 3 bytes into a region of size 2 [-Werror=format-overflow=] >sprintf(port_name, "port%d", port); > ^~~~ > drivers/net/ethernet/sun/niu.c:9538:22: note: directive argument in the range > [0, 255] > drivers/net/ethernet/sun/niu.c:9538:3: note: 'sprintf' output between 6 and 8 > bytes into a destination of size 6 > > While we know that the port number is small, there is no harm in > making the format string two bytes longer to avoid the warning. > > Signed-off-by: Arnd Bergmann Applied.
Re: [PATCH v2] sctp: don't dereference ptr before leaving _sctp_walk_{params,errors}()
From: Alexander Potapenko Date: Fri, 14 Jul 2017 12:03:29 +0200 > v2: per comment from David Miller, make sure the whole iterator->length > fits into the remaining buffer. Please compile and functionally test your changes: In file included from ./include/linux/compiler.h:58:0, from ./include/uapi/linux/stddef.h:1, from ./include/linux/stddef.h:4, from ./include/uapi/linux/posix_types.h:4, from ./include/uapi/linux/types.h:13, from ./include/linux/types.h:5, from net/sctp/sm_statefuns.c:48: net/sctp/sm_statefuns.c: In function ‘sctp_sf_do_reconf’: ./include/net/sctp/sctp.h:472:24: error: unknown type name ‘sctp_paramhdr_t’ (pos.v + offsetof(sctp_paramhdr_t, length) + sizeof(pos.p->length) <\ ^
Re: [PATCH] [for 4.13] net: qcom/emac: fix double free of SGMII IRQ during shutdown
From: Timur Tabi Date: Thu, 13 Jul 2017 15:45:41 -0500 > If the interface is not up, then don't try to close it during a > shutdown. This avoids possible double free of the IRQ, which > can happen during a shutdown. > > Fixes: 03eb3eb4d4d5 ("net: qcom/emac: add shutdown function") > Signed-off-by: Timur Tabi Applied.
Re: [PATCH] smsc95xx: use ethtool_op_get_ts_info()
From: Petr Kulhavy Date: Thu, 13 Jul 2017 19:40:57 +0200 > This change enables the use of SW timestamping on Raspberry PI. > > smsc95xx uses the usbnet transmit function usbnet_start_xmit(), which > implements software timestamping. However the SOF_TIMESTAMPING_TX_SOFTWARE > capability was missing and only SOF_TIMESTAMPING_RX_SOFTWARE was announced. > By using ethtool_op_get_ts_info() as get_ts_info() also the > SOF_TIMESTAMPING_TX_SOFTWARE is announced. > > Signed-off-by: Petr Kulhavy Applied.
Re: [PATCH net 1/1] net sched actions: rename act_get_notify() to tcf_get_notify()
From: Roman Mashak Date: Thu, 13 Jul 2017 13:12:18 -0400 > Make name consistent with other TC event notification routines, such as > tcf_add_notify() and tcf_del_notify() > > Signed-off-by: Roman Mashak Applied.
Re: IGMP snooping, switchdev and local multicast receiver on br interface
Hi All, Andrew Lunn writes: > I've been testing IGMP snooping support with DSA, putting MDB entries > into the switch so that traffic only goes out ports where there has > been an interest indicated via IGMP. It mostly works, but i've come > across one use case which does not. > > I have a multicast listener running on the host, performing a > setsockopt(IP_ADD_MEMBERSHIP) on the bridge interface. It is not an > unreasonable thing to want to do, e.g. a WiFi access point listening > to mDNS, or running other multicast protocols, a STB wanting to > receive a multicast video stream to display on the set, etc. > > I'm not seeing any switchdev operations when the IP_ADD_MEMBERSHIP is > called. So there is no indication that the switch should add an MDB > entry to forward traffic to the host. > > Im i missing something, or is this not implemented? I follow Andrew's question with another multicast issue I'm having: It seems like there is no way to add a multicast group via its MAC address. All iproute2 and kernel bridge code assumes IP multicast (0x0800 IPv4 and 0x86DD IPv6.) But there are valid cases where you might want to add an L2 multicast group on a specific VLAN ID, e.g. for 0x88F7 PTP, 0x88BA Multicast sampled values, one of the 802.1D reserved 01-80-C2-* addresses, or any proprietary protocol addresses. There is the ip-maddress VLAN-unaware tool using RTM_NEWADDR which isn't bound to switchdev, or bridge-mdb which only accepts a IPv4 or IPv6 grp. I tried to hack a PoC in iproute2 (http://ix.io/yuJ) but the kernel counterpart is not trivial at all. *br_mdb_entry only play with br_ip... Any thoughts on this? Regards, Vivien
Re: [PATCH net] net/packet: Fix Tx queue selection for AF_PACKET
From: Iván Briano Date: Thu, 13 Jul 2017 09:46:58 -0700 > When PACKET_QDISC_BYPASS is not used, Tx queue selection will be done > before the packet is enqueued, taking into account any mappings set by > a queuing discipline such as mqprio without hardware offloading. This > selection may be affected by a previously saved queue_mapping, either on > the Rx path, or done before the packet reaches the device, as it's > currently the case for AF_PACKET. > > In order for queue selection to work as expected when using traffic > control, there can't be another selection done before that point is > reached, so move the call to packet_pick_tx_queue to > packet_direct_xmit, leaving the default xmit path as it was before > PACKET_QDISC_BYPASS was introduced. > > A forward declaration of packet_pick_tx_queue() is introduced to avoid > the need to reorder the functions within the file. > > Fixes: d346a3fae3ff ("packet: introduce PACKET_QDISC_BYPASS socket option") > Signed-off-by: Iván Briano Applied, thanks.
Re: [PATCH net v2] cxgb4: ptp_clock_register() returns error pointers
From: Ganesh Goudar Date: Thu, 13 Jul 2017 18:36:50 +0530 > Check ptp_clock_register() return not only for NULL but > also for error pointers, and also nullify adapter->ptp_clock > if ptp_clock_register() fails. > > Fixes: 9c33e4208bce ("cxgb4: Add PTP Hardware Clock (PHC) support") > Reported-by: Dan Carpenter > Cc: Richard Cochran > Signed-off-by: Ganesh Goudar > --- > v2: nullifying adapter->ptp_clock if ptp_clock_register() fails. Applied.
Re: [PATCH net] net: bridge: fix dest lookup when vlan proto doesn't match
From: Nikolay Aleksandrov Date: Thu, 13 Jul 2017 16:09:10 +0300 > With 802.1ad support the vlan_ingress code started checking for vlan > protocol mismatch which causes the current tag to be inserted and the > bridge vlan protocol & pvid to be set. The vlan tag insertion changes > the skb mac_header and thus the lookup mac dest pointer which was loaded > prior to calling br_allowed_ingress in br_handle_frame_finish is VLAN_HLEN > bytes off now, pointing to the last two bytes of the destination mac and > the first four of the source mac causing lookups to always fail and > broadcasting all such packets to all ports. Same thing happens for locally > originated packets when passing via br_dev_xmit. So load the dest pointer > after the vlan checks and possible skb change. > > Fixes: 8580e2117c06 ("bridge: Prepare for 802.1ad vlan filtering support") > Reported-by: Anitha Narasimha Murthy > Signed-off-by: Nikolay Aleksandrov Applied.
Re: [PATCH net] net: hns: add acpi function of xge led control
From: Date: Thu, 13 Jul 2017 18:57:54 +0800 > From: LiuJian > > The current code only support DT method to control xge led. > This patch is the implementation of acpi method to control xge led. > > Signed-off-by: LiuJian > Reviewed-by: John Garry > Reviewed-by: Yunsheng Lin > Reviewed-by: Daode Huang Applied.
Re: [Patch net] netpoll: shut up a kernel warning on refcount
From: Cong Wang Date: Wed, 12 Jul 2017 15:56:41 -0700 > When we convert atomic_t to refcount_t, a new kernel warning > on "increment on 0" is introduced in the netpoll code, > zap_completion_queue(). In fact for this special case, we know > the refcount is 0 and we just have to set it to 1 to satisfy > the following dev_kfree_skb_any(), so we can just use > refcount_set(..., 1) instead. > > Fixes: 633547973ffc ("net: convert sk_buff.users from atomic_t to refcount_t") > Reported-by: Dave Jones > Cc: Reshetova, Elena > Signed-off-by: Cong Wang Applied, thanks Cong.
Re: [PATCH V3] Set NTB format again after altsetting switch for Huawei devices
From: Enrico Mioso Date: Tue, 11 Jul 2017 17:21:52 +0200 > Some firmwares in Huawei E3372H devices have been observed to switch back > to NTB 32-bit format after altsetting switch. > This patch implements a driver flag to check for the device settings and > set NTB format to 16-bit again if needed. > The flag has been activated for devices controlled by the huawei_cdc_ncm.c > driver. > > V1->V2: > - fixed broken error checks > - some corrections to the commit message > V2->V3: > - variable name changes, to clarify what's happening > - check (and possibly set) the NTB format later in the common bind code path > > Signed-off-by: Enrico Mioso > Reported-and-tested-by: Christian Panton > Reviewed-by: Bjørn Mork Applied, thanks.
Re: [PATCH net] mdio: mux: fix parsing mux registers outside of the PHY address range
From: Martin Blumenstingl Date: Mon, 10 Jul 2017 14:35:23 +0200 > mdio_mux_init parses the child nodes of the MDIO mux. When using > "mdio-mux-mmioreg" the child nodes are describing the register value > that is written to switch between the MDIO busses. > > The change which makes the error messages more verbose changed the > parsing of the "reg" property from a simple of_property_read_u32 call > to of_mdio_parse_addr. On a Khadas VIM (based on the Meson GXL SoC, > which uses mdio-mux-mmioreg) this prevents registering the MDIO mux > (because the "reg" values on the MDIO mux child nodes are 0x2009087f > and 0xe40908ff) and leads to the following errors: > mdio-mux-mmioreg c883455c.eth-phy-mux: > /soc/periphs@c8834000/eth-phy-mux/mdio@e40908ff PHY address -469169921 is too > large > mdio-mux-mmioreg c883455c.eth-phy-mux: Error: Failed to find reg for child > /soc/periphs@c8834000/eth-phy-mux/mdio@e40908ff > mdio-mux-mmioreg c883455c.eth-phy-mux: > /soc/periphs@c8834000/eth-phy-mux/mdio@2009087f PHY address 537462911 is too > large > mdio-mux-mmioreg c883455c.eth-phy-mux: Error: Failed to find reg for child > /soc/periphs@c8834000/eth-phy-mux/mdio@2009087f > mdio-mux-mmioreg c883455c.eth-phy-mux: Error: No acceptable child buses > found > mdio-mux-mmioreg c883455c.eth-phy-mux: failed to register mdio-mux bus > /soc/periphs@c8834000/eth-phy-mux > (as a result of that ethernet is not working, because the PHY which is > connected through the mux' child MDIO bus, which is not being > registered). > > Fix this by reverting the change from of_mdio_parse_addr to > of_mdio_parse_addr. > > Fixes: 342fa1964439 ("mdio: mux: make child bus walking more permissive and > errors more verbose") > Signed-off-by: Martin Blumenstingl Applied, thanks.
IGMP snooping, switchdev and local multicast receiver on br interface
Hi Folks I've been testing IGMP snooping support with DSA, putting MDB entries into the switch so that traffic only goes out ports where there has been an interest indicated via IGMP. It mostly works, but i've come across one use case which does not. I have a multicast listener running on the host, performing a setsockopt(IP_ADD_MEMBERSHIP) on the bridge interface. It is not an unreasonable thing to want to do, e.g. a WiFi access point listening to mDNS, or running other multicast protocols, a STB wanting to receive a multicast video stream to display on the set, etc. I'm not seeing any switchdev operations when the IP_ADD_MEMBERSHIP is called. So there is no indication that the switch should add an MDB entry to forward traffic to the host. Im i missing something, or is this not implemented? Thanks Andrew
[PATCH net] sctp: fix an array overflow when all ext chunks are set
Marcelo noticed an array overflow caused by commit c28445c3cb07 ("sctp: add reconf_enable in asoc ep and netns"), in which sctp would add SCTP_CID_RECONF into extensions when reconf_enable is set in sctp_make_init and sctp_make_init_ack. Then now when all ext chunks are set, 4 ext chunk ids can be put into extensions array while extensions array size is 3. It would cause a kernel panic because of this overflow. This patch is to fix it by defining extensions array size is 4 in both sctp_make_init and sctp_make_init_ack. Fixes: c28445c3cb07 ("sctp: add reconf_enable in asoc ep and netns") Signed-off-by: Xin Long --- net/sctp/sm_make_chunk.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/net/sctp/sm_make_chunk.c b/net/sctp/sm_make_chunk.c index 4e16b02..6110447 100644 --- a/net/sctp/sm_make_chunk.c +++ b/net/sctp/sm_make_chunk.c @@ -228,7 +228,7 @@ struct sctp_chunk *sctp_make_init(const struct sctp_association *asoc, sctp_adaptation_ind_param_t aiparam; sctp_supported_ext_param_t ext_param; int num_ext = 0; - __u8 extensions[3]; + __u8 extensions[4]; struct sctp_paramhdr *auth_chunks = NULL, *auth_hmacs = NULL; @@ -396,7 +396,7 @@ struct sctp_chunk *sctp_make_init_ack(const struct sctp_association *asoc, sctp_adaptation_ind_param_t aiparam; sctp_supported_ext_param_t ext_param; int num_ext = 0; - __u8 extensions[3]; + __u8 extensions[4]; struct sctp_paramhdr *auth_chunks = NULL, *auth_hmacs = NULL, *auth_random = NULL; -- 2.1.0
Re: [PATCH 15/22] hwmon: applesmc: fix format string overflow
On 07/14/2017 05:07 AM, Arnd Bergmann wrote: gcc-7 warns that the key might exceed five bytes for lage index values: drivers/hwmon/applesmc.c: In function 'applesmc_show_fan_position': drivers/hwmon/applesmc.c:906:18: error: '%d' directive writing between 1 and 5 bytes into a region of size 4 [-Werror=format-overflow=] sprintf(newkey, FAN_ID_FMT, to_index(attr)); ^~~ drivers/hwmon/applesmc.c:906:18: note: directive argument in the range [0, 65535] drivers/hwmon/applesmc.c:906:2: note: 'sprintf' output between 5 and 9 bytes into a destination of size 5 As the key is required to be four characters plus trailing zero, we know that the index has to be small here. I'm using snprintf() to avoid the warning. This would truncate the string instead of overflowing. Signed-off-by: Arnd Bergmann I submitted a more comprehensive patch a couple of days ago. There are other similar sprintf() calls in the driver which gcc doesn't report. Guenter --- drivers/hwmon/applesmc.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/hwmon/applesmc.c b/drivers/hwmon/applesmc.c index 0af7fd311979..515163b9a89f 100644 --- a/drivers/hwmon/applesmc.c +++ b/drivers/hwmon/applesmc.c @@ -903,7 +903,7 @@ static ssize_t applesmc_show_fan_position(struct device *dev, char newkey[5]; u8 buffer[17]; - sprintf(newkey, FAN_ID_FMT, to_index(attr)); + snprintf(newkey, sizeof(newkey), FAN_ID_FMT, to_index(attr)); ret = applesmc_read_key(newkey, buffer, 16); buffer[16] = 0;
Re: [PATCH 19/22] block: DAC960: shut up format-overflow warning
On 07/14/2017 06:07 AM, Arnd Bergmann wrote: > gcc-7 points out that a large controller number would overflow the > string length for the procfs name and the firmware version string: > > drivers/block/DAC960.c: In function 'DAC960_Probe': > drivers/block/DAC960.c:6591:38: warning: 'sprintf' may write a terminating > nul past the end of the destination [-Wformat-overflow=] > drivers/block/DAC960.c: In function 'DAC960_V1_ReadControllerConfiguration': > drivers/block/DAC960.c:1681:40: error: '%02d' directive writing between 2 and > 3 bytes into a region of size between 2 and 5 [-Werror=format-overflow=] > drivers/block/DAC960.c:1681:40: note: directive argument in the range [0, 255] > drivers/block/DAC960.c:1681:3: note: 'sprintf' output between 10 and 14 bytes > into a destination of size 12 > > Both of these seem appropriately sized, and using snprintf() > instead of sprintf() improves this by ensuring that even > incorrect data won't cause undefined behavior here. Thanks Arnd, added for 4.14. -- Jens Axboe
Re: [PATCH v7 2/3] PCI: Enable PCIe Relaxed Ordering if supported
On 7/13/2017 9:26 PM, Ding Tianhong wrote: > There is no code to enable the PCIe Relaxed Ordering bit in the configuration > space, > it is only be enable by default according to the PCIe Standard Specification, > what we > do is to distinguish the RC problematic platform and clear the Relaxed > Ordering bit > to tell the PCIe EP don't send any TLPs with Relaxed Ordering Attributes to > the Root > Complex. Maybe, you should change the patch commit as "Disable PCIe Relaxed Ordering if not supported"... -- Sinan Kaya Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc. Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.
Re: [PATCH 22/22] IB/mlx4: fix sprintf format warning
On Fri, Jul 14, 2017 at 02:07:14PM +0200, Arnd Bergmann wrote: > gcc-7 points out that a negative port_num value would overflow > the string buffer: > > drivers/infiniband/hw/mlx4/sysfs.c: In function > 'mlx4_ib_device_register_sysfs': > drivers/infiniband/hw/mlx4/sysfs.c:251:16: error: 'sprintf' may write a > terminating nul past the end of the destination [-Werror=format-overflow=] > drivers/infiniband/hw/mlx4/sysfs.c:251:2: note: 'sprintf' output between 2 > and 11 bytes into a destination of size 10 > drivers/infiniband/hw/mlx4/sysfs.c:303:17: error: 'sprintf' may write a > terminating nul past the end of the destination [-Werror=format-overflow=] > drivers/infiniband/hw/mlx4/sysfs.c:303:3: note: 'sprintf' output between 2 > and 11 bytes into a destination of size 10 > > While we should be able to assume that port_num is positive here, > making the buffer one byte longer has no downsides and avoids the > warning. > > Fixes: c1e7e466120b ("IB/mlx4: Add iov directory in sysfs under the ib > device") > Signed-off-by: Arnd Bergmann > --- > drivers/infiniband/hw/mlx4/sysfs.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > Thanks, Reviewed-by: Leon Romanovsky signature.asc Description: PGP signature
Re: [PATCH 18/22] gpio: acpi: fix string overflow for large pin numbers
On Fri, 2017-07-14 at 14:07 +0200, Arnd Bergmann wrote: > gcc-7 notices that the pin_table is an array of 16-bit numbers, > but we assume it can be printed as a two-character hexadecimal > string: > > drivers/gpio/gpiolib-acpi.c: In function > 'acpi_gpiochip_request_interrupt': > drivers/gpio/gpiolib-acpi.c:206:24: warning: '%02X' directive writing > between 2 and 4 bytes into a region of size 3 [-Wformat-overflow=] > sprintf(ev_name, "_%c%02X", > ^~~~ > drivers/gpio/gpiolib-acpi.c:206:20: note: directive argument in the > range [0, 65535] > sprintf(ev_name, "_%c%02X", > ^ > drivers/gpio/gpiolib-acpi.c:206:3: note: 'sprintf' output between 5 > and 7 bytes into a destination of size 5 > sprintf(ev_name, "_%c%02X", > ^~~ > agpio->triggering == ACPI_EDGE_SENSITIVE ? 'E' : 'L', > ~ > pin); > This is obviously a false positive warning. Here we have int pin = u16 pin_table[0] <= 255 (implying >= 0). I see few options how to make it more clear 1) your proposal; 2) use "%02hhX" instead; 3) use if (ret >= 0 && ret <= 255) condition. I would choose one of the 2-3. In case gcc will complain about 3), file a bug to gcc crazy warning. > > This can't be right, so this changes it to truncate the number to > an 8-bit pin number. > > Fixes: 0d1c28a449c6 ("gpiolib-acpi: Add ACPI5 event model support to > gpio.") > Signed-off-by: Arnd Bergmann > --- > drivers/gpio/gpiolib-acpi.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/gpio/gpiolib-acpi.c b/drivers/gpio/gpiolib-acpi.c > index c9b42dd12dfa..c3faea724af8 100644 > --- a/drivers/gpio/gpiolib-acpi.c > +++ b/drivers/gpio/gpiolib-acpi.c > @@ -205,7 +205,7 @@ static acpi_status > acpi_gpiochip_request_interrupt(struct acpi_resource *ares, > char ev_name[5]; > sprintf(ev_name, "_%c%02X", > agpio->triggering == ACPI_EDGE_SENSITIVE ? > 'E' : 'L', > - pin); > + (u8)pin); > if (ACPI_SUCCESS(acpi_get_handle(handle, ev_name, > &evt_handle))) > handler = acpi_gpio_irq_handler; > } -- Andy Shevchenko Intel Finland Oy
Re: [PATCH 11/22] net: thunder_bgx: avoid format string overflow warning
On 14/07/17 13:07, Arnd Bergmann wrote: > gcc warns that the temporary buffer might be too small here: > > drivers/net/ethernet/cavium/thunder/thunder_bgx.c: In function 'bgx_probe': > drivers/net/ethernet/cavium/thunder/thunder_bgx.c:1020:16: error: '%d' > directive writing between 1 and 10 bytes into a region of size between 9 and > 11 [-Werror=format-overflow=] > sprintf(str, "BGX%d LMAC%d mode", bgx->bgx_id, lmacid); > ^~~ > drivers/net/ethernet/cavium/thunder/thunder_bgx.c:1020:16: note: directive > argument in the range [0, 2147483647] > drivers/net/ethernet/cavium/thunder/thunder_bgx.c:1020:3: note: 'sprintf' > output between 16 and 27 bytes into a destination of size 20 > > This probably can't happen, but it can't hurt to make it long > enough for the theoretical limit. Probably indeed - both bgx_id and lmacid are u8 here, which would make the maximum length of that string, including null terminator, exactly 20 characters. So in this case the warning is not only silly, it's actively wrong; sure, the arguments themselves are being promoted to ints at that point, but GCC *knows* the original type, or it couldn't have generated the correct code for the call :/ Robin. > Signed-off-by: Arnd Bergmann > --- > drivers/net/ethernet/cavium/thunder/thunder_bgx.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/net/ethernet/cavium/thunder/thunder_bgx.c > b/drivers/net/ethernet/cavium/thunder/thunder_bgx.c > index a0ca68ce3fbb..79112563a25a 100644 > --- a/drivers/net/ethernet/cavium/thunder/thunder_bgx.c > +++ b/drivers/net/ethernet/cavium/thunder/thunder_bgx.c > @@ -1008,7 +1008,7 @@ static void bgx_print_qlm_mode(struct bgx *bgx, u8 > lmacid) > { > struct device *dev = &bgx->pdev->dev; > struct lmac *lmac; > - char str[20]; > + char str[27]; > > if (!bgx->is_dlm && lmacid) > return; >
Re: [PATCH 20/22] sound: pci: avoid string overflow warnings
On Fri, 14 Jul 2017 14:07:12 +0200, Arnd Bergmann wrote: > > With gcc-7, we get various warnings about a possible string overflow: > > sound/pci/rme9652/hdspm.c: In function 'snd_hdspm_create_alsa_devices': > sound/pci/rme9652/hdspm.c:2123:17: error: ' MIDIoverMADI' directive writing > 13 bytes into a region of size between 1 and 32 [-Werror=format-overflow=] > sound/pci/pcxhr/pcxhr.c: In function 'pcxhr_probe': > sound/pci/pcxhr/pcxhr.c:1647:28: error: ' [PCM #' directive writing 7 bytes > into a region of size between 1 and 32 [-Werror=format-overflow=] > sound/pci/mixart/mixart.c: In function 'snd_mixart_probe': > sound/pci/mixart/mixart.c:1353:28: error: ' [PCM #' directive writing 7 bytes > into a region of size between 1 and 32 [-Werror=format-overflow=] >sprintf(card->shortname, "%s [PCM #%d]", mgr->shortname, i); > ^~ > sound/pci/mixart/mixart.c:1353:28: note: using the range [-2147483648, > 2147483647] for directive argument > sound/pci/mixart/mixart.c:1353:3: note: 'sprintf' output between 10 and 51 > bytes into a destination of size 32 >sprintf(card->shortname, "%s [PCM #%d]", mgr->shortname, i); >^~~ > sound/pci/mixart/mixart.c:1354:27: error: ' [PCM #' directive writing 7 bytes > into a region of size between 1 and 80 [-Werror=format-overflow=] >sprintf(card->longname, "%s [PCM #%d]", mgr->longname, i); >^~ > sound/pci/mixart/mixart.c:1354:27: note: using the range [-2147483648, > 2147483647] for directive argument > sound/pci/mixart/mixart.c:1354:3: note: 'sprintf' output between 10 and 99 > bytes into a destination of size 80 > > I have checked these all and found that the driver-private > shortname strings for mixart and pcxhr are longer than necessary, > and making them shorter will be safe while also making it clear > that no overflow can happen when they get passed as a substring > into the card shortname. > > For hdspm, we have a local buffer of the same size as its substring. > In this case, making the buffer a little longer is safe as the > functions that take it as an argument all use length checking and > the strings we pass into it are actually short enough. > > Signed-off-by: Arnd Bergmann Thanks for the patch. I have seen it but ignored, so far, as not sure which action is the best. An alternative solution is to use snprintf() blindly, for example. For mixart, it's even better to drop mgr->shortname[] and longname[] assignment. The shortname is the fixed string, and the longname is used only at copying to card->longname, so we can create a string there from the scratch. Takashi > --- > sound/pci/mixart/mixart.h | 4 ++-- > sound/pci/pcxhr/pcxhr.h | 4 ++-- > sound/pci/rme9652/hdspm.c | 2 +- > 3 files changed, 5 insertions(+), 5 deletions(-) > > diff --git a/sound/pci/mixart/mixart.h b/sound/pci/mixart/mixart.h > index 426743871540..c8309e327663 100644 > --- a/sound/pci/mixart/mixart.h > +++ b/sound/pci/mixart/mixart.h > @@ -75,8 +75,8 @@ struct mixart_mgr { > struct mem_area mem[2]; > > /* share the name */ > - char shortname[32]; /* short name of this soundcard */ > - char longname[80]; /* name of this soundcard */ > + char shortname[16]; /* short name of this soundcard */ > + char longname[40]; /* name of this soundcard */ > > /* one and only blocking message or notification may be pending */ > u32 pending_event; > diff --git a/sound/pci/pcxhr/pcxhr.h b/sound/pci/pcxhr/pcxhr.h > index 9e39e509a3ef..4909a43ce3d9 100644 > --- a/sound/pci/pcxhr/pcxhr.h > +++ b/sound/pci/pcxhr/pcxhr.h > @@ -75,8 +75,8 @@ struct pcxhr_mgr { > unsigned long port[3]; > > /* share the name */ > - char shortname[32]; /* short name of this soundcard */ > - char longname[96]; /* name of this soundcard */ > + char shortname[16]; /* short name of this soundcard */ > + char longname[40]; /* name of this soundcard */ > > struct pcxhr_rmh *prmh; > > diff --git a/sound/pci/rme9652/hdspm.c b/sound/pci/rme9652/hdspm.c > index 254c3d040118..a1cbf5938a0e 100644 > --- a/sound/pci/rme9652/hdspm.c > +++ b/sound/pci/rme9652/hdspm.c > @@ -2061,7 +2061,7 @@ static int snd_hdspm_create_midi(struct snd_card *card, >struct hdspm *hdspm, int id) > { > int err; > - char buf[32]; > + char buf[64]; > > hdspm->midi[id].id = id; > hdspm->midi[id].hdspm = hdspm; > -- > 2.9.0 > >
[PATCH 07/22] scsi: gdth: increase the procfs event buffer size
We print a 256 byte event string into a buffer that is only 161 bytes long, this is clearly wrong: drivers/scsi/gdth_proc.c: In function 'gdth_show_info': drivers/scsi/gdth.c:3660:41: error: '%s' directive writing up to 255 bytes into a region of size between 141 and 150 [-Werror=format-overflow=] sprintf(buffer,"Adapter %d: %s\n", ^~ /git/arm-soc/drivers/scsi/gdth.c:3660:13: note: 'sprintf' output between 13 and 277 bytes into a destination of size 161 sprintf(buffer,"Adapter %d: %s\n", ^~ dvr->eu.async.ionode,dvr->event_string); ~~~ gcc calculates that the worst case buffer size would be 277 bytes, so we can use that. Signed-off-by: Arnd Bergmann --- drivers/scsi/gdth_proc.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/scsi/gdth_proc.c b/drivers/scsi/gdth_proc.c index be609db66807..d08b2716752c 100644 --- a/drivers/scsi/gdth_proc.c +++ b/drivers/scsi/gdth_proc.c @@ -147,7 +147,7 @@ int gdth_show_info(struct seq_file *m, struct Scsi_Host *host) gdth_cmd_str *gdtcmd; gdth_evt_str *estr; -char hrec[161]; +char hrec[277]; char *buf; gdth_dskstat_str *pds; -- 2.9.0
[PATCH 05/22] scsi: gdth: avoid buffer overflow warning
gcc notices that we would overflow the buffer for the inquiry of the product name if we have too many adapters: drivers/scsi/gdth.c: In function 'gdth_next': drivers/scsi/gdth.c:2357:29: warning: 'sprintf' may write a terminating nul past the end of the destination [-Wformat-overflow=] sprintf(inq.product,"Host Drive #%02d",t); ^~~ drivers/scsi/gdth.c:2357:9: note: 'sprintf' output between 16 and 17 bytes into a destination of size 16 sprintf(inq.product,"Host Drive #%02d",t); This won't happen in practice, so just use snprintf to truncate the string. Signed-off-by: Arnd Bergmann --- drivers/scsi/gdth.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/scsi/gdth.c b/drivers/scsi/gdth.c index facc7271f932..a4473356a9dc 100644 --- a/drivers/scsi/gdth.c +++ b/drivers/scsi/gdth.c @@ -2354,7 +2354,7 @@ static int gdth_internal_cache_cmd(gdth_ha_str *ha, Scsi_Cmnd *scp) inq.resp_aenc = 2; inq.add_length= 32; strcpy(inq.vendor,ha->oem_name); -sprintf(inq.product,"Host Drive #%02d",t); +snprintf(inq.product, sizeof(inq.product), "Host Drive #%02d",t); strcpy(inq.revision," "); gdth_copy_internal_data(ha, scp, (char*)&inq, sizeof(gdth_inq_data)); break; -- 2.9.0
[PATCH 14/22] [media] usbvision-i2c: fix format overflow warning
gcc-7 notices that we copy a fixed length string into another string of the same size, with additional characters: drivers/media/usb/usbvision/usbvision-i2c.c: In function 'usbvision_i2c_register': drivers/media/usb/usbvision/usbvision-i2c.c:190:36: error: '%d' directive writing between 1 and 11 bytes into a region of size between 0 and 47 [-Werror=format-overflow=] sprintf(usbvision->i2c_adap.name, "%s-%d-%s", i2c_adap_template.name, ^~ drivers/media/usb/usbvision/usbvision-i2c.c:190:2: note: 'sprintf' output between 4 and 76 bytes into a destination of size 48 We know this is fine as the template name is always "usbvision", so we can easily avoid the warning by using this as the format string directly. Signed-off-by: Arnd Bergmann --- drivers/media/usb/usbvision/usbvision-i2c.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/media/usb/usbvision/usbvision-i2c.c b/drivers/media/usb/usbvision/usbvision-i2c.c index fdf6b6e285da..aae9f69884da 100644 --- a/drivers/media/usb/usbvision/usbvision-i2c.c +++ b/drivers/media/usb/usbvision/usbvision-i2c.c @@ -187,8 +187,8 @@ int usbvision_i2c_register(struct usb_usbvision *usbvision) usbvision->i2c_adap = i2c_adap_template; - sprintf(usbvision->i2c_adap.name, "%s-%d-%s", i2c_adap_template.name, - usbvision->dev->bus->busnum, usbvision->dev->devpath); + sprintf(usbvision->i2c_adap.name, "usbvision-%d-%s", +usbvision->dev->bus->busnum, usbvision->dev->devpath); PDEBUG(DBG_I2C, "Adaptername: %s", usbvision->i2c_adap.name); usbvision->i2c_adap.dev.parent = &usbvision->dev->dev; -- 2.9.0
[PATCH 20/22] sound: pci: avoid string overflow warnings
With gcc-7, we get various warnings about a possible string overflow: sound/pci/rme9652/hdspm.c: In function 'snd_hdspm_create_alsa_devices': sound/pci/rme9652/hdspm.c:2123:17: error: ' MIDIoverMADI' directive writing 13 bytes into a region of size between 1 and 32 [-Werror=format-overflow=] sound/pci/pcxhr/pcxhr.c: In function 'pcxhr_probe': sound/pci/pcxhr/pcxhr.c:1647:28: error: ' [PCM #' directive writing 7 bytes into a region of size between 1 and 32 [-Werror=format-overflow=] sound/pci/mixart/mixart.c: In function 'snd_mixart_probe': sound/pci/mixart/mixart.c:1353:28: error: ' [PCM #' directive writing 7 bytes into a region of size between 1 and 32 [-Werror=format-overflow=] sprintf(card->shortname, "%s [PCM #%d]", mgr->shortname, i); ^~ sound/pci/mixart/mixart.c:1353:28: note: using the range [-2147483648, 2147483647] for directive argument sound/pci/mixart/mixart.c:1353:3: note: 'sprintf' output between 10 and 51 bytes into a destination of size 32 sprintf(card->shortname, "%s [PCM #%d]", mgr->shortname, i); ^~~ sound/pci/mixart/mixart.c:1354:27: error: ' [PCM #' directive writing 7 bytes into a region of size between 1 and 80 [-Werror=format-overflow=] sprintf(card->longname, "%s [PCM #%d]", mgr->longname, i); ^~ sound/pci/mixart/mixart.c:1354:27: note: using the range [-2147483648, 2147483647] for directive argument sound/pci/mixart/mixart.c:1354:3: note: 'sprintf' output between 10 and 99 bytes into a destination of size 80 I have checked these all and found that the driver-private shortname strings for mixart and pcxhr are longer than necessary, and making them shorter will be safe while also making it clear that no overflow can happen when they get passed as a substring into the card shortname. For hdspm, we have a local buffer of the same size as its substring. In this case, making the buffer a little longer is safe as the functions that take it as an argument all use length checking and the strings we pass into it are actually short enough. Signed-off-by: Arnd Bergmann --- sound/pci/mixart/mixart.h | 4 ++-- sound/pci/pcxhr/pcxhr.h | 4 ++-- sound/pci/rme9652/hdspm.c | 2 +- 3 files changed, 5 insertions(+), 5 deletions(-) diff --git a/sound/pci/mixart/mixart.h b/sound/pci/mixart/mixart.h index 426743871540..c8309e327663 100644 --- a/sound/pci/mixart/mixart.h +++ b/sound/pci/mixart/mixart.h @@ -75,8 +75,8 @@ struct mixart_mgr { struct mem_area mem[2]; /* share the name */ - char shortname[32]; /* short name of this soundcard */ - char longname[80]; /* name of this soundcard */ + char shortname[16]; /* short name of this soundcard */ + char longname[40]; /* name of this soundcard */ /* one and only blocking message or notification may be pending */ u32 pending_event; diff --git a/sound/pci/pcxhr/pcxhr.h b/sound/pci/pcxhr/pcxhr.h index 9e39e509a3ef..4909a43ce3d9 100644 --- a/sound/pci/pcxhr/pcxhr.h +++ b/sound/pci/pcxhr/pcxhr.h @@ -75,8 +75,8 @@ struct pcxhr_mgr { unsigned long port[3]; /* share the name */ - char shortname[32]; /* short name of this soundcard */ - char longname[96]; /* name of this soundcard */ + char shortname[16]; /* short name of this soundcard */ + char longname[40]; /* name of this soundcard */ struct pcxhr_rmh *prmh; diff --git a/sound/pci/rme9652/hdspm.c b/sound/pci/rme9652/hdspm.c index 254c3d040118..a1cbf5938a0e 100644 --- a/sound/pci/rme9652/hdspm.c +++ b/sound/pci/rme9652/hdspm.c @@ -2061,7 +2061,7 @@ static int snd_hdspm_create_midi(struct snd_card *card, struct hdspm *hdspm, int id) { int err; - char buf[32]; + char buf[64]; hdspm->midi[id].id = id; hdspm->midi[id].hdspm = hdspm; -- 2.9.0
[PATCH 22/22] IB/mlx4: fix sprintf format warning
gcc-7 points out that a negative port_num value would overflow the string buffer: drivers/infiniband/hw/mlx4/sysfs.c: In function 'mlx4_ib_device_register_sysfs': drivers/infiniband/hw/mlx4/sysfs.c:251:16: error: 'sprintf' may write a terminating nul past the end of the destination [-Werror=format-overflow=] drivers/infiniband/hw/mlx4/sysfs.c:251:2: note: 'sprintf' output between 2 and 11 bytes into a destination of size 10 drivers/infiniband/hw/mlx4/sysfs.c:303:17: error: 'sprintf' may write a terminating nul past the end of the destination [-Werror=format-overflow=] drivers/infiniband/hw/mlx4/sysfs.c:303:3: note: 'sprintf' output between 2 and 11 bytes into a destination of size 10 While we should be able to assume that port_num is positive here, making the buffer one byte longer has no downsides and avoids the warning. Fixes: c1e7e466120b ("IB/mlx4: Add iov directory in sysfs under the ib device") Signed-off-by: Arnd Bergmann --- drivers/infiniband/hw/mlx4/sysfs.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/infiniband/hw/mlx4/sysfs.c b/drivers/infiniband/hw/mlx4/sysfs.c index 0ba5ba7540c8..e219093d2764 100644 --- a/drivers/infiniband/hw/mlx4/sysfs.c +++ b/drivers/infiniband/hw/mlx4/sysfs.c @@ -221,7 +221,7 @@ void del_sysfs_port_mcg_attr(struct mlx4_ib_dev *device, int port_num, static int add_port_entries(struct mlx4_ib_dev *device, int port_num) { int i; - char buff[10]; + char buff[11]; struct mlx4_ib_iov_port *port = NULL; int ret = 0 ; struct ib_port_attr attr; -- 2.9.0
[PATCH 21/22] fscache: fix fscache_objlist_show format processing
gcc points out a minor bug in the handling of unknown cookie types, which could result in a string overflow when the integer is copied into a 3-byte string: fs/fscache/object-list.c: In function 'fscache_objlist_show': fs/fscache/object-list.c:265:19: error: 'sprintf' may write a terminating nul past the end of the destination [-Werror=format-overflow=] sprintf(_type, "%02u", cookie->def->type); ^~ fs/fscache/object-list.c:265:4: note: 'sprintf' output between 3 and 4 bytes into a destination of size 3 This is currently harmless as no code sets a type other than 0 or 1, but it makes sense to use snprintf() here to avoid overflowing the array if that changes. Signed-off-by: Arnd Bergmann --- fs/fscache/object-list.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/fs/fscache/object-list.c b/fs/fscache/object-list.c index 67f940892ef8..b5ab06fabc60 100644 --- a/fs/fscache/object-list.c +++ b/fs/fscache/object-list.c @@ -262,7 +262,8 @@ static int fscache_objlist_show(struct seq_file *m, void *v) type = "DT"; break; default: - sprintf(_type, "%02u", cookie->def->type); + snprintf(_type, sizeof(_type), "%02u", +cookie->def->type); type = _type; break; } -- 2.9.0
[PATCH 16/22] x86: intel-mid: fix a format string overflow warning
We have space for exactly one character for the index in "max7315_%d_base", but as gcc points out having more would cause an string overflow: arch/x86/platform/intel-mid/device_libs/platform_max7315.c: In function 'max7315_platform_data': arch/x86/platform/intel-mid/device_libs/platform_max7315.c:41:26: error: '%d' directive writing between 1 and 11 bytes into a region of size 9 [-Werror=format-overflow=] sprintf(base_pin_name, "max7315_%d_base", nr); ^ arch/x86/platform/intel-mid/device_libs/platform_max7315.c:41:26: note: directive argument in the range [-2147483647, 2147483647] arm-soc/arch/x86/platform/intel-mid/device_libs/platform_max7315.c:41:3: note: 'sprintf' output between 15 and 25 bytes into a destination of size 17 sprintf(base_pin_name, "max7315_%d_base", nr); This makes it use an snprintf() to truncate the string if that happened rather than overflowing the stack. Signed-off-by: Arnd Bergmann --- arch/x86/platform/intel-mid/device_libs/platform_max7315.c | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/arch/x86/platform/intel-mid/device_libs/platform_max7315.c b/arch/x86/platform/intel-mid/device_libs/platform_max7315.c index 6e075afa7877..58337b2bc682 100644 --- a/arch/x86/platform/intel-mid/device_libs/platform_max7315.c +++ b/arch/x86/platform/intel-mid/device_libs/platform_max7315.c @@ -38,8 +38,10 @@ static void __init *max7315_platform_data(void *info) */ strcpy(i2c_info->type, "max7315"); if (nr++) { - sprintf(base_pin_name, "max7315_%d_base", nr); - sprintf(intr_pin_name, "max7315_%d_int", nr); + snprintf(base_pin_name, sizeof(base_pin_name), +"max7315_%d_base", nr); + snprintf(intr_pin_name, sizeof(intr_pin_name), +"max7315_%d_int", nr); } else { strcpy(base_pin_name, "max7315_base"); strcpy(intr_pin_name, "max7315_int"); -- 2.9.0
[PATCH 19/22] block: DAC960: shut up format-overflow warning
gcc-7 points out that a large controller number would overflow the string length for the procfs name and the firmware version string: drivers/block/DAC960.c: In function 'DAC960_Probe': drivers/block/DAC960.c:6591:38: warning: 'sprintf' may write a terminating nul past the end of the destination [-Wformat-overflow=] drivers/block/DAC960.c: In function 'DAC960_V1_ReadControllerConfiguration': drivers/block/DAC960.c:1681:40: error: '%02d' directive writing between 2 and 3 bytes into a region of size between 2 and 5 [-Werror=format-overflow=] drivers/block/DAC960.c:1681:40: note: directive argument in the range [0, 255] drivers/block/DAC960.c:1681:3: note: 'sprintf' output between 10 and 14 bytes into a destination of size 12 Both of these seem appropriately sized, and using snprintf() instead of sprintf() improves this by ensuring that even incorrect data won't cause undefined behavior here. Signed-off-by: Arnd Bergmann --- drivers/block/DAC960.c | 12 1 file changed, 8 insertions(+), 4 deletions(-) diff --git a/drivers/block/DAC960.c b/drivers/block/DAC960.c index 245a879b036e..255591ab3716 100644 --- a/drivers/block/DAC960.c +++ b/drivers/block/DAC960.c @@ -1678,9 +1678,12 @@ static bool DAC960_V1_ReadControllerConfiguration(DAC960_Controller_T Enquiry2->FirmwareID.FirmwareType = '0'; Enquiry2->FirmwareID.TurnID = 0; } - sprintf(Controller->FirmwareVersion, "%d.%02d-%c-%02d", - Enquiry2->FirmwareID.MajorVersion, Enquiry2->FirmwareID.MinorVersion, - Enquiry2->FirmwareID.FirmwareType, Enquiry2->FirmwareID.TurnID); + snprintf(Controller->FirmwareVersion, sizeof(Controller->FirmwareVersion), + "%d.%02d-%c-%02d", + Enquiry2->FirmwareID.MajorVersion, + Enquiry2->FirmwareID.MinorVersion, + Enquiry2->FirmwareID.FirmwareType, + Enquiry2->FirmwareID.TurnID); if (!((Controller->FirmwareVersion[0] == '5' && strcmp(Controller->FirmwareVersion, "5.06") >= 0) || (Controller->FirmwareVersion[0] == '4' && @@ -6588,7 +6591,8 @@ static void DAC960_CreateProcEntries(DAC960_Controller_T *Controller) &dac960_proc_fops); } - sprintf(Controller->ControllerName, "c%d", Controller->ControllerNumber); + snprintf(Controller->ControllerName, sizeof(Controller->ControllerName), +"c%d", Controller->ControllerNumber); ControllerProcEntry = proc_mkdir(Controller->ControllerName, DAC960_ProcDirectoryEntry); proc_create_data("initial_status", 0, ControllerProcEntry, &dac960_initial_status_proc_fops, Controller); -- 2.9.0
[PATCH 18/22] gpio: acpi: fix string overflow for large pin numbers
gcc-7 notices that the pin_table is an array of 16-bit numbers, but we assume it can be printed as a two-character hexadecimal string: drivers/gpio/gpiolib-acpi.c: In function 'acpi_gpiochip_request_interrupt': drivers/gpio/gpiolib-acpi.c:206:24: warning: '%02X' directive writing between 2 and 4 bytes into a region of size 3 [-Wformat-overflow=] sprintf(ev_name, "_%c%02X", ^~~~ drivers/gpio/gpiolib-acpi.c:206:20: note: directive argument in the range [0, 65535] sprintf(ev_name, "_%c%02X", ^ drivers/gpio/gpiolib-acpi.c:206:3: note: 'sprintf' output between 5 and 7 bytes into a destination of size 5 sprintf(ev_name, "_%c%02X", ^~~ agpio->triggering == ACPI_EDGE_SENSITIVE ? 'E' : 'L', ~ pin); This can't be right, so this changes it to truncate the number to an 8-bit pin number. Fixes: 0d1c28a449c6 ("gpiolib-acpi: Add ACPI5 event model support to gpio.") Signed-off-by: Arnd Bergmann --- drivers/gpio/gpiolib-acpi.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpio/gpiolib-acpi.c b/drivers/gpio/gpiolib-acpi.c index c9b42dd12dfa..c3faea724af8 100644 --- a/drivers/gpio/gpiolib-acpi.c +++ b/drivers/gpio/gpiolib-acpi.c @@ -205,7 +205,7 @@ static acpi_status acpi_gpiochip_request_interrupt(struct acpi_resource *ares, char ev_name[5]; sprintf(ev_name, "_%c%02X", agpio->triggering == ACPI_EDGE_SENSITIVE ? 'E' : 'L', - pin); + (u8)pin); if (ACPI_SUCCESS(acpi_get_handle(handle, ev_name, &evt_handle))) handler = acpi_gpio_irq_handler; } -- 2.9.0
[PATCH 17/22] platform/x86: alienware-wmi: fix format string overflow warning
gcc points out a possible format string overflow for a large value of 'zone': drivers/platform/x86/alienware-wmi.c: In function 'alienware_wmi_init': drivers/platform/x86/alienware-wmi.c:461:24: error: '%02X' directive writing between 2 and 8 bytes into a region of size 6 [-Werror=format-overflow=] sprintf(buffer, "zone%02X", i); ^~~~ drivers/platform/x86/alienware-wmi.c:461:19: note: directive argument in the range [0, 2147483646] sprintf(buffer, "zone%02X", i); ^~ drivers/platform/x86/alienware-wmi.c:461:3: note: 'sprintf' output between 7 and 13 bytes into a destination of size 10 While the zone should never be that large, it's easy to make the buffer a few bytes longer so gcc can prove this to be safe. Signed-off-by: Arnd Bergmann --- drivers/platform/x86/alienware-wmi.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/platform/x86/alienware-wmi.c b/drivers/platform/x86/alienware-wmi.c index 0831b428c217..acc01242da82 100644 --- a/drivers/platform/x86/alienware-wmi.c +++ b/drivers/platform/x86/alienware-wmi.c @@ -421,7 +421,7 @@ static DEVICE_ATTR(lighting_control_state, 0644, show_control_state, static int alienware_zone_init(struct platform_device *dev) { int i; - char buffer[10]; + char buffer[13]; char *name; if (interface == WMAX) { -- 2.9.0
[PATCH 15/22] hwmon: applesmc: fix format string overflow
gcc-7 warns that the key might exceed five bytes for lage index values: drivers/hwmon/applesmc.c: In function 'applesmc_show_fan_position': drivers/hwmon/applesmc.c:906:18: error: '%d' directive writing between 1 and 5 bytes into a region of size 4 [-Werror=format-overflow=] sprintf(newkey, FAN_ID_FMT, to_index(attr)); ^~~ drivers/hwmon/applesmc.c:906:18: note: directive argument in the range [0, 65535] drivers/hwmon/applesmc.c:906:2: note: 'sprintf' output between 5 and 9 bytes into a destination of size 5 As the key is required to be four characters plus trailing zero, we know that the index has to be small here. I'm using snprintf() to avoid the warning. This would truncate the string instead of overflowing. Signed-off-by: Arnd Bergmann --- drivers/hwmon/applesmc.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/hwmon/applesmc.c b/drivers/hwmon/applesmc.c index 0af7fd311979..515163b9a89f 100644 --- a/drivers/hwmon/applesmc.c +++ b/drivers/hwmon/applesmc.c @@ -903,7 +903,7 @@ static ssize_t applesmc_show_fan_position(struct device *dev, char newkey[5]; u8 buffer[17]; - sprintf(newkey, FAN_ID_FMT, to_index(attr)); + snprintf(newkey, sizeof(newkey), FAN_ID_FMT, to_index(attr)); ret = applesmc_read_key(newkey, buffer, 16); buffer[16] = 0; -- 2.9.0