date:20170710

Re: [patch net-next RFC 0/4] net: sched: allow qdiscs to share filter block instances

2017-07-10 Thread Or Gerlitz

On Mon, Jul 10, 2017 at 9:51 PM, Jiri Pirko  wrote:
> From: Jiri Pirko 
>
> Currently the filters added to qdiscs are independent. So for example if you
> have 2 netdevices and you create ingress qdisc on both and you want to add
> identical filter rules both, you need to add them twice. This patchset
> makes this easier and mainly saves resources allowing to share all filters
> within a qdisc - I call it a "filter block". Also this helps to save
> resources when we do offload to hw for example to expensive TCAM.

[...]


> Now we can add filter to any of qdiscs sharing the same block:
> $ tc filter add dev ens7 parent : protocol ip pref 25 flower dst_ip 
> 192.168.0.0/16 action drop

> We will see the same output if we list filters for ens7 and ens8, including 
> stats:

yeah, but the stats belong to the action, not the filter, right? so you create
here actually a shared action? or you also introduced in that series stats
for filters, I am confused...

Or.

Re: [PATCH v2] mrf24j40: Fix en error handling path in 'mrf24j40_probe()'

2017-07-10 Thread Marcel Holtmann

Hi Christophe,

> If this check fails, we must release some resources as done everywhere
> else in this function before returning an error code.
> 
> Signed-off-by: Christophe JAILLET 
> ---
> V2: initialization of ret in this erro path ws missing. Stupid me!
> ---
> drivers/net/ieee802154/mrf24j40.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)

patch has been applied to bluetooth-next tree.

Regards

Marcel

Re: [PATCH iproute2 V3 3/4] rdma: Add link object

2017-07-10 Thread Leon Romanovsky

On Mon, Jul 10, 2017 at 08:28:28PM +0200, Jiri Pirko wrote:
> Mon, Jul 10, 2017 at 06:22:23PM CEST, l...@kernel.org wrote:
> >On Mon, Jul 10, 2017 at 10:13:07AM +0200, Jiri Pirko wrote:
> >> Tue, Jul 04, 2017 at 09:55:40AM CEST, l...@kernel.org wrote:
> >> >From: Leon Romanovsky 
> >> >
> >> >Link (port) object represent struct ib_port to the user space.
> >> >
> >> >Link properties:
> >> > * Port capabilities
> >> > * IB subnet prefix
> >> > * LID, SM_LID and LMC
> >> > * Port state
> >> > * Physical state
> >> >
> >> >Signed-off-by: Leon Romanovsky 
> >> >---
> >> > rdma/Makefile |   2 +-
> >> > rdma/link.c   | 280 
> >> > ++
> >> > rdma/rdma.c   |   3 +-
> >> > rdma/utils.c  |   5 ++
> >> > 4 files changed, 288 insertions(+), 2 deletions(-)
> >> > create mode 100644 rdma/link.c
> >> >
> >> >diff --git a/rdma/Makefile b/rdma/Makefile
> >> >index 123d7ac5..1a9e4b1a 100644
> >> >--- a/rdma/Makefile
> >> >+++ b/rdma/Makefile
> >> >@@ -2,7 +2,7 @@ include ../Config
> >> >
> >> > ifeq ($(HAVE_MNL),y)
> >> >
> >> >-RDMA_OBJ = rdma.o utils.o dev.o
> >> >+RDMA_OBJ = rdma.o utils.o dev.o link.o
> >> >
> >> > TARGETS=rdma
> >> > CFLAGS += $(shell $(PKG_CONFIG) libmnl --cflags)
> >> >diff --git a/rdma/link.c b/rdma/link.c
> >> >new file mode 100644
> >> >index ..f92b4cef
> >> >--- /dev/null
> >> >+++ b/rdma/link.c
> >> >@@ -0,0 +1,280 @@
> >> >+/*
> >> >+ * link.cRDMA tool
> >> >+ *
> >> >+ *  This program is free software; you can redistribute it 
> >> >and/or
> >> >+ *  modify it under the terms of the GNU General Public 
> >> >License
> >> >+ *  as published by the Free Software Foundation; either 
> >> >version
> >> >+ *  2 of the License, or (at your option) any later version.
> >> >+ *
> >> >+ * Authors: Leon Romanovsky 
> >> >+ */
> >> >+
> >> >+#include "rdma.h"
> >> >+
> >> >+static int link_help(struct rdma *rd)
> >> >+{
> >> >+ pr_out("Usage: %s link show [DEV/PORT_INDEX]\n", rd->filename);
> >> >+ return 0;
> >> >+}
> >> >+
> >> >+static void link_print_caps(struct nlattr **tb)
> >> >+{
> >> >+ uint64_t caps;
> >> >+ uint32_t idx;
> >> >+
> >> >+ /*
> >> >+  * FIXME: move to indexes when kernel will start exporting them.
> >>
> >> Not exported yet?
> >
> >Not yet, I want to minimize the UAPI export from kernel before user-space
> >part is accepted.
>
> I don't get it. If you need it in userspace, you should expose it. Why
> to wait? What am I missing?

Mainly my attempt to avoid constant rebasing for four series at the
same time. One for rdmatool, one for RDMA netlink, one for RDMA UAPI changes
and one for rdma-core [1] which should reuse those exported structures too.

[1] http://github.com/linux-rdma/rdma-core

Thanks


signature.asc
Description: PGP signature

Re: [PATCH v3 net-next 3/4] tls: kernel TLS support

2017-07-10 Thread Steffen Klassert

Sorry for replying to old mail...

On Wed, Jun 14, 2017 at 11:37:39AM -0700, Dave Watson wrote:
> +static int tls_do_encryption(struct tls_context *tls_ctx,
> +  struct tls_sw_context *ctx, size_t data_len,
> +  gfp_t flags)
> +{
> + unsigned int req_size = sizeof(struct aead_request) +
> + crypto_aead_reqsize(ctx->aead_send);
> + struct aead_request *aead_req;
> + int rc;
> +
> + aead_req = kmalloc(req_size, flags);
> + if (!aead_req)
> + return -ENOMEM;
> +
> + ctx->sg_encrypted_data[0].offset += tls_ctx->prepend_size;
> + ctx->sg_encrypted_data[0].length -= tls_ctx->prepend_size;
> +
> + aead_request_set_tfm(aead_req, ctx->aead_send);
> + aead_request_set_ad(aead_req, TLS_AAD_SPACE_SIZE);
> + aead_request_set_crypt(aead_req, ctx->sg_aead_in, ctx->sg_aead_out,
> +data_len, tls_ctx->iv);
> + rc = crypto_aead_encrypt(aead_req);
> +
> + ctx->sg_encrypted_data[0].offset -= tls_ctx->prepend_size;
> + ctx->sg_encrypted_data[0].length += tls_ctx->prepend_size;
> +
> + kfree(aead_req);
> + return rc;
> +}

...

> +int tls_set_sw_offload(struct sock *sk, struct tls_context *ctx)
> +{

...

> +
> + if (!sw_ctx->aead_send) {
> + sw_ctx->aead_send = crypto_alloc_aead("gcm(aes)", 0, 0);
> + if (IS_ERR(sw_ctx->aead_send)) {
> + rc = PTR_ERR(sw_ctx->aead_send);
> + sw_ctx->aead_send = NULL;
> + goto free_rec_seq;
> + }
> + }
> +

When I look on how you allocate the aead transformation, it seems
that you should either register an asynchronous callback with
aead_request_set_callback(), or request for a synchronous algorithm.

Otherwise you will crash on an asynchronous crypto return, no?

Also, it seems that you have your scatterlists on a per crypto
transformation base istead of per crypto request. Is this intentional?

Re: [PATCH] net: chelsio: cxgb3: constify attribute_group structures.

2017-07-10 Thread Arvind Yadav


Hi Joe,


On Tuesday 11 July 2017 11:17 AM, Joe Perches wrote:

On Tue, 2017-07-11 at 11:11 +0530, Arvind Yadav wrote:

Hi Joe,


On Monday 10 July 2017 10:30 PM, Joe Perches wrote:

On Mon, 2017-07-10 at 16:04 +0530, Arvind Yadav wrote:

attribute_groups are not supposed to change at runtime. All functions
working with attribute_groups provided by  work
with const attribute_group. So mark the non-const structs as const.

I think it's good you are doing all of these.

Instead of individually sending these patches, could you
please send a patch series for all of these attribute_group
patches with a cover letter at the same time?

That could make it easier for a trivial maintainer to apply
all of these at once and not get some applied and others
ignored or dropped on the floor.


Once again, I will send all of patch together, But I have doubt. If it's
having
different maintainer. Example- 'net:' subsystem is having a different
different
maintainer. So do i need to add all the maintainer in single list. Which
can confuse
what patch is for what maintainer. Please suggest me.

Please send individual patches, one per maintainer/subsystem
as a series with a cover letter like:

[PATCH 0/N] treewide: constify attribute_group structures
[PATCH 1/N] chelsio: cxgb3: constify attribute_group
[PATCH 2/N] chelsio: cxgb4: constify attribute_group
...
[PATCH N/N] subsystem: constify attribute_group


Thank you, I will follow as per your suggestion.

Regards,
~arvind

Re: [PATCH] net: chelsio: cxgb3: constify attribute_group structures.

2017-07-10 Thread Joe Perches

On Tue, 2017-07-11 at 11:11 +0530, Arvind Yadav wrote:
> Hi Joe,
> 
> 
> On Monday 10 July 2017 10:30 PM, Joe Perches wrote:
> > On Mon, 2017-07-10 at 16:04 +0530, Arvind Yadav wrote:
> > > attribute_groups are not supposed to change at runtime. All functions
> > > working with attribute_groups provided by  work
> > > with const attribute_group. So mark the non-const structs as const.
> > 
> > I think it's good you are doing all of these.
> > 
> > Instead of individually sending these patches, could you
> > please send a patch series for all of these attribute_group
> > patches with a cover letter at the same time?
> > 
> > That could make it easier for a trivial maintainer to apply
> > all of these at once and not get some applied and others
> > ignored or dropped on the floor.
> > 
> 
> Once again, I will send all of patch together, But I have doubt. If it's 
> having
> different maintainer. Example- 'net:' subsystem is having a different 
> different
> maintainer. So do i need to add all the maintainer in single list. Which 
> can confuse
> what patch is for what maintainer. Please suggest me.

Please send individual patches, one per maintainer/subsystem
as a series with a cover letter like:

[PATCH 0/N] treewide: constify attribute_group structures
[PATCH 1/N] chelsio: cxgb3: constify attribute_group
[PATCH 2/N] chelsio: cxgb4: constify attribute_group
...
[PATCH N/N] subsystem: constify attribute_group

Re: [PATCH] net: chelsio: cxgb3: constify attribute_group structures.

2017-07-10 Thread Arvind Yadav


Hi Joe,


On Monday 10 July 2017 10:30 PM, Joe Perches wrote:

On Mon, 2017-07-10 at 16:04 +0530, Arvind Yadav wrote:

attribute_groups are not supposed to change at runtime. All functions
working with attribute_groups provided by  work
with const attribute_group. So mark the non-const structs as const.

I think it's good you are doing all of these.

Instead of individually sending these patches, could you
please send a patch series for all of these attribute_group
patches with a cover letter at the same time?

That could make it easier for a trivial maintainer to apply
all of these at once and not get some applied and others
ignored or dropped on the floor.

Once again, I will send all of patch together, But I have doubt. If it's 
having
different maintainer. Example- 'net:' subsystem is having a different 
different
maintainer. So do i need to add all the maintainer in single list. Which 
can confuse

what patch is for what maintainer. Please suggest me.

Thanks ,
~arvind

Re: [RFC PATCH 00/12] Implement XDP bpf_redirect vairants

2017-07-10 Thread John Fastabend

On 07/10/2017 11:30 AM, Jesper Dangaard Brouer wrote:
> On Sat, 8 Jul 2017 21:06:17 +0200
> Jesper Dangaard Brouer  wrote:
> 
>> On Sat, 08 Jul 2017 10:46:18 +0100 (WEST)
>> David Miller  wrote:
>>
>>> From: John Fastabend 
>>> Date: Fri, 07 Jul 2017 10:48:36 -0700
>>>   
 On 07/07/2017 10:34 AM, John Fastabend wrote:
> This series adds two new XDP helper routines bpf_redirect() and
> bpf_redirect_map(). The first variant bpf_redirect() is meant
> to be used the same way it is currently being used by the cls_bpf
> classifier. An xdp packet will be redirected immediately when this
> is called.

 Also other than the typo in the title there ;) I'm going to CC
 the driver maintainers working on XDP (makes for a long CC list but)
 because we would want to try and get support in as many as possible in
 the next merge window.

 For this rev I just implemented on ixgbe because I wrote the
 original XDP support there. I'll volunteer to do virtio as well.
>>>
>>> I went over this series a few times and it looks great to me.
>>> You didn't even give me some coding style issues to pick on :-)  
>>
>> We (Daniel, Andy and I) have been reviewing and improving on this
>> patchset the last couple of weeks ;-).  We had some stability issues,
>> which is why it wasn't published earlier. My plan is to test this
>> latest patchset again, Monday and Tuesday. I'll try to assess stability
>> and provide some performance numbers.
> 
> 
> Damn, I though it was stable, I have been running a lot of performance
> tests, and then this just happened :-(

Thanks, I'll take a look through the code and see if I can come up with
why this might happen. I haven't hit it on my tests yet though.

.John

Re: meson-gxbb-p200 4.12.0 network TCP transfer stalls

2017-07-10 Thread Marc Duponcheel

 I use NBD for fast building and savif SD card and, sadly, the TV box crashes 
after few min of building

On 2017 Jul 11, Marc Duponcheel wrote:
>  FYI
> 
> I changed 4.12.0 drivers/net/phy/realtek.c RTL8211F code to equal the
> 3.14.29 amlogic/ethernet/phy/am_realtek.c code and now the networking
> works without TCP stalls.
> 
> I don't say below patch should be considered mainstream (as maybe some
> 4.12.0 rtl8211f_config_init code should not be removed) but sure
> someone could figure out what best patch is...

--
 Marc Duponcheel
 Velodroomstraat 74 - 2600 Berchem - Belgium
 +32 (0)478 68 10 91 - m...@offline.be

Re: [PATCH v6 0/3] Add new PCI_DEV_FLAGS_NO_RELAXED_ORDERING flag

2017-07-10 Thread Casey Leedom


Hey Alexander,

  Okay, I understand your point regarding the "most likely scenario" being
TLPs directed upstream to the Root Complex.  But I'd still like to make sure
that we have an agreed upon API/methodology for doing Peer-to-Peer with
Relaxed Ordering and no Relaxed Ordering to the Root Complex.  I don't see
how the proposed APIs can be used in that fashion.
 
  Right now the proposed change for cxgb4 is for it to test its own PCIe
Capability Device Control[Relaxed Ordering Enable] in order to use that
information to program the Chelsio Hardware to emit/not emit upstream TLPs
with the Relaxed Ordering Attribute set.  But if we're going to have the
mixed mode situation I describe, the PCIe Capability Device Control[Relaxed
Ordering Enable] will have to be set which means that we'll be programming
the Chelsio Hardware to send upstream TLPs with Relaxed Ordering Enable to
the Root Complex which is what we were trying to avoid in the first place ...

  [[ And, as I noted on Friday evening, the currect cxgb4 Driver hardwires
 the Relaxed Ordering Enable on early dureing device probe, so that
 would minimally need to be addressed even if we decide that we don't
 ever want to support mixed mode Relaxed Ordering. ]]

  We need some method of telling the Chelsio Driver that it should/shouldn't
use Relaxed Ordering with TLPs directed at the Root Complex.  And the same
is true for a Peer PCIe Device.

  It may be that we should approach this from the completely opposite
direction and instead of having quirks which identify problematic devices,
have quirks which identify devices which would benefit from the use of
Relaxed Ordering (if the sending device supports that).  That is, assume the
using Relaxed Ordering shouldn't be done unless the target device says "I
love Relaxed Ordering TLPs" ...  In such a world, an NVMe or a Graphics
device might declare love of Relaxed Ordering and the same for a SPARC Root
Complex (I think that was your example).

  By the way, the sole example of Data Corruption with Relaxed Ordering is
the AMD A1100 ARM SoC and AMD appears to have given up on that almost as
soon as it was released.  So what we're left with currently is a performance
problem on modern Intel CPUs ...  (And hopefully we'll get a Technical
Publication on that issue fairly soon.)

Casey

Re: meson-gxbb-p200 4.12.0 network TCP transfer stalls

2017-07-10 Thread Marc Duponcheel

 FYI

I changed 4.12.0 drivers/net/phy/realtek.c RTL8211F code to equal the
3.14.29 amlogic/ethernet/phy/am_realtek.c code and now the networking
works without TCP stalls.

I don't say below patch should be considered mainstream (as maybe some
4.12.0 rtl8211f_config_init code should not be removed) but sure
someone could figure out what best patch is...

 --
# diff -U2 realtek.c.orig realtek.c
--- realtek.c.orig  2017-07-03 01:07:02.0 +0200
+++ realtek.c   2017-07-11 01:32:20.273445023 +0200
@@ -94,28 +94,27 @@
 }
 
+#define RTL8211F_MMD_CTRL   0x0D
+#define RTL8211F_MMD_DATA   0x0E
+#define RTL8211E_INER_LINK_STAT 0x10
+
 static int rtl8211f_config_init(struct phy_device *phydev)
 {
-   int ret;
-   u16 reg;
-
-   ret = genphy_config_init(phydev);
-   if (ret < 0)
-   return ret;
-
-   phy_write(phydev, RTL8211F_PAGE_SELECT, 0xd08);
-   reg = phy_read(phydev, 0x11);
-
-   /* enable TX-delay for rgmii-id and rgmii-txid, otherwise disable it */
-   if (phydev->interface == PHY_INTERFACE_MODE_RGMII_ID ||
-   phydev->interface == PHY_INTERFACE_MODE_RGMII_TXID)
-   reg |= RTL8211F_TX_DELAY;
-   else
-   reg &= ~RTL8211F_TX_DELAY;
-
-   phy_write(phydev, 0x11, reg);
-   /* restore to default page 0 */
-   phy_write(phydev, RTL8211F_PAGE_SELECT, 0x0);
-
+   int val;
+/* we want to disable eee */
+   phy_write(phydev, RTL8211F_MMD_CTRL, 0x7);
+   phy_write(phydev, RTL8211F_MMD_DATA, 0x3c);
+   phy_write(phydev, RTL8211F_MMD_CTRL, 0x4007);
+   phy_write(phydev, RTL8211F_MMD_DATA, 0x0);
+/* disable 1000m adv*/
+   val = phy_read(phydev, 0x9);
+   phy_write(phydev, 0x9, val&(~(1<<9)));
+/* rx reg 21 bit 3 tx reg 17 bit 8 */
+/*
+   phy_write(phydev, 0x1f, 0xd08);
+   val =  phy_read(phydev, 0x15);
+   phy_write(phydev, 0x15,val| 1<<21);
+ */
return 0;
+/* Enable Auto Power Saving mode */
 }
 
@@ -167,6 +166,6 @@
.name   = "RTL8211F Gigabit Ethernet",
.phy_id_mask= 0x001f,
-   .features   = PHY_GBIT_FEATURES,
-   .flags  = PHY_HAS_INTERRUPT,
+   .features   = PHY_GBIT_FEATURES | SUPPORTED_Pause | 
SUPPORTED_Asym_Pause,
+   .flags  = PHY_HAS_INTERRUPT | PHY_HAS_MAGICANEG,
.config_aneg= &genphy_config_aneg,
 --


On 2017 Jul 08, Marc Duponcheel wrote:
> Hi all
> 
> Similar to 'crow'
> 
> ARM GLX Khadas VIM Pro - Ethernet detected as only 10Mbps and stalled after 
> some traffic
> 
> I also have the situation where copying large file over LAN stalls.
> 
> 3.14.29 performance is good 4.12.0 performance not
> 
> 
>  Perhaps one must compare 3.14.29 am_realtek.c and realtek.c with 4.12.0 
> realtek.c
> 
> 
>  thanks

have a nice day

Re: [PATCH net v2] samples/bpf: fix a build issue

2017-07-10 Thread Lawrence Brakmo



On 7/10/17, 2:13 PM, "Daniel Borkmann"  wrote:

On 07/10/2017 11:04 PM, Yonghong Song wrote:
> With latest net-next:
> 
> clang  -nostdinc -isystem /usr/lib/gcc/x86_64-redhat-linux/6.3.1/include 
-I./arch/x86/include -I./arch/x86/include/generated/uapi 
-I./arch/x86/include/generated  -I./include -I./arch/x86/include/uapi 
-I./include/uapi -I./include/generated/uapi -include ./include/linux/kconfig.h  
-Isamples/bpf \
>  -D__KERNEL__ -D__ASM_SYSREG_H -Wno-unused-value -Wno-pointer-sign \
>  -Wno-compare-distinct-pointer-types \
>  -Wno-gnu-variable-sized-type-not-at-end \
>  -Wno-address-of-packed-member -Wno-tautological-compare \
>  -Wno-unknown-warning-option \
>  -O2 -emit-llvm -c samples/bpf/tcp_synrto_kern.c -o -| llc -march=bpf 
-filetype=obj -o samples/bpf/tcp_synrto_kern.o
> samples/bpf/tcp_synrto_kern.c:20:10: fatal error: 'bpf_endian.h' file not 
found
>^~
> 1 error generated.
> 
>
> net has the same issue.
>
> Add support for ntohl and htonl in 
tools/testing/selftests/bpf/bpf_endian.h.
> Also move bpf_helpers.h from samples/bpf to selftests/bpf and change
> compiler include logic so that programs in samples/bpf can access the 
headers
> in selftests/bpf, but not the other way around.
>
> Signed-off-by: Yonghong Song 

LGTM, thanks!

Acked-by: Daniel Borkmann dan...@iogearbox.net

Acked-by: Lawrence Brakmo

Re: WARN_ON_ONCE(work > weight) in napi_poll()

2017-07-10 Thread Ryan Hsu

On 07/04/2017 08:59 AM, Andrey Ryabinin wrote:

> On 07/04/2017 04:49 PM, Kalle Valo wrote:
>> Andrey Ryabinin  writes:
>>
>>> I occasionally hit WARN_ON_ONCE(work > weight); in napi_poll() on a
>>> laptop with ath10k card.
>>>
>>>
>>> [37207.593370] [ cut here ]
>>> [37207.593380] WARNING: CPU: 0 PID: 7 at ../net/core/dev.c:5274
>>> net_rx_action+0x258/0x360
>>> [37207.593381] Modules linked in: snd_hda_codec_realtek snd_soc_skl
>>> snd_hda_codec_generic snd_soc_skl_ipc snd_soc_sst_ipc snd_soc_sst_dsp
>>> snd_hda_ext_core snd_soc_sst_match snd_soc_core rtsx_pci_sdmmc
>>> mmc_core snd_pcm_dmaengine x86_pkg_temp_thermal snd_hda_intel uvcvideo
>>> kvm_intel videobuf2_vmalloc kvm snd_hda_codec snd_hwdep btusb
>>> videobuf2_memops btintel snd_hda_core videobuf2_v4l2 bluetooth
>>> irqbypass snd_pcm videobuf2_core crc32c_intel videodev mei_me mei
>>> rtsx_pci intel_lpss_pci intel_lpss_acpi intel_vbtn intel_lpss mfd_core
>>> tpm_tis intel_hid tpm_tis_core tpm efivarfs
>>> [37207.593430] CPU: 0 PID: 7 Comm: ksoftirqd/0 Not tainted 4.11.7 #28
>>> [37207.593432] Hardware name: Dell Inc. XPS 13 9360/0839Y6, BIOS 1.3.5 
>>> 05/08/2017
>> What firmware and hardware versions exactly? The dmesg output when
>> ath10k loads is preferred. As you are using XPS 13 I'm guessing it's one
>> of QCA6174 family.
>>
> Yes it's qca6174.
>
> $ dmesg |grep ath10
> [0.624828] ath10k_pci :3a:00.0: enabling device ( -> 0002)
> [0.626370] ath10k_pci :3a:00.0: pci irq msi oper_irq_mode 2 irq_mode 
> 0 reset_mode 0
> [0.837862] ath10k_pci :3a:00.0: qca6174 hw3.2 target 0x0503 
> chip_id 0x00340aff sub 1a56:1535
> [0.837863] ath10k_pci :3a:00.0: kconfig debug 0 debugfs 1 tracing 0 
> dfs 0 testmode 0
> [0.838388] ath10k_pci :3a:00.0: firmware ver 
> WLAN.RM.2.0-00180-QCARMSWPZ-1 api 4 features wowlan,ignore-otp,no-4addr-pad 
> crc32 75dee6c5
> [0.900606] ath10k_pci :3a:00.0: board_file api 2 bmi_id N/A crc32 
> 19644295
> [3.020681] ath10k_pci :3a:00.0: htt-ver 3.26 wmi-op 4 htt-op 3 cal 
> otp max-sta 32 raw 0 hwcrypto 1
> [9.574087] ath10k_pci :3a:00.0 wlp58s0: renamed from wlan0
>

I can't reproduce of this past few days, not sure if this is due to the amsdu 
packets received from AP, would you mind share what Ap you're using?
And if there any specific steps you're doing?

Also WLAN.RM.2.0-00180-QCARMSWPZ-1 firmware is a bit old, could you also update 
firmware to give it a try?
https://github.com/kvalo/ath10k-firmware/tree/master/QCA6174/hw3.0/4.4

-- 
Ryan Hsu

Re: [PATCH net v2] samples/bpf: fix a build issue

2017-07-10 Thread Daniel Borkmann


On 07/10/2017 11:04 PM, Yonghong Song wrote:

With latest net-next:

clang  -nostdinc -isystem /usr/lib/gcc/x86_64-redhat-linux/6.3.1/include 
-I./arch/x86/include -I./arch/x86/include/generated/uapi 
-I./arch/x86/include/generated  -I./include -I./arch/x86/include/uapi 
-I./include/uapi -I./include/generated/uapi -include ./include/linux/kconfig.h  
-Isamples/bpf \
 -D__KERNEL__ -D__ASM_SYSREG_H -Wno-unused-value -Wno-pointer-sign \
 -Wno-compare-distinct-pointer-types \
 -Wno-gnu-variable-sized-type-not-at-end \
 -Wno-address-of-packed-member -Wno-tautological-compare \
 -Wno-unknown-warning-option \
 -O2 -emit-llvm -c samples/bpf/tcp_synrto_kern.c -o -| llc -march=bpf 
-filetype=obj -o samples/bpf/tcp_synrto_kern.o
samples/bpf/tcp_synrto_kern.c:20:10: fatal error: 'bpf_endian.h' file not found
   ^~
1 error generated.


net has the same issue.

Add support for ntohl and htonl in tools/testing/selftests/bpf/bpf_endian.h.
Also move bpf_helpers.h from samples/bpf to selftests/bpf and change
compiler include logic so that programs in samples/bpf can access the headers
in selftests/bpf, but not the other way around.

Signed-off-by: Yonghong Song 


LGTM, thanks!

Acked-by: Daniel Borkmann

[PATCH net v2] samples/bpf: fix a build issue

2017-07-10 Thread Yonghong Song

With latest net-next:

clang  -nostdinc -isystem /usr/lib/gcc/x86_64-redhat-linux/6.3.1/include 
-I./arch/x86/include -I./arch/x86/include/generated/uapi 
-I./arch/x86/include/generated  -I./include -I./arch/x86/include/uapi 
-I./include/uapi -I./include/generated/uapi -include ./include/linux/kconfig.h  
-Isamples/bpf \
-D__KERNEL__ -D__ASM_SYSREG_H -Wno-unused-value -Wno-pointer-sign \
-Wno-compare-distinct-pointer-types \
-Wno-gnu-variable-sized-type-not-at-end \
-Wno-address-of-packed-member -Wno-tautological-compare \
-Wno-unknown-warning-option \
-O2 -emit-llvm -c samples/bpf/tcp_synrto_kern.c -o -| llc -march=bpf 
-filetype=obj -o samples/bpf/tcp_synrto_kern.o
samples/bpf/tcp_synrto_kern.c:20:10: fatal error: 'bpf_endian.h' file not found
  ^~
1 error generated.


net has the same issue.

Add support for ntohl and htonl in tools/testing/selftests/bpf/bpf_endian.h.
Also move bpf_helpers.h from samples/bpf to selftests/bpf and change
compiler include logic so that programs in samples/bpf can access the headers
in selftests/bpf, but not the other way around.

Signed-off-by: Yonghong Song 
---
 samples/bpf/Makefile   |  1 +
 tools/testing/selftests/bpf/Makefile   |  1 -
 tools/testing/selftests/bpf/bpf_endian.h   | 14 ++
 {samples => tools/testing/selftests}/bpf/bpf_helpers.h |  0
 4 files changed, 15 insertions(+), 1 deletion(-)
 rename {samples => tools/testing/selftests}/bpf/bpf_helpers.h (100%)

diff --git a/samples/bpf/Makefile b/samples/bpf/Makefile
index 9c65058..87246be 100644
--- a/samples/bpf/Makefile
+++ b/samples/bpf/Makefile
@@ -207,6 +207,7 @@ $(obj)/tracex5_kern.o: $(obj)/syscall_nrs.h
 # useless for BPF samples.
 $(obj)/%.o: $(src)/%.c
$(CLANG) $(NOSTDINC_FLAGS) $(LINUXINCLUDE) $(EXTRA_CFLAGS) -I$(obj) \
+   -I$(srctree)/tools/testing/selftests/bpf/ \
-D__KERNEL__ -D__ASM_SYSREG_H -Wno-unused-value 
-Wno-pointer-sign \
-Wno-compare-distinct-pointer-types \
-Wno-gnu-variable-sized-type-not-at-end \
diff --git a/tools/testing/selftests/bpf/Makefile 
b/tools/testing/selftests/bpf/Makefile
index 2ca51a8..153c3a1 100644
--- a/tools/testing/selftests/bpf/Makefile
+++ b/tools/testing/selftests/bpf/Makefile
@@ -37,6 +37,5 @@ CLANG ?= clang
 
 %.o: %.c
$(CLANG) -I. -I./include/uapi -I../../../include/uapi \
-   -I../../../../samples/bpf/ \
-Wno-compare-distinct-pointer-types \
-O2 -target bpf -c $< -o $@
diff --git a/tools/testing/selftests/bpf/bpf_endian.h 
b/tools/testing/selftests/bpf/bpf_endian.h
index 487cbfb..74af266 100644
--- a/tools/testing/selftests/bpf/bpf_endian.h
+++ b/tools/testing/selftests/bpf/bpf_endian.h
@@ -23,11 +23,19 @@
 # define __bpf_htons(x)__builtin_bswap16(x)
 # define __bpf_constant_ntohs(x)   ___constant_swab16(x)
 # define __bpf_constant_htons(x)   ___constant_swab16(x)
+# define __bpf_ntohl(x)__builtin_bswap32(x)
+# define __bpf_htonl(x)__builtin_bswap32(x)
+# define __bpf_constant_ntohl(x)   ___constant_swab32(x)
+# define __bpf_constant_htonl(x)   ___constant_swab32(x)
 #elif __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__
 # define __bpf_ntohs(x)(x)
 # define __bpf_htons(x)(x)
 # define __bpf_constant_ntohs(x)   (x)
 # define __bpf_constant_htons(x)   (x)
+# define __bpf_ntohl(x)(x)
+# define __bpf_htonl(x)(x)
+# define __bpf_constant_ntohl(x)   (x)
+# define __bpf_constant_htonl(x)   (x)
 #else
 # error "Fix your compiler's __BYTE_ORDER__?!"
 #endif
@@ -38,5 +46,11 @@
 #define bpf_ntohs(x)   \
(__builtin_constant_p(x) ?  \
 __bpf_constant_ntohs(x) : __bpf_ntohs(x))
+#define bpf_htonl(x)   \
+   (__builtin_constant_p(x) ?  \
+__bpf_constant_htonl(x) : __bpf_htonl(x))
+#define bpf_ntohl(x)   \
+   (__builtin_constant_p(x) ?  \
+__bpf_constant_ntohl(x) : __bpf_ntohl(x))
 
 #endif /* __BPF_ENDIAN__ */
diff --git a/samples/bpf/bpf_helpers.h 
b/tools/testing/selftests/bpf/bpf_helpers.h
similarity index 100%
rename from samples/bpf/bpf_helpers.h
rename to tools/testing/selftests/bpf/bpf_helpers.h
-- 
2.9.3

Re: [PATCH net] samples/bpf: fix a build issue

2017-07-10 Thread Daniel Borkmann


On 07/10/2017 10:51 PM, Yonghong Song wrote:

On 7/10/17 1:27 PM, Daniel Borkmann wrote:

On 07/10/2017 10:12 PM, Yonghong Song wrote:

From: Yonghong Song 

With latest net-next:

clang  -nostdinc -isystem /usr/lib/gcc/x86_64-redhat-linux/6.3.1/include 
-I./arch/x86/include -I./arch/x86/include/generated/uapi 
-I./arch/x86/include/generated -I./include -I./arch/x86/include/uapi 
-I./include/uapi -I./include/generated/uapi -include ./include/linux/kconfig.h 
-Isamples/bpf \
-D__KERNEL__ -D__ASM_SYSREG_H -Wno-unused-value -Wno-pointer-sign \
-Wno-compare-distinct-pointer-types \
-Wno-gnu-variable-sized-type-not-at-end \
-Wno-address-of-packed-member -Wno-tautological-compare \
-Wno-unknown-warning-option \
-O2 -emit-llvm -c samples/bpf/tcp_synrto_kern.c -o -| llc -march=bpf 
-filetype=obj -o samples/bpf/tcp_synrto_kern.o
samples/bpf/tcp_synrto_kern.c:20:10: fatal error: 'bpf_endian.h' file not found
  ^~
1 error generated.


net has the same issue.

Add support for ntohl and htonl in tools/testing/selftests/bpf/bpf_endian.h
and move it to samples/bpf/ directory so that it can used by
both selftests/bpf and samples/bpf. The existing samples/bpf/bpf_helpers.h
is already used by both.

Signed-off-by: Yonghong Song 
---
  {tools/testing/selftests => samples}/bpf/bpf_endian.h | 14 ++


samples/bpf/Makefile already does:

   HOSTCFLAGS += -I$(srctree)/tools/testing/selftests/bpf/

If needed we should rather extend the sample's Makefile to pull in
bpf_endian.h also for clang generated files, but not the other way
round. It's both kind of messy, but kernel selftests should not need
to include or depend upon some sample code.


Sounds good. I will then move bpf_helpers.h to selftests as well so
samples/bpf will depend on selftests/bpf, but not the other way around.


Okay, that's fine by me. (We can later also move some of the test
cases over to selftests, so that they get generally more coverage.)
Thanks, Yonghong!

Re: [PATCH net-next RFC 05/12] net: dsa: Add support for learning FDB through notification

2017-07-10 Thread Vivien Didelot

Hi Arkadi,

Arkadi Sharshevsky  writes:

>>> +   err = dsa_port_fdb_add(p->dp, fdb_info->addr, fdb_info->vid);
>>> +   if (err) {
>>> +   netdev_dbg(dev, "fdb add failed err=%d\n", err);
>>> +   break;
>>> +   }
>>> +   call_switchdev_notifiers(SWITCHDEV_FDB_OFFLOADED, dev,
>>> +&fdb_info->info);
>>> +   break;
>>> +
>>> +   case SWITCHDEV_FDB_DEL_TO_DEVICE:
>>> +   fdb_info = &switchdev_work->fdb_info;
>>> +   err = dsa_port_fdb_del(p->dp, fdb_info->addr, fdb_info->vid);
>>> +   if (err)
>>> +   netdev_dbg(dev, "fdb del failed err=%d\n", err);
>> 
>> OK I must have missed from the off-list discussion why we are not
>> calling the switchdev notifier here?
>
> We do not agree on it actually, that is why it was moved to the list.
> I think that delete should succeed, you should retry until succession.
>
> The deletion is done under spinlock in the bridge so you cannot block,
> thus delete cannot fail due to hardware failure. Calling it here doesn't
> make sense because the bridge probably already deleted this FDB.

So as we discussed, the problem here is that if dsa_port_fdb_del fails
for some probable reasons (MDIO timeout, weak GPIO lines, etc.), Linux
bridge will delete the entry in software, dumping bridge fdb will show
nothing, but the entry would still be programmed in hardware and the
network can thus be inconsistent, unsupposedly switching frames.

IMHO the correct way for bridge to use the notification chain is to make
SWITCHDEV_FDB_DEL_TO_DEVICE symmetrical to SWITCHDEV_FDB_ADD_TO_DEVICE:
if an entry has been marked as offloaded, bridge must mark the entry as
to-be-deleted and do not delete the software entry until the driver
notifies back the successful deletion.

If that is hardly feasible due to some bridge limitations, we must
explain this in a comment and use something more explosive than a simple
netdev_dbg to warn the user about the broken network setup...

Thanks,

Vivien

Re: [PATCH net] samples/bpf: fix a build issue

2017-07-10 Thread Yonghong Song




On 7/10/17 1:27 PM, Daniel Borkmann wrote:

On 07/10/2017 10:12 PM, Yonghong Song wrote:

From: Yonghong Song 

With latest net-next:

clang  -nostdinc -isystem 
/usr/lib/gcc/x86_64-redhat-linux/6.3.1/include -I./arch/x86/include 
-I./arch/x86/include/generated/uapi -I./arch/x86/include/generated  
-I./include -I./arch/x86/include/uapi -I./include/uapi 
-I./include/generated/uapi -include ./include/linux/kconfig.h  
-Isamples/bpf \

-D__KERNEL__ -D__ASM_SYSREG_H -Wno-unused-value -Wno-pointer-sign \
-Wno-compare-distinct-pointer-types \
-Wno-gnu-variable-sized-type-not-at-end \
-Wno-address-of-packed-member -Wno-tautological-compare \
-Wno-unknown-warning-option \
-O2 -emit-llvm -c samples/bpf/tcp_synrto_kern.c -o -| llc 
-march=bpf -filetype=obj -o samples/bpf/tcp_synrto_kern.o
samples/bpf/tcp_synrto_kern.c:20:10: fatal error: 'bpf_endian.h' file 
not found

  ^~
1 error generated.


net has the same issue.

Add support for ntohl and htonl in 
tools/testing/selftests/bpf/bpf_endian.h

and move it to samples/bpf/ directory so that it can used by
both selftests/bpf and samples/bpf. The existing 
samples/bpf/bpf_helpers.h

is already used by both.

Signed-off-by: Yonghong Song 
---
  {tools/testing/selftests => samples}/bpf/bpf_endian.h | 14 
++


samples/bpf/Makefile already does:

   HOSTCFLAGS += -I$(srctree)/tools/testing/selftests/bpf/

If needed we should rather extend the sample's Makefile to pull in
bpf_endian.h also for clang generated files, but not the other way
round. It's both kind of messy, but kernel selftests should not need
to include or depend upon some sample code.


Sounds good. I will then move bpf_helpers.h to selftests as well so
samples/bpf will depend on selftests/bpf, but not the other way around.



Other than that looks good to me.


  tools/testing/selftests/bpf/Makefile  |  3 ++-
  2 files changed, 16 insertions(+), 1 deletion(-)
  rename {tools/testing/selftests => samples}/bpf/bpf_endian.h (73%)

diff --git a/tools/testing/selftests/bpf/bpf_endian.h 
b/samples/bpf/bpf_endian.h

similarity index 73%
rename from tools/testing/selftests/bpf/bpf_endian.h
rename to samples/bpf/bpf_endian.h
index 487cbfb..74af266 100644
--- a/tools/testing/selftests/bpf/bpf_endian.h
+++ b/samples/bpf/bpf_endian.h
@@ -23,11 +23,19 @@
  # define __bpf_htons(x)__builtin_bswap16(x)
  # define __bpf_constant_ntohs(x)___constant_swab16(x)
  # define __bpf_constant_htons(x)___constant_swab16(x)
+# define __bpf_ntohl(x)__builtin_bswap32(x)
+# define __bpf_htonl(x)__builtin_bswap32(x)
+# define __bpf_constant_ntohl(x)___constant_swab32(x)
+# define __bpf_constant_htonl(x)___constant_swab32(x)
  #elif __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__
  # define __bpf_ntohs(x)(x)
  # define __bpf_htons(x)(x)
  # define __bpf_constant_ntohs(x)(x)
  # define __bpf_constant_htons(x)(x)
+# define __bpf_ntohl(x)(x)
+# define __bpf_htonl(x)(x)
+# define __bpf_constant_ntohl(x)(x)
+# define __bpf_constant_htonl(x)(x)
  #else
  # error "Fix your compiler's __BYTE_ORDER__?!"
  #endif
@@ -38,5 +46,11 @@
  #define bpf_ntohs(x)\
  (__builtin_constant_p(x) ?\
   __bpf_constant_ntohs(x) : __bpf_ntohs(x))
+#define bpf_htonl(x)\
+(__builtin_constant_p(x) ?\
+ __bpf_constant_htonl(x) : __bpf_htonl(x))
+#define bpf_ntohl(x)\
+(__builtin_constant_p(x) ?\
+ __bpf_constant_ntohl(x) : __bpf_ntohl(x))

  #endif /* __BPF_ENDIAN__ */
diff --git a/tools/testing/selftests/bpf/Makefile 
b/tools/testing/selftests/bpf/Makefile

index 2ca51a8..f263c6b 100644
--- a/tools/testing/selftests/bpf/Makefile
+++ b/tools/testing/selftests/bpf/Makefile
@@ -8,7 +8,8 @@ ifneq ($(wildcard $(GENHDR)),)
GENFLAGS := -DHAVE_GENHDR
  endif

-CFLAGS += -Wall -O2 -I$(APIDIR) -I$(LIBDIR) -I$(GENDIR) $(GENFLAGS) 
-I../../../include
+CFLAGS += -Wall -O2 -I$(APIDIR) -I$(LIBDIR) -I$(GENDIR) $(GENFLAGS) 
-I../../../include \

+-I../../../../samples/bpf
  LDLIBS += -lcap -lelf

  TEST_GEN_PROGS = test_verifier test_tag test_maps test_lru_map 
test_lpm_map test_progs \

Re: [PATCH net] samples/bpf: fix a build issue

2017-07-10 Thread Daniel Borkmann


On 07/10/2017 10:12 PM, Yonghong Song wrote:

From: Yonghong Song 

With latest net-next:

clang  -nostdinc -isystem /usr/lib/gcc/x86_64-redhat-linux/6.3.1/include 
-I./arch/x86/include -I./arch/x86/include/generated/uapi 
-I./arch/x86/include/generated  -I./include -I./arch/x86/include/uapi 
-I./include/uapi -I./include/generated/uapi -include ./include/linux/kconfig.h  
-Isamples/bpf \
-D__KERNEL__ -D__ASM_SYSREG_H -Wno-unused-value -Wno-pointer-sign \
-Wno-compare-distinct-pointer-types \
-Wno-gnu-variable-sized-type-not-at-end \
-Wno-address-of-packed-member -Wno-tautological-compare \
-Wno-unknown-warning-option \
-O2 -emit-llvm -c samples/bpf/tcp_synrto_kern.c -o -| llc -march=bpf 
-filetype=obj -o samples/bpf/tcp_synrto_kern.o
samples/bpf/tcp_synrto_kern.c:20:10: fatal error: 'bpf_endian.h' file not found
  ^~
1 error generated.


net has the same issue.

Add support for ntohl and htonl in tools/testing/selftests/bpf/bpf_endian.h
and move it to samples/bpf/ directory so that it can used by
both selftests/bpf and samples/bpf. The existing samples/bpf/bpf_helpers.h
is already used by both.

Signed-off-by: Yonghong Song 
---
  {tools/testing/selftests => samples}/bpf/bpf_endian.h | 14 ++


samples/bpf/Makefile already does:

  HOSTCFLAGS += -I$(srctree)/tools/testing/selftests/bpf/

If needed we should rather extend the sample's Makefile to pull in
bpf_endian.h also for clang generated files, but not the other way
round. It's both kind of messy, but kernel selftests should not need
to include or depend upon some sample code.

Other than that looks good to me.


  tools/testing/selftests/bpf/Makefile  |  3 ++-
  2 files changed, 16 insertions(+), 1 deletion(-)
  rename {tools/testing/selftests => samples}/bpf/bpf_endian.h (73%)

diff --git a/tools/testing/selftests/bpf/bpf_endian.h b/samples/bpf/bpf_endian.h
similarity index 73%
rename from tools/testing/selftests/bpf/bpf_endian.h
rename to samples/bpf/bpf_endian.h
index 487cbfb..74af266 100644
--- a/tools/testing/selftests/bpf/bpf_endian.h
+++ b/samples/bpf/bpf_endian.h
@@ -23,11 +23,19 @@
  # define __bpf_htons(x)   __builtin_bswap16(x)
  # define __bpf_constant_ntohs(x)  ___constant_swab16(x)
  # define __bpf_constant_htons(x)  ___constant_swab16(x)
+# define __bpf_ntohl(x)__builtin_bswap32(x)
+# define __bpf_htonl(x)__builtin_bswap32(x)
+# define __bpf_constant_ntohl(x)   ___constant_swab32(x)
+# define __bpf_constant_htonl(x)   ___constant_swab32(x)
  #elif __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__
  # define __bpf_ntohs(x)   (x)
  # define __bpf_htons(x)   (x)
  # define __bpf_constant_ntohs(x)  (x)
  # define __bpf_constant_htons(x)  (x)
+# define __bpf_ntohl(x)(x)
+# define __bpf_htonl(x)(x)
+# define __bpf_constant_ntohl(x)   (x)
+# define __bpf_constant_htonl(x)   (x)
  #else
  # error "Fix your compiler's __BYTE_ORDER__?!"
  #endif
@@ -38,5 +46,11 @@
  #define bpf_ntohs(x)  \
(__builtin_constant_p(x) ?  \
 __bpf_constant_ntohs(x) : __bpf_ntohs(x))
+#define bpf_htonl(x)   \
+   (__builtin_constant_p(x) ?  \
+__bpf_constant_htonl(x) : __bpf_htonl(x))
+#define bpf_ntohl(x)   \
+   (__builtin_constant_p(x) ?  \
+__bpf_constant_ntohl(x) : __bpf_ntohl(x))

  #endif /* __BPF_ENDIAN__ */
diff --git a/tools/testing/selftests/bpf/Makefile 
b/tools/testing/selftests/bpf/Makefile
index 2ca51a8..f263c6b 100644
--- a/tools/testing/selftests/bpf/Makefile
+++ b/tools/testing/selftests/bpf/Makefile
@@ -8,7 +8,8 @@ ifneq ($(wildcard $(GENHDR)),)
GENFLAGS := -DHAVE_GENHDR
  endif

-CFLAGS += -Wall -O2 -I$(APIDIR) -I$(LIBDIR) -I$(GENDIR) $(GENFLAGS) 
-I../../../include
+CFLAGS += -Wall -O2 -I$(APIDIR) -I$(LIBDIR) -I$(GENDIR) $(GENFLAGS) 
-I../../../include \
+   -I../../../../samples/bpf
  LDLIBS += -lcap -lelf

  TEST_GEN_PROGS = test_verifier test_tag test_maps test_lru_map test_lpm_map 
test_progs \

Re: [PATCH net] samples/bpf: fix a build issue

2017-07-10 Thread Lawrence Brakmo

Thank you for fixing it.

On 7/10/17, 1:12 PM, "Yonghong Song"  wrote:

From: Yonghong Song 

With latest net-next:

clang  -nostdinc -isystem /usr/lib/gcc/x86_64-redhat-linux/6.3.1/include 
-I./arch/x86/include -I./arch/x86/include/generated/uapi 
-I./arch/x86/include/generated  -I./include -I./arch/x86/include/uapi 
-I./include/uapi -I./include/generated/uapi -include ./include/linux/kconfig.h  
-Isamples/bpf \
-D__KERNEL__ -D__ASM_SYSREG_H -Wno-unused-value -Wno-pointer-sign \
-Wno-compare-distinct-pointer-types \
-Wno-gnu-variable-sized-type-not-at-end \
-Wno-address-of-packed-member -Wno-tautological-compare \
-Wno-unknown-warning-option \
-O2 -emit-llvm -c samples/bpf/tcp_synrto_kern.c -o -| llc -march=bpf 
-filetype=obj -o samples/bpf/tcp_synrto_kern.o
samples/bpf/tcp_synrto_kern.c:20:10: fatal error: 'bpf_endian.h' file not 
found
 ^~
1 error generated.


net has the same issue.

Add support for ntohl and htonl in tools/testing/selftests/bpf/bpf_endian.h
and move it to samples/bpf/ directory so that it can used by
both selftests/bpf and samples/bpf. The existing samples/bpf/bpf_helpers.h
is already used by both.

Signed-off-by: Yonghong Song y...@fb.com

Acked-by: Lawrence Brakmo 

---
 {tools/testing/selftests => samples}/bpf/bpf_endian.h | 14 ++
 tools/testing/selftests/bpf/Makefile  |  3 ++-
 2 files changed, 16 insertions(+), 1 deletion(-)
 rename {tools/testing/selftests => samples}/bpf/bpf_endian.h (73%)

diff --git a/tools/testing/selftests/bpf/bpf_endian.h 
b/samples/bpf/bpf_endian.h
similarity index 73%
rename from tools/testing/selftests/bpf/bpf_endian.h
rename to samples/bpf/bpf_endian.h
index 487cbfb..74af266 100644
--- a/tools/testing/selftests/bpf/bpf_endian.h
+++ b/samples/bpf/bpf_endian.h
@@ -23,11 +23,19 @@
 # define __bpf_htons(x)__builtin_bswap16(x)
 # define __bpf_constant_ntohs(x)   ___constant_swab16(x)
 # define __bpf_constant_htons(x)   ___constant_swab16(x)
+# define __bpf_ntohl(x)__builtin_bswap32(x)
+# define __bpf_htonl(x)__builtin_bswap32(x)
+# define __bpf_constant_ntohl(x)   ___constant_swab32(x)
+# define __bpf_constant_htonl(x)   ___constant_swab32(x)
 #elif __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__
 # define __bpf_ntohs(x)(x)
 # define __bpf_htons(x)(x)
 # define __bpf_constant_ntohs(x)   (x)
 # define __bpf_constant_htons(x)   (x)
+# define __bpf_ntohl(x)(x)
+# define __bpf_htonl(x)(x)
+# define __bpf_constant_ntohl(x)   (x)
+# define __bpf_constant_htonl(x)   (x)
 #else
 # error "Fix your compiler's __BYTE_ORDER__?!"
 #endif
@@ -38,5 +46,11 @@
 #define bpf_ntohs(x)   \
(__builtin_constant_p(x) ?  \
 __bpf_constant_ntohs(x) : __bpf_ntohs(x))
+#define bpf_htonl(x)   \
+   (__builtin_constant_p(x) ?  \
+__bpf_constant_htonl(x) : __bpf_htonl(x))
+#define bpf_ntohl(x)   \
+   (__builtin_constant_p(x) ?  \
+__bpf_constant_ntohl(x) : __bpf_ntohl(x))
 
 #endif /* __BPF_ENDIAN__ */
diff --git a/tools/testing/selftests/bpf/Makefile 
b/tools/testing/selftests/bpf/Makefile
index 2ca51a8..f263c6b 100644
--- a/tools/testing/selftests/bpf/Makefile
+++ b/tools/testing/selftests/bpf/Makefile
@@ -8,7 +8,8 @@ ifneq ($(wildcard $(GENHDR)),)
   GENFLAGS := -DHAVE_GENHDR
 endif
 
-CFLAGS += -Wall -O2 -I$(APIDIR) -I$(LIBDIR) -I$(GENDIR) $(GENFLAGS) 
-I../../../include
+CFLAGS += -Wall -O2 -I$(APIDIR) -I$(LIBDIR) -I$(GENDIR) $(GENFLAGS) 
-I../../../include \
+   -I../../../../samples/bpf
 LDLIBS += -lcap -lelf
 
 TEST_GEN_PROGS = test_verifier test_tag test_maps test_lru_map 
test_lpm_map test_progs \
-- 
2.9.4

[PATCH net] samples/bpf: fix a build issue

2017-07-10 Thread Yonghong Song

From: Yonghong Song 

With latest net-next:

clang  -nostdinc -isystem /usr/lib/gcc/x86_64-redhat-linux/6.3.1/include 
-I./arch/x86/include -I./arch/x86/include/generated/uapi 
-I./arch/x86/include/generated  -I./include -I./arch/x86/include/uapi 
-I./include/uapi -I./include/generated/uapi -include ./include/linux/kconfig.h  
-Isamples/bpf \
-D__KERNEL__ -D__ASM_SYSREG_H -Wno-unused-value -Wno-pointer-sign \
-Wno-compare-distinct-pointer-types \
-Wno-gnu-variable-sized-type-not-at-end \
-Wno-address-of-packed-member -Wno-tautological-compare \
-Wno-unknown-warning-option \
-O2 -emit-llvm -c samples/bpf/tcp_synrto_kern.c -o -| llc -march=bpf 
-filetype=obj -o samples/bpf/tcp_synrto_kern.o
samples/bpf/tcp_synrto_kern.c:20:10: fatal error: 'bpf_endian.h' file not found
 ^~
1 error generated.


net has the same issue.

Add support for ntohl and htonl in tools/testing/selftests/bpf/bpf_endian.h
and move it to samples/bpf/ directory so that it can used by
both selftests/bpf and samples/bpf. The existing samples/bpf/bpf_helpers.h
is already used by both.

Signed-off-by: Yonghong Song 
---
 {tools/testing/selftests => samples}/bpf/bpf_endian.h | 14 ++
 tools/testing/selftests/bpf/Makefile  |  3 ++-
 2 files changed, 16 insertions(+), 1 deletion(-)
 rename {tools/testing/selftests => samples}/bpf/bpf_endian.h (73%)

diff --git a/tools/testing/selftests/bpf/bpf_endian.h b/samples/bpf/bpf_endian.h
similarity index 73%
rename from tools/testing/selftests/bpf/bpf_endian.h
rename to samples/bpf/bpf_endian.h
index 487cbfb..74af266 100644
--- a/tools/testing/selftests/bpf/bpf_endian.h
+++ b/samples/bpf/bpf_endian.h
@@ -23,11 +23,19 @@
 # define __bpf_htons(x)__builtin_bswap16(x)
 # define __bpf_constant_ntohs(x)   ___constant_swab16(x)
 # define __bpf_constant_htons(x)   ___constant_swab16(x)
+# define __bpf_ntohl(x)__builtin_bswap32(x)
+# define __bpf_htonl(x)__builtin_bswap32(x)
+# define __bpf_constant_ntohl(x)   ___constant_swab32(x)
+# define __bpf_constant_htonl(x)   ___constant_swab32(x)
 #elif __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__
 # define __bpf_ntohs(x)(x)
 # define __bpf_htons(x)(x)
 # define __bpf_constant_ntohs(x)   (x)
 # define __bpf_constant_htons(x)   (x)
+# define __bpf_ntohl(x)(x)
+# define __bpf_htonl(x)(x)
+# define __bpf_constant_ntohl(x)   (x)
+# define __bpf_constant_htonl(x)   (x)
 #else
 # error "Fix your compiler's __BYTE_ORDER__?!"
 #endif
@@ -38,5 +46,11 @@
 #define bpf_ntohs(x)   \
(__builtin_constant_p(x) ?  \
 __bpf_constant_ntohs(x) : __bpf_ntohs(x))
+#define bpf_htonl(x)   \
+   (__builtin_constant_p(x) ?  \
+__bpf_constant_htonl(x) : __bpf_htonl(x))
+#define bpf_ntohl(x)   \
+   (__builtin_constant_p(x) ?  \
+__bpf_constant_ntohl(x) : __bpf_ntohl(x))
 
 #endif /* __BPF_ENDIAN__ */
diff --git a/tools/testing/selftests/bpf/Makefile 
b/tools/testing/selftests/bpf/Makefile
index 2ca51a8..f263c6b 100644
--- a/tools/testing/selftests/bpf/Makefile
+++ b/tools/testing/selftests/bpf/Makefile
@@ -8,7 +8,8 @@ ifneq ($(wildcard $(GENHDR)),)
   GENFLAGS := -DHAVE_GENHDR
 endif
 
-CFLAGS += -Wall -O2 -I$(APIDIR) -I$(LIBDIR) -I$(GENDIR) $(GENFLAGS) 
-I../../../include
+CFLAGS += -Wall -O2 -I$(APIDIR) -I$(LIBDIR) -I$(GENDIR) $(GENFLAGS) 
-I../../../include \
+   -I../../../../samples/bpf
 LDLIBS += -lcap -lelf
 
 TEST_GEN_PROGS = test_verifier test_tag test_maps test_lru_map test_lpm_map 
test_progs \
-- 
2.9.4

Re: [PATCH net-next RFC 10/12] net: dsa: Move FDB dump implementation inside DSA

2017-07-10 Thread Vivien Didelot

Hi Arkadi,

Arkadi Sharshevsky  writes:

> +struct dsa_slave_dump_ctx {
> + struct net_device *dev;
> + struct sk_buff *skb;
> + struct netlink_callback *cb;
> + int idx;
> +};
> +
>  struct dsa_switch_ops {
>   /*
>* Legacy probing.
> @@ -392,9 +399,7 @@ struct dsa_switch_ops {
>   int (*port_fdb_del)(struct dsa_switch *ds, int port,
>   const unsigned char *addr, u16 vid);
>   int (*port_fdb_dump)(struct dsa_switch *ds, int port,
> -  struct switchdev_obj_port_fdb *fdb,
> -   switchdev_obj_dump_cb_t *cb);
> -
> +  struct dsa_slave_dump_ctx *dump);

I would prefer to pass a callback to the driver, so that we can call
port_fdb_dump for other interfaces like debugfs, and for non exposed
switch ports (CPU and DSA links) as well. Something like:

typedef int dsa_fdb_dump_cb_t(const unsigned char *addr, u16 vid,
  bool is_static, void *data);

int (*port_fdb_dump)(struct dsa_switch *ds, int port,
 dsa_fdb_dump_cb_t *cb, void *data);

Then dsa_slave_dump_ctx and dsa_slave_port_fdb_do_dump can be static to
net/dsa/slave.c.

> +int dsa_slave_port_fdb_dump(struct sk_buff *skb, struct netlink_callback *cb,
> + struct net_device *dev,
> + struct net_device *filter_dev, int *idx)
> +{
> + struct dsa_slave_dump_ctx dump = {
> + .dev = dev,
> + .skb = skb,
> + .cb = cb,
> + .idx = *idx,
> + };
> + struct dsa_slave_priv *p = netdev_priv(dev);
> + struct dsa_port *dp = p->dp;
> + struct dsa_switch *ds = dp->ds;
> + int err;
> +
> + if (!ds->ops->port_fdb_dump) {
> + err = -EOPNOTSUPP;
> + goto out;
> + }
> +
> + err = ds->ops->port_fdb_dump(ds, dp->index, &dump);
> +out:
> + *idx = dump.idx;
> + return err;

There is no need to reassign *idx in case of error:

if (!ds->ops->port_fdb_dump)
return -EOPNOTSUPP;

err = ds->ops->port_fdb_dump(ds, dp->index, &dump);
*idx = dump.idx;

return err;

> + .ndo_fdb_dump   = dsa_slave_port_fdb_dump,

And s/dsa_slave_port_fdb_dump/dsa_slave_fdb_dump/ here will be even
better ;-)


Thanks,

Vivien

Re: [PATCH v2] mrf24j40: Fix en error handling path in 'mrf24j40_probe()'

2017-07-10 Thread Alan Ott


On 07/08/2017 04:40 AM, Christophe JAILLET wrote:

If this check fails, we must release some resources as done everywhere
else in this function before returning an error code.

Signed-off-by: Christophe JAILLET 
---
V2: initialization of ret in this erro path ws missing. Stupid me!
---
 drivers/net/ieee802154/mrf24j40.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ieee802154/mrf24j40.c 
b/drivers/net/ieee802154/mrf24j40.c
index 7d334963dc08..da8683782ffc 100644
--- a/drivers/net/ieee802154/mrf24j40.c
+++ b/drivers/net/ieee802154/mrf24j40.c
@@ -1330,7 +1330,8 @@ static int mrf24j40_probe(struct spi_device *spi)
if (spi->max_speed_hz > MAX_SPI_SPEED_HZ) {
dev_warn(&spi->dev, "spi clock above possible maximum: %d",
 MAX_SPI_SPEED_HZ);
-   return -EINVAL;
+   ret = -EINVAL;
+   goto err_register_device;
}

ret = mrf24j40_hw_init(devrec);



Acked-by: Alan Ott

Re: [PATCH net-next 2/3] cxgb4: Add PTP Hardware Clock (PHC) support

2017-07-10 Thread Richard Cochran

On Tue, Jul 04, 2017 at 04:46:21PM +0530, Atul Gupta wrote:
> +/**
> + * cxgb4_ptp_init - initialize PTP for devices which support it
> + * @adapter: board private structure
> + *
> + * This function performs the required steps for enabling PTP support.
> + */
> +void cxgb4_ptp_init(struct adapter *adapter)
> +{
> + struct timespec64 now;
> +  /* no need to create a clock device if we already have one */
> + if (!IS_ERR_OR_NULL(adapter->ptp_clock))
> + return;
> +
> + adapter->ptp_tx_skb = NULL;
> + adapter->ptp_clock_info = cxgb4_ptp_clock_info;
> + spin_lock_init(&adapter->ptp_lock);
> +
> + adapter->ptp_clock = ptp_clock_register(&adapter->ptp_clock_info,
> + &adapter->pdev->dev);
> + if (!adapter->ptp_clock) {
> + dev_err(adapter->pdev_dev,
> + "PTP %s Clock registration has failed\n", __func__);
> + return;
> + }

This is wrong.  To quote the header file:

/**
 * ptp_clock_register() - register a PTP hardware clock driver
 *
 * @info:   Structure describing the new clock.
 * @parent: Pointer to the parent device of the new clock.
 *
 * Returns a valid pointer on success or PTR_ERR on failure.  If PHC
 * support is missing at the configuration level, this function
 * returns NULL, and drivers are expected to gracefully handle that
 * case separately.
 */

As this has already been merged, please submit a patch to properly
handle both PTR_ERR and NULL.

Thanks,
Richard

Re: [RFC] get_compat_bpf_fprog(): don't copyin field-by-field

2017-07-10 Thread Daniel Borkmann


On 07/08/2017 08:22 PM, Al Viro wrote:

Signed-off-by: Al Viro 


Acked-by: Daniel Borkmann 

(Looks like ppp_sock_fprog_ioctl_trans() is another such candidate.)

[patch iproute2/net-next RFC] tc: Implement filter block sharing to ingress and clsact qdiscs

2017-07-10 Thread Jiri Pirko

From: Jiri Pirko 

Signed-off-by: Jiri Pirko 
---
 include/linux/pkt_sched.h | 12 +++
 tc/q_clsact.c | 54 ++-
 tc/q_ingress.c| 32 +---
 3 files changed, 90 insertions(+), 8 deletions(-)

diff --git a/include/linux/pkt_sched.h b/include/linux/pkt_sched.h
index 099bf55..a684087 100644
--- a/include/linux/pkt_sched.h
+++ b/include/linux/pkt_sched.h
@@ -871,4 +871,16 @@ struct tc_pie_xstats {
__u32 maxq; /* maximum queue size */
__u32 ecn_mark; /* packets marked with ecn*/
 };
+
+/* Ingress/clsact */
+
+enum {
+   TCA_CLSACT_UNSPEC,
+   TCA_CLSACT_INGRESS_BLOCK,
+   TCA_CLSACT_EGRESS_BLOCK,
+   __TCA_CLSACT_MAX
+};
+
+#define TCA_CLSACT_MAX (__TCA_CLSACT_MAX - 1)
+
 #endif
diff --git a/tc/q_clsact.c b/tc/q_clsact.c
index e2a1a71..0ecaa63 100644
--- a/tc/q_clsact.c
+++ b/tc/q_clsact.c
@@ -6,23 +6,67 @@
 
 static void explain(void)
 {
-   fprintf(stderr, "Usage: ... clsact\n");
+   fprintf(stderr, "Usage: ... clsact [ingress_block BLOCK_INDEX] 
[egress_block BLOCK_INDEX]\n");
 }
 
 static int clsact_parse_opt(struct qdisc_util *qu, int argc, char **argv,
struct nlmsghdr *n)
 {
-   if (argc > 0) {
-   fprintf(stderr, "What is \"%s\"?\n", *argv);
-   explain();
-   return -1;
+   struct rtattr *tail;
+   unsigned int ingress_block;
+   unsigned int egress_block;
+
+   while (argc > 0) {
+   if (strcmp(*argv, "ingress_block") == 0) {
+   NEXT_ARG();
+   if (get_unsigned(&ingress_block, *argv, 0)) {
+   fprintf(stderr, "Illegal \"ingress_block\"\n");
+   return -1;
+   }
+   } else if (strcmp(*argv, "egress_block") == 0) {
+   NEXT_ARG();
+   if (get_unsigned(&egress_block, *argv, 0)) {
+   fprintf(stderr, "Illegal \"egress_block\"\n");
+   return -1;
+   }
+   } else {
+   fprintf(stderr, "What is \"%s\"?\n", *argv);
+   explain();
+   return -1;
+   }
+   NEXT_ARG_FWD();
}
 
+   tail = NLMSG_TAIL(n);
+   addattr_l(n, 1024, TCA_OPTIONS, NULL, 0);
+   if (ingress_block)
+   addattr32(n, 1024, TCA_CLSACT_INGRESS_BLOCK, ingress_block);
+   if (egress_block)
+   addattr32(n, 1024, TCA_CLSACT_EGRESS_BLOCK, egress_block);
+   tail->rta_len = (void *) NLMSG_TAIL(n) - (void *) tail;
return 0;
 }
 
 static int clsact_print_opt(struct qdisc_util *qu, FILE *f, struct rtattr *opt)
 {
+   struct rtattr *tb[TCA_CLSACT_MAX + 1];
+   unsigned int block;
+
+   if (!opt)
+   return 0;
+
+   parse_rtattr_nested(tb, TCA_CLSACT_MAX, opt);
+
+   if (tb[TCA_CLSACT_INGRESS_BLOCK] &&
+   RTA_PAYLOAD(tb[TCA_CLSACT_INGRESS_BLOCK]) >= sizeof(__u32)) {
+   block = rta_getattr_u32(tb[TCA_CLSACT_INGRESS_BLOCK]);
+   fprintf(f, "ingress_block %u ", block);
+   }
+   if (tb[TCA_CLSACT_EGRESS_BLOCK] &&
+   RTA_PAYLOAD(tb[TCA_CLSACT_EGRESS_BLOCK]) >= sizeof(__u32)) {
+   block = rta_getattr_u32(tb[TCA_CLSACT_EGRESS_BLOCK]);
+   fprintf(f, "egress_block %u ", block);
+   }
return 0;
 }
 
diff --git a/tc/q_ingress.c b/tc/q_ingress.c
index 31699a8..b973e1b 100644
--- a/tc/q_ingress.c
+++ b/tc/q_ingress.c
@@ -17,30 +17,56 @@
 
 static void explain(void)
 {
-   fprintf(stderr, "Usage: ... ingress\n");
+   fprintf(stderr, "Usage: ... ingress [block BLOCK_INDEX]\n");
 }
 
 static int ingress_parse_opt(struct qdisc_util *qu, int argc, char **argv,
 struct nlmsghdr *n)
 {
+   struct rtattr *tail;
+   unsigned int block;
+
while (argc > 0) {
if (strcmp(*argv, "handle") == 0) {
NEXT_ARG();
-   argc--; argv++;
+   } else if (strcmp(*argv, "block") == 0) {
+   NEXT_ARG();
+   if (get_unsigned(&block, *argv, 0)) {
+   fprintf(stderr, "Illegal \"block\"\n");
+   return -1;
+   }
} else {
fprintf(stderr, "What is \"%s\"?\n", *argv);
explain();
return -1;
}
+   NEXT_ARG_FWD();
}
 
+   tail = NLMSG_TAIL(n);
+   addattr_l(n, 1024, TCA_OPTIONS, NULL, 0);
+   if (block)
+   addattr32(n, 1024, TCA_CLSACT_INGRESS_BLOCK, block);
+   tail->rta_len = (void *) NLMSG_TAIL(n) - (void *) tail;
return 0;
 }
 
 static int ingress_p

[patch net-next RFC 3/4] net: sched: introduce shared filter blocks infrastructure

2017-07-10 Thread Jiri Pirko

From: Jiri Pirko 

Allow qdiscs to share filter blocks among them. Each qdisc type has to
use block get/put modifications that enable sharing. Shared blocks are
tracked within each net namespace and identified by u32 value. This
value is auto-generated in case user did not pass it from userspace. If
user passes value that is not used, new block is created. If user passes
value that is already used, the existing block will be re-used.

Signed-off-by: Jiri Pirko 
---
 include/net/pkt_cls.h |  22 +-
 include/net/sch_generic.h |   2 +
 net/sched/cls_api.c   | 180 ++
 3 files changed, 189 insertions(+), 15 deletions(-)

diff --git a/include/net/pkt_cls.h b/include/net/pkt_cls.h
index 537d0a0..4381cbc 100644
--- a/include/net/pkt_cls.h
+++ b/include/net/pkt_cls.h
@@ -23,7 +23,12 @@ struct tcf_chain *tcf_chain_get(struct tcf_block *block, u32 
chain_index,
 void tcf_chain_put(struct tcf_chain *chain);
 int tcf_block_get(struct tcf_block **p_block,
  struct tcf_proto __rcu **p_filter_chain);
+int tcf_block_get_shared(struct tcf_block **p_block,
+struct net *net, u32 block_index,
+struct tcf_proto __rcu **p_filter_chain);
 void tcf_block_put(struct tcf_block *block);
+void tcf_block_put_shared(struct tcf_block *block, struct net *net,
+ struct tcf_proto __rcu **p_filter_chain);
 int tcf_classify(struct sk_buff *skb, const struct tcf_proto *tp,
 struct tcf_result *res, bool compat_mode);
 
@@ -35,7 +40,22 @@ int tcf_block_get(struct tcf_block **p_block,
return 0;
 }
 
-static inline void tcf_block_put(struct tcf_block *block)
+static inline
+int tcf_block_get_shared(struct tcf_block **p_block,
+struct net *net, u32 block_index,
+struct tcf_proto __rcu **p_filter_chain)
+{
+   return 0;
+}
+
+static inline
+void tcf_block_put(struct tcf_block *block)
+{
+}
+
+static inline
+void tcf_block_put_shared(struct tcf_block *block, struct net *net,
+ struct tcf_proto __rcu **p_filter_chain)
 {
 }
 
diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h
index 7396de8..cbc7313 100644
--- a/include/net/sch_generic.h
+++ b/include/net/sch_generic.h
@@ -266,6 +266,8 @@ struct tcf_chain {
 
 struct tcf_block {
struct list_head chain_list;
+   u32 index; /* block index for shared blocks */
+   unsigned int refcnt;
 };
 
 static inline void qdisc_cb_private_validate(const struct sk_buff *skb, int sz)
diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
index 411f5577..098b9a2 100644
--- a/net/sched/cls_api.c
+++ b/net/sched/cls_api.c
@@ -25,6 +25,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -286,52 +287,175 @@ tcf_chain_filter_chain_ptr_del(struct tcf_chain *chain,
WARN_ON(1);
 }
 
-static struct tcf_chain *tcf_block_chain_zero(struct tcf_block *block)
+struct tcf_net {
+   struct idr idr;
+};
+
+static unsigned int tcf_net_id;
+
+static int tcf_block_insert(struct tcf_block *block, struct net *net,
+   u32 block_index)
 {
-   return list_first_entry(&block->chain_list, struct tcf_chain, list);
+   struct tcf_net *tn = net_generic(net, tcf_net_id);
+   int idr_start;
+   int idr_end;
+   int index;
+
+   if (block_index >= INT_MAX)
+   return -EINVAL;
+   idr_start = block_index ? block_index : 1;
+   idr_end = block_index ? block_index + 1 : INT_MAX;
+
+   index = idr_alloc(&tn->idr, block, idr_start, idr_end, GFP_KERNEL);
+   if (index < 0)
+   return index;
+   block->index = index;
+   return 0;
 }
 
-int tcf_block_get(struct tcf_block **p_block,
- struct tcf_proto __rcu **p_filter_chain)
+static void tcf_block_remove(struct tcf_block *block, struct net *net)
+{
+   struct tcf_net *tn = net_generic(net, tcf_net_id);
+
+   idr_remove(&tn->idr, block->index);
+}
+
+static struct tcf_block *tcf_block_create(void)
 {
-   struct tcf_block *block = kzalloc(sizeof(*block), GFP_KERNEL);
+   struct tcf_block *block;
struct tcf_chain *chain;
int err;
 
+   block = kzalloc(sizeof(*block), GFP_KERNEL);
if (!block)
-   return -ENOMEM;
+   return ERR_PTR(-ENOMEM);
INIT_LIST_HEAD(&block->chain_list);
+   block->refcnt = 1;
+
/* Create chain 0 by default, it has to be always present. */
chain = tcf_chain_create(block, 0);
if (!chain) {
err = -ENOMEM;
goto err_chain_create;
}
-   tcf_chain_filter_chain_ptr_add(chain, p_filter_chain);
-   *p_block = block;
-   return 0;
+   return block;
 
 err_chain_create:
kfree(block);
+   return ERR_PTR(err);
+}
+
+static void tcf_block_destroy(struct tcf_block *block)
+{
+   struct tcf_chain *chain, *tmp;

[patch net-next RFC 2/4] net: sched: intruduce qdisc_net helper

2017-07-10 Thread Jiri Pirko

From: Jiri Pirko 

There is going to be need to be able to obtain net structure for
a qdisc. So introduce helper to do it.

Signed-off-by: Jiri Pirko 
---
 include/net/pkt_sched.h | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/include/net/pkt_sched.h b/include/net/pkt_sched.h
index 2579c20..da85ad0 100644
--- a/include/net/pkt_sched.h
+++ b/include/net/pkt_sched.h
@@ -4,7 +4,9 @@
 #include 
 #include 
 #include 
+#include 
 #include 
+#include 
 
 #define DEFAULT_TX_QUEUE_LEN   1000
 
@@ -132,4 +134,9 @@ static inline unsigned int psched_mtu(const struct 
net_device *dev)
return dev->mtu + dev->hard_header_len;
 }
 
+static inline struct net *qdisc_net(struct Qdisc *q)
+{
+   return dev_net(q->dev_queue->dev);
+}
+
 #endif
-- 
2.9.3

[patch net-next RFC 0/4] net: sched: allow qdiscs to share filter block instances

2017-07-10 Thread Jiri Pirko

From: Jiri Pirko 

Currently the filters added to qdiscs are independent. So for example if you
have 2 netdevices and you create ingress qdisc on both and you want to add
identical filter rules both, you need to add them twice. This patchset
makes this easier and mainly saves resources allowing to share all filters
within a qdisc - I call it a "filter block". Also this helps to save
resources when we do offload to hw for example to expensive TCAM.

So back to the example. First, we create 2 qdiscs. Both will share
block number 22. "22" is just an identification. If we don't pass any
block number, a new one will be generated by kernel:

$ tc qdisc add dev ens7 ingress block 22

$ tc qdisc add dev ens8 ingress block 22


Now if we list the qdiscs, we will see the block index in the output:
qdisc fq_codel 0: dev ens7 root refcnt 2 limit 10240p flows 1024 quantum 1514 
target 5.0ms interval 100.0ms memory_limit 32Mb ecn 
 Sent 9014 bytes 99 pkt (dropped 0, overlimits 0 requeues 0) 
 backlog 0b 0p requeues 0 
  maxpacket 0 drop_overlimit 0 new_flow_count 0 ecn_mark 0
  new_flows_len 0 old_flows_len 0
qdisc ingress : dev ens7 parent :fff1 block 22 
  
 Sent 4592 bytes 58 pkt (dropped 0, overlimits 0 requeues 0) 
 backlog 0b 0p requeues 0 
qdisc fq_codel 0: dev ens8 root refcnt 2 limit 10240p flows 1024 quantum 1514 
target 5.0ms interval 100.0ms memory_limit 32Mb ecn 
 Sent 17022 bytes 307 pkt (dropped 0, overlimits 0 requeues 0) 
 backlog 0b 0p requeues 0 
  maxpacket 0 drop_overlimit 0 new_flow_count 0 ecn_mark 0
  new_flows_len 0 old_flows_len 0
qdisc ingress : dev ens8 parent :fff1 block 22 
  
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) 
 backlog 0b 0p requeues 0 


Now we can add filter to any of qdiscs sharing the same block:

$ tc filter add dev ens7 parent : protocol ip pref 25 flower dst_ip 
192.168.0.0/16 action drop


We will see the same output if we list filters for ens7 and ens8, including 
stats:

$ tc -s filter show dev ens7 root
filter parent : protocol ip pref 25 flower 
filter parent : protocol ip pref 25 flower handle 0x1 
  eth_type ipv4
  dst_ip 192.168.1.0/24
action order 1: gact action drop
 random type none pass val 0
 index 3 ref 1 bind 1 installed 10201 sec used 10150 sec
Action statistics:
Sent 4200 bytes 50 pkt (dropped 50, overlimits 0 requeues 0) 
backlog 0b 0p requeues 0 

$ tc -s filter show dev ens8 root
filter dev ens7 parent : protocol ip pref 25 flower 
filter dev ens7 parent : protocol ip pref 25 flower handle 0x1 
  eth_type ipv4
  dst_ip 192.168.1.0/24
action order 1: gact action drop
 random type none pass val 0
 index 3 ref 1 bind 1 installed 10202 sec used 10152 sec
Action statistics:
Sent 4200 bytes 50 pkt (dropped 50, overlimits 0 requeues 0) 
backlog 0b 0p requeues 0 


Issues:
- tp->q is set by the device used to add the filter. That has to be resolved.
  Impacts the dump (as you can see above)

Jiri Pirko (4):
  net: sched: introduce support for multiple filter chain pointers
registration
  net: sched: intruduce qdisc_net helper
  net: sched: introduce shared filter blocks infrastructure
  net: sched: allow ingress and clsact qdiscs to share filter blocks

 include/net/pkt_cls.h  |  22 +++-
 include/net/pkt_sched.h|   7 ++
 include/net/sch_generic.h  |   4 +-
 include/uapi/linux/pkt_sched.h |  12 +++
 net/sched/cls_api.c| 240 +
 net/sched/sch_ingress.c| 107 --
 6 files changed, 362 insertions(+), 30 deletions(-)

-- 
2.9.3

[patch net-next RFC 4/4] net: sched: allow ingress and clsact qdiscs to share filter blocks

2017-07-10 Thread Jiri Pirko

From: Jiri Pirko 

Benefit from the previously introduced shared filter blocks
infrastructure and allow ingress and clsact qdisc instances to share
filter blocks. The block index is coming from userspace as qdisc option.

Signed-off-by: Jiri Pirko 
---
 include/uapi/linux/pkt_sched.h |  12 +
 net/sched/sch_ingress.c| 107 ++---
 2 files changed, 112 insertions(+), 7 deletions(-)

diff --git a/include/uapi/linux/pkt_sched.h b/include/uapi/linux/pkt_sched.h
index 099bf55..a684087 100644
--- a/include/uapi/linux/pkt_sched.h
+++ b/include/uapi/linux/pkt_sched.h
@@ -871,4 +871,16 @@ struct tc_pie_xstats {
__u32 maxq; /* maximum queue size */
__u32 ecn_mark; /* packets marked with ecn*/
 };
+
+/* Ingress/clsact */
+
+enum {
+   TCA_CLSACT_UNSPEC,
+   TCA_CLSACT_INGRESS_BLOCK,
+   TCA_CLSACT_EGRESS_BLOCK,
+   __TCA_CLSACT_MAX
+};
+
+#define TCA_CLSACT_MAX (__TCA_CLSACT_MAX - 1)
+
 #endif
diff --git a/net/sched/sch_ingress.c b/net/sched/sch_ingress.c
index d8a9beb..f18b257 100644
--- a/net/sched/sch_ingress.c
+++ b/net/sched/sch_ingress.c
@@ -58,13 +58,42 @@ static struct tcf_block *ingress_tcf_block(struct Qdisc 
*sch, unsigned long cl)
return q->block;
 }
 
+static const struct nla_policy ingress_policy[TCA_CLSACT_MAX + 1] = {
+   [TCA_CLSACT_INGRESS_BLOCK]  = { .type = NLA_U32 },
+};
+
+static int ingress_parse_opt(struct nlattr *opt, u32 *p_ingress_block_index)
+{
+   struct nlattr *tb[TCA_CLSACT_MAX + 1];
+   int err;
+
+   *p_ingress_block_index = 0;
+
+   if (!opt)
+   return 0;
+   err = nla_parse_nested(tb, TCA_CLSACT_MAX, opt, ingress_policy, NULL);
+   if (err)
+   return err;
+
+   if (tb[TCA_CLSACT_INGRESS_BLOCK])
+   *p_ingress_block_index =
+   nla_get_u32(tb[TCA_CLSACT_INGRESS_BLOCK]);
+   return 0;
+}
+
 static int ingress_init(struct Qdisc *sch, struct nlattr *opt)
 {
struct ingress_sched_data *q = qdisc_priv(sch);
struct net_device *dev = qdisc_dev(sch);
+   u32 ingress_block_index;
int err;
 
-   err = tcf_block_get(&q->block, &dev->ingress_cl_list);
+   err = ingress_parse_opt(opt, &ingress_block_index);
+   if (err)
+   return err;
+
+   err = tcf_block_get_shared(&q->block, qdisc_net(sch),
+  ingress_block_index, &dev->ingress_cl_list);
if (err)
return err;
 
@@ -77,18 +106,22 @@ static int ingress_init(struct Qdisc *sch, struct nlattr 
*opt)
 static void ingress_destroy(struct Qdisc *sch)
 {
struct ingress_sched_data *q = qdisc_priv(sch);
+   struct net_device *dev = qdisc_dev(sch);
 
-   tcf_block_put(q->block);
+   tcf_block_put_shared(q->block, qdisc_net(sch), &dev->ingress_cl_list);
net_dec_ingress_queue();
 }
 
 static int ingress_dump(struct Qdisc *sch, struct sk_buff *skb)
 {
+   struct ingress_sched_data *q = qdisc_priv(sch);
struct nlattr *nest;
 
nest = nla_nest_start(skb, TCA_OPTIONS);
if (nest == NULL)
goto nla_put_failure;
+   if (nla_put_u32(skb, TCA_CLSACT_INGRESS_BLOCK, q->block->index))
+   goto nla_put_failure;
 
return nla_nest_end(skb, nest);
 
@@ -159,17 +192,54 @@ static struct tcf_block *clsact_tcf_block(struct Qdisc 
*sch, unsigned long cl)
}
 }
 
+static const struct nla_policy clsact_policy[TCA_CLSACT_MAX + 1] = {
+   [TCA_CLSACT_INGRESS_BLOCK]  = { .type = NLA_U32 },
+   [TCA_CLSACT_EGRESS_BLOCK]   = { .type = NLA_U32 },
+};
+
+static int clsact_parse_opt(struct nlattr *opt, u32 *p_ingress_block_index,
+   u32 *p_egress_block_index)
+{
+   struct nlattr *tb[TCA_CLSACT_MAX + 1];
+   int err;
+
+   *p_ingress_block_index = 0;
+   *p_egress_block_index = 0;
+
+   if (!opt)
+   return 0;
+   err = nla_parse_nested(tb, TCA_CLSACT_MAX, opt, clsact_policy, NULL);
+   if (err)
+   return err;
+
+   if (tb[TCA_CLSACT_INGRESS_BLOCK])
+   *p_ingress_block_index =
+   nla_get_u32(tb[TCA_CLSACT_INGRESS_BLOCK]);
+   if (tb[TCA_CLSACT_EGRESS_BLOCK])
+   *p_egress_block_index =
+   nla_get_u32(tb[TCA_CLSACT_EGRESS_BLOCK]);
+   return 0;
+}
+
 static int clsact_init(struct Qdisc *sch, struct nlattr *opt)
 {
struct clsact_sched_data *q = qdisc_priv(sch);
struct net_device *dev = qdisc_dev(sch);
+   u32 ingress_block_index;
+   u32 egress_block_index;
int err;
 
-   err = tcf_block_get(&q->ingress_block, &dev->ingress_cl_list);
+   err = clsact_parse_opt(opt, &ingress_block_index, &egress_block_index);
if (err)
return err;
 
-   err = tcf_block_get(&q->egress_block, &dev->egress_cl_list);
+   err = tcf_block_get_shared(&q->ingress_block

[patch net-next RFC 1/4] net: sched: introduce support for multiple filter chain pointers registration

2017-07-10 Thread Jiri Pirko

From: Jiri Pirko 

So far, there was possible only to register a single filter chain
pointer to a block->chain[0]. However, when the blocks will get
shareable, we need to allow multiple filter chain pointers registration.

Signed-off-by: Jiri Pirko 
---
 include/net/sch_generic.h |  2 +-
 net/sched/cls_api.c   | 66 ---
 2 files changed, 57 insertions(+), 11 deletions(-)

diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h
index 1c123e2..7396de8 100644
--- a/include/net/sch_generic.h
+++ b/include/net/sch_generic.h
@@ -257,7 +257,7 @@ struct qdisc_skb_cb {
 
 struct tcf_chain {
struct tcf_proto __rcu *filter_chain;
-   struct tcf_proto __rcu **p_filter_chain;
+   struct list_head filter_chain_list;
struct list_head list;
struct tcf_block *block;
u32 index; /* chain index */
diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
index 39da0c5..411f5577 100644
--- a/net/sched/cls_api.c
+++ b/net/sched/cls_api.c
@@ -186,6 +186,11 @@ static void tcf_proto_destroy(struct tcf_proto *tp)
kfree_rcu(tp, rcu);
 }
 
+struct tfc_filter_chain_list_item {
+   struct list_head list;
+   struct tcf_proto __rcu **p_filter_chain;
+};
+
 static struct tcf_chain *tcf_chain_create(struct tcf_block *block,
  u32 chain_index)
 {
@@ -194,6 +199,7 @@ static struct tcf_chain *tcf_chain_create(struct tcf_block 
*block,
chain = kzalloc(sizeof(*chain), GFP_KERNEL);
if (!chain)
return NULL;
+   INIT_LIST_HEAD(&chain->filter_chain_list);
list_add_tail(&chain->list, &block->chain_list);
chain->block = block;
chain->index = chain_index;
@@ -203,10 +209,11 @@ static struct tcf_chain *tcf_chain_create(struct 
tcf_block *block,
 
 static void tcf_chain_flush(struct tcf_chain *chain)
 {
+   struct tfc_filter_chain_list_item *item;
struct tcf_proto *tp;
 
-   if (*chain->p_filter_chain)
-   RCU_INIT_POINTER(*chain->p_filter_chain, NULL);
+   list_for_each_entry(item, &chain->filter_chain_list, list)
+   RCU_INIT_POINTER(*item->p_filter_chain, NULL);
while ((tp = rtnl_dereference(chain->filter_chain)) != NULL) {
RCU_INIT_POINTER(chain->filter_chain, tp->next);
tcf_proto_destroy(tp);
@@ -248,11 +255,40 @@ void tcf_chain_put(struct tcf_chain *chain)
 }
 EXPORT_SYMBOL(tcf_chain_put);
 
+static int
+tcf_chain_filter_chain_ptr_add(struct tcf_chain *chain,
+  struct tcf_proto __rcu **p_filter_chain)
+{
+   struct tfc_filter_chain_list_item *item;
+
+   item = kmalloc(sizeof(*item), GFP_KERNEL);
+   if (!item)
+   return -ENOMEM;
+   item->p_filter_chain = p_filter_chain;
+   list_add(&item->list, &chain->filter_chain_list);
+   return 0;
+}
+
 static void
-tcf_chain_filter_chain_ptr_set(struct tcf_chain *chain,
+tcf_chain_filter_chain_ptr_del(struct tcf_chain *chain,
   struct tcf_proto __rcu **p_filter_chain)
 {
-   chain->p_filter_chain = p_filter_chain;
+   struct tfc_filter_chain_list_item *item;
+
+   list_for_each_entry(item, &chain->filter_chain_list, list) {
+   if (!p_filter_chain ||
+   item->p_filter_chain == p_filter_chain) {
+   list_del(&item->list);
+   kfree(item);
+   return;
+   }
+   }
+   WARN_ON(1);
+}
+
+static struct tcf_chain *tcf_block_chain_zero(struct tcf_block *block)
+{
+   return list_first_entry(&block->chain_list, struct tcf_chain, list);
 }
 
 int tcf_block_get(struct tcf_block **p_block,
@@ -271,7 +307,7 @@ int tcf_block_get(struct tcf_block **p_block,
err = -ENOMEM;
goto err_chain_create;
}
-   tcf_chain_filter_chain_ptr_set(chain, p_filter_chain);
+   tcf_chain_filter_chain_ptr_add(chain, p_filter_chain);
*p_block = block;
return 0;
 
@@ -288,6 +324,8 @@ void tcf_block_put(struct tcf_block *block)
if (!block)
return;
 
+   tcf_chain_filter_chain_ptr_del(tcf_block_chain_zero(block), NULL);
+
list_for_each_entry_safe(chain, tmp, &block->chain_list, list)
tcf_chain_destroy(chain);
kfree(block);
@@ -362,9 +400,13 @@ static void tcf_chain_tp_insert(struct tcf_chain *chain,
struct tcf_chain_info *chain_info,
struct tcf_proto *tp)
 {
-   if (chain->p_filter_chain &&
-   *chain_info->pprev == chain->filter_chain)
-   rcu_assign_pointer(*chain->p_filter_chain, tp);
+   if (*chain_info->pprev == chain->filter_chain) {
+   struct tfc_filter_chain_list_item *item;
+
+   list_for_each_entry(item, &chain->filter_chain_list, list)
+   rcu_assign_pointer(*item->p_filt

Re: [PATCH] net/mlx5: IPSec, fix 64-bit division correctly

2017-07-10 Thread David Miller

From: Arnd Bergmann 
Date: Mon, 10 Jul 2017 11:37:51 +0200

> The new IPSec offload code introduced a build error:
> 
> drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec_rxtx.o: In function 
> `mlx5e_ipsec_build_inverse_table':
> ipsec_rxtx.c:(.text+0x556): undefined reference
> 
> Another patch was added on top to fix the build error, but
> that introduced a new bug, as we now use the remainder of
> the division rather than the result.
> 
> This makes it use the correct helper function instead.
> 
> Fixes: 5dfd87b67cd9 ("net/mlx5: IPSec, Fix 64-bit division on 32-bit builds")
> Fixes: 2ac9cfe78223 ("net/mlx5e: IPSec, Add Innova IPSec offload TX data 
> path")
> Signed-off-by: Arnd Bergmann 

Applied, thanks.

Re: [RFC PATCH 00/12] Implement XDP bpf_redirect vairants

2017-07-10 Thread Jesper Dangaard Brouer

On Sat, 8 Jul 2017 21:06:17 +0200
Jesper Dangaard Brouer  wrote:

> On Sat, 08 Jul 2017 10:46:18 +0100 (WEST)
> David Miller  wrote:
> 
> > From: John Fastabend 
> > Date: Fri, 07 Jul 2017 10:48:36 -0700
> >   
> > > On 07/07/2017 10:34 AM, John Fastabend wrote:
> > >> This series adds two new XDP helper routines bpf_redirect() and
> > >> bpf_redirect_map(). The first variant bpf_redirect() is meant
> > >> to be used the same way it is currently being used by the cls_bpf
> > >> classifier. An xdp packet will be redirected immediately when this
> > >> is called.
> > > 
> > > Also other than the typo in the title there ;) I'm going to CC
> > > the driver maintainers working on XDP (makes for a long CC list but)
> > > because we would want to try and get support in as many as possible in
> > > the next merge window.
> > > 
> > > For this rev I just implemented on ixgbe because I wrote the
> > > original XDP support there. I'll volunteer to do virtio as well.
> > 
> > I went over this series a few times and it looks great to me.
> > You didn't even give me some coding style issues to pick on :-)  
> 
> We (Daniel, Andy and I) have been reviewing and improving on this
> patchset the last couple of weeks ;-).  We had some stability issues,
> which is why it wasn't published earlier. My plan is to test this
> latest patchset again, Monday and Tuesday. I'll try to assess stability
> and provide some performance numbers.


Damn, I though it was stable, I have been running a lot of performance
tests, and then this just happened :-(

[11357.149486] BUG: unable to handle kernel NULL pointer dereference at 
0210
[11357.157393] IP: __dev_map_flush+0x58/0x90
[11357.161446] PGD 3ff685067 
[11357.161446] P4D 3ff685067 
[11357.164199] PUD 3ff684067 
[11357.166947] PMD 0 
[11357.170396] 
[11357.173981] Oops:  [#1] PREEMPT SMP
[11357.177859] Modules linked in: coretemp kvm_intel kvm irqbypass intel_cstate 
intel_uncore intel_rapl_perf mxm_wmi i2c_i801 pcspkr sg i2c_co]
[11357.203021] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 
4.12.0-net-next-xdp_redirect02-rfc+ #135
[11357.211706] Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./Z97 
Extreme4, BIOS P2.10 05/12/2015
[11357.221606] task: 81c0e480 task.stack: 81c0
[11357.227568] RIP: 0010:__dev_map_flush+0x58/0x90
[11357.232138] RSP: 0018:88041fa03de0 EFLAGS: 00010286
[11357.237409] RAX:  RBX: 8803fc996600 RCX: 0003
[11357.244589] RDX: 88040d0bf480 RSI:  RDI: 81d901d8
[11357.251767] RBP: 88041fa03df8 R08: fffc R09: 00070008
[11357.258940] R10: 813f11d0 R11: 0040 R12: e8c014a0
[11357.266119] R13: 0003 R14: 003c R15: 8803fc9c26c0
[11357.273294] FS:  () GS:88041fa0() 
knlGS:
[11357.281454] CS:  0010 DS:  ES:  CR0: 80050033
[11357.287244] CR2: 0210 CR3: 0003fc41e000 CR4: 001406f0
[11357.294423] Call Trace:
[11357.296912]  
[11357.298967]  xdp_do_flush_map+0x36/0x40
[11357.302847]  ixgbe_poll+0x7ea/0x1370 [ixgbe]
[11357.307160]  net_rx_action+0x247/0x3e0
[11357.310957]  __do_softirq+0x106/0x2cb
[11357.314664]  irq_exit+0xbe/0xd0
[11357.317851]  do_IRQ+0x4f/0xd0
[11357.320858]  common_interrupt+0x86/0x86
[11357.324733] RIP: 0010:poll_idle+0x2f/0x5a
[11357.328781] RSP: 0018:81c03dd0 EFLAGS: 0246 ORIG_RAX: 
ff8e
[11357.336426] RAX:  RBX: 81d689e0 RCX: 0020
[11357.343605] RDX:  RSI: 81c0e480 RDI: 88041fa22800
[11357.350783] RBP: 81c03dd0 R08: 03c5 R09: 0018
[11357.357958] R10: 0327 R11: 0390 R12: 88041fa22800
[11357.365135] R13: 81d689f8 R14:  R15: 81d689e0
[11357.372311]  
[11357.374453]  cpuidle_enter_state+0xf2/0x2e0
[11357.378678]  cpuidle_enter+0x17/0x20
[11357.382299]  call_cpuidle+0x23/0x40
[11357.385834]  do_idle+0xe8/0x190
[11357.389024]  cpu_startup_entry+0x1d/0x20
[11357.392993]  rest_init+0xd5/0xe0
[11357.396268]  start_kernel+0x3d7/0x3e4
[11357.399979]  x86_64_start_reservations+0x2a/0x2c
[11357.404641]  x86_64_start_kernel+0x178/0x18b
[11357.408959]  secondary_startup_64+0x9f/0x9f
[11357.413186]  ? secondary_startup_64+0x9f/0x9f
[11357.417589] Code: 41 89 c5 48 8b 53 60 44 89 e8 48 8d 14 c2 48 8b 12 48 85 
d2 74 23 48 8b 3a f0 49 0f b3 04 24 48 85 ff 74 15 48 8b 87 e0 0 
[11357.436565] RIP: __dev_map_flush+0x58/0x90 RSP: 88041fa03de0
[11357.442613] CR2: 0210
[11357.445972] ---[ end trace f7ed232095169a98 ]---
[11357.450637] Kernel panic - not syncing: Fatal exception in interrupt
[11357.457038] Kernel Offset: disabled
[11357.460566] ---[ end Kernel panic - not syncing: Fatal exception in interrupt

(gdb) list *(__dev_map_flush)+0x58
0x811422a8 is in __dev_map_flush (kernel/bpf/devmap.c:2

Re: [PATCH iproute2 V3 0/4] RDMAtool

2017-07-10 Thread Jiri Pirko

Mon, Jul 10, 2017 at 06:01:44PM CEST, l...@kernel.org wrote:
>On Mon, Jul 10, 2017 at 10:02:30AM +0200, Jiri Pirko wrote:
>> Tue, Jul 04, 2017 at 09:55:37AM CEST, l...@kernel.org wrote:
>> >Hi,
>> >
>> >This is third version of series implementing the RDAMtool -  the tool
>> >to configure RDMA devices. The initial proposal was sent as RFC [1] and
>> >was based on sysfs entries as POC.
>> >
>> >The current series was rewritten completely to work with RDMA netlinks as
>> >a source of user<->kernel communications. In order to achieve that, the
>> >RDMA netlinks were extensively refactored and modernized [2, 3, 4 and 5].
>> >
>> >The following is an example of various runs on my machine with 5 devices
>> >(4 in IB mode and one in Ethernet mode)
>> >
>> >### Without parameters
>> >$ rdma
>> >Usage: rdma [ OPTIONS ] OBJECT { COMMAND | help }
>> >where  OBJECT := { dev | link | help }
>> >   OPTIONS := { -V[ersion] | -d[etails]}
>>
>> What about json output? You will need it sooner than later. It will
>> prevent you from a lot of headaches if you implement it right away.
>> Lesson learned...
>
>I'm planning to do it in the coming kernel cycle.

Yeah, just consider pushing it in this initial patchset. Makes sense and
saves you troubles. Up to you.

Re: [PATCH iproute2 V3 3/4] rdma: Add link object

2017-07-10 Thread Jiri Pirko

Mon, Jul 10, 2017 at 06:22:23PM CEST, l...@kernel.org wrote:
>On Mon, Jul 10, 2017 at 10:13:07AM +0200, Jiri Pirko wrote:
>> Tue, Jul 04, 2017 at 09:55:40AM CEST, l...@kernel.org wrote:
>> >From: Leon Romanovsky 
>> >
>> >Link (port) object represent struct ib_port to the user space.
>> >
>> >Link properties:
>> > * Port capabilities
>> > * IB subnet prefix
>> > * LID, SM_LID and LMC
>> > * Port state
>> > * Physical state
>> >
>> >Signed-off-by: Leon Romanovsky 
>> >---
>> > rdma/Makefile |   2 +-
>> > rdma/link.c   | 280 
>> > ++
>> > rdma/rdma.c   |   3 +-
>> > rdma/utils.c  |   5 ++
>> > 4 files changed, 288 insertions(+), 2 deletions(-)
>> > create mode 100644 rdma/link.c
>> >
>> >diff --git a/rdma/Makefile b/rdma/Makefile
>> >index 123d7ac5..1a9e4b1a 100644
>> >--- a/rdma/Makefile
>> >+++ b/rdma/Makefile
>> >@@ -2,7 +2,7 @@ include ../Config
>> >
>> > ifeq ($(HAVE_MNL),y)
>> >
>> >-RDMA_OBJ = rdma.o utils.o dev.o
>> >+RDMA_OBJ = rdma.o utils.o dev.o link.o
>> >
>> > TARGETS=rdma
>> > CFLAGS += $(shell $(PKG_CONFIG) libmnl --cflags)
>> >diff --git a/rdma/link.c b/rdma/link.c
>> >new file mode 100644
>> >index ..f92b4cef
>> >--- /dev/null
>> >+++ b/rdma/link.c
>> >@@ -0,0 +1,280 @@
>> >+/*
>> >+ * link.c  RDMA tool
>> >+ *
>> >+ *  This program is free software; you can redistribute it 
>> >and/or
>> >+ *  modify it under the terms of the GNU General Public License
>> >+ *  as published by the Free Software Foundation; either 
>> >version
>> >+ *  2 of the License, or (at your option) any later version.
>> >+ *
>> >+ * Authors: Leon Romanovsky 
>> >+ */
>> >+
>> >+#include "rdma.h"
>> >+
>> >+static int link_help(struct rdma *rd)
>> >+{
>> >+   pr_out("Usage: %s link show [DEV/PORT_INDEX]\n", rd->filename);
>> >+   return 0;
>> >+}
>> >+
>> >+static void link_print_caps(struct nlattr **tb)
>> >+{
>> >+   uint64_t caps;
>> >+   uint32_t idx;
>> >+
>> >+   /*
>> >+* FIXME: move to indexes when kernel will start exporting them.
>>
>> Not exported yet?
>
>Not yet, I want to minimize the UAPI export from kernel before user-space
>part is accepted.

I don't get it. If you need it in userspace, you should expose it. Why
to wait? What am I missing?

[...]


>> >+   rd->port_idx = port ? :1;
>>
>> "port ? : 1"
>
>Yeah, legal C, the same as (port) ? port : 1
>ihttps://en.wikipedia.org/wiki/%3F:#C


I was referring to the missing " ". I'm a nitpicker :)

Re: [Patch] mqueue: fix netlink sock refcnt and skb refcnt

2017-07-10 Thread Linus Torvalds

This thing is definitely not cc'd to the right people:

On Sun, Jul 9, 2017 at 10:08 PM, Cong Wang  wrote:
>
> Cc: Linus Torvalds 
> Cc: Andrew Morton 
> Cc: Manfred Spraul 
> Signed-off-by: Cong Wang 

Unlike your previous patch, this seems to be more of a generic netlink
interface issue, so you should primarily talk to the networking
people, I'm not sure it makes much sense to cc me/Andrew/Manfred.

 Linus

Re: [RFC PATCH 10/12] xdp: Add batching support to redirect map

2017-07-10 Thread John Fastabend

On 07/10/2017 10:53 AM, Jesper Dangaard Brouer wrote:
> On Fri, 07 Jul 2017 10:37:59 -0700
> John Fastabend  wrote:
> 
>> diff --git a/kernel/bpf/devmap.c b/kernel/bpf/devmap.c
>> index 36dc13de..656e334 100644
>> --- a/kernel/bpf/devmap.c
>> +++ b/kernel/bpf/devmap.c
> [...]
>>  
>> +void __dev_map_insert_ctx(struct bpf_map *map, u32 key)
>> +{
>> +struct bpf_dtab *dtab = container_of(map, struct bpf_dtab, map);
>> +unsigned long *bitmap = this_cpu_ptr(dtab->flush_needed);
>> +
>> +set_bit(key, bitmap);
>> +}
> 
> I don't like that this adds an atomic op (set_bit) per packet on a fast-path.
> It shows up on a perf top #6 with xdp_redirect_map.
> 

Its a per cpu bitmap so __set_bit() should be fine here.

Thanks,
John

Re: [RFC PATCH 10/12] xdp: Add batching support to redirect map

2017-07-10 Thread Jesper Dangaard Brouer

On Fri, 07 Jul 2017 10:37:59 -0700
John Fastabend  wrote:

> diff --git a/kernel/bpf/devmap.c b/kernel/bpf/devmap.c
> index 36dc13de..656e334 100644
> --- a/kernel/bpf/devmap.c
> +++ b/kernel/bpf/devmap.c
[...]
>  
> +void __dev_map_insert_ctx(struct bpf_map *map, u32 key)
> +{
> + struct bpf_dtab *dtab = container_of(map, struct bpf_dtab, map);
> + unsigned long *bitmap = this_cpu_ptr(dtab->flush_needed);
> +
> + set_bit(key, bitmap);
> +}

I don't like that this adds an atomic op (set_bit) per packet on a fast-path.
It shows up on a perf top #6 with xdp_redirect_map.

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer

Re: [RFC PATCH 03/12] xdp: add bpf_redirect helper function

2017-07-10 Thread John Fastabend

On 07/09/2017 06:37 AM, Saeed Mahameed wrote:
> 
> 
> On 7/7/2017 8:35 PM, John Fastabend wrote:
>> This adds support for a bpf_redirect helper function to the XDP
>> infrastructure. For now this only supports redirecting to the egress
>> path of a port.
>>
>> In order to support drivers handling a xdp_buff natively this patches
>> uses a new ndo operation ndo_xdp_xmit() that takes pushes a xdp_buff
>> to the specified device.
>>
>> If the program specifies either (a) an unknown device or (b) a device
>> that does not support the operation a BPF warning is thrown and the
>> XDP_ABORTED error code is returned.
>>
>> Signed-off-by: John Fastabend 
>> Acked-by: Daniel Borkmann 
>> ---

[...]

>>
>> +static int __bpf_tx_xdp(struct net_device *dev, struct xdp_buff *xdp)
>> +{
>> +if (dev->netdev_ops->ndo_xdp_xmit) {
>> +dev->netdev_ops->ndo_xdp_xmit(dev, xdp);
> 
> Hi John,
> 
> I have some concern here regarding synchronizing between the
> redirecting device and the target device:
> 
> if the target device's NAPI is also doing XDP_TX on the same XDP TX
> ring which this NDO might be redirecting xdp packets into the same
> ring, there would be a race accessing this ring resources (buffers
> and descriptors). Maybe you addressed this issue in the device driver
> implementation of this ndo or with some NAPI tricks/assumptions, I
> guess we have the same issue for if you run the same program to
> redirect traffic from multiple netdevices into one netdevice, how do
> you synchronize accessing this TX ring ?

The implementation uses a per cpu TX ring to resolve these races. And
the pair of driver interface API calls, xdp_do_redirect() and xdp_do_flush_map()
must be completed in a single poll() handler.

This comment was included in the header file to document this,

/* The pair of xdp_do_redirect and xdp_do_flush_map MUST be called in the
 * same cpu context. Further for best results no more than a single map
 * for the do_redirect/do_flush pair should be used. This limitation is
 * because we only track one map and force a flush when the map changes.
 * This does not appear to be a real limitation for existing software.
 */

In general some documentation about implementing XDP would probably be
useful to add in Documentation/networking but this IMO goes beyond just
this patch series.

> 
> Maybe we need some clear guidelines in this ndo documentation stating
> how to implement this ndo and what are the assumptions on those XDP
> TX redirect rings or from which context this ndo can run.
> 
> can you please elaborate.

I think the best implementation is to use a per cpu TX ring as I did in
this series. If your device is limited by the number of queues for some
reason some other scheme would need to be devised. Unfortunately, the only
thing I've come up for this case (using only this series) would both impact
performance and make the code complex.

A nice solution might be to constrain networking "tasks" to only a subset
of cores. For 64+ core systems this might be a good idea. It would allow
avoiding locking using per_cpu logic but also avoid networking consuming
slices of every core in the system. As core count goes up I think we will
eventually need to address this.I believe Eric was thinking along these
lines with his netconf talk iirc. Obviously this work is way outside the
scope of this series though.

> Thanks,
> Saeed.
>

Re: [PATCH v2 0/9] Remove spin_unlock_wait()

2017-07-10 Thread Manfred Spraul


Hi Alan,

On 07/08/2017 06:21 PM, Alan Stern wrote:

Pardon me for barging in, but I found this whole interchange extremely
confusing...

On Sat, 8 Jul 2017, Ingo Molnar wrote:


* Paul E. McKenney  wrote:


On Sat, Jul 08, 2017 at 10:35:43AM +0200, Ingo Molnar wrote:

* Manfred Spraul  wrote:


Hi Ingo,

On 07/07/2017 10:31 AM, Ingo Molnar wrote:

There's another, probably just as significant advantage: 
queued_spin_unlock_wait()
is 'read-only', while spin_lock()+spin_unlock() dirties the lock cache line. On
any bigger system this should make a very measurable difference - if
spin_unlock_wait() is ever used in a performance critical code path.

At least for ipc/sem:
Dirtying the cacheline (in the slow path) allows to remove a smp_mb() in the
hot path.
So for sem_lock(), I either need a primitive that dirties the cacheline or
sem_lock() must continue to use spin_lock()/spin_unlock().

This statement doesn't seem to make sense.  Did Manfred mean to write
"smp_mb()" instead of "spin_lock()/spin_unlock()"?

Option 1:
fastpath:
spin_lock(local_lock)
smp_mb(); [[1]]
smp_load_acquire(global_flag);
slow path:
global_flag = 1;
smp_mb();


Option 2:
fastpath:
spin_lock(local_lock);
smp_load_acquire(global_flag)
slow path:
global_flag = 1;
spin_lock(local_lock);spin_unlock(local_lock).

Rational:
The ACQUIRE from spin_lock is at the read of local_lock, not at the write.
i.e.: Without the smp_mb() at [[1]], the CPU can do:
read local_lock;
read global_flag;
write local_lock;
For Option 2, the smp_mb() is not required, because fast path and slow 
path acquire the same lock.



Technically you could use spin_trylock()+spin_unlock() and avoid the lock 
acquire
spinning on spin_unlock() and get very close to the slow path performance of a
pure cacheline-dirtying behavior.

This is even more confusing.  Did Ingo mean to suggest using
"spin_trylock()+spin_unlock()" in place of "spin_lock()+spin_unlock()"
could provide the desired ordering guarantee without delaying other
CPUs that may try to acquire the lock?  That seems highly questionable.

I agree :-)

--
Manfred

Re: [Patch] mqueue: fix netlink sock refcnt and skb refcnt

2017-07-10 Thread Cong Wang

On Sun, Jul 9, 2017 at 10:08 PM, Cong Wang  wrote:
> netlink_sendskb() is problematic, it releases sock refcnt
> silently which could cause troubles we can call it multiple
> times. info->notify_sock is a good example where we
> setup once and use it to send netlink skb's for many times.
> It should not hold or release any refcnt, but needs to rely
> on netlink_attachskb()/netlink_detachskb() to hold/release
> the corresponding refcnt.
>
> Same for the skb attached to this sock, it is allocated once
> and used for multiple times, so we should hold its refcnt
> in netlink_attachskb().
>
> At last, we need to call netlink_detachskb() to release
> both refcnt's after we remove the notification.

Hmm, the info->notify_owner is NULL'ed after sending
the notification, so probably we don't put the sock refcnt
repeatly. Not sure about the skb though...

Re: [PATCH net-next RFC 08/12] net: dsa: Remove support for bypass bridge port attributes/vlan set

2017-07-10 Thread Vivien Didelot

Hi Florian, Arkadi,

Florian Fainelli  writes:

> I would be more comfortable if we still had a way to dump HW entries by
> calling into drivers, because it's useful for debugging, and doing that
> using standard tools plus an additional flag for instance:
>
> bridge fdb show
>
> would dump the SW-maintained VLANs, but:
>
> bridge fdb show hw

I don't think this makes much sense because the network hardware
architecture must be abstracted to the user, who only deals with user
network interfaces. This is also only needed for development and debug.

> would dump the HW-maintained VLANs, or any flag name really, but the
> point is that we can keep using a standard tool to expose debugging
> features. debugfs (or something else) could be used, but was rejected by
> David.

IIRC, what David is opposed to is having a *driver specific* debugfs
interface, i.e. debugfs for only mv88e6xxx. But he was opened to having
a *generic* debugfs interface in net/dsa/ from which any DSA chip
(e.g. lan9303, b53, qca8k, etc.) would benefit.

That makes even more sense for CPU and DSA ports which are not exposed
to userspace. Andrew played with dpipe but that wasn't quite satisfying
if I'm not mistaken. What we need for development and debug is a compile
time enableable interface to simply call into dsa_switch_ops directly.

So far here's what I have: port's VLAN, FDB, MDB, regs, stats, and
switch fabric topology (symlinks between interfaces): http://ix.io/yq0

I can split and submit this as RFC and see if I get slapped.

Thanks,

Vivien

[RESEND PATCH net] tap: convert a mutex to a spinlock

2017-07-10 Thread Cong Wang

We are not allowed to block on the RCU reader side, so can't
just hold the mutex as before. As a quick fix, convert it to
a spinlock.

Fixes: d9f1f61c0801 ("tap: Extending tap device create/destroy APIs")
Reported-by: Christian Borntraeger 
Tested-by: Christian Borntraeger 
Cc: Sainath Grandhi 
Signed-off-by: Cong Wang 
---
 drivers/net/tap.c | 18 +-
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/drivers/net/tap.c b/drivers/net/tap.c
index 9af3239..3570c75 100644
--- a/drivers/net/tap.c
+++ b/drivers/net/tap.c
@@ -106,7 +106,7 @@ struct major_info {
struct rcu_head rcu;
dev_t major;
struct idr minor_idr;
-   struct mutex minor_lock;
+   spinlock_t minor_lock;
const char *device_name;
struct list_head next;
 };
@@ -416,15 +416,15 @@ int tap_get_minor(dev_t major, struct tap_dev *tap)
goto unlock;
}
 
-   mutex_lock(&tap_major->minor_lock);
-   retval = idr_alloc(&tap_major->minor_idr, tap, 1, TAP_NUM_DEVS, 
GFP_KERNEL);
+   spin_lock(&tap_major->minor_lock);
+   retval = idr_alloc(&tap_major->minor_idr, tap, 1, TAP_NUM_DEVS, 
GFP_ATOMIC);
if (retval >= 0) {
tap->minor = retval;
} else if (retval == -ENOSPC) {
netdev_err(tap->dev, "Too many tap devices\n");
retval = -EINVAL;
}
-   mutex_unlock(&tap_major->minor_lock);
+   spin_unlock(&tap_major->minor_lock);
 
 unlock:
rcu_read_unlock();
@@ -442,12 +442,12 @@ void tap_free_minor(dev_t major, struct tap_dev *tap)
goto unlock;
}
 
-   mutex_lock(&tap_major->minor_lock);
+   spin_lock(&tap_major->minor_lock);
if (tap->minor) {
idr_remove(&tap_major->minor_idr, tap->minor);
tap->minor = 0;
}
-   mutex_unlock(&tap_major->minor_lock);
+   spin_unlock(&tap_major->minor_lock);
 
 unlock:
rcu_read_unlock();
@@ -467,13 +467,13 @@ static struct tap_dev *dev_get_by_tap_file(int major, int 
minor)
goto unlock;
}
 
-   mutex_lock(&tap_major->minor_lock);
+   spin_lock(&tap_major->minor_lock);
tap = idr_find(&tap_major->minor_idr, minor);
if (tap) {
dev = tap->dev;
dev_hold(dev);
}
-   mutex_unlock(&tap_major->minor_lock);
+   spin_unlock(&tap_major->minor_lock);
 
 unlock:
rcu_read_unlock();
@@ -1244,7 +1244,7 @@ static int tap_list_add(dev_t major, const char 
*device_name)
tap_major->major = MAJOR(major);
 
idr_init(&tap_major->minor_idr);
-   mutex_init(&tap_major->minor_lock);
+   spin_lock_init(&tap_major->minor_lock);
 
tap_major->device_name = device_name;
 
-- 
2.5.5

Re: [PATCH] brcmfmac: added LED triggers for transmit/receive

2017-07-10 Thread Russell Joyce

> 1) I think most of it should be some cfg80211 shareable code.

I’m not sure exactly what you mean by this, could you please clarify?

> 2) This "rxtx" while surely present in other places sounds like a
> workaround for LED subsystem limitation. Maybe it's time to finally
> rework LED triggers.

I agree that it’s not an ideal way to do things, but I couldn’t think of a
better alternative. I think that having a combined trigger is useful though, for
situations like using the single LED on a Raspberry Pi to show Wi-Fi activity.


> On 10 Jul 2017, at 10:48, Rafał Miłecki  wrote:
> 
> On 7 July 2017 at 16:09, Russell Joyce  wrote:
>> Add three basic LED triggers to brcmfmac, based on those in mac80211: one
>> for transmit, one for receive, and one for combined transmit/receive.
>> 
>> Signed-off-by: Russell Joyce 
> 
> 1) I think most of it should be some cfg80211 shareable code.
> 2) This "rxtx" while surely present in other places sounds like a
> workaround for LED subsystem limitation. Maybe it's time to finally
> rework LED triggers.

Re: [PATCH] net: chelsio: cxgb3: constify attribute_group structures.

2017-07-10 Thread Joe Perches

On Mon, 2017-07-10 at 16:04 +0530, Arvind Yadav wrote:
> attribute_groups are not supposed to change at runtime. All functions
> working with attribute_groups provided by  work
> with const attribute_group. So mark the non-const structs as const.

I think it's good you are doing all of these.

Instead of individually sending these patches, could you
please send a patch series for all of these attribute_group
patches with a cover letter at the same time?

That could make it easier for a trivial maintainer to apply
all of these at once and not get some applied and others
ignored or dropped on the floor.

bug in "PCI: Support INTx masking on ConnectX-4 with firmware x.14.1100+"?

2017-07-10 Thread Denys Vlasenko

+   /* Reading from resource space should be 32b aligned */
+   fw_maj_min = ioread32be(fw_ver);
+   fw_sub_min = ioread32be(fw_ver + 1);
+   fw_major = fw_maj_min & 0x;
+   fw_minor = fw_maj_min >> 16;
+   fw_subminor = fw_sub_min & 0x;

Maybe second read should be ioread32be(fw_ver + 4)?

Re: net: Fix inconsistent teardown and release of private netdev state.

2017-07-10 Thread Cong Wang

On Sun, Jul 9, 2017 at 7:07 PM, Jason A. Donenfeld  wrote:
> On Sat, Jul 8, 2017 at 12:39 AM, Cong Wang  wrote:
>> On Thu, Jul 6, 2017 at 7:24 AM, Jason A. Donenfeld  wrote:
>>> list_add(&priv->list, &list_of_things);
>>>
>>> ret = register_netdevice(); // if ret is < 0, then destruct above 
>>> is automatically called
>>>
>>> // RACE WITH LIST_ADD/LIST_DEL!! It's impossible to call list_add 
>>> only after
>>> // things are brought up successfully. This is problematic.
>>>
>>> if (!ret)
>>> pr_info("Yay it worked!\n");
>>
>> I fail to understand what you mean by RACE here.
>>
>> Here you should already have RTNL lock, so it can't race with any other
>> newlink() calls. In fact you can't acquire RTNL lock in your destructor
>> since register_netdevice() already gets it. Perhaps you mean
>> netdev_run_todo() calls it without RTNL lock?
>>
>> I don't know why you reorder the above list_add(), you can order it
>> as it was before, aka, call it after register_netdevice(), but you have to
>> init the priv->list now for the list_del() on error path.
>
> The race is that there's a state in which priv->list is part of
> list_of_things before the interface is actually successfully setup and
> ready to go.
>
> And no, it's not possible to order it _after_ register_netdevice,
> since register_netdevice might call priv_destructor, and
> priv_destructor calls list_del, so if it's not already on the list,
> we'll OOPS. In otherwords, API problem.


As I said, you have to initialize it, list_del() on an empty head
is literally a nop, why oops?

Re: [PATCH iproute2 V3 3/4] rdma: Add link object

2017-07-10 Thread Leon Romanovsky

On Mon, Jul 10, 2017 at 10:13:07AM +0200, Jiri Pirko wrote:
> Tue, Jul 04, 2017 at 09:55:40AM CEST, l...@kernel.org wrote:
> >From: Leon Romanovsky 
> >
> >Link (port) object represent struct ib_port to the user space.
> >
> >Link properties:
> > * Port capabilities
> > * IB subnet prefix
> > * LID, SM_LID and LMC
> > * Port state
> > * Physical state
> >
> >Signed-off-by: Leon Romanovsky 
> >---
> > rdma/Makefile |   2 +-
> > rdma/link.c   | 280 
> > ++
> > rdma/rdma.c   |   3 +-
> > rdma/utils.c  |   5 ++
> > 4 files changed, 288 insertions(+), 2 deletions(-)
> > create mode 100644 rdma/link.c
> >
> >diff --git a/rdma/Makefile b/rdma/Makefile
> >index 123d7ac5..1a9e4b1a 100644
> >--- a/rdma/Makefile
> >+++ b/rdma/Makefile
> >@@ -2,7 +2,7 @@ include ../Config
> >
> > ifeq ($(HAVE_MNL),y)
> >
> >-RDMA_OBJ = rdma.o utils.o dev.o
> >+RDMA_OBJ = rdma.o utils.o dev.o link.o
> >
> > TARGETS=rdma
> > CFLAGS += $(shell $(PKG_CONFIG) libmnl --cflags)
> >diff --git a/rdma/link.c b/rdma/link.c
> >new file mode 100644
> >index ..f92b4cef
> >--- /dev/null
> >+++ b/rdma/link.c
> >@@ -0,0 +1,280 @@
> >+/*
> >+ * link.c   RDMA tool
> >+ *
> >+ *  This program is free software; you can redistribute it 
> >and/or
> >+ *  modify it under the terms of the GNU General Public License
> >+ *  as published by the Free Software Foundation; either version
> >+ *  2 of the License, or (at your option) any later version.
> >+ *
> >+ * Authors: Leon Romanovsky 
> >+ */
> >+
> >+#include "rdma.h"
> >+
> >+static int link_help(struct rdma *rd)
> >+{
> >+pr_out("Usage: %s link show [DEV/PORT_INDEX]\n", rd->filename);
> >+return 0;
> >+}
> >+
> >+static void link_print_caps(struct nlattr **tb)
> >+{
> >+uint64_t caps;
> >+uint32_t idx;
> >+
> >+/*
> >+ * FIXME: move to indexes when kernel will start exporting them.
>
> Not exported yet?

Not yet, I want to minimize the UAPI export from kernel before user-space
part is accepted.

>
>
>
> >+ */
> >+static const char *link_caps[64] = {
>
> []

It will require from me to fill all 64 fields.
In current version, I'm leveraging the fact that static is initialized
to zero (NULL).

>
>
> >+"UNKNOWN",
> >+"SM",
> >+"NOTICE",
> >+"TRAP",
> >+"OPT_IPD",
> >+"AUTO_MIGR",
> >+"SL_MAP",
> >+"MKEY_NVRAM",
> >+"PKEY_NVRAM",
> >+"LED_INFO",
> >+"SM_DISABLED",
> >+"SYS_IMAGE_GUID",
> >+"PKEY_SW_EXT_PORT_TRAP",
> >+"UNKNOWN",
> >+"EXTENDED_SPEEDS",
> >+"UNKNOWN",
> >+"CM",
> >+"SNMP_TUNNEL",
> >+"REINIT",
> >+"DEVICE_MGMT",
> >+"VENDOR_CLASS",
> >+"DR_NOTICE",
> >+"CAP_MASK_NOTICE",
> >+"BOOT_MGMT",
> >+"LINK_LATENCY",
> >+"CLIENT_REG",
> >+"IP_BASED_GIDS",
> >+};
> >+
> >+if (!tb[RDMA_NLDEV_ATTR_CAP_FLAGS])
> >+return;
> >+
> >+caps = mnl_attr_get_u64(tb[RDMA_NLDEV_ATTR_CAP_FLAGS]);
> >+
> >+pr_out("\ncaps: <");
> >+for (idx = 0; idx < 64; idx++) {
> >+if (caps & 0x1) {
> >+pr_out("%s", link_caps[idx]?link_caps[idx]:"UNKNONW");
>
> "link_caps[idx] ? link_caps[idx] : "UNKNOWN""
>
> note the s/UNKNONW/UNKNOWN/

Right

>
>
> >+if (caps >> 0x1)
> >+pr_out(", ");
> >+}
> >+caps >>= 0x1;
>
> Interesting.
>
>
> >+}
> >+
> >+pr_out(">");
> >+}
> >+
> >+static void link_print_subnet_prefix(struct nlattr **tb)
> >+{
> >+uint64_t subnet_prefix;
> >+uint16_t sp[4];
> >+
> >+if (!tb[RDMA_NLDEV_ATTR_SUBNET_PREFIX])
> >+return;
> >+
> >+subnet_prefix = mnl_attr_get_u64(tb[RDMA_NLDEV_ATTR_SUBNET_PREFIX]);
> >+memcpy(sp, &subnet_prefix, sizeof(uint64_t));
> >+pr_out("subnet_prefix %04x:%04x:%04x:%04x ", sp[3], sp[2], sp[1], 
> >sp[0]);
>
> You have similar pr_out helper in the previous patch. Perhaps you can
> re-use it?

Sure

>
>
> >+}
> >+
> >+static void link_print_lid(struct nlattr **tb)
> >+{
> >+if (!tb[RDMA_NLDEV_ATTR_LID])
> >+return;
> >+
> >+pr_out("lid %u ",
> >+   mnl_attr_get_u32(tb[RDMA_NLDEV_ATTR_LID]));
> >+}
> >+
> >+static void link_print_sm_lid(struct nlattr **tb)
> >+{
> >+
>
> Avoid the extra empty line.
>

Will remove

>
> >+if (!tb[RDMA_NLDEV_ATTR_SM_LID])
> >+return;
> >+
> >+pr_out("sm_lid %u ",
> >+   mnl_attr_get_u32(tb[RDMA_NLDEV_ATTR_SM_LID]));
> >+}
> >+
> >+static void link_print_lmc(struct nlattr **tb)
> >+{
> >+if (!tb[RDMA_NLDEV_ATTR_LMC])
> >+return;
> >+
> >+pr_out("lmc %u ", mnl_attr_get_u8(tb[RDMA_NLDEV_ATTR_LMC]));
> >+}
> >+
> >+static void link_print_state(struct

Re: [PATCH iproute2 V3 0/4] RDMAtool

2017-07-10 Thread Leon Romanovsky

On Mon, Jul 10, 2017 at 10:02:30AM +0200, Jiri Pirko wrote:
> Tue, Jul 04, 2017 at 09:55:37AM CEST, l...@kernel.org wrote:
> >Hi,
> >
> >This is third version of series implementing the RDAMtool -  the tool
> >to configure RDMA devices. The initial proposal was sent as RFC [1] and
> >was based on sysfs entries as POC.
> >
> >The current series was rewritten completely to work with RDMA netlinks as
> >a source of user<->kernel communications. In order to achieve that, the
> >RDMA netlinks were extensively refactored and modernized [2, 3, 4 and 5].
> >
> >The following is an example of various runs on my machine with 5 devices
> >(4 in IB mode and one in Ethernet mode)
> >
> >### Without parameters
> >$ rdma
> >Usage: rdma [ OPTIONS ] OBJECT { COMMAND | help }
> >where  OBJECT := { dev | link | help }
> >   OPTIONS := { -V[ersion] | -d[etails]}
>
> What about json output? You will need it sooner than later. It will
> prevent you from a lot of headaches if you implement it right away.
> Lesson learned...

I'm planning to do it in the coming kernel cycle.

Thanks


signature.asc
Description: PGP signature

Re: [PATCH] dpaa_eth: use correct device for DMA mapping API

2017-07-10 Thread Robin Murphy

On 10/07/17 16:14, Arnd Bergmann wrote:
> Geert Uytterhoeven ran into a build error without CONFIG_HAS_DMA,
> as a result of the driver calling set_dma_ops(). While we can
> fix the build error in the dma-mapping implementation, there is
> another problem in this driver:
> 
> The configuration for the DMA is done by the platform code,
> looking up information about the system from the device tree.
> This copies the information only in an incomplete way, setting
> the dma_map_ops and forcing a specific mask, but ignoring all
> settings regarding IOMMU, coherence etc.
> 
> A better way to avoid the problem is to only ever pass a device
> into the dma_mapping implementation that has been setup by the
> platform code. In this case, that is the parent device, so we
> can get that pointer at probe time. Fortunately, we already have
> a pointer in the device specific structure for that, so we only
> need to modify that.
> 
> Fixes: fb52728a9294 ("dpaa_eth: reuse the dma_ops provided by the FMan MAC 
> device")
> Signed-off-by: Arnd Bergmann 

Acked-by: Robin Murphy 

Assuming all the DPAA subcomponents have the same DMA capabilities
(which to the best of my knowledge they probably do), this is a neater
approach than what I started a while back in the context of cleaning up
arch_setup_dma_ops() abuse. FWIW, that looked like this:

->8-
diff --git a/drivers/net/ethernet/freescale/dpaa/dpaa_eth.c
b/drivers/net/ethernet/freescale/dpaa/dpaa_eth.c
index e2ca107f9d94..8eef0db5db30 100644
--- a/drivers/net/ethernet/freescale/dpaa/dpaa_eth.c
+++ b/drivers/net/ethernet/freescale/dpaa/dpaa_eth.c
@@ -2539,10 +2539,9 @@ static int dpaa_eth_probe(struct platform_device
*pdev)
priv->buf_layout[TX].priv_data_size = DPAA_TX_PRIV_DATA_SIZE; /*
Tx */

/* device used for DMA mapping */
-   arch_setup_dma_ops(dev, 0, 0, NULL, false);
-   err = dma_coerce_mask_and_coherent(dev, DMA_BIT_MASK(40));
+   err = dma_set_mask_and_coherent(dev, DMA_BIT_MASK(40));
if (err) {
-   dev_err(dev, "dma_coerce_mask_and_coherent() failed\n");
+   dev_err(dev, "dma_set_mask_and_coherent() failed\n");
goto dev_mask_failed;
}

diff --git a/drivers/net/ethernet/freescale/fman/mac.c
b/drivers/net/ethernet/freescale/fman/mac.c
index 0b31f8502ada..c81efbfa99c2 100644
--- a/drivers/net/ethernet/freescale/fman/mac.c
+++ b/drivers/net/ethernet/freescale/fman/mac.c
@@ -627,6 +627,10 @@ static struct platform_device
*dpaa_eth_add_device(int fman_id,
if (ret)
goto err;

+   ret = of_dma_configure(&pdev->dev, node);
+   if (ret)
+   goto err;
+
ret = platform_device_add(pdev);
if (ret)
goto err;
-8<-

We might possibly need to revisit this if and when the question of IOMMU
support in the fsl-mc driver comes up (all this stuff is completely
paged out of my head at the moment, so I'm not certain), but for now I
don't see any reason no to go with your patch.

Robin.

> ---
> Not tested, please see if this works before applying!
> ---
>  drivers/net/ethernet/freescale/dpaa/dpaa_eth.c | 11 ++-
>  1 file changed, 2 insertions(+), 9 deletions(-)
> 
> diff --git a/drivers/net/ethernet/freescale/dpaa/dpaa_eth.c 
> b/drivers/net/ethernet/freescale/dpaa/dpaa_eth.c
> index 757b873735a5..f7b0b928cd53 100644
> --- a/drivers/net/ethernet/freescale/dpaa/dpaa_eth.c
> +++ b/drivers/net/ethernet/freescale/dpaa/dpaa_eth.c
> @@ -2646,14 +2646,6 @@ static int dpaa_eth_probe(struct platform_device *pdev)
>   priv->buf_layout[RX].priv_data_size = DPAA_RX_PRIV_DATA_SIZE; /* Rx */
>   priv->buf_layout[TX].priv_data_size = DPAA_TX_PRIV_DATA_SIZE; /* Tx */
>  
> - /* device used for DMA mapping */
> - set_dma_ops(dev, get_dma_ops(&pdev->dev));
> - err = dma_coerce_mask_and_coherent(dev, DMA_BIT_MASK(40));
> - if (err) {
> - dev_err(dev, "dma_coerce_mask_and_coherent() failed\n");
> - goto dev_mask_failed;
> - }
> -
>   /* bp init */
>   for (i = 0; i < DPAA_BPS_NUM; i++) {
>   int err;
> @@ -2665,7 +2657,8 @@ static int dpaa_eth_probe(struct platform_device *pdev)
>   dpaa_bps[i]->raw_size = bpool_buffer_raw_size(i, DPAA_BPS_NUM);
>   /* avoid runtime computations by keeping the usable size here */
>   dpaa_bps[i]->size = dpaa_bp_size(dpaa_bps[i]->raw_size);
> - dpaa_bps[i]->dev = dev;
> + /* DMA operations are done on the platform-provided device */
> + dpaa_bps[i]->dev = dev->parent;
>  
>   err = dpaa_bp_alloc_pool(dpaa_bps[i]);
>   if (err < 0) {
>

Re: [PATCH v2] dt-bindings: net: ravb : Add support for r8a7743 SoC

2017-07-10 Thread Geert Uytterhoeven

On Mon, Jul 10, 2017 at 5:32 PM, Biju Das  wrote:
> Add a new compatible string for the RZ/G1M (R8A7743) SoC.
>
> Signed-off-by: Biju Das 

Reviewed-by: Geert Uytterhoeven 

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds

[PATCH v2] dt-bindings: net: ravb : Add support for r8a7743 SoC

2017-07-10 Thread Biju Das

Add a new compatible string for the RZ/G1M (R8A7743) SoC.

Signed-off-by: Biju Das 
---
v1->v2
* Changed the subject
* re-formatted the required properties

 .../devicetree/bindings/net/renesas,ravb.txt   | 29 +-
 1 file changed, 17 insertions(+), 12 deletions(-)

diff --git a/Documentation/devicetree/bindings/net/renesas,ravb.txt 
b/Documentation/devicetree/bindings/net/renesas,ravb.txt
index b519503..4717bc2 100644
--- a/Documentation/devicetree/bindings/net/renesas,ravb.txt
+++ b/Documentation/devicetree/bindings/net/renesas,ravb.txt
@@ -4,19 +4,24 @@ This file provides information on what the device node for 
the Ethernet AVB
 interface contains.
 
 Required properties:
-- compatible: "renesas,etheravb-r8a7790" if the device is a part of R8A7790 
SoC.
- "renesas,etheravb-r8a7791" if the device is a part of R8A7791 SoC.
- "renesas,etheravb-r8a7792" if the device is a part of R8A7792 SoC.
- "renesas,etheravb-r8a7793" if the device is a part of R8A7793 SoC.
- "renesas,etheravb-r8a7794" if the device is a part of R8A7794 SoC.
- "renesas,etheravb-r8a7795" if the device is a part of R8A7795 SoC.
- "renesas,etheravb-r8a7796" if the device is a part of R8A7796 SoC.
- "renesas,etheravb-rcar-gen2" for generic R-Car Gen 2 compatible 
interface.
- "renesas,etheravb-rcar-gen3" for generic R-Car Gen 3 compatible 
interface.
+- compatible: Must contain one or more of the following:
+  - "renesas,etheravb-r8a7743" for the R8A7743 SoC.
+  - "renesas,etheravb-r8a7790" for the R8A7790 SoC.
+  - "renesas,etheravb-r8a7791" for the R8A7791 SoC.
+  - "renesas,etheravb-r8a7792" for the R8A7792 SoC.
+  - "renesas,etheravb-r8a7793" for the R8A7793 SoC.
+  - "renesas,etheravb-r8a7794" for the R8A7794 SoC.
+  - "renesas,etheravb-rcar-gen2" as a fallback for the above
+   R-Car Gen2 and RZ/G1 devices.
 
- When compatible with the generic version, nodes must list the
- SoC-specific version corresponding to the platform first
- followed by the generic version.
+  - "renesas,etheravb-r8a7795" for the R8A7795 SoC.
+  - "renesas,etheravb-r8a7796" for the R8A7796 SoC.
+  - "renesas,etheravb-rcar-gen3" as a fallback for the above
+   R-Car Gen3 devices.
+
+   When compatible with the generic version, nodes must list the
+   SoC-specific version corresponding to the platform first followed by
+   the generic version.
 
 - reg: offset and length of (1) the register block and (2) the stream buffer.
 - interrupts: A list of interrupt-specifiers, one for each entry in
-- 
1.9.1

Re: [PATCH] net/mlx5: IPSec, fix 64-bit division correctly

2017-07-10 Thread Joe Perches

On Mon, 2017-07-10 at 10:24 +, Ilan Tayari wrote:
> > -Original Message-
> > From: Arnd Bergmann [mailto:a...@arndb.de]
> > Subject: [PATCH] net/mlx5: IPSec, fix 64-bit division correctly
> > 
> > The new IPSec offload code introduced a build error:
> > 
> > drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec_rxtx.o: In function
> > `mlx5e_ipsec_build_inverse_table':
> > ipsec_rxtx.c:(.text+0x556): undefined reference
> > 
> > Another patch was added on top to fix the build error, but
> > that introduced a new bug, as we now use the remainder of
> > the division rather than the result.

Is it possible to return noise in mlx5e_ipsec_mss_inv ?

What clamps skb_shinfo(skb)->gso_size to MAX_LSO
(the size of inverse_table)?

#define MAX_LSO_MSS 2048
static __be16 mlx5e_ipsec_inverse_table[MAX_LSO_MSS];

static inline __be16 mlx5e_ipsec_mss_inv(struct sk_buff *skb)
{
return mlx5e_ipsec_inverse_table[skb_shinfo(skb)->gso_size];
}

Re: [PATCH] dpaa_eth: use correct device for DMA mapping API

2017-07-10 Thread Geert Uytterhoeven

Hi Arnd,

On Mon, Jul 10, 2017 at 5:14 PM, Arnd Bergmann  wrote:
> Geert Uytterhoeven ran into a build error without CONFIG_HAS_DMA,
> as a result of the driver calling set_dma_ops(). While we can
> fix the build error in the dma-mapping implementation, there is
> another problem in this driver:
>
> The configuration for the DMA is done by the platform code,
> looking up information about the system from the device tree.
> This copies the information only in an incomplete way, setting
> the dma_map_ops and forcing a specific mask, but ignoring all
> settings regarding IOMMU, coherence etc.
>
> A better way to avoid the problem is to only ever pass a device
> into the dma_mapping implementation that has been setup by the
> platform code. In this case, that is the parent device, so we
> can get that pointer at probe time. Fortunately, we already have
> a pointer in the device specific structure for that, so we only
> need to modify that.

Thank you, that looks like a much better solution!

> Fixes: fb52728a9294 ("dpaa_eth: reuse the dma_ops provided by the FMan MAC 
> device")
> Signed-off-by: Arnd Bergmann 

Acked-by: Geert Uytterhoeven 

> Not tested, please see if this works before applying!

Indeed, please test first.

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds

Re: [PATCH/RFC] dma-mapping: Provide dummy set_dma_ops() for NO_DMA=y

2017-07-10 Thread Robin Murphy

On 10/07/17 15:56, Christoph Hellwig wrote:
> This looks reasonable to me, I'd be happy to pick it up.  Can you send
> it as a series with the reverts?

The fact remains that the FSL driver is still doing the wrong thing
though - set_dma_ops(dev1, get_dma_ops(dev2)) is just a hack which
happens to work on platforms which don't keep other arch-internal DMA
info as well. I did start writing a patch somewhere to fix this thing to
actually do proper DMA configuration (originally in the context of not
abusing arch_setup_dma_ops()), but Arnd's fix is probably simpler.

I don't think it makes an awful lot of sense for code without a DMA API
dependency to be calling set_dma_ops() - AFAIU the only reason it's
available to drivers at all is for particular RDMA cases which know that
their "DMA" is actually done by CPU threads, and want to optimise for that.

Robin.

Re: [iproute PATCH] ip netns: Make sure netns name is sane

2017-07-10 Thread Stephen Hemminger

On Mon, 10 Jul 2017 13:19:12 +0200
Phil Sutter  wrote:

> +static bool is_basename(const char *name)
> +{
> + char *name_dup = strdup(name);
> + bool rc = true;
> +
> + if (!name_dup)
> + return false;
> +
> + if (strcmp(basename(name_dup), name))
> + rc = false;
> +
> + free(name_dup);
> + return rc;
> +}

Why not just:

static bool is_basename(const char *name)
{
return strchr(name '/') == NULL;
}

[PATCH] dpaa_eth: use correct device for DMA mapping API

2017-07-10 Thread Arnd Bergmann

Geert Uytterhoeven ran into a build error without CONFIG_HAS_DMA,
as a result of the driver calling set_dma_ops(). While we can
fix the build error in the dma-mapping implementation, there is
another problem in this driver:

The configuration for the DMA is done by the platform code,
looking up information about the system from the device tree.
This copies the information only in an incomplete way, setting
the dma_map_ops and forcing a specific mask, but ignoring all
settings regarding IOMMU, coherence etc.

A better way to avoid the problem is to only ever pass a device
into the dma_mapping implementation that has been setup by the
platform code. In this case, that is the parent device, so we
can get that pointer at probe time. Fortunately, we already have
a pointer in the device specific structure for that, so we only
need to modify that.

Fixes: fb52728a9294 ("dpaa_eth: reuse the dma_ops provided by the FMan MAC 
device")
Signed-off-by: Arnd Bergmann 
---
Not tested, please see if this works before applying!
---
 drivers/net/ethernet/freescale/dpaa/dpaa_eth.c | 11 ++-
 1 file changed, 2 insertions(+), 9 deletions(-)

diff --git a/drivers/net/ethernet/freescale/dpaa/dpaa_eth.c 
b/drivers/net/ethernet/freescale/dpaa/dpaa_eth.c
index 757b873735a5..f7b0b928cd53 100644
--- a/drivers/net/ethernet/freescale/dpaa/dpaa_eth.c
+++ b/drivers/net/ethernet/freescale/dpaa/dpaa_eth.c
@@ -2646,14 +2646,6 @@ static int dpaa_eth_probe(struct platform_device *pdev)
priv->buf_layout[RX].priv_data_size = DPAA_RX_PRIV_DATA_SIZE; /* Rx */
priv->buf_layout[TX].priv_data_size = DPAA_TX_PRIV_DATA_SIZE; /* Tx */
 
-   /* device used for DMA mapping */
-   set_dma_ops(dev, get_dma_ops(&pdev->dev));
-   err = dma_coerce_mask_and_coherent(dev, DMA_BIT_MASK(40));
-   if (err) {
-   dev_err(dev, "dma_coerce_mask_and_coherent() failed\n");
-   goto dev_mask_failed;
-   }
-
/* bp init */
for (i = 0; i < DPAA_BPS_NUM; i++) {
int err;
@@ -2665,7 +2657,8 @@ static int dpaa_eth_probe(struct platform_device *pdev)
dpaa_bps[i]->raw_size = bpool_buffer_raw_size(i, DPAA_BPS_NUM);
/* avoid runtime computations by keeping the usable size here */
dpaa_bps[i]->size = dpaa_bp_size(dpaa_bps[i]->raw_size);
-   dpaa_bps[i]->dev = dev;
+   /* DMA operations are done on the platform-provided device */
+   dpaa_bps[i]->dev = dev->parent;
 
err = dpaa_bp_alloc_pool(dpaa_bps[i]);
if (err < 0) {
-- 
2.9.0

Re: Investment Consideration

2017-07-10 Thread John Hanan

Thank you for your time,

We are looking for clients in your country with good business or project that 
requires financing to execute.

Please get back to me if you are interested in this or you know anybody who has 
good business ideas but lack the necessary capital to fund his projects so we 
can establish working relationship.

Sincerely,

John Hanan, MBA, CFA
General Investment Consultant

Re: CAN-FD Transceiver Limitations

2017-07-10 Thread Marc Kleine-Budde

On 06/29/2017 05:41 PM, Andrew Lunn wrote:
>> Transceivers for CAN are not apart of any model. Traditional CAN didn't
>> have a problem because all transceivers from my understanding supported
>> the maximum speed of 1 Mbps defined by the spec. However, with the
>> introduction of CAN Flexible Datarate mode it seems that for
>> transceivers that supported CAN-FD the maximum supported speeds vary.
> 
> So transceivers are dumb devices, nothing to configure, so no need to
> have a driver for them.

Yes and no.

CAN transceivers are usually quite dumb, but most of them have some sort
of "enable" pin. This pin is currently modelled as a regulator. Which
fits nicely, as there dual transceivers with only one enable pin.

However there are more complicated transceivers with two pins, that
implement a state machine, where you can query the chip for various
error conditions and can configure remote wakeup, etc... So in the
future a proper driver might be implemented.

Marc

-- 
Pengutronix e.K.  | Marc Kleine-Budde   |
Industrial Linux Solutions| Phone: +49-231-2826-924 |
Vertretung West/Dortmund  | Fax:   +49-5121-206917- |
Amtsgericht Hildesheim, HRA 2686  | http://www.pengutronix.de   |

signature.asc
Description: OpenPGP digital signature

Re: [PATCH/RFC] dma-mapping: Provide dummy set_dma_ops() for NO_DMA=y

2017-07-10 Thread Christoph Hellwig

This looks reasonable to me, I'd be happy to pick it up.  Can you send
it as a series with the reverts?

Re: [PATCH net] mdio: mux: fix parsing mux registers outside of the PHY address range

2017-07-10 Thread Andrew Lunn

> To be clear, are you suggesting that we add an additional property to
> of_mdio_parse_addr() that specifies the limit to check against, or
> remove the check and add it to each additional caller?

Hi Jon

Probably the simplest is to add an extra parameter to mdio_mux_init()
which is the maximum allowed reg value.

We should not touch of_mdio_parse_addr(). reg is not an mdio
address. It is a count of gpios, or a value to be poked into an
register.

  Andrew

Re: [PATCH net] mdio: mux: fix parsing mux registers outside of the PHY address range

2017-07-10 Thread Jon Mason

On Mon, Jul 10, 2017 at 8:56 AM, Andrew Lunn  wrote:
> On Mon, Jul 10, 2017 at 02:35:23PM +0200, Martin Blumenstingl wrote:
>> mdio_mux_init parses the child nodes of the MDIO mux. When using
>> "mdio-mux-mmioreg" the child nodes are describing the register value
>> that is written to switch between the MDIO busses.
>>
>> The change which makes the error messages more verbose changed the
>> parsing of the "reg" property from a simple of_property_read_u32 call
>> to of_mdio_parse_addr. On a Khadas VIM (based on the Meson GXL SoC,
>> which uses mdio-mux-mmioreg) this prevents registering the MDIO mux
>> (because the "reg" values on the MDIO mux child nodes are 0x2009087f
>> and 0xe40908ff) and leads to the following errors:
>>   mdio-mux-mmioreg c883455c.eth-phy-mux: 
>> /soc/periphs@c8834000/eth-phy-mux/mdio@e40908ff PHY address -469169921 is 
>> too large
>>   mdio-mux-mmioreg c883455c.eth-phy-mux: Error: Failed to find reg for child 
>> /soc/periphs@c8834000/eth-phy-mux/mdio@e40908ff
>>   mdio-mux-mmioreg c883455c.eth-phy-mux: 
>> /soc/periphs@c8834000/eth-phy-mux/mdio@2009087f PHY address 537462911 is too 
>> large
>>   mdio-mux-mmioreg c883455c.eth-phy-mux: Error: Failed to find reg for child 
>> /soc/periphs@c8834000/eth-phy-mux/mdio@2009087f
>>   mdio-mux-mmioreg c883455c.eth-phy-mux: Error: No acceptable child buses 
>> found
>>   mdio-mux-mmioreg c883455c.eth-phy-mux: failed to register mdio-mux bus 
>> /soc/periphs@c8834000/eth-phy-mux
>> (as a result of that ethernet is not working, because the PHY which is
>> connected through the mux' child MDIO bus, which is not being
>> registered).
>>
>> Fix this by reverting the change from of_mdio_parse_addr to
>> of_mdio_parse_addr.
>
> Reviewed-by: Andrew Lunn 
>
> Yes, validating the reg property needs to be done separately in each
> user of the generic mdio-mix code. The reg for the gpio mux must be <=
> number of gpios, mmioreg must be somewhere within the address space,
> bcm-iproc < 1024?
>
> Jon, please feel free to add such code.

To be clear, are you suggesting that we add an additional property to
of_mdio_parse_addr() that specifies the limit to check against, or
remove the check and add it to each additional caller?

Thanks,
Jon

>
> Andrew

Re: [GIT] Networking

2017-07-10 Thread Herbert Xu

On Mon, Jul 10, 2017 at 08:10:02AM -0400, Sowmini Varadhan wrote:
>
> The reason that the WARN_ON is triggered is that af_alg_accept() calls
> sock_init_data() which does 
> 
>2636 if (sock) {
> :
>2639 sock->sk=   sk;

Oh yes.  This started out with just sock_init_data which would
not have triggered your warning.  Then someone came along and
added sock_graft which basically duplicates work already done in
sock_init_data.

In fact the reason they wanted sock_graft was purely to call
security_sock_graft to initialise some SELinux state.  So we
could avoid your warning by calling that directly.

---8<---
crypto: af_alg - Avoid sock_graft call warning

The newly added sock_graft warning triggers in af_alg_accept.
It's harmless as we're essentially doing sock->sk = sock->sk.

The sock_graft call is actually redundant because all the work
it does is subsumed by sock_init_data.  However, it was added
to placate SELinux as it uses it to initialise its internal state.

This patch avoisd the warning by making the SELinux call directly.

Reported-by: Linus Torvalds 
Signed-off-by: Herbert Xu 

diff --git a/crypto/af_alg.c b/crypto/af_alg.c
index 3556d8e..92a3d54 100644
--- a/crypto/af_alg.c
+++ b/crypto/af_alg.c
@@ -287,7 +287,7 @@ int af_alg_accept(struct sock *sk, struct socket *newsock, 
bool kern)
goto unlock;

sock_init_data(newsock, sk2);
-   sock_graft(sk2, newsock);
+   security_sock_graft(sk2, newsock);
security_sk_clone(sk, sk2);

err = type->accept(ask->private, sk2);
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

[PATCH net] cxgb4: fix BUG() on interrupt deallocating path of ULD

2017-07-10 Thread Guilherme G. Piccoli

Since the introduction of ULD (Upper-Layer Drivers), the MSI-X
deallocating path changed in cxgb4: the driver frees the interrupts
of ULD when unregistering it or on shutdown PCI handler.

Problem is that if a MSI-X is not freed before deallocated in the PCI
layer, it will trigger a BUG() due to still "alive" interrupt being
tentatively quiesced.

The below trace was observed when doing a simple unbind of Chelsio's
adapter PCI function, like:
  "echo 001e:80:00.4 > /sys/bus/pci/drivers/cxgb4/unbind"

Trace:

  kernel BUG at drivers/pci/msi.c:352!
  Oops: Exception in kernel mode, sig: 5 [#1]
  ...
  NIP [c05a5e60] free_msi_irqs+0xa0/0x250
  LR [c05a5e50] free_msi_irqs+0x90/0x250
  Call Trace:
  [c05a5e50] free_msi_irqs+0x90/0x250 (unreliable)
  [c05a72c4] pci_disable_msix+0x124/0x180
  [d00011e06708] disable_msi+0x88/0xb0 [cxgb4]
  [d00011e06948] free_some_resources+0xa8/0x160 [cxgb4]
  [d00011e06d60] remove_one+0x170/0x3c0 [cxgb4]
  [c058a910] pci_device_remove+0x70/0x110
  [c064ef04] device_release_driver_internal+0x1f4/0x2c0
  ...

This patch fixes the issue by refactoring the shutdown path of ULD on
cxgb4 driver, by properly freeing and disabling interrupts on PCI
remove handler too.

Fixes: 0fbc81b3ad51 ("Allocate resources dynamically for all cxgb4 ULD's")
Reported-by: Harsha Thyagaraja 
Signed-off-by: Guilherme G. Piccoli 
---

It might be important to add this patch to stable too if possible, v4.9+ .

 drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c | 16 +++---
 drivers/net/ethernet/chelsio/cxgb4/cxgb4_uld.c  | 42 +++--
 2 files changed, 36 insertions(+), 22 deletions(-)

diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c 
b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
index 86f92e31e8aa..e403fa18f1b1 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
@@ -2083,12 +2083,12 @@ static void detach_ulds(struct adapter *adap)
 
mutex_lock(&uld_mutex);
list_del(&adap->list_node);
+
for (i = 0; i < CXGB4_ULD_MAX; i++)
-   if (adap->uld && adap->uld[i].handle) {
+   if (adap->uld && adap->uld[i].handle)
adap->uld[i].state_change(adap->uld[i].handle,
 CXGB4_STATE_DETACH);
-   adap->uld[i].handle = NULL;
-   }
+
if (netevent_registered && list_empty(&adapter_list)) {
unregister_netevent_notifier(&cxgb4_netevent_nb);
netevent_registered = false;
@@ -5303,8 +5303,10 @@ static void remove_one(struct pci_dev *pdev)
 */
destroy_workqueue(adapter->workq);
 
-   if (is_uld(adapter))
+   if (is_uld(adapter)) {
detach_ulds(adapter);
+   t4_uld_clean_up(adapter);
+   }
 
disable_interrupts(adapter);
 
@@ -5385,7 +5387,11 @@ static void shutdown_one(struct pci_dev *pdev)
if (adapter->port[i]->reg_state == NETREG_REGISTERED)
cxgb_close(adapter->port[i]);
 
-   t4_uld_clean_up(adapter);
+   if (is_uld(adapter)) {
+   detach_ulds(adapter);
+   t4_uld_clean_up(adapter);
+   }
+
disable_interrupts(adapter);
disable_msi(adapter);
 
diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_uld.c 
b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_uld.c
index ec53fe9dec68..71a315bc1409 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_uld.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_uld.c
@@ -589,22 +589,37 @@ void t4_uld_mem_free(struct adapter *adap)
kfree(adap->uld);
 }
 
+/* This function should be called with uld_mutex taken. */
+static void cxgb4_shutdown_uld_adapter(struct adapter *adap, enum cxgb4_uld 
type)
+{
+   if (adap->uld[type].handle) {
+   adap->uld[type].handle = NULL;
+   adap->uld[type].add = NULL;
+   release_sge_txq_uld(adap, type);
+
+   if (adap->flags & FULL_INIT_DONE)
+   quiesce_rx_uld(adap, type);
+
+   if (adap->flags & USING_MSIX)
+   free_msix_queue_irqs_uld(adap, type);
+
+   free_sge_queues_uld(adap, type);
+   free_queues_uld(adap, type);
+   }
+}
+
 void t4_uld_clean_up(struct adapter *adap)
 {
unsigned int i;
 
-   if (!adap->uld)
-   return;
+   mutex_lock(&uld_mutex);
for (i = 0; i < CXGB4_ULD_MAX; i++) {
if (!adap->uld[i].handle)
continue;
-   if (adap->flags & FULL_INIT_DONE)
-   quiesce_rx_uld(adap, i);
-   if (adap->flags & USING_MSIX)
-   free_msix_queue_irqs_uld(adap, i);
-

Re: [PATCH v1 1/1] dt-binding: ptp: Add SoC compatibility strings for dte ptp clock

2017-07-10 Thread Rob Herring

On Thu, Jul 06, 2017 at 10:37:57AM -0700, Arun Parameswaran wrote:
> Add SoC specific compatibility strings to the Broadcom DTE
> based PTP clock binding document.
> 
> Fixed the document heading and node name.
> 
> Fixes: 80d6076140b2 ("dt-binding: ptp: add bindings document for dte based 
> ptp clock")
> Signed-off-by: Arun Parameswaran 
> ---
>  Documentation/devicetree/bindings/ptp/brcm,ptp-dte.txt | 15 +++
>  1 file changed, 11 insertions(+), 4 deletions(-)

Acked-by: Rob Herring

Re: [PATCH 2/2] net: ethernet: fsl: add phy reset after clk enable option

2017-07-10 Thread Rob Herring

On Thu, Jul 06, 2017 at 03:05:30PM +0200, Richard Leitner wrote:
> Some PHYs (for example the LAN8710) doesn't allow turning the clocks off
> and on again without reset (according to their datasheet). Exactly this
> behaviour was introduced for power saving reasons by commit e8fcfcd5684a
> ("net: fec: optimize the clock management to save power")
> Therefore add a devictree option to perform a PHY reset and
> configuration after every clock enable.
> 
> For a better understanding here's a outline of the time response of the
> clock and reset lines before and after this patch:
> 
> v--fec_probe()  v--fec_enet_open()
> v   v
>   w/o patch eCLK: ___|
>   w/o patch nRST: __--
>   w/o patch CONF: ___XX___
> 
>   w/  patch eCLK: ___|
>   w/  patch nRST: __--__--
>   w/  patch CONF: ___XX__XX___
> ^   ^
> ^--fec_probe()  ^--fec_enet_open()
> 
> In our case this problem does occur at about every 10th to 50th POR of
> an LAN8710 connected to an i.MX6 SoC. The typical symptom of this
> problem is a "swinging" ethernet link. Similar issues were experienced
> by users of the NXP forum:
>   https://community.nxp.com/thread/389902
>   https://community.nxp.com/message/309354
> With this patch applied the issue didn't occur for at least a few
> hundred PORs of our board.
> 
> Fixes: e8fcfcd5684a ("net: fec: optimize the clock management to sa...")
> Signed-off-by: Richard Leitner 
> ---
>  Documentation/devicetree/bindings/net/fsl-fec.txt |  3 +++
>  drivers/net/ethernet/freescale/fec.h  |  1 +
>  drivers/net/ethernet/freescale/fec_main.c | 16 
>  3 files changed, 20 insertions(+)
> 
> diff --git a/Documentation/devicetree/bindings/net/fsl-fec.txt 
> b/Documentation/devicetree/bindings/net/fsl-fec.txt
> index 6f55bdd..1766579 100644
> --- a/Documentation/devicetree/bindings/net/fsl-fec.txt
> +++ b/Documentation/devicetree/bindings/net/fsl-fec.txt
> @@ -23,6 +23,9 @@ Optional properties:
>  - phy-handle : phandle to the PHY device connected to this device.
>  - fixed-link : Assume a fixed link. See fixed-link.txt in the same directory.
>Use instead of phy-handle.
> +- phy-reset-after-clk-enable : If present then a phy reset and configuration
> +  will be performed everytime after the clocks are enabled. This is required
> +  for some PHYs to work properly.

Maybe this is not needed based on the discussion, but just to make 
sure. Since this is a property of the phy, it should be implied from the 
phy's compatible string.

Rob

[PATCH 2/2] openvswitch: Optimize operations for OvS flow_stats.

2017-07-10 Thread Tonghao Zhang

When calling the flow_free() to free the flow, we call many times
(cpu_possible_mask, eg. 128 as default) cpumask_next(). That will
take up our CPU usage if we call the flow_free() frequently.
When we put all packets to userspace via upcall, and OvS will send
them back via netlink to ovs_packet_cmd_execute(will call flow_free).

The test topo is shown as below. VM01 sends TCP packets to VM02,
and OvS forward packtets. When testing, we use perf to report the
system performance.

VM01 --- OvS-VM --- VM02

Without this patch, perf-top show as below: The flow_free() is
3.02% CPU usage.

4.23%  [kernel][k] _raw_spin_unlock_irqrestore
3.62%  [kernel][k] __do_softirq
3.16%  [kernel][k] __memcpy
3.02%  [kernel][k] flow_free
2.42%  libc-2.17.so[.] __memcpy_ssse3_back
2.18%  [kernel][k] copy_user_generic_unrolled
2.17%  [kernel][k] find_next_bit

When applied this patch, perf-top show as below: Not shown on
the list anymore.

4.11%  [kernel][k] _raw_spin_unlock_irqrestore
3.79%  [kernel][k] __do_softirq
3.46%  [kernel][k] __memcpy
2.73%  libc-2.17.so[.] __memcpy_ssse3_back
2.25%  [kernel][k] copy_user_generic_unrolled
1.89%  libc-2.17.so[.] _int_malloc
1.53%  ovs-vswitchd[.] xlate_actions

With this patch, the TCP throughput(we dont use Megaflow Cache
+ Microflow Cache) between VMs is 1.18Gbs/sec up to 1.30Gbs/sec
(maybe ~10% performance imporve).

This patch adds cpumask struct, the cpu_used_mask stores the cpu_id
that the flow used. And we only check the flow_stats on the cpu we
used, and it is unncessary to check all possible cpu when getting,
cleaning, and updating the flow_stats. Adding the cpu_used_mask to
sw_flow struct does’t increase the cacheline number.

Signed-off-by: Tonghao Zhang 
---
 net/openvswitch/flow.c   | 7 ---
 net/openvswitch/flow.h   | 2 ++
 net/openvswitch/flow_table.c | 5 -
 3 files changed, 10 insertions(+), 4 deletions(-)

diff --git a/net/openvswitch/flow.c b/net/openvswitch/flow.c
index 89aeb32..cfb652a 100644
--- a/net/openvswitch/flow.c
+++ b/net/openvswitch/flow.c
@@ -72,7 +72,7 @@ void ovs_flow_stats_update(struct sw_flow *flow, __be16 
tcp_flags,
   const struct sk_buff *skb)
 {
struct flow_stats *stats;
-   int cpu = smp_processor_id();
+   unsigned int cpu = smp_processor_id();
int len = skb->len + (skb_vlan_tag_present(skb) ? VLAN_HLEN : 0);
 
stats = rcu_dereference(flow->stats[cpu]);
@@ -117,6 +117,7 @@ void ovs_flow_stats_update(struct sw_flow *flow, __be16 
tcp_flags,
 
rcu_assign_pointer(flow->stats[cpu],
   new_stats);
+   cpumask_set_cpu(cpu, 
&flow->cpu_used_mask);
goto unlock;
}
}
@@ -144,7 +145,7 @@ void ovs_flow_stats_get(const struct sw_flow *flow,
memset(ovs_stats, 0, sizeof(*ovs_stats));
 
/* We open code this to make sure cpu 0 is always considered */
-   for (cpu = 0; cpu < nr_cpu_ids; cpu = cpumask_next(cpu, 
cpu_possible_mask)) {
+   for (cpu = 0; cpu < nr_cpu_ids; cpu = cpumask_next(cpu, 
&flow->cpu_used_mask)) {
struct flow_stats *stats = 
rcu_dereference_ovsl(flow->stats[cpu]);
 
if (stats) {
@@ -168,7 +169,7 @@ void ovs_flow_stats_clear(struct sw_flow *flow)
int cpu;
 
/* We open code this to make sure cpu 0 is always considered */
-   for (cpu = 0; cpu < nr_cpu_ids; cpu = cpumask_next(cpu, 
cpu_possible_mask)) {
+   for (cpu = 0; cpu < nr_cpu_ids; cpu = cpumask_next(cpu, 
&flow->cpu_used_mask)) {
struct flow_stats *stats = ovsl_dereference(flow->stats[cpu]);
 
if (stats) {
diff --git a/net/openvswitch/flow.h b/net/openvswitch/flow.h
index a9bc1c8..1875bba 100644
--- a/net/openvswitch/flow.h
+++ b/net/openvswitch/flow.h
@@ -31,6 +31,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -219,6 +220,7 @@ struct sw_flow {
 */
struct sw_flow_key key;
struct sw_flow_id id;
+   struct cpumask cpu_used_mask;
struct sw_flow_mask *mask;
struct sw_flow_actions __rcu *sf_acts;
struct flow_stats __rcu *stats[]; /* One for each CPU.  First one
diff --git a/net/openvswitch/flow_table.c b/net/openvswitch/flow_table.c
index ea7a807..324a246 100644
--- a/net/openvswitch/flow_table.c
+++ b/net/openvswitch/flow_table.c
@@ -98,6 +98,9 @@ struct sw_flow *ovs_flow_alloc(void)
 
RCU_INIT_POINTER(flow->stats[0], stats);
 
+   cpumask_clear(&flow->cpu_used_mask);
+   cpumask_set_cpu(0, &flo

[PATCH 1/2] openvswitch: Optimize updating for OvS flow_stats.

2017-07-10 Thread Tonghao Zhang

In the ovs_flow_stats_update(), we only use the node
var to alloc flow_stats struct. But this is not a
common case, it is unnecessary to call the numa_node_id()
everytime. This patch is not a bugfix, but there maybe
a small increase.

Signed-off-by: Tonghao Zhang 
---
 net/openvswitch/flow.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/net/openvswitch/flow.c b/net/openvswitch/flow.c
index 3f76cb7..89aeb32 100644
--- a/net/openvswitch/flow.c
+++ b/net/openvswitch/flow.c
@@ -72,7 +72,6 @@ void ovs_flow_stats_update(struct sw_flow *flow, __be16 
tcp_flags,
   const struct sk_buff *skb)
 {
struct flow_stats *stats;
-   int node = numa_node_id();
int cpu = smp_processor_id();
int len = skb->len + (skb_vlan_tag_present(skb) ? VLAN_HLEN : 0);
 
@@ -108,7 +107,7 @@ void ovs_flow_stats_update(struct sw_flow *flow, __be16 
tcp_flags,
  __GFP_THISNODE |
  __GFP_NOWARN |
  __GFP_NOMEMALLOC,
- node);
+ numa_node_id());
if (likely(new_stats)) {
new_stats->used = jiffies;
new_stats->packet_count = 1;
-- 
1.8.3.1

Re: [PATCH] netns: avoid directory traversal (was: ip netns: Make sure netns name is sane)

2017-07-10 Thread Phil Sutter

Hi Matteo,

On Mon, Jul 10, 2017 at 02:08:31PM +0200, Matteo Croce wrote:
> I noticed that your patch still leaves an uncovered scenario, the one where 
> the
> namespace name is "." or "..".
> Calling 'ip netns del ..' will remove /var/run which is a symlink to /run on
> most systems causing some daemons, eg. dbus, to fail.

Oh, indeed. My patch even checks the name for 'ip netns add' only!

> ip netns doesn't validate input, allowing creation and deletion of files
> relatives to /var/run/netns.
> This patch denies creation or deletion of namespaces with names contaning
> "/" or that matches exactly "." or "..".

You might want to have a look at --scissors option to git-am for a more
elegant way of sending a reply with attached patch.

[...]
>  int do_netns(int argc, char **argv)
>  {
>   netns_nsid_socket_init();
> @@ -775,6 +780,11 @@ int do_netns(int argc, char **argv)
>   return netns_list(0, NULL);
>   }
>  
> + if (argc > 1 && invalid_name(argv[1])) {
> + fprintf(stderr, "Invalid netns name \"%s\"\n", argv[1]);
> + exit(-1);
> + }

Maybe worth noting, this assumes argv[1] will be used for the netns name
which doesn't hold for 'ip netns identify' command. It doesn't matter
though, since netns_identify() enforces the parameter to be either
"self" or a decimal number. Yet, in future a new subcommand might be
added which requires this check to be refactored.

Thanks, Phil

Re: [PATCH net] mdio: mux: fix parsing mux registers outside of the PHY address range

2017-07-10 Thread Andrew Lunn

On Mon, Jul 10, 2017 at 02:35:23PM +0200, Martin Blumenstingl wrote:
> mdio_mux_init parses the child nodes of the MDIO mux. When using
> "mdio-mux-mmioreg" the child nodes are describing the register value
> that is written to switch between the MDIO busses.
> 
> The change which makes the error messages more verbose changed the
> parsing of the "reg" property from a simple of_property_read_u32 call
> to of_mdio_parse_addr. On a Khadas VIM (based on the Meson GXL SoC,
> which uses mdio-mux-mmioreg) this prevents registering the MDIO mux
> (because the "reg" values on the MDIO mux child nodes are 0x2009087f
> and 0xe40908ff) and leads to the following errors:
>   mdio-mux-mmioreg c883455c.eth-phy-mux: 
> /soc/periphs@c8834000/eth-phy-mux/mdio@e40908ff PHY address -469169921 is too 
> large
>   mdio-mux-mmioreg c883455c.eth-phy-mux: Error: Failed to find reg for child 
> /soc/periphs@c8834000/eth-phy-mux/mdio@e40908ff
>   mdio-mux-mmioreg c883455c.eth-phy-mux: 
> /soc/periphs@c8834000/eth-phy-mux/mdio@2009087f PHY address 537462911 is too 
> large
>   mdio-mux-mmioreg c883455c.eth-phy-mux: Error: Failed to find reg for child 
> /soc/periphs@c8834000/eth-phy-mux/mdio@2009087f
>   mdio-mux-mmioreg c883455c.eth-phy-mux: Error: No acceptable child buses 
> found
>   mdio-mux-mmioreg c883455c.eth-phy-mux: failed to register mdio-mux bus 
> /soc/periphs@c8834000/eth-phy-mux
> (as a result of that ethernet is not working, because the PHY which is
> connected through the mux' child MDIO bus, which is not being
> registered).
> 
> Fix this by reverting the change from of_mdio_parse_addr to
> of_mdio_parse_addr.

Reviewed-by: Andrew Lunn 

Yes, validating the reg property needs to be done separately in each
user of the generic mdio-mix code. The reg for the gpio mux must be <=
number of gpios, mmioreg must be somewhere within the address space,
bcm-iproc < 1024?

Jon, please feel free to add such code.

Andrew

Re: [PATCH net] mdio: mux: fix parsing mux registers outside of the PHY address range

2017-07-10 Thread Neil Armstrong

On 07/10/2017 02:35 PM, Martin Blumenstingl wrote:
> mdio_mux_init parses the child nodes of the MDIO mux. When using
> "mdio-mux-mmioreg" the child nodes are describing the register value
> that is written to switch between the MDIO busses.
> 
> The change which makes the error messages more verbose changed the
> parsing of the "reg" property from a simple of_property_read_u32 call
> to of_mdio_parse_addr. On a Khadas VIM (based on the Meson GXL SoC,
> which uses mdio-mux-mmioreg) this prevents registering the MDIO mux
> (because the "reg" values on the MDIO mux child nodes are 0x2009087f
> and 0xe40908ff) and leads to the following errors:
>   mdio-mux-mmioreg c883455c.eth-phy-mux: 
> /soc/periphs@c8834000/eth-phy-mux/mdio@e40908ff PHY address -469169921 is too 
> large
>   mdio-mux-mmioreg c883455c.eth-phy-mux: Error: Failed to find reg for child 
> /soc/periphs@c8834000/eth-phy-mux/mdio@e40908ff
>   mdio-mux-mmioreg c883455c.eth-phy-mux: 
> /soc/periphs@c8834000/eth-phy-mux/mdio@2009087f PHY address 537462911 is too 
> large
>   mdio-mux-mmioreg c883455c.eth-phy-mux: Error: Failed to find reg for child 
> /soc/periphs@c8834000/eth-phy-mux/mdio@2009087f
>   mdio-mux-mmioreg c883455c.eth-phy-mux: Error: No acceptable child buses 
> found
>   mdio-mux-mmioreg c883455c.eth-phy-mux: failed to register mdio-mux bus 
> /soc/periphs@c8834000/eth-phy-mux
> (as a result of that ethernet is not working, because the PHY which is
> connected through the mux' child MDIO bus, which is not being
> registered).
> 
> Fix this by reverting the change from of_mdio_parse_addr to
> of_mdio_parse_addr.
> 
> Fixes: 342fa1964439 ("mdio: mux: make child bus walking more permissive and 
> errors more verbose")
> Signed-off-by: Martin Blumenstingl 
> ---
>  drivers/net/phy/mdio-mux.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/net/phy/mdio-mux.c b/drivers/net/phy/mdio-mux.c
> index 00755b6a42cf..c608e1dfaf09 100644
> --- a/drivers/net/phy/mdio-mux.c
> +++ b/drivers/net/phy/mdio-mux.c
> @@ -135,8 +135,8 @@ int mdio_mux_init(struct device *dev,
>   for_each_available_child_of_node(dev->of_node, child_bus_node) {
>   int v;
>  
> - v = of_mdio_parse_addr(dev, child_bus_node);
> - if (v < 0) {
> + r = of_property_read_u32(child_bus_node, "reg", &v);
> + if (r) {
>   dev_err(dev,
>   "Error: Failed to find reg for child %s\n",
>   of_node_full_name(child_bus_node));
> 

I was going to push this, thanks martin !!

Acked-by: Neil Armstrong 

Neil

[PATCH net] mdio: mux: fix parsing mux registers outside of the PHY address range

2017-07-10 Thread Martin Blumenstingl

mdio_mux_init parses the child nodes of the MDIO mux. When using
"mdio-mux-mmioreg" the child nodes are describing the register value
that is written to switch between the MDIO busses.

The change which makes the error messages more verbose changed the
parsing of the "reg" property from a simple of_property_read_u32 call
to of_mdio_parse_addr. On a Khadas VIM (based on the Meson GXL SoC,
which uses mdio-mux-mmioreg) this prevents registering the MDIO mux
(because the "reg" values on the MDIO mux child nodes are 0x2009087f
and 0xe40908ff) and leads to the following errors:
  mdio-mux-mmioreg c883455c.eth-phy-mux: 
/soc/periphs@c8834000/eth-phy-mux/mdio@e40908ff PHY address -469169921 is too 
large
  mdio-mux-mmioreg c883455c.eth-phy-mux: Error: Failed to find reg for child 
/soc/periphs@c8834000/eth-phy-mux/mdio@e40908ff
  mdio-mux-mmioreg c883455c.eth-phy-mux: 
/soc/periphs@c8834000/eth-phy-mux/mdio@2009087f PHY address 537462911 is too 
large
  mdio-mux-mmioreg c883455c.eth-phy-mux: Error: Failed to find reg for child 
/soc/periphs@c8834000/eth-phy-mux/mdio@2009087f
  mdio-mux-mmioreg c883455c.eth-phy-mux: Error: No acceptable child buses found
  mdio-mux-mmioreg c883455c.eth-phy-mux: failed to register mdio-mux bus 
/soc/periphs@c8834000/eth-phy-mux
(as a result of that ethernet is not working, because the PHY which is
connected through the mux' child MDIO bus, which is not being
registered).

Fix this by reverting the change from of_mdio_parse_addr to
of_mdio_parse_addr.

Fixes: 342fa1964439 ("mdio: mux: make child bus walking more permissive and 
errors more verbose")
Signed-off-by: Martin Blumenstingl 
---
 drivers/net/phy/mdio-mux.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/phy/mdio-mux.c b/drivers/net/phy/mdio-mux.c
index 00755b6a42cf..c608e1dfaf09 100644
--- a/drivers/net/phy/mdio-mux.c
+++ b/drivers/net/phy/mdio-mux.c
@@ -135,8 +135,8 @@ int mdio_mux_init(struct device *dev,
for_each_available_child_of_node(dev->of_node, child_bus_node) {
int v;
 
-   v = of_mdio_parse_addr(dev, child_bus_node);
-   if (v < 0) {
+   r = of_property_read_u32(child_bus_node, "reg", &v);
+   if (r) {
dev_err(dev,
"Error: Failed to find reg for child %s\n",
of_node_full_name(child_bus_node));
-- 
2.13.2

Re: [GIT] Networking

2017-07-10 Thread Sowmini Varadhan

On (07/10/17 18:05), Herbert Xu wrote:
> 
> Hmm, I can't see the problem in af_alg_accept.  The struct socket
> comes directly from sys_accept() which creates it using sock_alloc.
> 
> So the only thing I can think of is that the memory returned by
> sock_alloc is not zeroed and therefore the WARN_ON is just reading
> garbage.

Then it is odd that this WARN_ON is not triggered for other sockets
(e.g., for TCP sockets), though it happens easily with AF_ALG. 

But it's not sock_alloc() - that function is returning a properly
zeroed ->sk.

The reason that the WARN_ON is triggered is that af_alg_accept() calls
sock_init_data() which does 

   2636 if (sock) {
:
   2639 sock->sk=   sk;

So we can do one of the following:

1. drop the WARN_ON(), which makes true leaks hard to detect
2. change the WARN_ON() to WARN_ON(parent->sk && parent->sk != sk)

#2 assumes that all the refcount book-keeping is being done
correctly (there is the danger that we end up taking 2 refs on the sk) 

--Sowmini

[PATCH] netns: avoid directory traversal (was: ip netns: Make sure netns name is sane)

2017-07-10 Thread Matteo Croce

Hi Phil,

I noticed that your patch still leaves an uncovered scenario, the one where the
namespace name is "." or "..".
Calling 'ip netns del ..' will remove /var/run which is a symlink to /run on
most systems causing some daemons, eg. dbus, to fail.

ip netns doesn't validate input, allowing creation and deletion of files
relatives to /var/run/netns.
This patch denies creation or deletion of namespaces with names contaning
"/" or that matches exactly "." or "..".
---
 ip/ipnetns.c | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/ip/ipnetns.c b/ip/ipnetns.c
index 0b0378a..4254994 100644
--- a/ip/ipnetns.c
+++ b/ip/ipnetns.c
@@ -766,6 +766,11 @@ static int netns_monitor(int argc, char **argv)
return 0;
 }
 
+static int invalid_name(const char *name)
+{
+   return strchr(name, '/') || !strcmp(name, ".") || !strcmp(name, "..");
+}
+
 int do_netns(int argc, char **argv)
 {
netns_nsid_socket_init();
@@ -775,6 +780,11 @@ int do_netns(int argc, char **argv)
return netns_list(0, NULL);
}
 
+   if (argc > 1 && invalid_name(argv[1])) {
+   fprintf(stderr, "Invalid netns name \"%s\"\n", argv[1]);
+   exit(-1);
+   }
+
if ((matches(*argv, "list") == 0) || (matches(*argv, "show") == 0) ||
(matches(*argv, "lst") == 0)) {
netns_map_init();
-- 
2.9.4

[PATCH] ioc3-eth: store pointer to net_device for priviate area

2017-07-10 Thread Jason A. Donenfeld

Computing the alignment manually for going from priv to pub is probably
not such a good idea, and in general the assumption that going from priv
to pub is possible trivially could change, so rather than relying on
that, we change things to just store a pointer to pub. This was sugested
by DaveM in [1].

[1] http://www.spinics.net/lists/netdev/msg443992.html

Signed-off-by: Jason A. Donenfeld 
---
Ralf - I don't have the platform to test this out, so you might want to
briefly put it through the paces before giving it your sign-off.

 drivers/net/ethernet/sgi/ioc3-eth.c | 14 +-
 1 file changed, 5 insertions(+), 9 deletions(-)

diff --git a/drivers/net/ethernet/sgi/ioc3-eth.c 
b/drivers/net/ethernet/sgi/ioc3-eth.c
index b607936e1b3e..9c0488e0f08e 100644
--- a/drivers/net/ethernet/sgi/ioc3-eth.c
+++ b/drivers/net/ethernet/sgi/ioc3-eth.c
@@ -90,17 +90,13 @@ struct ioc3_private {
spinlock_t ioc3_lock;
struct mii_if_info mii;
 
+   struct net_device *dev;
struct pci_dev *pdev;
 
/* Members used by autonegotiation  */
struct timer_list ioc3_timer;
 };
 
-static inline struct net_device *priv_netdev(struct ioc3_private *dev)
-{
-   return (void *)dev - ((sizeof(struct net_device) + 31) & ~31);
-}
-
 static int ioc3_ioctl(struct net_device *dev, struct ifreq *rq, int cmd);
 static void ioc3_set_multicast_list(struct net_device *dev);
 static int ioc3_start_xmit(struct sk_buff *skb, struct net_device *dev);
@@ -427,7 +423,7 @@ static void ioc3_get_eaddr_nic(struct ioc3_private *ip)
nic[i] = nic_read_byte(ioc3);
 
for (i = 2; i < 8; i++)
-   priv_netdev(ip)->dev_addr[i - 2] = nic[i];
+   ip->dev->dev_addr[i - 2] = nic[i];
 }
 
 /*
@@ -439,7 +435,7 @@ static void ioc3_get_eaddr(struct ioc3_private *ip)
 {
ioc3_get_eaddr_nic(ip);
 
-   printk("Ethernet address is %pM.\n", priv_netdev(ip)->dev_addr);
+   printk("Ethernet address is %pM.\n", ip->dev->dev_addr);
 }
 
 static void __ioc3_set_mac_address(struct net_device *dev)
@@ -790,13 +786,12 @@ static void ioc3_timer(unsigned long data)
  */
 static int ioc3_mii_init(struct ioc3_private *ip)
 {
-   struct net_device *dev = priv_netdev(ip);
int i, found = 0, res = 0;
int ioc3_phy_workaround = 1;
u16 word;
 
for (i = 0; i < 32; i++) {
-   word = ioc3_mdio_read(dev, i, MII_PHYSID1);
+   word = ioc3_mdio_read(ip->dev, i, MII_PHYSID1);
 
if (word != 0x && word != 0x) {
found = 1;
@@ -1276,6 +1271,7 @@ static int ioc3_probe(struct pci_dev *pdev, const struct 
pci_device_id *ent)
SET_NETDEV_DEV(dev, &pdev->dev);
 
ip = netdev_priv(dev);
+   ip->dev = dev;
 
dev->irq = pdev->irq;
 
-- 
2.13.2

Re: [PATCH 1/2] netdevice: add netdev_pub helper function

2017-07-10 Thread Jason A. Donenfeld

On Mon, Jul 10, 2017 at 10:04 AM, David Miller  wrote:
> I disagree.  Assuming one can go from the driver private to the netdev
> object trivially is a worse assumption than the other way around, and
> locks us into the current implementation of how the netdev and driver
> private memory is allocated.
>
> If you want to style your driver such that the private is passed
> around instead of the netdev, put a pointer back to the netdev object
> in your private data structure.

I'm surprised you're okay with the memory waste of that, but you bring
up the ability to change the interface later, which is a great point.
I'll submit a patch for that random driver, and I'll also refactor
WireGuard to do the same. Thanks for the guidance.

Jason

[iproute PATCH] ip netns: Make sure netns name is sane

2017-07-10 Thread Phil Sutter

In order to keep track of created netns, 'ip netns' creates a mount
point inside NETNS_RUN_DIR. By not checking the user-specified name, it
allowed to create that mount point outside of NETNS_RUN_DIR and hence
lose track of it afterwards:

| # ip netns add ../../tmp/foobar
| # ip netns list
| # mount | grep foobar
| nsfs on /tmp/foobar type nsfs (rw)

Prevent this by making sure basename() does not see a need to alter the
given netns name.

Fixes: 0dc34c7713bb7 ("iproute2: Add processless network namespace support")
Signed-off-by: Phil Sutter 
---
 ip/ipnetns.c | 20 
 1 file changed, 20 insertions(+)

diff --git a/ip/ipnetns.c b/ip/ipnetns.c
index 0b0378ab6560c..4eee85e146b3d 100644
--- a/ip/ipnetns.c
+++ b/ip/ipnetns.c
@@ -595,6 +595,21 @@ static int create_netns_dir(void)
return 0;
 }
 
+static bool is_basename(const char *name)
+{
+   char *name_dup = strdup(name);
+   bool rc = true;
+
+   if (!name_dup)
+   return false;
+
+   if (strcmp(basename(name_dup), name))
+   rc = false;
+
+   free(name_dup);
+   return rc;
+}
+
 static int netns_add(int argc, char **argv)
 {
/* This function creates a new network namespace and
@@ -616,6 +631,11 @@ static int netns_add(int argc, char **argv)
}
name = argv[0];
 
+   if (!is_basename(name)) {
+   fprintf(stderr, "Invalid netns name: contains path 
components\n");
+   return -1;
+   }
+
snprintf(netns_path, sizeof(netns_path), "%s/%s", NETNS_RUN_DIR, name);
 
if (create_netns_dir())
-- 
2.13.1

Re: [PATCH 1/2 net] ptp: fix error codes in ptp_clock_register()

2017-07-10 Thread Richard Cochran

On Mon, Jul 10, 2017 at 01:29:20PM +0300, Dan Carpenter wrote:
> The "goto no_pps" was a bug you introduced.

I took a second look, and yes, the original commit should not have
returned NULL, and the original callers did not expect NULL either.

> But I feel like you're being rude, so I'm not going to resend these
> patches.  Please fix them yourself.

Sorry you feel that way.  Your first patch, although now more or less
cosmetic, would be okay.  The second patch is still wrong.

Thanks,
Richard

Re: [PATCH v6 0/3] Add new PCI_DEV_FLAGS_NO_RELAXED_ORDERING flag

2017-07-10 Thread Ding Tianhong

Hi Casey:

On 2017/7/8 10:04, Casey Leedom wrote:
>   Okay, thanks for the note Alexander.I'll have to look more closely 
> at
> the patch on Monday and try it out on one of the targeted systems to verify
> the semantics you describe.
> 

All the modification is only clearing the device's Device Control{Relaxed 
Ordering
Enable]bit when distinguish that the platform should not support RO and did 
nothing
to the RC configuration, so I don't think it will break anything compare to the
first version from yours.

>   However, that said, there is no way to tell a priori where a device will
> send TLPs.  To simply assume that all TLPs will be directed towards the Root
> Complex is a big assumption.  Only the device and the code controlling it
> know where the TLPs will be directed.  That's why there are changes required
> in the cxgb4 driver.  For instance, the code in
> drivers/net/ethernet/chelsio./cxgb4/sge.c: t4_sge_alloc_rxq() knows that
> it's allocating Free List Buffers in Host Memory and that the RX Queues that
> it's allocating in the Hardware will eventually send Ingress Data to those
> Free List Buffers.  (And similarly for the Free List Buffer Pointer Queue
> with respect to DMA Reads from the host.)  In that routine we explicitly
> configure the Hardware to use/not-use the Relaxed Ordering Attribute via the
> FW_IQ_CMD_FL0FETCHRO and FW_IQ_CMD_FL0DATARO flags.  Basically we're
> conditionally setting them based on the desirability of sending Relaxed
> Ordering TLPs to the Root Complex.  (And we would perform the same kind of
> check for an nVME application ... which brings us to ...)
> 
>   And what would be the code using these patch APIs to set up a Peer-to-Peer
> nVME-style application?In that case we'd need the Chelsio adapter's 
> PCIe
> Capability Device Control[Relaxed Ordering Enable] set for the nVME
> application ... and we would avoid programming the Chelsio Hardware to use
> Relaxed Ordering for TLPs directed at the Root Complex.Thus we would 
> be in
> a position where some TLPs being emitted by the device to Peer devices would
> have Relaxed Ordering set and some directed at the Root Complex would not.
> And the only way for that to work is if the source device's Device
> Control[Relaxed Ordering Enable] is set ...
> 
>   Finally, setting aside my disagreements with the patch, we still have the
> code in the cxgb4 driver which explicitly turns on its own Device
> Control[Relaxed Ordering Enable] in cxgb4_main.c:
> enable_pcie_relaxed_ordering().  So the patch is something of a loop if all
> we're doing is testing our own Relaxed Ordering Enable state ...
>  
> Casey
> 
> .
>

[PATCH] net: chelsio: cxgb3: constify attribute_group structures.

2017-07-10 Thread Arvind Yadav

attribute_groups are not supposed to change at runtime. All functions
working with attribute_groups provided by  work
with const attribute_group. So mark the non-const structs as const.

File size before:
   textdata bss dec hex filename
  28720 985  12   297177415 drivers/net/.../cxgb3/cxgb3_main.o

File size After adding 'const':
   textdata bss dec hex filename
  28848 857  12   297177415 drivers/net/.../cxgb3/cxgb3_main.o

Signed-off-by: Arvind Yadav 
---
 drivers/net/ethernet/chelsio/cxgb3/cxgb3_main.c | 8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/chelsio/cxgb3/cxgb3_main.c 
b/drivers/net/ethernet/chelsio/cxgb3/cxgb3_main.c
index 0bc6a4f..6a01536 100644
--- a/drivers/net/ethernet/chelsio/cxgb3/cxgb3_main.c
+++ b/drivers/net/ethernet/chelsio/cxgb3/cxgb3_main.c
@@ -793,7 +793,9 @@ static DEVICE_ATTR(name, S_IRUGO | S_IWUSR, show_##name, 
store_method)
NULL
 };
 
-static struct attribute_group cxgb3_attr_group = {.attrs = cxgb3_attrs };
+static const struct attribute_group cxgb3_attr_group = {
+   .attrs = cxgb3_attrs,
+};
 
 static ssize_t tm_attr_show(struct device *d,
char *buf, int sched)
@@ -880,7 +882,9 @@ static DEVICE_ATTR(name, S_IRUGO | S_IWUSR, show_##name, 
store_##name)
NULL
 };
 
-static struct attribute_group offload_attr_group = {.attrs = offload_attrs };
+static const struct attribute_group offload_attr_group = {
+   .attrs = offload_attrs,
+};
 
 /*
  * Sends an sk_buff to an offload queue driver
-- 
1.9.1

Re: [PATCH 1/2 net] ptp: fix error codes in ptp_clock_register()

2017-07-10 Thread Dan Carpenter

On Mon, Jul 10, 2017 at 11:48:06AM +0200, Richard Cochran wrote:
> On Mon, Jul 10, 2017 at 12:38:16PM +0300, Dan Carpenter wrote:
> > There were two buggy commits so I chose the ealier one.  The other buggy
> 
> No, you are mistaken.  In the original patch, NULL or PTR_ERR were
> returned on error, and that was not a bug.
> 

The "goto no_pps" was a bug you introduced.

But I feel like you're being rude, so I'm not going to resend these
patches.  Please fix them yourself.

regards,
dan carpenter

RE: [PATCH] net/mlx5: IPSec, fix 64-bit division correctly

2017-07-10 Thread Ilan Tayari

> -Original Message-
> From: Arnd Bergmann [mailto:a...@arndb.de]
> Subject: [PATCH] net/mlx5: IPSec, fix 64-bit division correctly
> 
> The new IPSec offload code introduced a build error:
> 
> drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec_rxtx.o: In function
> `mlx5e_ipsec_build_inverse_table':
> ipsec_rxtx.c:(.text+0x556): undefined reference
> 
> Another patch was added on top to fix the build error, but
> that introduced a new bug, as we now use the remainder of
> the division rather than the result.
> 
> This makes it use the correct helper function instead.
> 
> Fixes: 5dfd87b67cd9 ("net/mlx5: IPSec, Fix 64-bit division on 32-bit
> builds")
> Fixes: 2ac9cfe78223 ("net/mlx5e: IPSec, Add Innova IPSec offload TX data
> path")
> Signed-off-by: Arnd Bergmann 

This is a good fix. Values are now generated correctly.
Sorry for the mixup.

Thank you Arnd!

Reviewed-by: Ilan Tayari

[PATCH] net: bonding: constify attribute_group structures.

2017-07-10 Thread Arvind Yadav

attribute_groups are not supposed to change at runtime. All functions
working with attribute_groups provided by  work
with const attribute_group. So mark the non-const structs as const.

File size before:
   textdata bss dec hex filename
   45121472   059841760 drivers/net/bonding/bond_sysfs.o

File size After adding 'const':
   textdata bss dec hex filename
   45761408   059841760 drivers/net/bonding/bond_sysfs.o

Signed-off-by: Arvind Yadav 
---
 drivers/net/bonding/bond_sysfs.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/bonding/bond_sysfs.c b/drivers/net/bonding/bond_sysfs.c
index 770623a..040b493 100644
--- a/drivers/net/bonding/bond_sysfs.c
+++ b/drivers/net/bonding/bond_sysfs.c
@@ -759,7 +759,7 @@ static DEVICE_ATTR(ad_user_port_key, S_IRUGO | S_IWUSR,
NULL,
 };
 
-static struct attribute_group bonding_group = {
+static const struct attribute_group bonding_group = {
.name = "bonding",
.attrs = per_bond_attrs,
 };
-- 
1.9.1

RE: [PATCH iproute2] ip: change flag names to an array

2017-07-10 Thread David Laight

From:  Stephen Hemminger
> Sent: 07 July 2017 16:40
> For the most of the address flags, use a table of bit values rather
> than open coding every value.  This allows for easier inevitable
> expansion of flags.
> 
> This also fixes the missing stable-privacy flag.
> 
> Signed-off-by: Stephen Hemminger 
> ---
>  ip/ipaddress.c | 152 
> -
>  1 file changed, 65 insertions(+), 87 deletions(-)
> 
> diff --git a/ip/ipaddress.c b/ip/ipaddress.c
> index f06f5829fb61..b4efb9fedcd2 100644
> --- a/ip/ipaddress.c
> +++ b/ip/ipaddress.c
> @@ -1013,14 +1013,67 @@ static unsigned int get_ifa_flags(struct ifaddrmsg 
> *ifa,
>   ifa->ifa_flags;
>  }
> 
> +
> +static const char *ifa_flag_names[] = {
> + "secondary",
> + "nodad",
> + "optimistic",
> + "dadfailed",
> + "home",
> + "deprecated",
> + "tentative",
> + "permanent",
> + "mngtmpaddr",
> + "noprefixroute",
> + "autojoin",
> + "stable-privacy",
> +};

It would be safer to set up a table of the constant - string pairs
instead of relying on the table being in the right order.

David

[PATCH] arcnet: com20020-pci: constify attribute_group structures.

2017-07-10 Thread Arvind Yadav

attribute_groups are not supposed to change at runtime. All functions
working with attribute_groups provided by  work
with const attribute_group. So mark the non-const structs as const.

File size before:
   textdata bss dec hex filename
   3409 948  2843851121 drivers/net/arcnet/com20020-pci.o

File size After adding 'const':
   textdata bss dec hex filename
   3473 884  2843851121 drivers/net/arcnet/com20020-pci.o

Signed-off-by: Arvind Yadav 
---
 drivers/net/arcnet/com20020-pci.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/arcnet/com20020-pci.c 
b/drivers/net/arcnet/com20020-pci.c
index 01cab95..eb7f767 100644
--- a/drivers/net/arcnet/com20020-pci.c
+++ b/drivers/net/arcnet/com20020-pci.c
@@ -109,7 +109,7 @@ static ssize_t backplane_mode_show(struct device *dev,
NULL,
 };
 
-static struct attribute_group com20020_state_group = {
+static const struct attribute_group com20020_state_group = {
.name = NULL,
.attrs = com20020_state_attrs,
 };
-- 
1.9.1

Re: [GIT] Networking

2017-07-10 Thread Herbert Xu

On Sun, Jul 09, 2017 at 09:40:43PM +0100, David Miller wrote:
>
> > It look like PF_ALG sets up a ->sk in alg_create() (but this
> > would get over-written in alg_accept()?) 

No it does not.  The struct socket comes from sys_accept() and
AFAICS it doesn't carry a struck sock with it.

> > Cc'ing Herbert to see if this is expected behavior (and PF_ALG
> > somehow does the right thing with the refcount for the ->sk
> > set up in alg_create) in which case I suppose we should drop the 
> > WARN_ON. 
> 
> I think we've found yet another socket leak, this time in
> AF_ALG.

Hmm, I can't see the problem in af_alg_accept.  The struct socket
comes directly from sys_accept() which creates it using sock_alloc.

So the only thing I can think of is that the memory returned by
sock_alloc is not zeroed and therefore the WARN_ON is just reading
garbage.

Cheers,
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

RE: [PATCH] net/mlx5: IPSec, fix 64-bit division correctly

2017-07-10 Thread Ilan Tayari

> -Original Message-
> From: Arnd Bergmann [mailto:a...@arndb.de]
> Subject: [PATCH] net/mlx5: IPSec, fix 64-bit division correctly
> 
> The new IPSec offload code introduced a build error:
> 
> drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec_rxtx.o: In function
> `mlx5e_ipsec_build_inverse_table':
> ipsec_rxtx.c:(.text+0x556): undefined reference
> 
> Another patch was added on top to fix the build error, but
> that introduced a new bug, as we now use the remainder of
> the division rather than the result.
> 
> This makes it use the correct helper function instead.
> 
> Fixes: 5dfd87b67cd9 ("net/mlx5: IPSec, Fix 64-bit division on 32-bit
> builds")
> Fixes: 2ac9cfe78223 ("net/mlx5e: IPSec, Add Innova IPSec offload TX data
> path")
> Signed-off-by: Arnd Bergmann 
> ---
>  drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec_rxtx.c | 4 +---
>  1 file changed, 1 insertion(+), 3 deletions(-)
> 
> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec_rxtx.c
> b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec_rxtx.c
> index 7d06c673851a..4614ddfa91eb 100644
> --- a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec_rxtx.c
> +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec_rxtx.c
> @@ -363,7 +363,6 @@ void mlx5e_ipsec_build_inverse_table(void)
>  {
>   u16 mss_inv;
>   u32 mss;
> - u64 n;
> 
>   /* Calculate 1/x inverse table for use in GSO data path.
>* Using this table, we provide the IPSec accelerator with the value
> of
> @@ -373,8 +372,7 @@ void mlx5e_ipsec_build_inverse_table(void)
>*/
>   mlx5e_ipsec_inverse_table[1] = htons(0x);
>   for (mss = 2; mss < MAX_LSO_MSS; mss++) {
> - n = 1ULL << 32;
> - mss_inv = do_div(n, mss) >> 16;
> + mss_inv = div_u64(1ULL << 32, mss) >> 16;

Thanks Arnd, 

:o
I'm surprised... let me check this...

>   mlx5e_ipsec_inverse_table[mss] = htons(mss_inv);
>   }
>  }
> --
> 2.9.0

Re: [PATCH 2/2 net] cxgb4: ptp_clock_register() returns error pointers

2017-07-10 Thread Richard Cochran

On Mon, Jul 10, 2017 at 10:16:15AM +0300, Dan Carpenter wrote:
> We're checking ptp_clock_register() for NULL but we should be checking
> for error pointers.

No.

> diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_ptp.c 
> b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_ptp.c
> index 50517cfd9671..c24313a103c6 100644
> --- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_ptp.c
> +++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_ptp.c
> @@ -441,7 +441,7 @@ void cxgb4_ptp_init(struct adapter *adapter)
>  
>   adapter->ptp_clock = ptp_clock_register(&adapter->ptp_clock_info,
>   &adapter->pdev->dev);
> - if (!adapter->ptp_clock) {

Yeah, that is wrong, but the fix is to check to IS_ERR or NULL.

Thanks,
Richard

[PATCH] wireless: iwlegacy: Constify attribute_group structures.

2017-07-10 Thread Arvind Yadav

attribute_groups are not supposed to change at runtime. All functions
working with attribute_groups provided by  work
with const attribute_group. So mark the non-const structs as const.

Signed-off-by: Arvind Yadav 
---
 drivers/net/wireless/intel/iwlegacy/4965-mac.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/wireless/intel/iwlegacy/4965-mac.c 
b/drivers/net/wireless/intel/iwlegacy/4965-mac.c
index 5b51fba..de9b652 100644
--- a/drivers/net/wireless/intel/iwlegacy/4965-mac.c
+++ b/drivers/net/wireless/intel/iwlegacy/4965-mac.c
@@ -4654,7 +4654,7 @@ static DEVICE_ATTR(tx_power, S_IWUSR | S_IRUGO, 
il4965_show_tx_power,
NULL
 };
 
-static struct attribute_group il_attribute_group = {
+static const struct attribute_group il_attribute_group = {
.name = NULL,   /* put in device directory */
.attrs = il_sysfs_entries,
 };
-- 
1.9.1

Re: [PATCH v2] net: axienet: add support for standard phy-mode binding

2017-07-10 Thread Alvaro Gamez Machado

On Fri, Jul 07, 2017 at 10:16:31AM -0700, Florian Fainelli wrote:
> On 07/06/2017 11:50 PM, Alvaro Gamez Machado wrote:
> > Keep supporting proprietary "xlnx,phy-type" attribute and add support for
> > MII connectivity to the PHY.
> > 
> > Signed-off-by: Alvaro Gamez Machado 
> > ---
> > diff --git a/drivers/net/ethernet/xilinx/xilinx_axienet.h 
> > b/drivers/net/ethernet/xilinx/xilinx_axienet.h
> > index af27f7d1cbf3..48e939a5fa31 100644
> > --- a/drivers/net/ethernet/xilinx/xilinx_axienet.h
> > +++ b/drivers/net/ethernet/xilinx/xilinx_axienet.h
> > @@ -389,7 +389,7 @@ struct axidma_bd {
> >   * @dma_err_tasklet: Tasklet structure to process Axi DMA errors
> >   * @tx_irq:Axidma TX IRQ number
> >   * @rx_irq:Axidma RX IRQ number
> > - * @phy_type:  Phy type to identify between MII/GMII/RGMII/SGMII/1000 
> > Base-X
> > + * @phy_mode:  Phy type to identify between MII/GMII/RGMII/SGMII/1000 
> > Base-X
> >   * @options:   AxiEthernet option word
> >   * @last_link: Phy link state in which the PHY was negotiated earlier
> >   * @features:  Stores the extended features supported by the axienet hw
> > @@ -432,7 +432,7 @@ struct axienet_local {
> >  
> > int tx_irq;
> > int rx_irq;
> > -   u32 phy_type;
> > +   u32 phy_mode;
> 
> Can you use phy_interface_t here for storing this value?

Sure!

> 
> > @@ -1539,7 +1532,38 @@ static int axienet_probe(struct platform_device 
> > *pdev)
> >  * the device-tree and accordingly set flags.
> >  */
> > of_property_read_u32(pdev->dev.of_node, "xlnx,rxmem", &lp->rxmem);
> > -   of_property_read_u32(pdev->dev.of_node, "xlnx,phy-type", &lp->phy_type);
> > +
> > +   /* Start with the proprietary, and broken phy_type */
> > +   ret = of_property_read_u32(pdev->dev.of_node, "xlnx,phy-type", &value);
> > +   if (!ret) {
> > +   netdev_warn(ndev, "Please upgrade your device tree binary blob 
> > to use phy-mode");
> > +   switch (value) {
> > +   case XAE_PHY_TYPE_MII:
> > +   lp->phy_mode = PHY_INTERFACE_MODE_MII;
> > +   break;
> > +   case XAE_PHY_TYPE_GMII:
> > +   lp->phy_mode = PHY_INTERFACE_MODE_GMII;
> > +   break;
> > +   case XAE_PHY_TYPE_RGMII_2_0:
> > +   lp->phy_mode = PHY_INTERFACE_MODE_RGMII;
> 
> This should be PHY_INTERFACE_MODE_RGMII_ID.

Yes, thank you, Andrew already noticed that. Sorry I didn't see it myself.

Since Andrew told me that netdev is closed for patches right now, I'll wait
until next week to send these changes.

Thanks to both of you for your assistance on this small patch!

-- 
Alvaro G. M.

Re: [PATCH 1/2 net] ptp: fix error codes in ptp_clock_register()

2017-07-10 Thread Richard Cochran

On Mon, Jul 10, 2017 at 12:38:16PM +0300, Dan Carpenter wrote:
> There were two buggy commits so I chose the ealier one.  The other buggy

No, you are mistaken.  In the original patch, NULL or PTR_ERR were
returned on error, and that was not a bug.

If you want to correct the present version of ptp_clock_register() to
always return a valid pointer or PTR_ERR (like the kerneldoc says), be
my guest, but please say that in the change log and reference the
correct commit (namely the one related to disabling POSIX clocks.)

Thanks,
Richard

Re: [PATCH] brcmfmac: added LED triggers for transmit/receive

2017-07-10 Thread Rafał Miłecki

On 7 July 2017 at 16:09, Russell Joyce  wrote:
> Add three basic LED triggers to brcmfmac, based on those in mac80211: one
> for transmit, one for receive, and one for combined transmit/receive.
>
> Signed-off-by: Russell Joyce 

1) I think most of it should be some cfg80211 shareable code.
2) This "rxtx" while surely present in other places sounds like a
workaround for LED subsystem limitation. Maybe it's time to finally
rework LED triggers.

[PATCH] wireless: iwlegacy: constify attribute_group structures.

2017-07-10 Thread Arvind Yadav

attribute_groups are not supposed to change at runtime. All functions
working with attribute_groups provided by  work
with const attribute_group. So mark the non-const structs as const.

Signed-off-by: Arvind Yadav 
---
 drivers/net/wireless/intel/iwlegacy/3945-mac.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/wireless/intel/iwlegacy/3945-mac.c 
b/drivers/net/wireless/intel/iwlegacy/3945-mac.c
index 38bf403..329f3a6 100644
--- a/drivers/net/wireless/intel/iwlegacy/3945-mac.c
+++ b/drivers/net/wireless/intel/iwlegacy/3945-mac.c
@@ -3464,7 +3464,7 @@ static DEVICE_ATTR(antenna, S_IWUSR | S_IRUGO, 
il3945_show_antenna,
NULL
 };
 
-static struct attribute_group il3945_attribute_group = {
+static const struct attribute_group il3945_attribute_group = {
.name = NULL,   /* put in device directory */
.attrs = il3945_sysfs_entries,
 };
-- 
1.9.1

[PATCH] wireless: ipw2100: constify attribute_group structures.

2017-07-10 Thread Arvind Yadav

attribute_groups are not supposed to change at runtime. All functions
working with attribute_groups provided by  work
with const attribute_group. So mark the non-const structs as const.

Signed-off-by: Arvind Yadav 
---
 drivers/net/wireless/intel/ipw2x00/ipw2100.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/wireless/intel/ipw2x00/ipw2100.c 
b/drivers/net/wireless/intel/ipw2x00/ipw2100.c
index aaaca4d..ccbe745 100644
--- a/drivers/net/wireless/intel/ipw2x00/ipw2100.c
+++ b/drivers/net/wireless/intel/ipw2x00/ipw2100.c
@@ -4324,7 +4324,7 @@ static ssize_t store_rf_kill(struct device *d, struct 
device_attribute *attr,
NULL,
 };
 
-static struct attribute_group ipw2100_attribute_group = {
+static const struct attribute_group ipw2100_attribute_group = {
.attrs = ipw2100_sysfs_entries,
 };
 
-- 
1.9.1

[PATCH] net/mlx5: IPSec, fix 64-bit division correctly

2017-07-10 Thread Arnd Bergmann

The new IPSec offload code introduced a build error:

drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec_rxtx.o: In function 
`mlx5e_ipsec_build_inverse_table':
ipsec_rxtx.c:(.text+0x556): undefined reference

Another patch was added on top to fix the build error, but
that introduced a new bug, as we now use the remainder of
the division rather than the result.

This makes it use the correct helper function instead.

Fixes: 5dfd87b67cd9 ("net/mlx5: IPSec, Fix 64-bit division on 32-bit builds")
Fixes: 2ac9cfe78223 ("net/mlx5e: IPSec, Add Innova IPSec offload TX data path")
Signed-off-by: Arnd Bergmann 
---
 drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec_rxtx.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec_rxtx.c 
b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec_rxtx.c
index 7d06c673851a..4614ddfa91eb 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec_rxtx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec_rxtx.c
@@ -363,7 +363,6 @@ void mlx5e_ipsec_build_inverse_table(void)
 {
u16 mss_inv;
u32 mss;
-   u64 n;
 
/* Calculate 1/x inverse table for use in GSO data path.
 * Using this table, we provide the IPSec accelerator with the value of
@@ -373,8 +372,7 @@ void mlx5e_ipsec_build_inverse_table(void)
 */
mlx5e_ipsec_inverse_table[1] = htons(0x);
for (mss = 2; mss < MAX_LSO_MSS; mss++) {
-   n = 1ULL << 32;
-   mss_inv = do_div(n, mss) >> 16;
+   mss_inv = div_u64(1ULL << 32, mss) >> 16;
mlx5e_ipsec_inverse_table[mss] = htons(mss_inv);
}
 }
-- 
2.9.0

Re: [PATCH 1/2 net] ptp: fix error codes in ptp_clock_register()

2017-07-10 Thread Dan Carpenter

On Mon, Jul 10, 2017 at 11:21:03AM +0200, Richard Cochran wrote:
> On Mon, Jul 10, 2017 at 10:11:37AM +0300, Dan Carpenter wrote:
> > The ptp_clock_register() function returns NULL when it's #ifdefed out
> > because CONFIG_PTP_1588_CLOCK is disabled.  Otherwise, it's intended to
> > return error pointers.  Unfortunately, there are a couple paths where we
> > forget to set the error code.  It means that we could result in NULL
> > pointer dereferences in the callers.
> > 
> > Fixes: d94ba80ebbea ("ptp: Added a brand new class driver for ptp clocks.")
> 
> This "Fixes" tag references the wrong commit.  Please correct it.
> 

There were two buggy commits so I chose the ealier one.  The other buggy
commit is 85a66e550195 ("ptp: create "pins" together with the rest of
attributes").  I should have CC'd Dmitry as well.  I can resend.

regards,
dan carpenter

Re: [PATCH 1/2 net] ptp: fix error codes in ptp_clock_register()

2017-07-10 Thread Richard Cochran

On Mon, Jul 10, 2017 at 10:11:37AM +0300, Dan Carpenter wrote:
> The ptp_clock_register() function returns NULL when it's #ifdefed out
> because CONFIG_PTP_1588_CLOCK is disabled.  Otherwise, it's intended to
> return error pointers.  Unfortunately, there are a couple paths where we
> forget to set the error code.  It means that we could result in NULL
> pointer dereferences in the callers.

Actually, this description is bogus.  Callers will not dereference
NULL, because they are required to check the returned pointer:

/**
 * ptp_clock_register() - register a PTP hardware clock driver
 *
 * @info:   Structure describing the new clock.
 * @parent: Pointer to the parent device of the new clock.
 *
 * Returns a valid pointer on success or PTR_ERR on failure.  If PHC
 * support is missing at the configuration level, this function
 * returns NULL, and drivers are expected to gracefully handle that
 * case separately.
 */

Thanks,
Richard

Re: [PATCH v 1/2] ravb: Add support for r8a7743 SoC

2017-07-10 Thread Sergei Shtylyov


Hello!

On 7/10/2017 4:20 AM, Rob Herring wrote:


Add support for Gigabit Ethernet E-MAC on r8a7743 (RZ/G1M) SoC.
Renesas RZ/G1M (R8A7743) SoC Ethernet AVB IP is identical to the R-Car Gen2
family.


For the subject: "dt-bindings: net: ..."



Signed-off-by: Biju Das 
Reviewed-by: Chris Paterson 
---
  Documentation/devicetree/bindings/net/renesas,ravb.txt | 3 ++-
  drivers/net/ethernet/renesas/ravb_main.c   | 1 +
  2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/Documentation/devicetree/bindings/net/renesas,ravb.txt 
b/Documentation/devicetree/bindings/net/renesas,ravb.txt
index b519503..bc692ab 100644
--- a/Documentation/devicetree/bindings/net/renesas,ravb.txt
+++ b/Documentation/devicetree/bindings/net/renesas,ravb.txt
@@ -4,7 +4,8 @@ This file provides information on what the device node for the 
Ethernet AVB
  interface contains.
  
  Required properties:

-- compatible: "renesas,etheravb-r8a7790" if the device is a part of R8A7790 
SoC.
+- compatible: "renesas,etheravb-r8a7743" if the device is a part of R8A7743 
SoC.
+ "renesas,etheravb-r8a7790" if the device is a part of R8A7790 SoC.


Please re-format like this:

- compatible: Must be one of:
...

So it's a one line change to add new compatibles.


   Note that the common gen2/3 values are at end of this list, so they'll 
need different treating if you add these words.


MBR, Sergei

[PATCH] wireless: ipw2200: constify attribute_group structures.

2017-07-10 Thread Arvind Yadav

attribute_groups are not supposed to change at runtime. All functions
working with attribute_groups provided by  work
with const attribute_group. So mark the non-const structs as const.

Signed-off-by: Arvind Yadav 
---
 drivers/net/wireless/intel/ipw2x00/ipw2200.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/wireless/intel/ipw2x00/ipw2200.c 
b/drivers/net/wireless/intel/ipw2x00/ipw2200.c
index 9368abd..c311b1a 100644
--- a/drivers/net/wireless/intel/ipw2x00/ipw2200.c
+++ b/drivers/net/wireless/intel/ipw2x00/ipw2200.c
@@ -11500,7 +11500,7 @@ static int ipw_wdev_init(struct net_device *dev)
NULL
 };
 
-static struct attribute_group ipw_attribute_group = {
+static const struct attribute_group ipw_attribute_group = {
.name = NULL,   /* put in device directory */
.attrs = ipw_sysfs_entries,
 };
-- 
1.9.1

Re: [PATCH 1/2 net] ptp: fix error codes in ptp_clock_register()

2017-07-10 Thread Richard Cochran

On Mon, Jul 10, 2017 at 10:11:37AM +0300, Dan Carpenter wrote:
> The ptp_clock_register() function returns NULL when it's #ifdefed out
> because CONFIG_PTP_1588_CLOCK is disabled.  Otherwise, it's intended to
> return error pointers.  Unfortunately, there are a couple paths where we
> forget to set the error code.  It means that we could result in NULL
> pointer dereferences in the callers.
> 
> Fixes: d94ba80ebbea ("ptp: Added a brand new class driver for ptp clocks.")

This "Fixes" tag references the wrong commit.  Please correct it.

Thanks,
Richard

1 2 >

1 - 100 of 119 matches

Mail list logo