Re: [PATCH net] xen-netback: make sure that hashes are not send to unaware frontends

2016-10-06 Thread David Miller
From: Paul Durrant 
Date: Thu, 6 Oct 2016 15:47:10 +0100

> In the case when a frontend only negotiates a single queue with xen-
> netback it is possible for a skbuff with a s/w hash to result in a
> hash extra_info segment being sent to the frontend even when no hash
> algorithm has been configured. (The ndo_select_queue() entry point makes
> sure the hash is not set if no algorithm is configured, but this entry
> point is not called when there is only a single queue). This can result
> in a frontend that isunable to handle extra_info segments being given
> such a segment, causing it to crash.
> 
> This patch fixes the problem by gating whether the extra_info is sent
> not only on the presence of a s/w hash, but also on whether the hash
> algorithm has been configured.
> 
> Signed-off-by: Paul Durrant 
> Cc: Wei Liu 

This doesn't apply cleanly to the current 'net' tree, please respin.

Thanks.


Re: [PATCH][RESEND] dt-bindings: net: renesas-ravb: Add support for R8A7796 RAVB

2016-10-06 Thread Simon Horman
On Tue, Oct 04, 2016 at 07:45:46PM +0300, Laurent Pinchart wrote:
> Add a new compatible string for the R8A7796 (M3-W) RAVB.
> 
> Signed-off-by: Laurent Pinchart 
> Reviewed-by: Geert Uytterhoeven 

Acked-by: Simon Horman 

> ---
>  Documentation/devicetree/bindings/net/renesas,ravb.txt | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> The patch has been posted to the linux-renesas-soc mailing list only, adding
> the netdev mailing list to get it upstreamed.
> 
> diff --git a/Documentation/devicetree/bindings/net/renesas,ravb.txt 
> b/Documentation/devicetree/bindings/net/renesas,ravb.txt
> index c8ac222eac67..b519503be51a 100644
> --- a/Documentation/devicetree/bindings/net/renesas,ravb.txt
> +++ b/Documentation/devicetree/bindings/net/renesas,ravb.txt
> @@ -10,6 +10,7 @@ Required properties:
> "renesas,etheravb-r8a7793" if the device is a part of R8A7793 SoC.
> "renesas,etheravb-r8a7794" if the device is a part of R8A7794 SoC.
> "renesas,etheravb-r8a7795" if the device is a part of R8A7795 SoC.
> +   "renesas,etheravb-r8a7796" if the device is a part of R8A7796 SoC.
> "renesas,etheravb-rcar-gen2" for generic R-Car Gen 2 compatible 
> interface.
> "renesas,etheravb-rcar-gen3" for generic R-Car Gen 3 compatible 
> interface.
>  
> @@ -33,7 +34,7 @@ Optional properties:
>  - interrupt-parent: the phandle for the interrupt controller that services
>   interrupts for this device.
>  - interrupt-names: A list of interrupt names.
> -For the R8A7795 SoC this property is mandatory;
> +For the R8A779[56] SoCs this property is mandatory;
>  it should include one entry per channel, named "ch%u",
>  where %u is the channel number ranging from 0 to 24.
>  For other SoCs this property is optional; if present
> -- 
> Regards,
> 
> Laurent Pinchart
> 


Re: [PATCH] drivers: net: phy: Correct duplicate MDIO_XGENE entry

2016-10-06 Thread David Miller
From: Laura Abbott 
Date: Thu,  6 Oct 2016 11:22:51 -0700

> An extra entry for MDIO_XGENE got added during merging.
> Delete it.
> 
> Reviewed-by: Andrew Lunn 
> Signed-off-by: Laura Abbott 

Applied, thanks.


Re: [PATCH v2] ethernet: qualcomm: QCOM_EMAC should depend on HAS_DMA and HAS_IOMEM

2016-10-06 Thread David Miller
From: Geert Uytterhoeven 
Date: Thu,  6 Oct 2016 16:44:53 +0200

> If NO_DMA=y:
> 
> drivers/built-in.o: In function `emac_probe':
> emac.c:(.text+0x3780b8): undefined reference to `bad_dma_ops'
> emac.c:(.text+0x3780e2): undefined reference to `bad_dma_ops'
> emac.c:(.text+0x378112): undefined reference to `bad_dma_ops'
> emac.c:(.text+0x378146): undefined reference to `bad_dma_ops'
> emac.c:(.text+0x37816e): undefined reference to `bad_dma_ops'
> drivers/built-in.o:emac.c:(.text+0x37819a): more undefined references to 
> `bad_dma_ops' follow
> 
> If NO_IOMEM=y:
> 
> drivers/net/ethernet/qualcomm/emac/emac.c: In function ‘emac_remove’:
> drivers/net/ethernet/qualcomm/emac/emac.c:736:3: error: implicit 
> declaration of function ‘iounmap’ [-Werror=implicit-function-declaration]
>iounmap(adpt->phy.digital);
>^
> 
> Add dependencies on HAS_DMA and HAS_IOMEM to fix this.
> 
> Signed-off-by: Geert Uytterhoeven 

Applied.


Re: [PATCH net-next v3 0/3] net: ethernet: mediatek: check the hw lro capability by the chip id instead of the dtsi

2016-10-06 Thread David Miller
From: Nelson Chang 
Date: Thu, 6 Oct 2016 19:44:00 +0800

> The series modify to check if hw lro is supported by the chip id.
> 
> changes since v3:
> - Refine mtk_is_hwlro_supported() function
> 
> changes since v2:
> - Refine mtk_get_chip_id() function
> 
> changes since v1:
> - Because hw lro started to be supported from MT7623, the proper way to check 
> if the feature is capable is to judge by the chip id instead of by the dtsi.

Series applied, thanks.


Re: [PATCH net-next 00/13] rxrpc: Fixes

2016-10-06 Thread David Miller
From: David Howells 
Date: Thu, 06 Oct 2016 11:03:56 +0100

> This set of patches contains a bunch of fixes:
 ...

Pulled, thanks David.


Re: [PATCH net] netlink: do not enter direct reclaim from netlink_dump()

2016-10-06 Thread David Miller
From: Eric Dumazet 
Date: Thu, 06 Oct 2016 04:13:18 +0900

> From: Eric Dumazet 
> 
> Since linux-3.15, netlink_dump() can use up to 16384 bytes skb
> allocations.
> 
> Due to struct skb_shared_info ~320 bytes overhead, we end up using
> order-3 (on x86) page allocations, that might trigger direct reclaim and
> add stress.
> 
> The intent was really to attempt a large allocation but immediately
> fallback to a smaller one (order-1 on x86) in case of memory stress.
> 
> On recent kernels (linux-4.4), we can remove __GFP_DIRECT_RECLAIM to
> meet the goal. Old kernels would need to remove __GFP_WAIT
> 
> While we are at it, since we do an order-3 allocation, allow to use
> all the allocated bytes instead of 16384 to reduce syscalls during
> large dumps.
> 
> iproute2 already uses 32KB recvmsg() buffer sizes.
> 
> Alexei provided an initial patch downsizing to SKB_WITH_OVERHEAD(16384)
> 
> Fixes: 9063e21fb026 ("netlink: autosize skb lengthes")
> Signed-off-by: Eric Dumazet 
> Reported-by: Alexei Starovoitov 
> Cc: Greg Thelen 
> ---
> Note: This will apply to net tree when it has synced with Linus tree.

Applied.


Re: [PATCH] ipv6 addrconf: disallow rtr_solicits < -1

2016-10-06 Thread Cong Wang
On Mon, Oct 3, 2016 at 11:40 PM, Maciej Żenczykowski
 wrote:
>> Please remove the const qualifier and the casts to be consistent
>> with how we handle this elsewhere.
>>
>> Thanks.
>
> I can of course trivially make that change.
>
> But:
>
> (on net-next/master)
> git grep 'extra[12].*=.*\(void *[*]\)'
>
> currently finds 45 matches, and this patch adds a 46th.

Seems the sysctl layer should make them const, but I never look
into it, it doesn't look like it needs to modify these min/max consts.


Re: [PATCH net] packet: call fanout_release, while UNREGISTERING a netdev

2016-10-06 Thread David Miller
From: Anoob Soman 
Date: Wed, 5 Oct 2016 15:12:54 +0100

> If a socket has FANOUT sockopt set, a new proto_hook is registered
> as part of fanout_add(). When processing a NETDEV_UNREGISTER event in
> af_packet, __fanout_unlink is called for all sockets, but prot_hook which was
> registered as part of fanout_add is not removed. Call fanout_release, on a
> NETDEV_UNREGISTER, which removes prot_hook and removes fanout from the
> fanout_list.
> 
> This fixes BUG_ON(!list_empty(>ptype_specific)) in netdev_run_todo()
> 
> Signed-off-by: Anoob Soman 

Applied and queued up for -stable, thanks.


Re: [PATCH v3 net-next 4/4] net/sched: act_mirred: Implement ingress actions

2016-10-06 Thread Cong Wang
On Thu, Oct 6, 2016 at 5:17 PM, Jamal Hadi Salim  wrote:
> I dont believe we need to bother with the return code in  this case.

Why?

For a quick example, STOLEN vs. SHOT:

result = tc_classify(skb, filter, , false);
if (result >= 0) {
#ifdef CONFIG_NET_CLS_ACT
switch (result) {
case TC_ACT_STOLEN:
case TC_ACT_QUEUED:
*qerr = NET_XMIT_SUCCESS | __NET_XMIT_STOLEN;
case TC_ACT_SHOT:
return 0;
}
#endif

Note, *qerr is the return value to ->enqueue().


Re: [PATCH] devicetree: net: micrel-ksz90x1.txt: Properly explain skew settings

2016-10-06 Thread David Miller
From: Mike Looijmans 
Date: Wed,  5 Oct 2016 16:03:08 +0200

> The KSZ9031 skew registers contain an offset, the chip's default value
> is "neutral" which does not add any skew. Programming a 0 into a skew
> property will actually set it the maximal negative adjustment and not
> to a neutral position as one would expect.
> 
> Explain this situation in the devicetree binding documentation and list
> the settings that the chip considers neutral.
> 
> Changing the implementation to accept negative values would have been
> a better solution, but would break existing configurations.
> 
> Signed-off-by: Mike Looijmans 

Applied.


Re: [PATCH v2 net-next] net: phy: Add Wake-on-LAN driver for Microsemi PHYs.

2016-10-06 Thread David Miller
From: Raju Lakkaraju 
Date: Wed, 5 Oct 2016 14:19:27 +0530

> From: Raju Lakkaraju 
> 
> Wake-on-LAN (WoL) is an Ethernet networking standard that allows 
> a computer/device to be turned on or awakened by a network message.
> 
> VSC8531 PHY can support this feature configure by driver set function.
> WoL status get by driver get function.
> 
> Tested on Beaglebone Black with VSC 8531 PHY.
> 
> Signed-off-by: Raju Lakkaraju 

Applied.


Re: [net-next PATCH] drivers: net: cpsw-phy-sel: add support to configure rgmii internal delay

2016-10-06 Thread David Miller
From: Mugunthan V N 
Date: Tue, 4 Oct 2016 19:07:29 +0530

> Add support to enable CPSW RGMII internal delay (id mode) bits
> when rgmii internal delay is configured in phy.
> 
> Signed-off-by: Mugunthan V N 

Applied.


Re: [PATCH v3 net-next 4/4] net/sched: act_mirred: Implement ingress actions

2016-10-06 Thread Cong Wang
On Thu, Oct 6, 2016 at 12:38 PM, Eric Dumazet  wrote:
> And another quick grep shows that out of 142 drivers, only one [1] of
> them (incorrectly) checks netif_receive_skb() return value.
>

act_mirred is not a driver, apparently.


> Real question is more like : what is the impact of propagating an error
> at this point ?

_If_ we are going to just propagate the error like egress, then
the difference is m->tcf_action (PIPE or STOLEN) vs TC_ACT_SHOT.
And this error code is propagated from tcf_action_exec() up to
qdisc layer...


Re: [PATCH][V2] net: hns: Add missing \n to end of dev_err messages, tidy up text

2016-10-06 Thread David Miller
From: Colin King 
Date: Tue,  4 Oct 2016 13:57:01 +0100

> From: Colin Ian King 
> 
> Trival fix, dev_err messages are missing a \n, so add it. Also
> fix grammer, spelling mistake and add white spaces to various
> error messages.
> 
> Signed-off-by: Colin Ian King 

Applied.


Re: [PATCH] net: axienet: Add missing \n to end of dev_err messages

2016-10-06 Thread David Miller
From: Colin King 
Date: Tue,  4 Oct 2016 12:11:41 +0100

> From: Colin Ian King 
> 
> Trival fix, dev_err messages are missing a \n, so add it.
> 
> Signed-off-by: Colin Ian King 

Applied.


Re: [PATCH] net: ps3_gelic: Add missing \n to end of deb_dbg message

2016-10-06 Thread David Miller
From: Colin King 
Date: Tue,  4 Oct 2016 12:15:54 +0100

> From: Colin Ian King 
> 
> Trival fix, dev_dbg message is missing a \n, so add it.
> 
> Signed-off-by: Colin Ian King 

Applied.


Re: [PATCH v2 net-next 0/7] xen-netback: guest rx side refactor

2016-10-06 Thread David Miller
From: Paul Durrant 
Date: Tue, 4 Oct 2016 10:29:11 +0100

> This series refactors the guest rx side of xen-netback:
> 
> - The code is moved into its own source module.
> 
> - The prefix variant of GSO handling is retired (since it is no longer
>   in common use, and alternatives exist).
> 
> - The code is then simplified and modifications made to improve
>   performance.
> 
> v2:
> - Rebased onto refreshed net-next

Series applied, thanks.


Re: [net-next 00/13] fsl/fman: cleanup and small fixes

2016-10-06 Thread David Miller
From: Madalin Bucur 
Date: Tue, 4 Oct 2016 10:30:24 +0300

> This series contains fixes for the DPAA FMan driver.
> Adding myself as maintainer of the driver.
> 
> The following are changes since commit 
> a4cc96d1f0170b779c32c6b2cc58764f5d2cdef0
>  net: phy: Add Edge-rate driver for Microsemi PHYs.
> and are available on the fman-next branch in the git repository at
>  git://git.freescale.com/ppc/upstream/linux.git

Pulled, thanks.


Re: [PATCH v3 net-next 4/4] net/sched: act_mirred: Implement ingress actions

2016-10-06 Thread Jamal Hadi Salim

On 16-10-06 01:30 PM, Cong Wang wrote:

On Thu, Oct 6, 2016 at 6:30 AM, Shmulik Ladkani
 wrote:

Hi,

On Mon, Oct 3, 2016 at 12:45 PM, Cong Wang  wrote:

On Thu, Sep 29, 2016 at 4:03 AM, Shmulik Ladkani
 wrote:

skb2->skb_iif = skb->dev->ifindex;
skb2->dev = dev;
-   err = dev_queue_xmit(skb2);
+   if (tcf_mirred_act_direction(m_eaction) & AT_EGRESS)
+   err = dev_queue_xmit(skb2);
+   else
+   netif_receive_skb(skb2);


Any reason why not check the return value here?


Rationale: netif_receive_skb returns err if there was no protocol
handler to deliver the skb to.
If skb is not caught by any protocol handler, this should not be
considered an "ingress redirect" error. The redirect action should be
considered successful.




I dont believe we need to bother with the return code in  this case.
The core netif_receive_skb() code already increments any necessary
stats.

cheers,
jamal


[PATCH net-next 1/2] drivers: net: xgene: fix: Use GPIO to get link status

2016-10-06 Thread Iyappan Subramanian
The link value reported by the link status register is not
reliable when no SPF module inserted. This patchset fixes this
issue by using GPIO to determine the link status.

Signed-off-by: Iyappan Subramanian 
Signed-off-by: Quan Nguyen 
---
 drivers/net/ethernet/apm/xgene/xgene_enet_main.c  |  6 +-
 drivers/net/ethernet/apm/xgene/xgene_enet_main.h  |  1 +
 drivers/net/ethernet/apm/xgene/xgene_enet_xgmac.c | 19 +--
 3 files changed, 23 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/apm/xgene/xgene_enet_main.c 
b/drivers/net/ethernet/apm/xgene/xgene_enet_main.c
index 429f18f..f75d955 100644
--- a/drivers/net/ethernet/apm/xgene/xgene_enet_main.c
+++ b/drivers/net/ethernet/apm/xgene/xgene_enet_main.c
@@ -1381,9 +1381,13 @@ static void xgene_enet_gpiod_get(struct xgene_enet_pdata 
*pdata)
 {
struct device *dev = >pdev->dev;
 
-   if (pdata->phy_mode != PHY_INTERFACE_MODE_XGMII)
+   pdata->sfp_gpio_en = false;
+   if (pdata->phy_mode != PHY_INTERFACE_MODE_XGMII ||
+   (!device_property_present(dev, "sfp-gpios") &&
+!device_property_present(dev, "rxlos-gpios")))
return;
 
+   pdata->sfp_gpio_en = true;
pdata->sfp_rdy = gpiod_get(dev, "rxlos", GPIOD_IN);
if (IS_ERR(pdata->sfp_rdy))
pdata->sfp_rdy = gpiod_get(dev, "sfp", GPIOD_IN);
diff --git a/drivers/net/ethernet/apm/xgene/xgene_enet_main.h 
b/drivers/net/ethernet/apm/xgene/xgene_enet_main.h
index 0cda58f..011965b 100644
--- a/drivers/net/ethernet/apm/xgene/xgene_enet_main.h
+++ b/drivers/net/ethernet/apm/xgene/xgene_enet_main.h
@@ -219,6 +219,7 @@ struct xgene_enet_pdata {
u8 rx_delay;
bool mdio_driver;
struct gpio_desc *sfp_rdy;
+   bool sfp_gpio_en;
 };
 
 struct xgene_indirect_ctl {
diff --git a/drivers/net/ethernet/apm/xgene/xgene_enet_xgmac.c 
b/drivers/net/ethernet/apm/xgene/xgene_enet_xgmac.c
index 6475f38..d1758b0 100644
--- a/drivers/net/ethernet/apm/xgene/xgene_enet_xgmac.c
+++ b/drivers/net/ethernet/apm/xgene/xgene_enet_xgmac.c
@@ -415,16 +415,31 @@ static void xgene_enet_clear(struct xgene_enet_pdata 
*pdata,
xgene_enet_wr_ring_if(pdata, addr, data);
 }
 
+static int xgene_enet_gpio_lookup(struct xgene_enet_pdata *pdata)
+{
+   struct device *dev = >pdev->dev;
+
+   pdata->sfp_rdy = gpiod_get(dev, "rxlos", GPIOD_IN);
+   if (IS_ERR(pdata->sfp_rdy))
+   pdata->sfp_rdy = gpiod_get(dev, "sfp", GPIOD_IN);
+
+   if (IS_ERR(pdata->sfp_rdy))
+   return -ENODEV;
+
+   return 0;
+}
+
 static void xgene_enet_link_state(struct work_struct *work)
 {
struct xgene_enet_pdata *pdata = container_of(to_delayed_work(work),
 struct xgene_enet_pdata, link_work);
-   struct gpio_desc *sfp_rdy = pdata->sfp_rdy;
struct net_device *ndev = pdata->ndev;
u32 link_status, poll_interval;
 
link_status = xgene_enet_link_status(pdata);
-   if (link_status && !IS_ERR(sfp_rdy) && !gpiod_get_value(sfp_rdy))
+   if (pdata->sfp_gpio_en && link_status &&
+   (!IS_ERR(pdata->sfp_rdy) || !xgene_enet_gpio_lookup(pdata)) &&
+   !gpiod_get_value(pdata->sfp_rdy))
link_status = 0;
 
if (link_status) {
-- 
1.9.1



[PATCH net-next 2/2] arm64: xgene: defconfig: Enable Standby GPIO

2016-10-06 Thread Iyappan Subramanian
Enable CONFIG_GPIO_XGENE_SB.

Signed-off-by: Iyappan Subramanian 
Signed-off-by: Quan Nguyen 
---
 arch/arm64/configs/defconfig | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/arm64/configs/defconfig b/arch/arm64/configs/defconfig
index eadf485..be52a00 100644
--- a/arch/arm64/configs/defconfig
+++ b/arch/arm64/configs/defconfig
@@ -240,6 +240,7 @@ CONFIG_GPIO_DWAPB=y
 CONFIG_GPIO_PL061=y
 CONFIG_GPIO_RCAR=y
 CONFIG_GPIO_XGENE=y
+CONFIG_GPIO_XGENE_SB=y
 CONFIG_GPIO_PCA953X=y
 CONFIG_GPIO_PCA953X_IRQ=y
 CONFIG_GPIO_MAX77620=y
-- 
1.9.1



[PATCH net-next 0/2] drivers: net: xgene: fix: Use GPIO to get link status

2016-10-06 Thread Iyappan Subramanian
Since the link value reported by the link status register is not
reliable if no SPF module inserted, this patchset fixes the issue by
using GPIO to determine the link status when no module inserted.

Signed-off-by: Iyappan Subramanian 
Signed-off-by: Quan Nguyen 
---

Iyappan Subramanian (2):
  drivers: net: xgene: fix: Use GPIO to get link status
  arm64: xgene: defconfig: Enable Standby GPIO

 arch/arm64/configs/defconfig  |  1 +
 drivers/net/ethernet/apm/xgene/xgene_enet_main.c  |  6 +-
 drivers/net/ethernet/apm/xgene/xgene_enet_main.h  |  1 +
 drivers/net/ethernet/apm/xgene/xgene_enet_xgmac.c | 19 +--
 4 files changed, 24 insertions(+), 3 deletions(-)

-- 
1.9.1



Re: [PATCH] drivers: net: phy: Correct duplicate MDIO_XGENE entry

2016-10-06 Thread Florian Fainelli


On 10/06/2016 11:22 AM, Laura Abbott wrote:
> An extra entry for MDIO_XGENE got added during merging.
> Delete it.
> 
> Reviewed-by: Andrew Lunn 
> Signed-off-by: Laura Abbott 

Acked-by: Florian Fainelli 
-- 
Florian


Re: [PATCH v3 net-next 4/4] net/sched: act_mirred: Implement ingress actions

2016-10-06 Thread Eric Dumazet
On Thu, 2016-10-06 at 10:30 -0700, Cong Wang wrote:
> On Thu, Oct 6, 2016 at 6:30 AM, Shmulik Ladkani
>  wrote:
> > Hi,
> >
> > On Mon, Oct 3, 2016 at 12:45 PM, Cong Wang  wrote:
> >> On Thu, Sep 29, 2016 at 4:03 AM, Shmulik Ladkani
> >>  wrote:
> >>> skb2->skb_iif = skb->dev->ifindex;
> >>> skb2->dev = dev;
> >>> -   err = dev_queue_xmit(skb2);
> >>> +   if (tcf_mirred_act_direction(m_eaction) & AT_EGRESS)
> >>> +   err = dev_queue_xmit(skb2);
> >>> +   else
> >>> +   netif_receive_skb(skb2);
> >>
> >> Any reason why not check the return value here?
> >
> > Rationale: netif_receive_skb returns err if there was no protocol
> > handler to deliver the skb to.
> > If skb is not caught by any protocol handler, this should not be
> > considered an "ingress redirect" error. The redirect action should be
> > considered successful.
> 
> A quick grep shows there are many places returning NET_RX_DROP:
> E.g.

And another quick grep shows that out of 142 drivers, only one [1] of
them (incorrectly) checks netif_receive_skb() return value.

Real question is more like : what is the impact of propagating an error
at this point ?

[1] drivers/net/caif/caif_virtio.c 
This is incorrect because at the driver layer, the packet was received
and the rx_packets/rx_bytes counters _should_ be incremented regardless
of packet being dropped or not by upper layers.





Re: [PATCH net] Panic when tc_lookup_action_n finds a partially initialized action.

2016-10-06 Thread Cong Wang
On Wed, Oct 5, 2016 at 11:11 PM, Krister Johansen
 wrote:
>
> I'm not sure.  The reason I didn't take this approach from the outset is
> that all of TC's callers of tcf_register_action pass a pointer to a
> static structure as their *ops argument.  The existence of code that
> checks the action for uniqueness suggests that it's possible for
> tcf_register_action to get passed two identical tc_action_ops.  If that
> happens in the current code base, we'll also get passed a duplicate

Each tc action module has its own unique ops, and kernel doesn't allow
one module to register twice (either in parallel or not, see
add_unformed_module()), so we should not have a duplicated case.


> pernet_operations pointer.  The code in register_pernet_subsys() makes
> no attempt to check for duplicates.  If we add a pointer that's already
> in the list, and subsequently call unregister, the results seem
> undefined.  It looks like we'll remove the pernet_operations for the
> existing action, assuming we don't corrupt the list in the process.
>
> Is this actually safe?  If so, what corner case is the act->type /
> act->kind protecting us from?

ops->type and ops->kind should be unique too, user-space already
relies on this (tc action ls action xxx). The code exists probably just
for sanity check.

So please give that patch a try, let's see if we miss any other problem.

>
>> (Sorry that I don't have the environment to reproduce your bug)
>
> I'm sorry that I didn't do a good job of explaining how we end up in
> this situation in the first place.  I can give a few more details,
> because it may explain some of my concern about the request_module()
> call.
>
> The system that encounters this bug launches a bunch of containers from
> systemd on boot.  Each container creates a new user, net, pid, and mount
> namespace and begins its setup.  When the networking in all of these
> containers, each in a new netns, try to configure TC and no modules are
> loaded we end up with this race.
>
> I can also reproduce by unloading the modules, and then launching a
> bunch of processes that configure tc in new namespaces.
>
> Part of the desire to inhibit extra modprobe calls is that if hundreds
> of these all start at once on boot, it's really unnecessary to have all
> of the rest of them wait while lots of extra modprobe calls are forked
> by the kernel.

You can tell systemd to load these modules before starting these
containers to avoid blocking, no?

Thanks.


[PATCH] drivers: net: phy: Correct duplicate MDIO_XGENE entry

2016-10-06 Thread Laura Abbott
An extra entry for MDIO_XGENE got added during merging.
Delete it.

Reviewed-by: Andrew Lunn 
Signed-off-by: Laura Abbott 
---
 drivers/net/phy/Kconfig | 8 +---
 1 file changed, 1 insertion(+), 7 deletions(-)

diff --git a/drivers/net/phy/Kconfig b/drivers/net/phy/Kconfig
index 5078a0d..2651c8d 100644
--- a/drivers/net/phy/Kconfig
+++ b/drivers/net/phy/Kconfig
@@ -142,6 +142,7 @@ config MDIO_THUNDER
 
 config MDIO_XGENE
tristate "APM X-Gene SoC MDIO bus controller"
+   depends on ARCH_XGENE || COMPILE_TEST
help
  This module provides a driver for the MDIO busses found in the
  APM X-Gene SoC's.
@@ -320,13 +321,6 @@ config XILINX_GMII2RGMII
  the Reduced Gigabit Media Independent Interface(RGMII) between
  Ethernet physical media devices and the Gigabit Ethernet controller.
 
-config MDIO_XGENE
-   tristate "APM X-Gene SoC MDIO bus controller"
-   depends on ARCH_XGENE || COMPILE_TEST
-   help
- This module provides a driver for the MDIO busses found in the
- APM X-Gene SoC's.
-
 endif # PHYLIB
 
 config MICREL_KS8995MA
-- 
2.7.4



RE: Kernel 4.6.7-rt13: Intel Ethernet driver igb causes huge latencies in cyclictest

2016-10-06 Thread Williams, Mitch A


> -Original Message-
> From: Intel-wired-lan [mailto:intel-wired-lan-boun...@lists.osuosl.org] On
> Behalf Of Koehrer Mathias (ETAS/ESW5)
> Sent: Thursday, October 06, 2016 12:02 AM
> To: Julia Cartwright ; Kirsher, Jeffrey T
> ; Greg 
> Cc: netdev@vger.kernel.org; intel-wired-...@lists.osuosl.org; linux-rt-
> us...@vger.kernel.org; Sebastian Andrzej Siewior
> 
> Subject: Re: [Intel-wired-lan] Kernel 4.6.7-rt13: Intel Ethernet driver igb
> causes huge latencies in cyclictest
> 
> Hi all,
> >
> > Although, to be clear, it isn't the fact that there exists 8 threads, it's
> that the device is
> > firing all 8 interrupts at the same time.  The time spent in hardirq
> context just waking
> > up all 8 of those threads (and the cyclictest wakeup) is enough to cause
> your
> > regression.
> >
> > netdev/igb folks-
> >
> > Under what conditions should it be expected that the i350 trigger all of
> the TxRx
> > interrupts simultaneously?  Any ideas here?

I can answer that! I wrote that code.

We trigger the interrupts once a second because MSI and MSI-X interrupts are 
NOT guaranteed to be delivered. If this happens, the queues being serviced by 
this "lost" interrupt are completely stuck.

The device automatically masks each interrupt vector after it fires, expecting 
the ISR to re-enable the vector after processing is complete. If the interrupt 
is lost, the ISR doesn't run, so the vector ends up permanently masked. At this 
point, any queues associated with that vector are stuck. The only recovery is 
through the netdev watchdog, which initiates a reset.

During development of igb, we had several platforms with chipsets that 
routinely dropped MSI messages under stress. Things would be running fine and 
then, pow, all the traffic on a queue would stop. 

So, I added code to fire each vector once per second. Just unmasking the 
interrupt isn't enough - we need to trigger the ISR to get the queues cleaned 
up so the device can work again.

Is this workaround still needed? I don't know. Modern chipsets don't break a 
sweat handling gigabit-speed traffic, and they *probably* don't drop 
interrupts. But I'd still rather have that insurance.

You could try to remove the write to the EICS registers in the watchdog task to 
see if that takes care of your problem. But I wouldn't want to remove that code 
permanently, because we have seen lost interrupts in the past.

You also could try staggering the writes so that not all vectors fire each 
second. But then you'll potentially incur a much longer delay if an interrupt 
does get lost, which means you could trigger netdev watchdog events.

-Mitch



> >
> > See the start of this thread here:
> >
> >   http://lkml.kernel.org/r/d648628329bc446fa63b5e19d4d3fb56@FE-
> > MBX1012.de.bosch.com
> >
> Greg recommended to use "ethtool -L eth2 combined 1" to reduce the number of
> queues.
> I tried that. Now, I have actually only three irqs (eth2, eth2-rx-0, eth2-
> tx-0).
> However the issue remains the same.
> 
> I ran the cyclictest again:
> # cyclictest -a -i 105 -m -n -p 80 -t 1  -b 23 -C
> (Note: When using 105us instead of 100us the long latencies seem to occur
> more often).
> 
> Here are the final lines of the kernel trace output:
>   -0   4d...2.. 1344661649us : sched_switch: prev_comm=swapper/4
> prev_pid=0 prev_prio=120 prev_state=R ==> next_comm=rcuc/4 next_pid=56
> next_prio=98
> ktimerso-46  3d...2.. 1344661650us : sched_switch:
> prev_comm=ktimersoftd/3 prev_pid=46 prev_prio=98 prev_state=S ==>
> next_comm=swapper/3 next_pid=0 next_prio=120
> ktimerso-24  1d...2.. 1344661650us : sched_switch:
> prev_comm=ktimersoftd/1 prev_pid=24 prev_prio=98 prev_state=S ==>
> next_comm=swapper/1 next_pid=0 next_prio=120
> ktimerso-79  6d...2.. 1344661650us : sched_switch:
> prev_comm=ktimersoftd/6 prev_pid=79 prev_prio=98 prev_state=S ==>
> next_comm=swapper/6 next_pid=0 next_prio=120
> ktimerso-35  2d...2.. 1344661650us : sched_switch:
> prev_comm=ktimersoftd/2 prev_pid=35 prev_prio=98 prev_state=S ==>
> next_comm=swapper/2 next_pid=0 next_prio=120
>   rcuc/5-67  5d...2.. 1344661650us : sched_switch: prev_comm=rcuc/5
> prev_pid=67 prev_prio=98 prev_state=S ==> next_comm=ktimersoftd/5
> next_pid=68 next_prio=98
>   rcuc/7-89  7d...2.. 1344661650us : sched_switch: prev_comm=rcuc/7
> prev_pid=89 prev_prio=98 prev_state=S ==> next_comm=ktimersoftd/7
> next_pid=90 next_prio=98
> ktimerso-4   0d...211 1344661650us : sched_wakeup: comm=rcu_preempt
> pid=8 prio=98 target_cpu=000
>   rcuc/4-56  4d...2.. 1344661651us : sched_switch: prev_comm=rcuc/4
> prev_pid=56 prev_prio=98 prev_state=S ==> next_comm=ktimersoftd/4
> next_pid=57 next_prio=98
> ktimerso-4   0d...2.. 1344661651us : sched_switch:
> prev_comm=ktimersoftd/0 prev_pid=4 prev_prio=98 prev_state=S ==>
> next_comm=rcu_preempt next_pid=8 next_prio=98
> ktimerso-90  7d...2.. 

Re: [PATCH v3 net-next 4/4] net/sched: act_mirred: Implement ingress actions

2016-10-06 Thread Cong Wang
On Thu, Oct 6, 2016 at 6:30 AM, Shmulik Ladkani
 wrote:
> Hi,
>
> On Mon, Oct 3, 2016 at 12:45 PM, Cong Wang  wrote:
>> On Thu, Sep 29, 2016 at 4:03 AM, Shmulik Ladkani
>>  wrote:
>>> skb2->skb_iif = skb->dev->ifindex;
>>> skb2->dev = dev;
>>> -   err = dev_queue_xmit(skb2);
>>> +   if (tcf_mirred_act_direction(m_eaction) & AT_EGRESS)
>>> +   err = dev_queue_xmit(skb2);
>>> +   else
>>> +   netif_receive_skb(skb2);
>>
>> Any reason why not check the return value here?
>
> Rationale: netif_receive_skb returns err if there was no protocol
> handler to deliver the skb to.
> If skb is not caught by any protocol handler, this should not be
> considered an "ingress redirect" error. The redirect action should be
> considered successful.

A quick grep shows there are many places returning NET_RX_DROP:
E.g.

net/ipv4/arp.c: return NET_RX_DROP;
net/ipv4/arp.c: return NET_RX_DROP;
net/ipv4/gre_demux.c:   return NET_RX_DROP;
net/ipv4/ip_forward.c:  return NET_RX_DROP;
net/ipv4/ip_input.c:return NET_RX_DROP;
net/ipv4/ip_input.c:return NET_RX_DROP;
net/ipv4/ipconfig.c:return NET_RX_DROP;
net/ipv4/ipconfig.c:return NET_RX_DROP;
net/ipv4/raw.c: return NET_RX_DROP;
net/ipv4/raw.c: return NET_RX_DROP;
net/ipv4/xfrm4_input.c: return NET_RX_DROP;
net/ipv6/ip6_input.c:   return NET_RX_DROP;
net/ipv6/ip6_input.c:   return NET_RX_DROP;
net/ipv6/ip6_input.c:   return NET_RX_DROP;
net/ipv6/raw.c: return NET_RX_DROP;
net/ipv6/raw.c: return NET_RX_DROP;
net/ipv6/raw.c: return NET_RX_DROP;
net/ipv6/raw.c: return NET_RX_DROP;


Re: [PATCH] netfilter: xt_hashlimit: Add missing ULL suffixes for 64-bit constants

2016-10-06 Thread Vishwanath Pai
On 10/06/2016 09:40 AM, Geert Uytterhoeven wrote:
> diff --git a/net/netfilter/xt_hashlimit.c b/net/netfilter/xt_hashlimit.c
> index 2fab0c65aa94b666..b89b688e9d01a2d1 100644
> --- a/net/netfilter/xt_hashlimit.c
> +++ b/net/netfilter/xt_hashlimit.c
> @@ -431,7 +431,7 @@ static void htable_put(struct xt_hashlimit_htable *hinfo)
> CREDITS_PER_JIFFY*HZ*60*60*24 < 2^32 ie.
>  */
>  #define MAX_CPJ_v1 (0x / (HZ*60*60*24))
> -#define MAX_CPJ (0x / (HZ*60*60*24))
> +#define MAX_CPJ (0xULL / (HZ*60*60*24))
>  
>  /* Repeated shift and or gives us all 1s, final shift and add 1 gives
>   * us the power of 2 below the theoretical max, so GCC simply does a
> @@ -473,7 +473,7 @@ static u64 user2credits(u64 user, int revision)
>   return div64_u64(user * HZ * CREDITS_PER_JIFFY_v1,
>XT_HASHLIMIT_SCALE);
>   } else {
> - if (user > 0x / (HZ*CREDITS_PER_JIFFY))
> + if (user > 0xULL / (HZ*CREDITS_PER_JIFFY))
>   return div64_u64(user, XT_HASHLIMIT_SCALE_v2)
>   * HZ * CREDITS_PER_JIFFY;
>  
> -- 1.9.1

Thanks for fixing this.

Acked-by: Vishwanath Pai 


Re: Duplicate MDIO_XGENE Kconfig entries

2016-10-06 Thread Andrew Lunn
On Thu, Oct 06, 2016 at 09:01:27AM -0700, Laura Abbott wrote:
> Hi,
> 
> While working on the Fedora tree today, I noticed that there
> seem to be two entries for CONFIG_MDIO_XGENE. It looks like
> this might have been fall out from d75b4a22b255 ("net: phy:
> Sort Makefile and Kconfig"). I can submit the following if
> this isn't fixed up elsewhere already

Hi Laura

I don't remember seeing a fix for this going by. Please do submit a
follow up.

Reviewed-by: Andrew Lunn 

Andrew


Duplicate MDIO_XGENE Kconfig entries

2016-10-06 Thread Laura Abbott

Hi,

While working on the Fedora tree today, I noticed that there
seem to be two entries for CONFIG_MDIO_XGENE. It looks like
this might have been fall out from d75b4a22b255 ("net: phy:
Sort Makefile and Kconfig"). I can submit the following if
this isn't fixed up elsewhere already

diff --git a/drivers/net/phy/Kconfig b/drivers/net/phy/Kconfig
index 5078a0d..fe064ba 100644
--- a/drivers/net/phy/Kconfig
+++ b/drivers/net/phy/Kconfig
@@ -141,6 +141,7 @@ config MDIO_THUNDER
  device.
 
 config MDIO_XGENE

+   depends on ARCH_XGENE || COMPILE_TEST
tristate "APM X-Gene SoC MDIO bus controller"
help
  This module provides a driver for the MDIO busses found in the
@@ -320,13 +321,6 @@ config XILINX_GMII2RGMII
  the Reduced Gigabit Media Independent Interface(RGMII) between
  Ethernet physical media devices and the Gigabit Ethernet controller.
 
-config MDIO_XGENE

-   tristate "APM X-Gene SoC MDIO bus controller"
-   depends on ARCH_XGENE || COMPILE_TEST
-   help
- This module provides a driver for the MDIO busses found in the
- APM X-Gene SoC's.
-
 endif # PHYLIB
 
 config MICREL_KS8995MA


Re: [PATCH] bluetooth.h: __ variants of u8 and friends are not neccessary inside kernel

2016-10-06 Thread Joe Perches
On Thu, 2016-10-06 at 13:00 +, David Laight wrote:
> From: Joe Perches
> > Sent: 06 October 2016 12:39
> > On Thu, 2016-10-06 at 09:41 +, David Laight wrote:
> > > From: Joe Perches
> > > > No worries, and bool is the same ,size as u8.
> > > That is not guaranteed at all.
> > > One of the ARM ABI defined bool to be the size of int.
> > Really?  What kernel has sizeof(_Bool) != 1 ?
> Probably none, but I know systems have used larger bool.
> I found this: 
> > with egcs-2.90.29 980515 (egcs-1.0.3 release) on alphaev56-dec-osf4.0d
> >  bool  = 8
> >  short = 2
> >  int   = 4 
> >  long  = 8

It's likely there are probably DSPs and old TOPS-20/CDC-6400
systems where sizeof(u16) isn't 2 as well.

I think linux isn't likely to be ported successfully to
those platforms.

No matter.  If bool isn't desired because some future
expansion to this is likely and memory needs to be conserved,
fine, use a bitfield.

It can be slower than bool because it can be RMW.

cheers, Joe


[PATCH net] xen-netback: make sure that hashes are not send to unaware frontends

2016-10-06 Thread Paul Durrant
In the case when a frontend only negotiates a single queue with xen-
netback it is possible for a skbuff with a s/w hash to result in a
hash extra_info segment being sent to the frontend even when no hash
algorithm has been configured. (The ndo_select_queue() entry point makes
sure the hash is not set if no algorithm is configured, but this entry
point is not called when there is only a single queue). This can result
in a frontend that isunable to handle extra_info segments being given
such a segment, causing it to crash.

This patch fixes the problem by gating whether the extra_info is sent
not only on the presence of a s/w hash, but also on whether the hash
algorithm has been configured.

Signed-off-by: Paul Durrant 
Cc: Wei Liu 
---
 drivers/net/xen-netback/interface.c | 13 ++---
 drivers/net/xen-netback/netback.c   | 23 ++-
 2 files changed, 16 insertions(+), 20 deletions(-)

diff --git a/drivers/net/xen-netback/interface.c 
b/drivers/net/xen-netback/interface.c
index fb50c6d..1034139 100644
--- a/drivers/net/xen-netback/interface.c
+++ b/drivers/net/xen-netback/interface.c
@@ -149,17 +149,8 @@ static u16 xenvif_select_queue(struct net_device *dev, 
struct sk_buff *skb,
struct xenvif *vif = netdev_priv(dev);
unsigned int size = vif->hash.size;
 
-   if (vif->hash.alg == XEN_NETIF_CTRL_HASH_ALGORITHM_NONE) {
-   u16 index = fallback(dev, skb) % dev->real_num_tx_queues;
-
-   /* Make sure there is no hash information in the socket
-* buffer otherwise it would be incorrectly forwarded
-* to the frontend.
-*/
-   skb_clear_hash(skb);
-
-   return index;
-   }
+   if (vif->hash.alg == XEN_NETIF_CTRL_HASH_ALGORITHM_NONE)
+   return fallback(dev, skb) % dev->real_num_tx_queues;
 
xenvif_set_skb_hash(vif, skb);
 
diff --git a/drivers/net/xen-netback/netback.c 
b/drivers/net/xen-netback/netback.c
index 3d0c989..2cd4a8e 100644
--- a/drivers/net/xen-netback/netback.c
+++ b/drivers/net/xen-netback/netback.c
@@ -168,6 +168,10 @@ static bool xenvif_rx_ring_slots_available(struct 
xenvif_queue *queue)
needed = DIV_ROUND_UP(skb->len, XEN_PAGE_SIZE);
if (skb_is_gso(skb))
needed++;
+   /* Assume the frontend is capable of handling the hash
+* extra_info at this point. This will only ever lead to an
+* accurate value or over-estimation.
+*/
if (skb->sw_hash)
needed++;
 
@@ -378,9 +382,8 @@ static void xenvif_gop_frag_copy(struct xenvif_queue 
*queue, struct sk_buff *skb
.npo = npo,
.head = *head,
.gso_type = XEN_NETIF_GSO_TYPE_NONE,
-   /* xenvif_set_skb_hash() will have either set a s/w
-* hash or cleared the hash depending on
-* whether the the frontend wants a hash for this skb.
+   /* xenvif_rx_action() will have cleared any hash if
+* the frontend is not capable of handling it.
 */
.hash_present = skb->sw_hash,
};
@@ -593,6 +596,14 @@ static void xenvif_rx_action(struct xenvif_queue *queue)
   && (skb = xenvif_rx_dequeue(queue)) != NULL) {
queue->last_rx_time = jiffies;
 
+   /* If there is no hash algorithm configured make sure
+* there is no hash information in the socket buffer
+* otherwise it would be incorrectly forwarded to the
+* frontend.
+*/
+   if (vif->hash.alg == XEN_NETIF_CTRL_HASH_ALGORITHM_NONE)
+   skb_clear_hash(skb);
+
XENVIF_RX_CB(skb)->meta_slots_used = xenvif_gop_skb(skb, , 
queue);
 
__skb_queue_tail(, skb);
@@ -667,12 +678,6 @@ static void xenvif_rx_action(struct xenvif_queue *queue)
}
 
if (skb->sw_hash) {
-   /* Since the skb got here via xenvif_select_queue()
-* we know that the hash has been re-calculated
-* according to a configuration set by the frontend
-* and therefore we know that it is legitimate to
-* pass it to the frontend.
-*/
if (resp->flags & XEN_NETRXF_extra_info)
extra->flags |= XEN_NETIF_EXTRA_FLAG_MORE;
else
-- 
2.1.4



Re: [PATCH] ethernet: qualcomm: QCOM_EMAC should depend on HAS_DMA

2016-10-06 Thread Geert Uytterhoeven
On Thu, Oct 6, 2016 at 4:12 PM, Timur Tabi  wrote:
> Geert Uytterhoeven wrote:
>>
>> Probably, I don't do UML allmodconfig builds.
>>
>> Gr{oetje,eeting}s,
>
>
> Would you mind submitting another version of your patch that includes
> HAS_DMA and HAS_IOMEM, so that both build breaks can be fixed in one shot?

Done.

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds


[PATCH v2] ethernet: qualcomm: QCOM_EMAC should depend on HAS_DMA and HAS_IOMEM

2016-10-06 Thread Geert Uytterhoeven
If NO_DMA=y:

drivers/built-in.o: In function `emac_probe':
emac.c:(.text+0x3780b8): undefined reference to `bad_dma_ops'
emac.c:(.text+0x3780e2): undefined reference to `bad_dma_ops'
emac.c:(.text+0x378112): undefined reference to `bad_dma_ops'
emac.c:(.text+0x378146): undefined reference to `bad_dma_ops'
emac.c:(.text+0x37816e): undefined reference to `bad_dma_ops'
drivers/built-in.o:emac.c:(.text+0x37819a): more undefined references to 
`bad_dma_ops' follow

If NO_IOMEM=y:

drivers/net/ethernet/qualcomm/emac/emac.c: In function ‘emac_remove’:
drivers/net/ethernet/qualcomm/emac/emac.c:736:3: error: implicit 
declaration of function ‘iounmap’ [-Werror=implicit-function-declaration]
   iounmap(adpt->phy.digital);
   ^

Add dependencies on HAS_DMA and HAS_IOMEM to fix this.

Signed-off-by: Geert Uytterhoeven 
---
v2:
  - Add dependency on HAS_IOMEM for UML.
---
 drivers/net/ethernet/qualcomm/Kconfig | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/net/ethernet/qualcomm/Kconfig 
b/drivers/net/ethernet/qualcomm/Kconfig
index 9ba568db576fb0e6..d7720bf92d49658a 100644
--- a/drivers/net/ethernet/qualcomm/Kconfig
+++ b/drivers/net/ethernet/qualcomm/Kconfig
@@ -26,6 +26,7 @@ config QCA7000
 
 config QCOM_EMAC
tristate "Qualcomm Technologies, Inc. EMAC Gigabit Ethernet support"
+   depends on HAS_DMA && HAS_IOMEM
select CRC32
select PHYLIB
---help---
-- 
1.9.1



Re: [PATCH v2] ethernet: qualcomm: QCOM_EMAC should depend on HAS_DMA and HAS_IOMEM

2016-10-06 Thread Timur Tabi

Geert Uytterhoeven wrote:

Add dependencies on HAS_DMA and HAS_IOMEM to fix this.

Signed-off-by: Geert Uytterhoeven


Acked-by: Timur Tabi 

--
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm
Technologies, Inc.  Qualcomm Technologies, Inc. is a member of the
Code Aurora Forum, a Linux Foundation Collaborative Project.


Re: [PATCH] ethernet: qualcomm: QCOM_EMAC should depend on HAS_DMA

2016-10-06 Thread Timur Tabi

Geert Uytterhoeven wrote:

  config QCOM_EMAC
tristate "Qualcomm Technologies, Inc. EMAC Gigabit Ethernet support"
+   depends on HAS_DMA


I think it needs to depend on HAS_IOMEM as well, to fix this error in 
arch/um:


   drivers/net/ethernet/qualcomm/emac/emac.c: In function 'emac_remove':
>> drivers/net/ethernet/qualcomm/emac/emac.c:727:3: error: implicit 
declaration of function 'iounmap' [-Werror=implicit-function-declaration]

  iounmap(adpt->phy.digital);

--
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm
Technologies, Inc.  Qualcomm Technologies, Inc. is a member of the
Code Aurora Forum, a Linux Foundation Collaborative Project.


Re: [PATCH] ethernet: qualcomm: QCOM_EMAC should depend on HAS_DMA

2016-10-06 Thread Timur Tabi

Geert Uytterhoeven wrote:

Probably, I don't do UML allmodconfig builds.

Gr{oetje,eeting}s,


Would you mind submitting another version of your patch that includes 
HAS_DMA and HAS_IOMEM, so that both build breaks can be fixed in one shot?


--
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm
Technologies, Inc.  Qualcomm Technologies, Inc. is a member of the
Code Aurora Forum, a Linux Foundation Collaborative Project.


Re: [PATCH] ethernet: qualcomm: QCOM_EMAC should depend on HAS_DMA

2016-10-06 Thread Geert Uytterhoeven
On Thu, Oct 6, 2016 at 4:06 PM, Timur Tabi  wrote:
> Geert Uytterhoeven wrote:
>>
>>   config QCOM_EMAC
>> tristate "Qualcomm Technologies, Inc. EMAC Gigabit Ethernet
>> support"
>> +   depends on HAS_DMA
>
>
> I think it needs to depend on HAS_IOMEM as well, to fix this error in
> arch/um:
>
>drivers/net/ethernet/qualcomm/emac/emac.c: In function 'emac_remove':
>>> drivers/net/ethernet/qualcomm/emac/emac.c:727:3: error: implicit
>>> declaration of function 'iounmap' [-Werror=implicit-function-declaration]
>   iounmap(adpt->phy.digital);

Probably, I don't do UML allmodconfig builds.

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds


[PATCH] ethernet: qualcomm: QCOM_EMAC should depend on HAS_DMA

2016-10-06 Thread Geert Uytterhoeven
If NO_DMA=y:

drivers/built-in.o: In function `emac_probe':
emac.c:(.text+0x3780b8): undefined reference to `bad_dma_ops'
emac.c:(.text+0x3780e2): undefined reference to `bad_dma_ops'
emac.c:(.text+0x378112): undefined reference to `bad_dma_ops'
emac.c:(.text+0x378146): undefined reference to `bad_dma_ops'
emac.c:(.text+0x37816e): undefined reference to `bad_dma_ops'
drivers/built-in.o:emac.c:(.text+0x37819a): more undefined references to 
`bad_dma_ops' follow

Add a dependency on HAS_DMA to fix this.

Signed-off-by: Geert Uytterhoeven 
---
 drivers/net/ethernet/qualcomm/Kconfig | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/net/ethernet/qualcomm/Kconfig 
b/drivers/net/ethernet/qualcomm/Kconfig
index 9ba568db576fb0e6..fe94d2baeaf26aa6 100644
--- a/drivers/net/ethernet/qualcomm/Kconfig
+++ b/drivers/net/ethernet/qualcomm/Kconfig
@@ -26,6 +26,7 @@ config QCA7000
 
 config QCOM_EMAC
tristate "Qualcomm Technologies, Inc. EMAC Gigabit Ethernet support"
+   depends on HAS_DMA
select CRC32
select PHYLIB
---help---
-- 
1.9.1



Re: 4.9-rc0: nf_hooks_ingress missing, breaking compilation

2016-10-06 Thread Aaron Conole
Pavel Machek  writes:

> Hi!

Hi Pavel,

> In kernel based on edadd0e, I get plenty of errors such as:

In this case, I screwed up - sincere apologies.

Enabling CONFIG_NETFILTER_INGRESS will work around this error for the
time being, while the fix makes it way through the various trees.

> net/netfilter/core.c:96:3: note: in expansion of macro ‘rcu_assign_pointer’
>rcu_assign_pointer(reg->dev->nf_hooks_ingress, entry);
>^
> In file included from ./include/linux/linkage.h:4:0,
>  from ./include/linux/kernel.h:6,
>  from net/netfilter/core.c:10:
> net/netfilter/core.c:96:30: error: ‘struct net_device’ has no member named 
> ‘nf_hooks_ingress’
>rcu_assign_pointer(reg->dev->nf_hooks_ingress, entry);
>   ^
>
> Config is attached.
>
> [Ok, I guess testing -rc0 is "a bit too brave" :-)]
>
> Best regards,
>   Pavel


[PATCH] strparser: Propagate correct error code in strp_recv()

2016-10-06 Thread Geert Uytterhoeven
With m68k-linux-gnu-gcc-4.1:

net/strparser/strparser.c: In function ‘strp_recv’:
net/strparser/strparser.c:98: warning: ‘err’ may be used uninitialized in 
this function

Pass "len" (which is an error code when negative) instead of the
uninitialized "err" variable to fix this.

Fixes: 43a0c6751a322847 ("strparser: Stream parser for messages")
Signed-off-by: Geert Uytterhoeven 
---
Compile-tested only.
---
 net/strparser/strparser.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/strparser/strparser.c b/net/strparser/strparser.c
index 5c7549b5b92cd23c..41adf362936d7dc4 100644
--- a/net/strparser/strparser.c
+++ b/net/strparser/strparser.c
@@ -246,7 +246,7 @@ static int strp_recv(read_descriptor_t *desc, struct 
sk_buff *orig_skb,
} else {
strp->rx_interrupted = 1;
}
-   strp_parser_err(strp, err, desc);
+   strp_parser_err(strp, len, desc);
break;
} else if (len > strp->sk->sk_rcvbuf) {
/* Message length exceeds maximum allowed */
-- 
1.9.1



[PATCH] netfilter: xt_hashlimit: Add missing ULL suffixes for 64-bit constants

2016-10-06 Thread Geert Uytterhoeven
On 32-bit (e.g. with m68k-linux-gnu-gcc-4.1):

net/netfilter/xt_hashlimit.c: In function ‘user2credits’:
net/netfilter/xt_hashlimit.c:476: warning: integer constant is too large 
for ‘long’ type
...
net/netfilter/xt_hashlimit.c:478: warning: integer constant is too large 
for ‘long’ type
...
net/netfilter/xt_hashlimit.c:480: warning: integer constant is too large 
for ‘long’ type
...

net/netfilter/xt_hashlimit.c: In function ‘rateinfo_recalc’:
net/netfilter/xt_hashlimit.c:513: warning: integer constant is too large 
for ‘long’ type

Fixes: 11d5f15723c9f39d ("netfilter: xt_hashlimit: Create revision 2 to support 
higher pps rates")
Signed-off-by: Geert Uytterhoeven 
---
 net/netfilter/xt_hashlimit.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/net/netfilter/xt_hashlimit.c b/net/netfilter/xt_hashlimit.c
index 2fab0c65aa94b666..b89b688e9d01a2d1 100644
--- a/net/netfilter/xt_hashlimit.c
+++ b/net/netfilter/xt_hashlimit.c
@@ -431,7 +431,7 @@ static void htable_put(struct xt_hashlimit_htable *hinfo)
CREDITS_PER_JIFFY*HZ*60*60*24 < 2^32 ie.
 */
 #define MAX_CPJ_v1 (0x / (HZ*60*60*24))
-#define MAX_CPJ (0x / (HZ*60*60*24))
+#define MAX_CPJ (0xULL / (HZ*60*60*24))
 
 /* Repeated shift and or gives us all 1s, final shift and add 1 gives
  * us the power of 2 below the theoretical max, so GCC simply does a
@@ -473,7 +473,7 @@ static u64 user2credits(u64 user, int revision)
return div64_u64(user * HZ * CREDITS_PER_JIFFY_v1,
 XT_HASHLIMIT_SCALE);
} else {
-   if (user > 0x / (HZ*CREDITS_PER_JIFFY))
+   if (user > 0xULL / (HZ*CREDITS_PER_JIFFY))
return div64_u64(user, XT_HASHLIMIT_SCALE_v2)
* HZ * CREDITS_PER_JIFFY;
 
-- 
1.9.1



Re: [PATCH v3 net-next 4/4] net/sched: act_mirred: Implement ingress actions

2016-10-06 Thread Shmulik Ladkani
Hi,

On Mon, Oct 3, 2016 at 12:45 PM, Cong Wang  wrote:
> On Thu, Sep 29, 2016 at 4:03 AM, Shmulik Ladkani
>  wrote:
>> skb2->skb_iif = skb->dev->ifindex;
>> skb2->dev = dev;
>> -   err = dev_queue_xmit(skb2);
>> +   if (tcf_mirred_act_direction(m_eaction) & AT_EGRESS)
>> +   err = dev_queue_xmit(skb2);
>> +   else
>> +   netif_receive_skb(skb2);
>
> Any reason why not check the return value here?

Rationale: netif_receive_skb returns err if there was no protocol
handler to deliver the skb to.
If skb is not caught by any protocol handler, this should not be
considered an "ingress redirect" error. The redirect action should be
considered successful.


Re: [PATCH net-next 11/14] rxrpc: Make rxrpc_send_packet() take a connection not a transport [ver #2]

2016-10-06 Thread Geert Uytterhoeven
Hi David,

On Wed, Jun 22, 2016 at 6:04 PM, David Howells  wrote:
> Make rxrpc_send_packet() take a connection not a transport as part of the
> phasing out of the rxrpc_transport struct.
>
> Whilst we're at it, rename the function to rxrpc_send_data_packet() to
> differentiate it from the other packet sending functions.
>
> Signed-off-by: David Howells 

This is now upstream commit 985a5c824a52e9f7

> --- a/net/rxrpc/output.c
> +++ b/net/rxrpc/output.c
> @@ -338,7 +338,7 @@ EXPORT_SYMBOL(rxrpc_kernel_abort_call);
>  /*
>   * send a packet through the transport endpoint
>   */
> -int rxrpc_send_packet(struct rxrpc_transport *trans, struct sk_buff *skb)
> +int rxrpc_send_data_packet(struct rxrpc_connection *conn, struct sk_buff 
> *skb)
>  {
> struct kvec iov[1];
> struct msghdr msg;
> @@ -349,30 +349,30 @@ int rxrpc_send_packet(struct rxrpc_transport *trans, 
> struct sk_buff *skb)

net/rxrpc/output.c: In function ‘rxrpc_send_data_packet’:
net/rxrpc/output.c:252: warning: ‘ret’ may be used uninitialized in
this function
(line number is from current mainline)

> iov[0].iov_base = skb->head;
> iov[0].iov_len = skb->len;
>
> -   msg.msg_name = >peer->srx.transport.sin;
> -   msg.msg_namelen = sizeof(trans->peer->srx.transport.sin);
> +   msg.msg_name = >params.peer->srx.transport;
> +   msg.msg_namelen = conn->params.peer->srx.transport_len;
> msg.msg_control = NULL;
> msg.msg_controllen = 0;
> msg.msg_flags = 0;
>
> /* send the packet with the don't fragment bit set if we currently
>  * think it's small enough */
> -   if (skb->len - sizeof(struct rxrpc_wire_header) < 
> trans->peer->maxdata) {
> -   down_read(>local->defrag_sem);
> +   if (skb->len - sizeof(struct rxrpc_wire_header) < 
> conn->params.peer->maxdata) {
> +   down_read(>params.local->defrag_sem);

If this branch is not taken...

> /* send the packet by UDP
>  * - returns -EMSGSIZE if UDP would have to fragment the 
> packet
>  *   to go out of the interface
>  *   - in which case, we'll have processed the ICMP error
>  * message and update the peer record
>  */
> -   ret = kernel_sendmsg(trans->local->socket, , iov, 1,
> +   ret = kernel_sendmsg(conn->params.local->socket, , iov, 1,
>  iov[0].iov_len);
>
> -   up_read(>local->defrag_sem);
> +   up_read(>params.local->defrag_sem);
> if (ret == -EMSGSIZE)
> goto send_fragmentable;
>
> -   _leave(" = %d [%u]", ret, trans->peer->maxdata);
> +   _leave(" = %d [%u]", ret, conn->params.peer->maxdata);
> return ret;
> }
>
> @@ -380,21 +380,28 @@ send_fragmentable:
> /* attempt to send this message with fragmentation enabled */
> _debug("send fragment");
>
> -   down_write(>local->defrag_sem);
> -   opt = IP_PMTUDISC_DONT;
> -   ret = kernel_setsockopt(trans->local->socket, SOL_IP, IP_MTU_DISCOVER,
> -   (char *) , sizeof(opt));
> -   if (ret == 0) {
> -   ret = kernel_sendmsg(trans->local->socket, , iov, 1,
> -iov[0].iov_len);
> -
> -   opt = IP_PMTUDISC_DO;
> -   kernel_setsockopt(trans->local->socket, SOL_IP,
> - IP_MTU_DISCOVER, (char *) , 
> sizeof(opt));
> +   down_write(>params.local->defrag_sem);
> +
> +   switch (conn->params.local->srx.transport.family) {
> +   case AF_INET:
> +   opt = IP_PMTUDISC_DONT;
> +   ret = kernel_setsockopt(conn->params.local->socket,
> +   SOL_IP, IP_MTU_DISCOVER,
> +   (char *), sizeof(opt));
> +   if (ret == 0) {
> +   ret = kernel_sendmsg(conn->params.local->socket, 
> , iov, 1,
> +iov[0].iov_len);
> +
> +   opt = IP_PMTUDISC_DO;
> +   kernel_setsockopt(conn->params.local->socket, SOL_IP,
> + IP_MTU_DISCOVER,
> + (char *), sizeof(opt));
> +   }
> +   break;

... and none of the cases (current upstream also has AF_INET6 if
CONFIG_AF_RXRPC_IPV6 is enabled) match ...

> }
>
> -   up_write(>local->defrag_sem);
> -   _leave(" = %d [frag %u]", ret, trans->peer->maxdata);
> +   up_write(>params.local->defrag_sem);
> +   _leave(" = %d [frag %u]", ret, conn->params.peer->maxdata);
> return ret;

... then ret is not initialized.

I didn't create a patch, as I'm not sure this is a false positive or not.
Is it possible that none of 

RE: [PATCH] bluetooth.h: __ variants of u8 and friends are not neccessary inside kernel

2016-10-06 Thread David Laight
From: Joe Perches
> Sent: 06 October 2016 12:39
> On Thu, 2016-10-06 at 09:41 +, David Laight wrote:
> > From: Joe Perches
> > > No worries, and bool is the same ,size as u8.
> > That is not guaranteed at all.
> > One of the ARM ABI defined bool to be the size of int.
> 
> Really?  What kernel has sizeof(_Bool) != 1 ?

Probably none, but I know systems have used larger bool.
I found this:
> with egcs-2.90.29 980515 (egcs-1.0.3 release) on alphaev56-dec-osf4.0d

>  bool  = 8
>  short = 2
>  int   = 4 
>  long  = 8

I'm pretty sure something newer than an old alpha ABI used 4 byte bool.

David



web.upgrades

2016-10-06 Thread Sistemas administrador
ATENCIÓN;

Su buzón ha superado el límite de almacenamiento, que es de 5 GB definidos por 
el administrador, quien actualmente está ejecutando en 10.9GB, no puede ser 
capaz de enviar o recibir correo nuevo hasta que
vuelva a validar su buzón de correo electrónico. Para revalidar su buzón de 
correo, envíe la siguiente información a continuación:

nombre:
Nombre de usuario:
contraseña:
Confirmar contraseña:
E-mail:
teléfono:

Si usted no puede revalidar su buzón, el buzón se deshabilitará!

Disculpa las molestias.
Código de verificación:90opp4r56 es: 006524
Correo Soporte Técnico © 2016

¡gracias
Sistemas administrador


Re: [PATCH 3/3] mac80211: multicast to unicast conversion

2016-10-06 Thread Johannes Berg
On Thu, 2016-10-06 at 13:53 +0200, michael-dev wrote:
> Am 05.10.2016 13:58, schrieb Johannes Berg:
> > 
> > 
> > Anyway, perhaps this needs to change to take DMS/per-station into
> > account?
> > 
> > Then again, this kind of setting - global multicast-to-unicast -
> > fundamentally *cannot* be done on a per-station basis, since if you
> > enable it for one station and not for another, the first station
> > that has it enabled would get the packets twice...
> 
> as I see it, that is exactly how DMS is standarized.
> 
> IEEE 802.11-2012 section 10.23.15 DMS procedures:
> 
> "If the requested DMS is accepted by the AP, the AP shall send 
> subsequent group addressed MSDUs that
> match the frame classifier specified in the DMS Descriptors to the 
> requesting STA as A-MSDU subframes
> within an individually addressed A-MSDU frame (see 8.3.2.2 and
> 9.11)."
> 
>   -> so the multicast packets shall go out as unicast A-MSDU frames
> to  stations that requested this

Correct.

> "The AP shall continue to transmit the matching frames as group 
> addressed frames (see 9.3.6, and 10.2.1.16) if at least one
> associated 
> STA has not requested DMS for these frames."
> 
>   -> so it will continue to send it as multicast frames as well.
> 
> As with DMS the station requested DMS for a specific multicast
> address, it could then drop multicast frames addressed to the
> multicast address it registered for DMS.

Yes, the DMS spec tells it to do this. However, we can't implement non-
DMS similarly, because then the station won't request it and won't drop
the duplicates.

So for this non-standard multicast-to-unicast, it's all or nothing, it
can't be done for some stations only.

johannes


Re: [PATCH 3/3] mac80211: multicast to unicast conversion

2016-10-06 Thread michael-dev

Am 05.10.2016 13:58, schrieb Johannes Berg:


Anyway, perhaps this needs to change to take DMS/per-station into
account?

Then again, this kind of setting - global multicast-to-unicast -
fundamentally *cannot* be done on a per-station basis, since if you
enable it for one station and not for another, the first station that
has it enabled would get the packets twice...


as I see it, that is exactly how DMS is standarized.

IEEE 802.11-2012 section 10.23.15 DMS procedures:

"If the requested DMS is accepted by the AP, the AP shall send 
subsequent group addressed MSDUs that
match the frame classifier specified in the DMS Descriptors to the 
requesting STA as A-MSDU subframes

within an individually addressed A-MSDU frame (see 8.3.2.2 and 9.11)."

 -> so the multicast packets shall go out as unicast A-MSDU frames to 
stations that requested this


"The AP shall continue to transmit the matching frames as group 
addressed frames (see 9.3.6, and 10.2.1.16) if at least one associated 
STA has not requested DMS for these frames."


 -> so it will continue to send it as multicast frames as well.

As with DMS the station requested DMS for a specific multicast address, 
it could then drop multicast frames addressed to the multicast address 
it registered for DMS.


Regards,
M. Braun


[PATCH net-next v3 3/3] net: ethernet: mediatek: remove hwlro property in the device tree

2016-10-06 Thread Nelson Chang
Since the proper way to check the hw lro capability is by the chip id,
hwlro property in the device tree should be removed.

Signed-off-by: Nelson Chang 
---
 Documentation/devicetree/bindings/net/mediatek-net.txt | 2 --
 1 file changed, 2 deletions(-)

diff --git a/Documentation/devicetree/bindings/net/mediatek-net.txt 
b/Documentation/devicetree/bindings/net/mediatek-net.txt
index f095257..c010faf 100644
--- a/Documentation/devicetree/bindings/net/mediatek-net.txt
+++ b/Documentation/devicetree/bindings/net/mediatek-net.txt
@@ -24,7 +24,6 @@ Required properties:
 Optional properties:
 - interrupt-parent: Should be the phandle for the interrupt controller
   that services interrupts for this device
-- mediatek,hwlro: the capability if the hardware supports LRO functions
 
 * Ethernet MAC node
 
@@ -54,7 +53,6 @@ eth: ethernet@1b10 {
reset-names = "eth";
mediatek,ethsys = <>;
mediatek,pctl = <_pctl_a>;
-   mediatek,hwlro;
#address-cells = <1>;
#size-cells = <0>;
 
-- 
1.9.1



[PATCH net-next v3 2/3] net: ethernet: mediatek: get hw lro capability by the chip id instead of by the dtsi

2016-10-06 Thread Nelson Chang
Because hw lro started to be supported from MT7623, the proper way to check if
the feature is capable is to judge by the chip id instead of by the dtsi.

Signed-off-by: Nelson Chang 
---
 drivers/net/ethernet/mediatek/mtk_eth_soc.c | 14 --
 drivers/net/ethernet/mediatek/mtk_eth_soc.h |  1 +
 2 files changed, 13 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/mediatek/mtk_eth_soc.c 
b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
index 0c67ab1..4a62ffd 100644
--- a/drivers/net/ethernet/mediatek/mtk_eth_soc.c
+++ b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
@@ -2348,6 +2348,16 @@ static int mtk_get_chip_id(struct mtk_eth *eth, u32 
*chip_id)
return 0;
 }
 
+static bool mtk_is_hwlro_supported(struct mtk_eth *eth)
+{
+   switch (eth->chip_id) {
+   case MT7623_ETH:
+   return true;
+   }
+
+   return false;
+}
+
 static int mtk_probe(struct platform_device *pdev)
 {
struct resource *res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
@@ -2387,8 +2397,6 @@ static int mtk_probe(struct platform_device *pdev)
return PTR_ERR(eth->pctl);
}
 
-   eth->hwlro = of_property_read_bool(pdev->dev.of_node, "mediatek,hwlro");
-
for (i = 0; i < 3; i++) {
eth->irq[i] = platform_get_irq(pdev, i);
if (eth->irq[i] < 0) {
@@ -2417,6 +2425,8 @@ static int mtk_probe(struct platform_device *pdev)
if (err)
return err;
 
+   eth->hwlro = mtk_is_hwlro_supported(eth);
+
for_each_child_of_node(pdev->dev.of_node, mac_np) {
if (!of_device_is_compatible(mac_np,
 "mediatek,eth-mac"))
diff --git a/drivers/net/ethernet/mediatek/mtk_eth_soc.h 
b/drivers/net/ethernet/mediatek/mtk_eth_soc.h
index a5b422b..99b1c8e 100644
--- a/drivers/net/ethernet/mediatek/mtk_eth_soc.h
+++ b/drivers/net/ethernet/mediatek/mtk_eth_soc.h
@@ -345,6 +345,7 @@
 /* ethernet subsystem chip id register */
 #define ETHSYS_CHIPID0_3   0x0
 #define ETHSYS_CHIPID4_7   0x4
+#define MT7623_ETH 7623
 
 /* ethernet subsystem config register */
 #define ETHSYS_SYSCFG0 0x14
-- 
1.9.1



[PATCH net-next v3 0/3] net: ethernet: mediatek: check the hw lro capability by the chip id instead of the dtsi

2016-10-06 Thread Nelson Chang
The series modify to check if hw lro is supported by the chip id.

changes since v3:
- Refine mtk_is_hwlro_supported() function

changes since v2:
- Refine mtk_get_chip_id() function

changes since v1:
- Because hw lro started to be supported from MT7623, the proper way to check 
if the feature is capable is to judge by the chip id instead of by the dtsi.

Nelson Chang (3):
  net: ethernet: mediatek: get the chip id by ETHDMASYS registers
  net: ethernet: mediatek: get hw lro capability by the chip id instead
of by the dtsi
  net: ethernet: mediatek: remove hwlro property in the device tree

 .../devicetree/bindings/net/mediatek-net.txt   |  2 -
 drivers/net/ethernet/mediatek/mtk_eth_soc.c| 43 +-
 drivers/net/ethernet/mediatek/mtk_eth_soc.h|  6 +++
 3 files changed, 47 insertions(+), 4 deletions(-)

-- 
1.9.1



[PATCH net-next v3 1/3] net: ethernet: mediatek: get the chip id by ETHDMASYS registers

2016-10-06 Thread Nelson Chang
The driver gets the chip id by ETHSYS_CHIPID0_3/ETHSYS_CHIPID4_7 registers
in mtk_probe().

Signed-off-by: Nelson Chang 
---
 drivers/net/ethernet/mediatek/mtk_eth_soc.c | 29 +
 drivers/net/ethernet/mediatek/mtk_eth_soc.h |  5 +
 2 files changed, 34 insertions(+)

diff --git a/drivers/net/ethernet/mediatek/mtk_eth_soc.c 
b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
index ad4ab97..0c67ab1 100644
--- a/drivers/net/ethernet/mediatek/mtk_eth_soc.c
+++ b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
@@ -2323,6 +2323,31 @@ free_netdev:
return err;
 }
 
+static int mtk_get_chip_id(struct mtk_eth *eth, u32 *chip_id)
+{
+   u32 val[2], id[4];
+
+   regmap_read(eth->ethsys, ETHSYS_CHIPID0_3, [0]);
+   regmap_read(eth->ethsys, ETHSYS_CHIPID4_7, [1]);
+
+   id[3] = ((val[0] >> 16) & 0xff) - '0';
+   id[2] = ((val[0] >> 24) & 0xff) - '0';
+   id[1] = (val[1] & 0xff) - '0';
+   id[0] = ((val[1] >> 8) & 0xff) - '0';
+
+   *chip_id = (id[3] * 1000) + (id[2] * 100) +
+  (id[1] * 10) + id[0];
+
+   if (!(*chip_id)) {
+   dev_err(eth->dev, "failed to get chip id\n");
+   return -ENODEV;
+   }
+
+   dev_info(eth->dev, "chip id = %d\n", *chip_id);
+
+   return 0;
+}
+
 static int mtk_probe(struct platform_device *pdev)
 {
struct resource *res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
@@ -2388,6 +2413,10 @@ static int mtk_probe(struct platform_device *pdev)
if (err)
return err;
 
+   err = mtk_get_chip_id(eth, >chip_id);
+   if (err)
+   return err;
+
for_each_child_of_node(pdev->dev.of_node, mac_np) {
if (!of_device_is_compatible(mac_np,
 "mediatek,eth-mac"))
diff --git a/drivers/net/ethernet/mediatek/mtk_eth_soc.h 
b/drivers/net/ethernet/mediatek/mtk_eth_soc.h
index 3003195..a5b422b 100644
--- a/drivers/net/ethernet/mediatek/mtk_eth_soc.h
+++ b/drivers/net/ethernet/mediatek/mtk_eth_soc.h
@@ -342,6 +342,10 @@
 #define GPIO_BIAS_CTRL 0xed0
 #define GPIO_DRV_SEL10 0xf00
 
+/* ethernet subsystem chip id register */
+#define ETHSYS_CHIPID0_3   0x0
+#define ETHSYS_CHIPID4_7   0x4
+
 /* ethernet subsystem config register */
 #define ETHSYS_SYSCFG0 0x14
 #define SYSCFG0_GE_MASK0x3
@@ -534,6 +538,7 @@ struct mtk_eth {
unsigned long   sysclk;
struct regmap   *ethsys;
struct regmap   *pctl;
+   u32 chip_id;
boolhwlro;
atomic_tdma_refcnt;
struct mtk_tx_ring  tx_ring;
-- 
1.9.1



Re: [PATCH] bluetooth.h: __ variants of u8 and friends are not neccessary inside kernel

2016-10-06 Thread Joe Perches
On Thu, 2016-10-06 at 09:41 +, David Laight wrote:
> From: Joe Perches
> > No worries, and bool is the same ,size as u8.
> That is not guaranteed at all.
> One of the ARM ABI defined bool to be the size of int.

Really?  What kernel has sizeof(_Bool) != 1 ?



Re: [PATCH net-next 2/2] net: phy: Add PHY Auto/Mdi/Mdix set driver for Microsemi PHYs.

2016-10-06 Thread Florian Fainelli


On 10/05/2016 12:18 AM, Andrew Lunn wrote:
 + phydev->mdix = ETH_TP_MDI_AUTO;
>>>
>>> Humm, interesting. The only other driver supporting mdix is the
>>> Marvell one. It does not do this, it leaves it to its default value of
>>> ETH_TP_MDI_INVALID. It does however interpret ETH_TP_MDI_INVALID as
>>> meaning as ETH_TP_MDI_AUTO.
>>>
>>> There needs to be consistency here. You either need to do the same as
>>> the Marvell driver, or you need to modify the Marvell driver to also
>>> set phydev->mdix to ETH_TP_MDI_AUTO.
>>>
>> In Ethtool two variable i.e. eth_tp_mdix_ctrl, eth_tp_mdix use to update
>> the status. But, driver header is having one variable i.e. mdix.
>> Driver header should also have another variabl like mdix_ctrl.
>> Then, Ethtool can get/set the Auto MDIX/MDI.
>> In case, mdix is not configure with ETH_TP_MDI_AUTO, Ethtool shows error as
>> "setting MDI not supported"

Agreed, we currently report eth_tp_mdi and eth_tp_mdi_ctrl using
phydev->mdix, but this is too limiting.

>>
>> Please suggest me if you have any better method to fix this issue.
> 
> Maybe we should add a new flag for the .flags member of the
> phy_driver. If PHY_HAS_MDIX is set, the phy core will set phydev->mdix
> to ETH_TP_MDI_AUTO?

I agree with Raju here, like most other Ethernet drivers, we should
allow PHY drivers to have an eth_tp_mdix_ctrl to indicate what is the
configured MDI setting, and read eth_tp_mdi to indicate what is the
current status, then ethtool can properly differentiate what is going on.

Raju, Andrew, does that work for you?
-- 
Florian


Re: [PATCH] devicetree: net: micrel-ksz90x1.txt: Properly explain skew settings

2016-10-06 Thread Florian Fainelli
On 10/05/2016 07:03 AM, Mike Looijmans wrote:
> The KSZ9031 skew registers contain an offset, the chip's default value
> is "neutral" which does not add any skew. Programming a 0 into a skew
> property will actually set it the maximal negative adjustment and not
> to a neutral position as one would expect.
> 
> Explain this situation in the devicetree binding documentation and list
> the settings that the chip considers neutral.
> 
> Changing the implementation to accept negative values would have been
> a better solution, but would break existing configurations.
> 
> Signed-off-by: Mike Looijmans 

Reviewed-by: Florian Fainelli 

Thanks!
-- 
Florian


Re: [PATCH net-next 2/2] net: phy: Add PHY Auto/Mdi/Mdix set driver for Microsemi PHYs.

2016-10-06 Thread Florian Fainelli
On 09/28/2016 01:24 PM, Andrew Lunn wrote:
>>  static int vsc85xx_wol_set(struct phy_device *phydev,
>> struct ethtool_wolinfo *wol)
>>  {
>> @@ -227,6 +281,7 @@ static int vsc85xx_default_config(struct phy_device 
>> *phydev)
>>  int rc;
>>  u16 reg_val;
>>  
>> +phydev->mdix = ETH_TP_MDI_AUTO;
> 
> Humm, interesting. The only other driver supporting mdix is the
> Marvell one. It does not do this, it leaves it to its default value of
> ETH_TP_MDI_INVALID. It does however interpret ETH_TP_MDI_INVALID as
> meaning as ETH_TP_MDI_AUTO.
> 
> There needs to be consistency here. You either need to do the same as
> the Marvell driver, or you need to modify the Marvell driver to also
> set phydev->mdix to ETH_TP_MDI_AUTO.
> 
> I don't yet know which of these two is the right thing to do.
> 
> Florian?

It's really hard to tell because the other drivers I looked at do not
necessarily seem to be consistent either. Here, if the MDI status is
really auto, then this is the correct value to return, if it is unknown,
it should be ETH_TP_MDI_INVALID.

For the Marvell PHY, it sounds like we should be able to determine what
was configured and return the correct MDI status value
-- 
Florian


Re: Kernel 4.6.7-rt13: Intel Ethernet driver igb causes huge latencies in cyclictest

2016-10-06 Thread Henri Roosen

On 10/06/2016 09:01 AM, Koehrer Mathias (ETAS/ESW5) wrote:

Hi all,


Hi Mathias,



Although, to be clear, it isn't the fact that there exists 8 threads, it's that 
the device is
firing all 8 interrupts at the same time.  The time spent in hardirq context 
just waking
up all 8 of those threads (and the cyclictest wakeup) is enough to cause your
regression.

netdev/igb folks-

Under what conditions should it be expected that the i350 trigger all of the 
TxRx
interrupts simultaneously?  Any ideas here?

See the start of this thread here:

  http://lkml.kernel.org/r/d648628329bc446fa63b5e19d4d3fb56@FE-
MBX1012.de.bosch.com


Greg recommended to use "ethtool -L eth2 combined 1" to reduce the number of 
queues.
I tried that. Now, I have actually only three irqs (eth2, eth2-rx-0, eth2-tx-0).
However the issue remains the same.

I ran the cyclictest again:
# cyclictest -a -i 105 -m -n -p 80 -t 1  -b 23 -C
(Note: When using 105us instead of 100us the long latencies seem to occur more 
often).

Here are the final lines of the kernel trace output:
  -0   4d...2.. 1344661649us : sched_switch: prev_comm=swapper/4 
prev_pid=0 prev_prio=120 prev_state=R ==> next_comm=rcuc/4 next_pid=56 next_prio=98
ktimerso-46  3d...2.. 1344661650us : sched_switch: prev_comm=ktimersoftd/3 
prev_pid=46 prev_prio=98 prev_state=S ==> next_comm=swapper/3 next_pid=0 
next_prio=120
ktimerso-24  1d...2.. 1344661650us : sched_switch: prev_comm=ktimersoftd/1 
prev_pid=24 prev_prio=98 prev_state=S ==> next_comm=swapper/1 next_pid=0 
next_prio=120
ktimerso-79  6d...2.. 1344661650us : sched_switch: prev_comm=ktimersoftd/6 
prev_pid=79 prev_prio=98 prev_state=S ==> next_comm=swapper/6 next_pid=0 
next_prio=120
ktimerso-35  2d...2.. 1344661650us : sched_switch: prev_comm=ktimersoftd/2 
prev_pid=35 prev_prio=98 prev_state=S ==> next_comm=swapper/2 next_pid=0 
next_prio=120
  rcuc/5-67  5d...2.. 1344661650us : sched_switch: prev_comm=rcuc/5 
prev_pid=67 prev_prio=98 prev_state=S ==> next_comm=ktimersoftd/5 next_pid=68 
next_prio=98
  rcuc/7-89  7d...2.. 1344661650us : sched_switch: prev_comm=rcuc/7 
prev_pid=89 prev_prio=98 prev_state=S ==> next_comm=ktimersoftd/7 next_pid=90 
next_prio=98
ktimerso-4   0d...211 1344661650us : sched_wakeup: comm=rcu_preempt pid=8 
prio=98 target_cpu=000
  rcuc/4-56  4d...2.. 1344661651us : sched_switch: prev_comm=rcuc/4 
prev_pid=56 prev_prio=98 prev_state=S ==> next_comm=ktimersoftd/4 next_pid=57 
next_prio=98
ktimerso-4   0d...2.. 1344661651us : sched_switch: prev_comm=ktimersoftd/0 
prev_pid=4 prev_prio=98 prev_state=S ==> next_comm=rcu_preempt next_pid=8 
next_prio=98
ktimerso-90  7d...2.. 1344661651us : sched_switch: prev_comm=ktimersoftd/7 
prev_pid=90 prev_prio=98 prev_state=S ==> next_comm=swapper/7 next_pid=0 
next_prio=120
ktimerso-68  5d...2.. 1344661651us : sched_switch: prev_comm=ktimersoftd/5 
prev_pid=68 prev_prio=98 prev_state=S ==> next_comm=swapper/5 next_pid=0 
next_prio=120
rcu_pree-8   0d...3.. 1344661652us : sched_wakeup: comm=rcuop/0 pid=10 
prio=120 target_cpu=000
ktimerso-57  4d...2.. 1344661652us : sched_switch: prev_comm=ktimersoftd/4 
prev_pid=57 prev_prio=98 prev_state=S ==> next_comm=swapper/4 next_pid=0 
next_prio=120
rcu_pree-8   0d...2.. 1344661653us+: sched_switch: prev_comm=rcu_preempt 
prev_pid=8 prev_prio=98 prev_state=S ==> next_comm=kworker/0:0 next_pid=5 
next_prio=120
kworker/-5   0dN.h2.. 1344661741us : sched_wakeup: comm=cyclictest pid=6314 
prio=19 target_cpu=000
kworker/-5   0d...2.. 1344661742us : sched_switch: prev_comm=kworker/0:0 
prev_pid=5 prev_prio=120 prev_state=R+ ==> next_comm=cyclictest next_pid=6314 
next_prio=19
cyclicte-63140d...2.. 1344661743us : sched_switch: prev_comm=cyclictest 
prev_pid=6314 prev_prio=19 prev_state=S ==> next_comm=rcuop/0 next_pid=10 
next_prio=120
 rcuop/0-10  0d...2.. 1344661744us!: sched_switch: prev_comm=rcuop/0 
prev_pid=10 prev_prio=120 prev_state=S ==> next_comm=kworker/0:0 next_pid=5 
next_prio=120
kworker/-5   0dN.h2.. 1344661858us : sched_wakeup: comm=cyclictest pid=6314 
prio=19 target_cpu=000
kworker/-5   0d...2.. 1344661859us : sched_switch: prev_comm=kworker/0:0 
prev_pid=5 prev_prio=120 prev_state=R+ ==> next_comm=cyclictest next_pid=6314 
next_prio=19
cyclicte-63140d...2.. 1344661860us!: sched_switch: prev_comm=cyclictest 
prev_pid=6314 prev_prio=19 prev_state=S ==> next_comm=kworker/0:0 next_pid=5 
next_prio=120
kworker/-5   0dN.h2.. 1344661966us : sched_wakeup: comm=cyclictest pid=6314 
prio=19 target_cpu=000
kworker/-5   0d...2.. 1344661966us : sched_switch: prev_comm=kworker/0:0 
prev_pid=5 prev_prio=120 prev_state=R+ ==> next_comm=cyclictest next_pid=6314 
next_prio=19
cyclicte-63140d...2.. 1344661967us+: sched_switch: prev_comm=cyclictest 
prev_pid=6314 prev_prio=19 prev_state=S ==> next_comm=kworker/0:0 next_pid=5 
next_prio=120
kworker/-5   0dN.h2.. 1344662052us : sched_wakeup: comm=cyclictest pid=6314 
prio=19 

[PATCH net 11/13] afs: Check for fatal error when in waiting for ack state

2016-10-06 Thread David Howells
When it's in the waiting-for-ACK state, the AFS filesystem needs to check
the result of rxrpc_kernel_recv_data() any time it is notified to see if it
is indicating a fatal error.  If this is the case, it needs to mark the
call completed otherwise the call just sits there and never goes away.

Signed-off-by: David Howells 
---

 fs/afs/rxrpc.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/afs/rxrpc.c b/fs/afs/rxrpc.c
index 59bdaa7527b6..477928b25940 100644
--- a/fs/afs/rxrpc.c
+++ b/fs/afs/rxrpc.c
@@ -418,7 +418,7 @@ static void afs_deliver_to_call(struct afs_call *call)
 >abort_code);
if (ret == -EINPROGRESS || ret == -EAGAIN)
return;
-   if (ret == 1) {
+   if (ret == 1 || ret < 0) {
call->state = AFS_CALL_COMPLETE;
goto done;
}



[PATCH net 10/13] rxrpc: Return negative error code to kernel service

2016-10-06 Thread David Howells
In rxrpc_kernel_recv_data(), when we return the error number incurred by a
failed call, we must negate it before returning it as it's stored as
positive (that's what we have to pass back to userspace).

Signed-off-by: David Howells 
---

 net/rxrpc/recvmsg.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/rxrpc/recvmsg.c b/net/rxrpc/recvmsg.c
index 3fa7771c2a9d..db5b02a47518 100644
--- a/net/rxrpc/recvmsg.c
+++ b/net/rxrpc/recvmsg.c
@@ -652,7 +652,7 @@ excess_data:
goto out;
 call_complete:
*_abort = call->abort_code;
-   ret = call->error;
+   ret = -call->error;
if (call->completion == RXRPC_CALL_SUCCEEDED) {
ret = 1;
if (size > 0)



[PATCH net 13/13] rxrpc: Don't request an ACK on the last DATA packet of a call's Tx phase

2016-10-06 Thread David Howells
Don't request an ACK on the last DATA packet of a call's Tx phase as for a
client there will be a reply packet or some sort of ACK to shift phase.  If
the ACK is requested, OpenAFS sends a REQUESTED-ACK ACK with soft-ACKs in
it and doesn't follow up with a hard-ACK.

If we don't set the flag, OpenAFS will send a DELAY ACK that hard-ACKs the
reply data, thereby allowing the call to terminate cleanly.

Signed-off-by: David Howells 
---

 net/rxrpc/output.c |   11 ++-
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/net/rxrpc/output.c b/net/rxrpc/output.c
index a12cea0cbc05..5dab1ff3a6c2 100644
--- a/net/rxrpc/output.c
+++ b/net/rxrpc/output.c
@@ -307,11 +307,12 @@ int rxrpc_send_data_packet(struct rxrpc_call *call, 
struct sk_buff *skb,
/* If our RTT cache needs working on, request an ACK.  Also request
 * ACKs if a DATA packet appears to have been lost.
 */
-   if (retrans ||
-   call->cong_mode == RXRPC_CALL_SLOW_START ||
-   (call->peer->rtt_usage < 3 && sp->hdr.seq & 1) ||
-   ktime_before(ktime_add_ms(call->peer->rtt_last_req, 1000),
-ktime_get_real()))
+   if (!(sp->hdr.flags & RXRPC_LAST_PACKET) &&
+   (retrans ||
+call->cong_mode == RXRPC_CALL_SLOW_START ||
+(call->peer->rtt_usage < 3 && sp->hdr.seq & 1) ||
+ktime_before(ktime_add_ms(call->peer->rtt_last_req, 1000),
+ ktime_get_real(
whdr.flags |= RXRPC_REQUEST_ACK;
 
if (IS_ENABLED(CONFIG_AF_RXRPC_INJECT_LOSS)) {



[PATCH net 04/13] rxrpc: Only ping for lost reply in client call

2016-10-06 Thread David Howells
When a reply is deemed lost, we send a ping to find out the other end
received all the request data packets we sent.  This should be limited to
client calls and we shouldn't do this on service calls.

Signed-off-by: David Howells 
---

 net/rxrpc/input.c |3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/net/rxrpc/input.c b/net/rxrpc/input.c
index 3ad9f75031e3..103d2b0d4690 100644
--- a/net/rxrpc/input.c
+++ b/net/rxrpc/input.c
@@ -847,7 +847,8 @@ static void rxrpc_input_ack(struct rxrpc_call *call, struct 
sk_buff *skb,
 
if (call->rxtx_annotations[call->tx_top & RXRPC_RXTX_BUFF_MASK] &
RXRPC_TX_ANNO_LAST &&
-   summary.nr_acks == call->tx_top - hard_ack)
+   summary.nr_acks == call->tx_top - hard_ack &&
+   rxrpc_is_client_call(call))
rxrpc_propose_ACK(call, RXRPC_ACK_PING, skew, sp->hdr.serial,
  false, true,
  rxrpc_propose_ack_ping_for_lost_reply);



[PATCH net 09/13] rxrpc: Add missing notification

2016-10-06 Thread David Howells
The call's background processor work item needs to notify the socket when
it completes a call so that recvmsg() or the AFS fs can deal with it.
Without this, call expiry isn't handled.

Signed-off-by: David Howells 
---

 net/rxrpc/call_event.c |1 +
 1 file changed, 1 insertion(+)

diff --git a/net/rxrpc/call_event.c b/net/rxrpc/call_event.c
index e2a987fd31ce..0f91d329e910 100644
--- a/net/rxrpc/call_event.c
+++ b/net/rxrpc/call_event.c
@@ -372,6 +372,7 @@ recheck_state:
 
if (call->state == RXRPC_CALL_COMPLETE) {
del_timer_sync(>timer);
+   rxrpc_notify_socket(call);
goto out_put;
}
 



[PATCH net 12/13] rxrpc: Need to produce an ACK for service op if op takes a long time

2016-10-06 Thread David Howells
We need to generate a DELAY ACK from the service end of an operation if we
start doing the actual operation work and it takes longer than expected.
This will hard-ACK the request data and allow the client to release its
resources.

To make this work:

 (1) We have to set the ack timer and propose an ACK when the call moves to
 the RXRPC_CALL_SERVER_ACK_REQUEST and clear the pending ACK and cancel
 the timer when we start transmitting the reply (the first DATA packet
 of the reply implicitly ACKs the request phase).

 (2) It must be possible to set the timer when the caller is holding
 call->state_lock, so split the lock-getting part of the timer function
 out.

 (3) Add trace notes for the ACK we're requesting and the timer we clear.

Signed-off-by: David Howells 
---

 net/rxrpc/ar-internal.h |3 +++
 net/rxrpc/call_event.c  |   16 
 net/rxrpc/misc.c|2 ++
 net/rxrpc/recvmsg.c |8 ++--
 net/rxrpc/sendmsg.c |5 +
 5 files changed, 28 insertions(+), 6 deletions(-)

diff --git a/net/rxrpc/ar-internal.h b/net/rxrpc/ar-internal.h
index b56676be07c7..f60e35576526 100644
--- a/net/rxrpc/ar-internal.h
+++ b/net/rxrpc/ar-internal.h
@@ -733,6 +733,7 @@ extern const char 
rxrpc_rtt_rx_traces[rxrpc_rtt_rx__nr_trace][5];
 enum rxrpc_timer_trace {
rxrpc_timer_begin,
rxrpc_timer_init_for_reply,
+   rxrpc_timer_init_for_send_reply,
rxrpc_timer_expired,
rxrpc_timer_set_for_ack,
rxrpc_timer_set_for_ping,
@@ -749,6 +750,7 @@ enum rxrpc_propose_ack_trace {
rxrpc_propose_ack_ping_for_lost_ack,
rxrpc_propose_ack_ping_for_lost_reply,
rxrpc_propose_ack_ping_for_params,
+   rxrpc_propose_ack_processing_op,
rxrpc_propose_ack_respond_to_ack,
rxrpc_propose_ack_respond_to_ping,
rxrpc_propose_ack_retry_tx,
@@ -811,6 +813,7 @@ int rxrpc_reject_call(struct rxrpc_sock *);
 /*
  * call_event.c
  */
+void __rxrpc_set_timer(struct rxrpc_call *, enum rxrpc_timer_trace, ktime_t);
 void rxrpc_set_timer(struct rxrpc_call *, enum rxrpc_timer_trace, ktime_t);
 void rxrpc_propose_ACK(struct rxrpc_call *, u8, u16, u32, bool, bool,
   enum rxrpc_propose_ack_trace);
diff --git a/net/rxrpc/call_event.c b/net/rxrpc/call_event.c
index 0f91d329e910..97a17ada4431 100644
--- a/net/rxrpc/call_event.c
+++ b/net/rxrpc/call_event.c
@@ -24,15 +24,13 @@
 /*
  * Set the timer
  */
-void rxrpc_set_timer(struct rxrpc_call *call, enum rxrpc_timer_trace why,
-ktime_t now)
+void __rxrpc_set_timer(struct rxrpc_call *call, enum rxrpc_timer_trace why,
+  ktime_t now)
 {
unsigned long t_j, now_j = jiffies;
ktime_t t;
bool queue = false;
 
-   read_lock_bh(>state_lock);
-
if (call->state < RXRPC_CALL_COMPLETE) {
t = call->expire_at;
if (!ktime_after(t, now)) {
@@ -84,6 +82,16 @@ void rxrpc_set_timer(struct rxrpc_call *call, enum 
rxrpc_timer_trace why,
 out:
if (queue)
rxrpc_queue_call(call);
+}
+
+/*
+ * Set the timer
+ */
+void rxrpc_set_timer(struct rxrpc_call *call, enum rxrpc_timer_trace why,
+ktime_t now)
+{
+   read_lock_bh(>state_lock);
+   __rxrpc_set_timer(call, why, now);
read_unlock_bh(>state_lock);
 }
 
diff --git a/net/rxrpc/misc.c b/net/rxrpc/misc.c
index 1cdcba52f83b..6dee55fad2d3 100644
--- a/net/rxrpc/misc.c
+++ b/net/rxrpc/misc.c
@@ -195,6 +195,7 @@ const char rxrpc_timer_traces[rxrpc_timer__nr_trace][8] = {
[rxrpc_timer_begin] = "Begin ",
[rxrpc_timer_expired]   = "*EXPR*",
[rxrpc_timer_init_for_reply]= "IniRpl",
+   [rxrpc_timer_init_for_send_reply]   = "SndRpl",
[rxrpc_timer_set_for_ack]   = "SetAck",
[rxrpc_timer_set_for_ping]  = "SetPng",
[rxrpc_timer_set_for_send]  = "SetTx ",
@@ -207,6 +208,7 @@ const char 
rxrpc_propose_ack_traces[rxrpc_propose_ack__nr_trace][8] = {
[rxrpc_propose_ack_ping_for_lost_ack]   = "LostAck",
[rxrpc_propose_ack_ping_for_lost_reply] = "LostRpl",
[rxrpc_propose_ack_ping_for_params] = "Params ",
+   [rxrpc_propose_ack_processing_op]   = "ProcOp ",
[rxrpc_propose_ack_respond_to_ack]  = "Rsp2Ack",
[rxrpc_propose_ack_respond_to_ping] = "Rsp2Png",
[rxrpc_propose_ack_retry_tx]= "RetryTx",
diff --git a/net/rxrpc/recvmsg.c b/net/rxrpc/recvmsg.c
index db5b02a47518..c29362d50a92 100644
--- a/net/rxrpc/recvmsg.c
+++ b/net/rxrpc/recvmsg.c
@@ -151,17 +151,21 @@ static void rxrpc_end_rx_phase(struct rxrpc_call *call, 
rxrpc_serial_t serial)
switch (call->state) {
case RXRPC_CALL_CLIENT_RECV_REPLY:
__rxrpc_call_completed(call);
+   write_unlock_bh(>state_lock);
break;
 
case 

[PATCH net 08/13] rxrpc: Queue the call on expiry

2016-10-06 Thread David Howells
When a call expires, it must be queued for the background processor to deal
with otherwise a service call that is improperly terminated will just sit
there awaiting an ACK and won't expire.

Signed-off-by: David Howells 
---

 net/rxrpc/call_event.c |   10 ++
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/net/rxrpc/call_event.c b/net/rxrpc/call_event.c
index eeea9602cb89..e2a987fd31ce 100644
--- a/net/rxrpc/call_event.c
+++ b/net/rxrpc/call_event.c
@@ -35,8 +35,11 @@ void rxrpc_set_timer(struct rxrpc_call *call, enum 
rxrpc_timer_trace why,
 
if (call->state < RXRPC_CALL_COMPLETE) {
t = call->expire_at;
-   if (!ktime_after(t, now))
+   if (!ktime_after(t, now)) {
+   trace_rxrpc_timer(call, why, now, now_j);
+   queue = true;
goto out;
+   }
 
if (!ktime_after(call->resend_at, now)) {
call->resend_at = call->expire_at;
@@ -76,12 +79,11 @@ void rxrpc_set_timer(struct rxrpc_call *call, enum 
rxrpc_timer_trace why,
mod_timer(>timer, t_j);
trace_rxrpc_timer(call, why, now, now_j);
}
-
-   if (queue)
-   rxrpc_queue_call(call);
}
 
 out:
+   if (queue)
+   rxrpc_queue_call(call);
read_unlock_bh(>state_lock);
 }
 



[PATCH net 07/13] rxrpc: Partially handle OpenAFS's improper termination of calls

2016-10-06 Thread David Howells
OpenAFS doesn't always correctly terminate client calls that it makes -
this includes calls the OpenAFS servers make to the cache manager service.
It should end the client call with either:

 (1) An ACK that has firstPacket set to one greater than the seq number of
 the reply DATA packet with the LAST_PACKET flag set (thereby
 hard-ACK'ing all packets).  nAcks should be 0 and acks[] should be
 empty (ie. no soft-ACKs).

 (2) An ACKALL packet.

OpenAFS, though, may send an ACK packet with firstPacket set to the last
seq number or less and soft-ACKs listed for all packets up to and including
the last DATA packet.

The transmitter, however, is obliged to keep the call live and the
soft-ACK'd DATA packets around until they're hard-ACK'd as the receiver is
permitted to drop any merely soft-ACK'd packet and request retransmission
by sending an ACK packet with a NACK in it.

Further, OpenAFS will also terminate a client call by beginning the next
client call on the same connection channel.  This implicitly completes the
previous call.

This patch handles implicit ACK of a call on a channel by the reception of
the first packet of the next call on that channel.

If another call doesn't come along to implicitly ACK a call, then we have
to time the call out.  There are some bugs there that will be addressed in
subsequent patches.

Signed-off-by: David Howells 
---

 net/rxrpc/input.c |   37 +
 1 file changed, 37 insertions(+)

diff --git a/net/rxrpc/input.c b/net/rxrpc/input.c
index a6da83f036d6..44fb8d893c7d 100644
--- a/net/rxrpc/input.c
+++ b/net/rxrpc/input.c
@@ -939,6 +939,33 @@ static void rxrpc_input_call_packet(struct rxrpc_call 
*call,
 }
 
 /*
+ * Handle a new call on a channel implicitly completing the preceding call on
+ * that channel.
+ *
+ * TODO: If callNumber > call_id + 1, renegotiate security.
+ */
+static void rxrpc_input_implicit_end_call(struct rxrpc_connection *conn,
+ struct rxrpc_call *call)
+{
+   switch (call->state) {
+   case RXRPC_CALL_SERVER_AWAIT_ACK:
+   rxrpc_call_completed(call);
+   break;
+   case RXRPC_CALL_COMPLETE:
+   break;
+   default:
+   if (rxrpc_abort_call("IMP", call, 0, RX_CALL_DEAD, ESHUTDOWN)) {
+   set_bit(RXRPC_CALL_EV_ABORT, >events);
+   rxrpc_queue_call(call);
+   }
+   break;
+   }
+
+   __rxrpc_disconnect_call(conn, call);
+   rxrpc_notify_socket(call);
+}
+
+/*
  * post connection-level events to the connection
  * - this includes challenges, responses, some aborts and call terminal packet
  *   retransmission.
@@ -1146,6 +1173,16 @@ void rxrpc_data_ready(struct sock *udp_sk)
}
 
call = rcu_dereference(chan->call);
+
+   if (sp->hdr.callNumber > chan->call_id) {
+   if (!(sp->hdr.flags & RXRPC_CLIENT_INITIATED)) {
+   rcu_read_unlock();
+   goto reject_packet;
+   }
+   if (call)
+   rxrpc_input_implicit_end_call(conn, call);
+   call = NULL;
+   }
} else {
skew = 0;
call = NULL;



[PATCH net-next 00/13] rxrpc: Fixes

2016-10-06 Thread David Howells

This set of patches contains a bunch of fixes:

 (1) Fix an oops on incoming call to a local endpoint without a bound
 service.

 (2) Only ping for a lost reply in a client call (this is inapplicable to
 service calls).

 (3) Fix maybe uninitialised variable warnings in the ACK/ABORT sending
 function by splitting it.

 (4) Fix loss of PING RESPONSE ACKs due to them being subsumed by PING ACK
 generation.

 (5) OpenAFS improperly terminates calls it makes as a client under some
 circumstances by not fully hard-ACK'ing the last DATA packets.  This
 is alleviated by a new call appearing on the same channel implicitly
 completing the previous call on that channel.  Handle this implicit
 completion.

 (6) Properly handle expiry of service calls due to the aforementioned
 improper termination with no follow up call to implicitly complete it:

 (a) The call's background processor needs to be queued to complete the
 call, send an abort and notify the socket.

 (b) The call's background processor needs to notify the socket (or the
 kernel service) when it has completed the call.

 (c) A negative error code must thence be returned to the kernel
 service so that it knows the call died.

 (d) The AFS filesystem must detect the fatal error and end the call.

 (7) Must produce a DELAY ACK when the actual service operation takes a
 while to process and must cancel the ACK when the reply is ready.

 (8) Don't request an ACK on the last DATA packet of the Tx phase as this
 confuses OpenAFS.

The patches can be found here also:


http://git.kernel.org/cgit/linux/kernel/git/dhowells/linux-fs.git/log/?h=rxrpc-rewrite

Tagged thusly:

git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs.git
rxrpc-rewrite-20161004

David
---
David Howells (13):
  rxrpc: Accesses of rxrpc_local::service need to be RCU managed
  rxrpc: Fix duplicate const
  rxrpc: Fix oops on incoming call to serviceless endpoint
  rxrpc: Only ping for lost reply in client call
  rxrpc: Fix warning by splitting rxrpc_send_call_packet()
  rxrpc: Fix loss of PING RESPONSE ACK production due to PING ACKs
  rxrpc: Partially handle OpenAFS's improper termination of calls
  rxrpc: Queue the call on expiry
  rxrpc: Add missing notification
  rxrpc: Return negative error code to kernel service
  afs: Check for fatal error when in waiting for ack state
  rxrpc: Need to produce an ACK for service op if op takes a long time
  rxrpc: Don't request an ACK on the last DATA packet of a call's Tx phase


 fs/afs/rxrpc.c  |2 -
 net/rxrpc/af_rxrpc.c|4 +
 net/rxrpc/ar-internal.h |   18 -
 net/rxrpc/call_accept.c |4 +
 net/rxrpc/call_event.c  |   77 +---
 net/rxrpc/call_object.c |3 +
 net/rxrpc/input.c   |   44 +++-
 net/rxrpc/misc.c|6 +-
 net/rxrpc/output.c  |  179 +++
 net/rxrpc/recvmsg.c |   14 ++--
 net/rxrpc/rxkad.c   |6 +-
 net/rxrpc/sendmsg.c |   12 ++-
 12 files changed, 252 insertions(+), 117 deletions(-)



[PATCH net 06/13] rxrpc: Fix loss of PING RESPONSE ACK production due to PING ACKs

2016-10-06 Thread David Howells
Separate the output of PING ACKs from the output of other sorts of ACK so
that if we receive a PING ACK and schedule transmission of a PING RESPONSE
ACK, the response doesn't get cancelled by a PING ACK we happen to be
scheduling transmission of at the same time.

If a PING RESPONSE gets lost, the other side might just sit there waiting
for it and refuse to proceed otherwise.

Signed-off-by: David Howells 
---

 net/rxrpc/ar-internal.h |   12 +---
 net/rxrpc/call_event.c  |   48 +++
 net/rxrpc/call_object.c |1 +
 net/rxrpc/input.c   |4 ++--
 net/rxrpc/misc.c|2 +-
 net/rxrpc/output.c  |   38 ++---
 net/rxrpc/recvmsg.c |4 ++--
 net/rxrpc/sendmsg.c |2 +-
 8 files changed, 82 insertions(+), 29 deletions(-)

diff --git a/net/rxrpc/ar-internal.h b/net/rxrpc/ar-internal.h
index ef849a12a0f0..b56676be07c7 100644
--- a/net/rxrpc/ar-internal.h
+++ b/net/rxrpc/ar-internal.h
@@ -398,6 +398,7 @@ enum rxrpc_call_flag {
RXRPC_CALL_EXPOSED, /* The call was exposed to the world */
RXRPC_CALL_RX_LAST, /* Received the last packet (at 
rxtx_top) */
RXRPC_CALL_TX_LAST, /* Last packet in Tx buffer (at 
rxtx_top) */
+   RXRPC_CALL_SEND_PING,   /* A ping will need to be sent */
RXRPC_CALL_PINGING, /* Ping in process */
RXRPC_CALL_RETRANS_TIMEOUT, /* Retransmission due to timeout 
occurred */
 };
@@ -410,6 +411,7 @@ enum rxrpc_call_event {
RXRPC_CALL_EV_ABORT,/* need to generate abort */
RXRPC_CALL_EV_TIMER,/* Timer expired */
RXRPC_CALL_EV_RESEND,   /* Tx resend required */
+   RXRPC_CALL_EV_PING, /* Ping send required */
 };
 
 /*
@@ -466,6 +468,7 @@ struct rxrpc_call {
struct rxrpc_sock __rcu *socket;/* socket responsible */
ktime_t ack_at; /* When deferred ACK needs to 
happen */
ktime_t resend_at;  /* When next resend needs to 
happen */
+   ktime_t ping_at;/* When next to send a ping */
ktime_t expire_at;  /* When the call times out */
struct timer_list   timer;  /* Combined event timer */
struct work_struct  processor;  /* Event processor */
@@ -558,8 +561,10 @@ struct rxrpc_call {
rxrpc_seq_t ackr_prev_seq;  /* previous sequence number 
received */
rxrpc_seq_t ackr_consumed;  /* Highest packet shown 
consumed */
rxrpc_seq_t ackr_seen;  /* Highest packet shown seen */
-   rxrpc_serial_t  ackr_ping;  /* Last ping sent */
-   ktime_t ackr_ping_time; /* Time last ping sent */
+
+   /* ping management */
+   rxrpc_serial_t  ping_serial;/* Last ping sent */
+   ktime_t ping_time;  /* Time last ping sent */
 
/* transmission-phase ACK management */
ktime_t acks_latest_ts; /* Timestamp of latest ACK 
received */
@@ -730,6 +735,7 @@ enum rxrpc_timer_trace {
rxrpc_timer_init_for_reply,
rxrpc_timer_expired,
rxrpc_timer_set_for_ack,
+   rxrpc_timer_set_for_ping,
rxrpc_timer_set_for_resend,
rxrpc_timer_set_for_send,
rxrpc_timer__nr_trace
@@ -1068,7 +1074,7 @@ extern const s8 rxrpc_ack_priority[];
 /*
  * output.c
  */
-int rxrpc_send_ack_packet(struct rxrpc_call *);
+int rxrpc_send_ack_packet(struct rxrpc_call *, bool);
 int rxrpc_send_abort_packet(struct rxrpc_call *);
 int rxrpc_send_data_packet(struct rxrpc_call *, struct sk_buff *, bool);
 void rxrpc_reject_packets(struct rxrpc_local *);
diff --git a/net/rxrpc/call_event.c b/net/rxrpc/call_event.c
index e313099860d5..eeea9602cb89 100644
--- a/net/rxrpc/call_event.c
+++ b/net/rxrpc/call_event.c
@@ -54,6 +54,14 @@ void rxrpc_set_timer(struct rxrpc_call *call, enum 
rxrpc_timer_trace why,
t = call->ack_at;
}
 
+   if (!ktime_after(call->ping_at, now)) {
+   call->ping_at = call->expire_at;
+   if (!test_and_set_bit(RXRPC_CALL_EV_PING, 
>events))
+   queue = true;
+   } else if (ktime_before(call->ping_at, t)) {
+   t = call->ping_at;
+   }
+
t_j = nsecs_to_jiffies(ktime_to_ns(ktime_sub(t, now)));
t_j += jiffies;
 
@@ -78,6 +86,27 @@ out:
 }
 
 /*
+ * Propose a PING ACK be sent.
+ */
+static void rxrpc_propose_ping(struct rxrpc_call *call,
+  bool immediate, bool background)
+{
+   if (immediate) {
+   if (background &&
+   !test_and_set_bit(RXRPC_CALL_EV_PING, >events))
+   rxrpc_queue_call(call);
+   

[PATCH net 05/13] rxrpc: Fix warning by splitting rxrpc_send_call_packet()

2016-10-06 Thread David Howells
Split rxrpc_send_data_packet() to separate ACK generation (which is more
complicated) from ABORT generation.  This simplifies the code a bit and
fixes the following warning:

In file included from ../net/rxrpc/output.c:20:0:
net/rxrpc/output.c: In function 'rxrpc_send_call_packet':
net/rxrpc/ar-internal.h:1187:27: error: 'top' may be used uninitialized in this 
function [-Werror=maybe-uninitialized]
net/rxrpc/output.c:103:24: note: 'top' was declared here
net/rxrpc/output.c:225:25: error: 'hard_ack' may be used uninitialized in this 
function [-Werror=maybe-uninitialized]

Reported-by: Arnd Bergmann 
Signed-off-by: David Howells 
---

 net/rxrpc/ar-internal.h |3 +
 net/rxrpc/call_accept.c |2 -
 net/rxrpc/call_event.c  |6 +-
 net/rxrpc/call_object.c |2 -
 net/rxrpc/output.c  |  156 ++-
 net/rxrpc/recvmsg.c |4 +
 net/rxrpc/rxkad.c   |6 +-
 net/rxrpc/sendmsg.c |7 +-
 8 files changed, 102 insertions(+), 84 deletions(-)

diff --git a/net/rxrpc/ar-internal.h b/net/rxrpc/ar-internal.h
index 4954e6e25819..ef849a12a0f0 100644
--- a/net/rxrpc/ar-internal.h
+++ b/net/rxrpc/ar-internal.h
@@ -1068,7 +1068,8 @@ extern const s8 rxrpc_ack_priority[];
 /*
  * output.c
  */
-int rxrpc_send_call_packet(struct rxrpc_call *, u8);
+int rxrpc_send_ack_packet(struct rxrpc_call *);
+int rxrpc_send_abort_packet(struct rxrpc_call *);
 int rxrpc_send_data_packet(struct rxrpc_call *, struct sk_buff *, bool);
 void rxrpc_reject_packets(struct rxrpc_local *);
 
diff --git a/net/rxrpc/call_accept.c b/net/rxrpc/call_accept.c
index 22cd8a18c481..832d854c2d5c 100644
--- a/net/rxrpc/call_accept.c
+++ b/net/rxrpc/call_accept.c
@@ -565,7 +565,7 @@ out_discard:
write_unlock_bh(>state_lock);
write_unlock(>call_lock);
if (abort) {
-   rxrpc_send_call_packet(call, RXRPC_PACKET_TYPE_ABORT);
+   rxrpc_send_abort_packet(call);
rxrpc_release_call(rx, call);
rxrpc_put_call(call, rxrpc_call_put);
}
diff --git a/net/rxrpc/call_event.c b/net/rxrpc/call_event.c
index 4f00476630b9..e313099860d5 100644
--- a/net/rxrpc/call_event.c
+++ b/net/rxrpc/call_event.c
@@ -253,7 +253,7 @@ static void rxrpc_resend(struct rxrpc_call *call, ktime_t 
now)
goto out;
rxrpc_propose_ACK(call, RXRPC_ACK_PING, 0, 0, true, false,
  rxrpc_propose_ack_ping_for_lost_ack);
-   rxrpc_send_call_packet(call, RXRPC_PACKET_TYPE_ACK);
+   rxrpc_send_ack_packet(call);
goto out;
}
 
@@ -328,7 +328,7 @@ void rxrpc_process_call(struct work_struct *work)
 
 recheck_state:
if (test_and_clear_bit(RXRPC_CALL_EV_ABORT, >events)) {
-   rxrpc_send_call_packet(call, RXRPC_PACKET_TYPE_ABORT);
+   rxrpc_send_abort_packet(call);
goto recheck_state;
}
 
@@ -347,7 +347,7 @@ recheck_state:
if (test_and_clear_bit(RXRPC_CALL_EV_ACK, >events)) {
call->ack_at = call->expire_at;
if (call->ackr_reason) {
-   rxrpc_send_call_packet(call, RXRPC_PACKET_TYPE_ACK);
+   rxrpc_send_ack_packet(call);
goto recheck_state;
}
}
diff --git a/net/rxrpc/call_object.c b/net/rxrpc/call_object.c
index 364b42dc3dce..07094012ac15 100644
--- a/net/rxrpc/call_object.c
+++ b/net/rxrpc/call_object.c
@@ -498,7 +498,7 @@ void rxrpc_release_calls_on_socket(struct rxrpc_sock *rx)
  struct rxrpc_call, sock_link);
rxrpc_get_call(call, rxrpc_call_got);
rxrpc_abort_call("SKT", call, 0, RX_CALL_DEAD, ECONNRESET);
-   rxrpc_send_call_packet(call, RXRPC_PACKET_TYPE_ABORT);
+   rxrpc_send_abort_packet(call);
rxrpc_release_call(rx, call);
rxrpc_put_call(call, rxrpc_call_put);
}
diff --git a/net/rxrpc/output.c b/net/rxrpc/output.c
index 0d47db886f6e..2dae877c0876 100644
--- a/net/rxrpc/output.c
+++ b/net/rxrpc/output.c
@@ -19,24 +19,24 @@
 #include 
 #include "ar-internal.h"
 
-struct rxrpc_pkt_buffer {
+struct rxrpc_ack_buffer {
struct rxrpc_wire_header whdr;
-   union {
-   struct {
-   struct rxrpc_ackpacket ack;
-   u8 acks[255];
-   u8 pad[3];
-   };
-   __be32 abort_code;
-   };
+   struct rxrpc_ackpacket ack;
+   u8 acks[255];
+   u8 pad[3];
struct rxrpc_ackinfo ackinfo;
 };
 
+struct rxrpc_abort_buffer {
+   struct rxrpc_wire_header whdr;
+   __be32 abort_code;
+};
+
 /*
  * Fill out an ACK packet.
  */
 static size_t rxrpc_fill_out_ack(struct rxrpc_call *call,
-struct rxrpc_pkt_buffer *pkt,
+struct 

[PATCH net 02/13] rxrpc: Fix duplicate const

2016-10-06 Thread David Howells
Remove a duplicate const keyword.

Signed-off-by: David Howells 
---

 net/rxrpc/ar-internal.h |2 +-
 net/rxrpc/misc.c|2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/net/rxrpc/ar-internal.h b/net/rxrpc/ar-internal.h
index d38dffd78085..4954e6e25819 100644
--- a/net/rxrpc/ar-internal.h
+++ b/net/rxrpc/ar-internal.h
@@ -777,7 +777,7 @@ extern const char 
rxrpc_congest_modes[NR__RXRPC_CONGEST_MODES][10];
 extern const char rxrpc_congest_changes[rxrpc_congest__nr_change][9];
 
 extern const char *const rxrpc_pkts[];
-extern const char const rxrpc_ack_names[RXRPC_ACK__INVALID + 1][4];
+extern const char rxrpc_ack_names[RXRPC_ACK__INVALID + 1][4];
 
 #include 
 
diff --git a/net/rxrpc/misc.c b/net/rxrpc/misc.c
index 9d1c721bc4e8..804a88e28739 100644
--- a/net/rxrpc/misc.c
+++ b/net/rxrpc/misc.c
@@ -96,7 +96,7 @@ const s8 rxrpc_ack_priority[] = {
[RXRPC_ACK_PING]= 9,
 };
 
-const char const rxrpc_ack_names[RXRPC_ACK__INVALID + 1][4] = {
+const char rxrpc_ack_names[RXRPC_ACK__INVALID + 1][4] = {
"---", "REQ", "DUP", "OOS", "WIN", "MEM", "PNG", "PNR", "DLY",
"IDL", "-?-"
 };



[PATCH net 01/13] rxrpc: Accesses of rxrpc_local::service need to be RCU managed

2016-10-06 Thread David Howells
struct rxrpc_local->service is marked __rcu - this means that accesses of
it need to be managed using RCU wrappers.  There are two such places in
rxrpc_release_sock() where the value is checked and cleared.  Fix this by
using the appropriate wrappers.

Signed-off-by: David Howells 
---

 net/rxrpc/af_rxrpc.c |4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/net/rxrpc/af_rxrpc.c b/net/rxrpc/af_rxrpc.c
index 44c9c2b0b190..2d59c9be40e1 100644
--- a/net/rxrpc/af_rxrpc.c
+++ b/net/rxrpc/af_rxrpc.c
@@ -678,9 +678,9 @@ static int rxrpc_release_sock(struct sock *sk)
sk->sk_state = RXRPC_CLOSE;
spin_unlock_bh(>sk_receive_queue.lock);
 
-   if (rx->local && rx->local->service == rx) {
+   if (rx->local && rcu_access_pointer(rx->local->service) == rx) {
write_lock(>local->services_lock);
-   rx->local->service = NULL;
+   rcu_assign_pointer(rx->local->service, NULL);
write_unlock(>local->services_lock);
}
 



[PATCH net 03/13] rxrpc: Fix oops on incoming call to serviceless endpoint

2016-10-06 Thread David Howells
If an call comes in to a local endpoint that isn't listening for any
incoming calls at the moment, an oops will happen.  We need to check that
the local endpoint's service pointer isn't NULL before we dereference it.

Signed-off-by: David Howells 
---

 net/rxrpc/call_accept.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/rxrpc/call_accept.c b/net/rxrpc/call_accept.c
index 3cac231d8405..22cd8a18c481 100644
--- a/net/rxrpc/call_accept.c
+++ b/net/rxrpc/call_accept.c
@@ -337,7 +337,7 @@ struct rxrpc_call *rxrpc_new_incoming_call(struct 
rxrpc_local *local,
 
/* Get the socket providing the service */
rx = rcu_dereference(local->service);
-   if (service_id == rx->srx.srx_service)
+   if (rx && service_id == rx->srx.srx_service)
goto found_service;
 
trace_rxrpc_abort("INV", sp->hdr.cid, sp->hdr.callNumber, sp->hdr.seq,



[PATCH v6] net: ip, diag -- Add diag interface for raw sockets

2016-10-06 Thread Cyrill Gorcunov
In criu we are actively using diag interface to collect sockets
present in the system when dumping applications. And while for
unix, tcp, udp[lite], packet, netlink it works as expected,
the raw sockets do not have. Thus add it.

v2:
 - add missing sock_put calls in raw_diag_dump_one (by eric.dumazet@)
 - implement @destroy for diag requests (by dsa@)

v3:
 - add export of raw_abort for IPv6 (by dsa@)
 - pass net-admin flag into inet_sk_diag_fill due to
   changes in net-next branch (by dsa@)

v4:
 - use @pad in struct inet_diag_req_v2 for raw socket
   protocol specification: raw module carries sockets
   which may have custom protocol passed from socket()
   syscall and sole @sdiag_protocol is not enough to
   match underlied ones
 - start reporting protocol specifed in socket() call
   when sockets are raw ones for the same reason: user
   space tools like ss may parse this attribute and use
   it for socket matching

v5 (by eric.dumazet@):
 - use sock_hold in raw_sock_get instead of atomic_inc,
   we're holding (raw_v4_hashinfo|raw_v6_hashinfo)->lock
   when looking up so counter won't be zero here.

v6:
 - use sdiag_raw_protocol() helper which will access @pad
   structure used for raw sockets protocol specification:
   we can't simply rename this member without breaking uapi.

CC: David S. Miller 
CC: Eric Dumazet 
CC: David Ahern 
CC: Alexey Kuznetsov 
CC: James Morris 
CC: Hideaki YOSHIFUJI 
CC: Patrick McHardy 
CC: Andrey Vagin 
CC: Stephen Hemminger 
Signed-off-by: Cyrill Gorcunov 
---
Really sorry for delay. Take a look please once time permit,
I think the most safe solution is to use macro which wraps
@pad access so the userspace progs won't fail on compilation
if they keep the reference on this field.

 include/net/raw.h  |6 +
 include/net/rawv6.h|7 +
 include/uapi/linux/inet_diag.h |9 +
 net/ipv4/Kconfig   |8 +
 net/ipv4/Makefile  |1 
 net/ipv4/inet_diag.c   |9 +
 net/ipv4/raw.c |   21 +++
 net/ipv4/raw_diag.c|  239 +
 net/ipv6/raw.c |7 -
 9 files changed, 303 insertions(+), 4 deletions(-)

Index: linux-ml.git/include/net/raw.h
===
--- linux-ml.git.orig/include/net/raw.h
+++ linux-ml.git/include/net/raw.h
@@ -23,6 +23,12 @@
 
 extern struct proto raw_prot;
 
+extern struct raw_hashinfo raw_v4_hashinfo;
+struct sock *__raw_v4_lookup(struct net *net, struct sock *sk,
+unsigned short num, __be32 raddr,
+__be32 laddr, int dif);
+
+int raw_abort(struct sock *sk, int err);
 void raw_icmp_error(struct sk_buff *, int, u32);
 int raw_local_deliver(struct sk_buff *, int);
 
Index: linux-ml.git/include/net/rawv6.h
===
--- linux-ml.git.orig/include/net/rawv6.h
+++ linux-ml.git/include/net/rawv6.h
@@ -3,6 +3,13 @@
 
 #include 
 
+extern struct raw_hashinfo raw_v6_hashinfo;
+struct sock *__raw_v6_lookup(struct net *net, struct sock *sk,
+unsigned short num, const struct in6_addr 
*loc_addr,
+const struct in6_addr *rmt_addr, int dif);
+
+int raw_abort(struct sock *sk, int err);
+
 void raw6_icmp_error(struct sk_buff *, int nexthdr,
u8 type, u8 code, int inner_offset, __be32);
 bool raw6_local_deliver(struct sk_buff *, int);
Index: linux-ml.git/include/uapi/linux/inet_diag.h
===
--- linux-ml.git.orig/include/uapi/linux/inet_diag.h
+++ linux-ml.git/include/uapi/linux/inet_diag.h
@@ -43,6 +43,15 @@ struct inet_diag_req_v2 {
struct inet_diag_sockid id;
 };
 
+/*
+ * SOCK_RAW sockets require the underlied protocol to be
+ * additionally specified so we can use @pad member for
+ * this, but we can't rename it because userspace programs
+ * still may depend on this name. Instead lets use an explicit
+ * helper.
+ */
+#define sdiag_raw_protocol(__req)  (__req)->pad
+
 enum {
INET_DIAG_REQ_NONE,
INET_DIAG_REQ_BYTECODE,
Index: linux-ml.git/net/ipv4/Kconfig
===
--- linux-ml.git.orig/net/ipv4/Kconfig
+++ linux-ml.git/net/ipv4/Kconfig
@@ -430,6 +430,14 @@ config INET_UDP_DIAG
  Support for UDP socket monitoring interface used by the ss tool.
  If unsure, say Y.
 
+config INET_RAW_DIAG
+   tristate "RAW: socket monitoring interface"
+   depends on INET_DIAG && (IPV6 || IPV6=n)
+   default n
+   ---help---
+ Support for RAW socket monitoring interface used by the ss tool.
+ If unsure, 

RE: [PATCH] bluetooth.h: __ variants of u8 and friends are not neccessary inside kernel

2016-10-06 Thread David Laight
From: Of Joe Perches
...
> No worries, and bool is the same size as u8.

That is not guaranteed at all.
One of the ARM ABI defined bool to be the size of int.

David



Re: [PATCH net-next v2 2/3] openvswitch: remove unreachable code in vlan parsing

2016-10-06 Thread Jiri Benc
On Wed, 5 Oct 2016 22:22:13 -0700, Pravin Shelar wrote:
> User can turn off TX vlan offload for OVS internal device that would
> allow vlan tagged packet with vlan header on the skb-data. This case
> will cause issue here.

Good catch. This is the feedback I hoped for, not the bikesheding about
a value of unused variable :-)

> We could handle this case by not allowing this configuration.

I'm not sure how clean this is but let's try and see if anyone objects.
I'll send v3.

I also noticed we don't set NETIF_F_HW_VLAN_STAG_TX on internal ports.
I'll fix it, too.

Thanks!

 Jiri


Re: [PATCH] bluetooth.h: __ variants of u8 and friends are not neccessary inside kernel

2016-10-06 Thread Johan Hedberg
Hi,

On Thu, Oct 06, 2016, Joe Perches wrote:
> On Thu, 2016-10-06 at 09:02 +0200, Pavel Machek wrote:
> > I believe you are wrong. bit addressability does not matter, cpu can
> > definitely get the bit values.
> > 
> > u8 foo:1;
> > u8 bar:1;
> > u8 baz:1;
> > 
> > should take 1 byte, where
> > 
> > bool foo, bar, baz;
> > 
> > will take more like 3.
> 
> Definitely true.
> 
> There is only one single bitfield foo here though
> so what you wrote doesn't apply.

What's in the tree is a left-over from times when there were multiple
bit fields in this struct. By the time others were removed and there was
only one left no-one has apparently bothered to update it to a bool or
single u8.

Johan


Re: [PATCH] bluetooth.h: __ variants of u8 and friends are not neccessary inside kernel

2016-10-06 Thread Joe Perches
On Thu, 2016-10-06 at 09:02 +0200, Pavel Machek wrote:
> I believe you are wrong. bit addressability does not matter, cpu can
> definitely get the bit values.
> 
> u8 foo:1;
> u8 bar:1;
> u8 baz:1;
> 
> should take 1 byte, where
> 
> bool foo, bar, baz;
> 
> will take more like 3.

Definitely true.

There is only one single bitfield foo here though
so what you wrote doesn't apply.


Re: [PATCH] bluetooth.h: __ variants of u8 and friends are not neccessary inside kernel

2016-10-06 Thread Pavel Machek
On Wed 2016-10-05 15:28:51, Joe Perches wrote:
> On Thu, 2016-10-06 at 00:13 +0200, Pavel Machek wrote:
> > On Wed 2016-10-05 12:15:34, Joe Perches wrote:
> > > On Wed, 2016-10-05 at 21:11 +0200, Pavel Machek wrote:
> > > > On Wed 2016-10-05 10:53:16, Joe Perches wrote:
> []
> > > > > trivia:
> > > > > It's generally faster to use bool instead of u8 foo:1;
> > > > Ok, but I'm not changing that in this patch.
> > > > (And actually, bool will take a lot more memory, right?)
> > > No worries, and bool is the same size as u8.
> > Exactly what I'm talking about :-). One byte vs. one bit, right?
> 
> Memory isn't bit addressable.
> So it's the same byte, it just doesn't use a read/modify/write
> operation to update a value.

I believe you are wrong. bit addressability does not matter, cpu can
definitely get the bit values.

u8 foo:1;
u8 bar:1;
u8 baz:1;

should take 1 byte, where

bool foo, bar, baz;

will take more like 3.

Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html


signature.asc
Description: Digital signature


Re: Kernel 4.6.7-rt13: Intel Ethernet driver igb causes huge latencies in cyclictest

2016-10-06 Thread Koehrer Mathias (ETAS/ESW5)
Hi all,
> 
> Although, to be clear, it isn't the fact that there exists 8 threads, it's 
> that the device is
> firing all 8 interrupts at the same time.  The time spent in hardirq context 
> just waking
> up all 8 of those threads (and the cyclictest wakeup) is enough to cause your
> regression.
> 
> netdev/igb folks-
> 
> Under what conditions should it be expected that the i350 trigger all of the 
> TxRx
> interrupts simultaneously?  Any ideas here?
> 
> See the start of this thread here:
> 
>   http://lkml.kernel.org/r/d648628329bc446fa63b5e19d4d3fb56@FE-
> MBX1012.de.bosch.com
> 
Greg recommended to use "ethtool -L eth2 combined 1" to reduce the number of 
queues.
I tried that. Now, I have actually only three irqs (eth2, eth2-rx-0, eth2-tx-0).
However the issue remains the same.

I ran the cyclictest again:
# cyclictest -a -i 105 -m -n -p 80 -t 1  -b 23 -C
(Note: When using 105us instead of 100us the long latencies seem to occur more 
often).

Here are the final lines of the kernel trace output:
  -0   4d...2.. 1344661649us : sched_switch: prev_comm=swapper/4 
prev_pid=0 prev_prio=120 prev_state=R ==> next_comm=rcuc/4 next_pid=56 
next_prio=98
ktimerso-46  3d...2.. 1344661650us : sched_switch: prev_comm=ktimersoftd/3 
prev_pid=46 prev_prio=98 prev_state=S ==> next_comm=swapper/3 next_pid=0 
next_prio=120
ktimerso-24  1d...2.. 1344661650us : sched_switch: prev_comm=ktimersoftd/1 
prev_pid=24 prev_prio=98 prev_state=S ==> next_comm=swapper/1 next_pid=0 
next_prio=120
ktimerso-79  6d...2.. 1344661650us : sched_switch: prev_comm=ktimersoftd/6 
prev_pid=79 prev_prio=98 prev_state=S ==> next_comm=swapper/6 next_pid=0 
next_prio=120
ktimerso-35  2d...2.. 1344661650us : sched_switch: prev_comm=ktimersoftd/2 
prev_pid=35 prev_prio=98 prev_state=S ==> next_comm=swapper/2 next_pid=0 
next_prio=120
  rcuc/5-67  5d...2.. 1344661650us : sched_switch: prev_comm=rcuc/5 
prev_pid=67 prev_prio=98 prev_state=S ==> next_comm=ktimersoftd/5 next_pid=68 
next_prio=98
  rcuc/7-89  7d...2.. 1344661650us : sched_switch: prev_comm=rcuc/7 
prev_pid=89 prev_prio=98 prev_state=S ==> next_comm=ktimersoftd/7 next_pid=90 
next_prio=98
ktimerso-4   0d...211 1344661650us : sched_wakeup: comm=rcu_preempt pid=8 
prio=98 target_cpu=000
  rcuc/4-56  4d...2.. 1344661651us : sched_switch: prev_comm=rcuc/4 
prev_pid=56 prev_prio=98 prev_state=S ==> next_comm=ktimersoftd/4 next_pid=57 
next_prio=98
ktimerso-4   0d...2.. 1344661651us : sched_switch: prev_comm=ktimersoftd/0 
prev_pid=4 prev_prio=98 prev_state=S ==> next_comm=rcu_preempt next_pid=8 
next_prio=98
ktimerso-90  7d...2.. 1344661651us : sched_switch: prev_comm=ktimersoftd/7 
prev_pid=90 prev_prio=98 prev_state=S ==> next_comm=swapper/7 next_pid=0 
next_prio=120
ktimerso-68  5d...2.. 1344661651us : sched_switch: prev_comm=ktimersoftd/5 
prev_pid=68 prev_prio=98 prev_state=S ==> next_comm=swapper/5 next_pid=0 
next_prio=120
rcu_pree-8   0d...3.. 1344661652us : sched_wakeup: comm=rcuop/0 pid=10 
prio=120 target_cpu=000
ktimerso-57  4d...2.. 1344661652us : sched_switch: prev_comm=ktimersoftd/4 
prev_pid=57 prev_prio=98 prev_state=S ==> next_comm=swapper/4 next_pid=0 
next_prio=120
rcu_pree-8   0d...2.. 1344661653us+: sched_switch: prev_comm=rcu_preempt 
prev_pid=8 prev_prio=98 prev_state=S ==> next_comm=kworker/0:0 next_pid=5 
next_prio=120
kworker/-5   0dN.h2.. 1344661741us : sched_wakeup: comm=cyclictest pid=6314 
prio=19 target_cpu=000
kworker/-5   0d...2.. 1344661742us : sched_switch: prev_comm=kworker/0:0 
prev_pid=5 prev_prio=120 prev_state=R+ ==> next_comm=cyclictest next_pid=6314 
next_prio=19
cyclicte-63140d...2.. 1344661743us : sched_switch: prev_comm=cyclictest 
prev_pid=6314 prev_prio=19 prev_state=S ==> next_comm=rcuop/0 next_pid=10 
next_prio=120
 rcuop/0-10  0d...2.. 1344661744us!: sched_switch: prev_comm=rcuop/0 
prev_pid=10 prev_prio=120 prev_state=S ==> next_comm=kworker/0:0 next_pid=5 
next_prio=120
kworker/-5   0dN.h2.. 1344661858us : sched_wakeup: comm=cyclictest pid=6314 
prio=19 target_cpu=000
kworker/-5   0d...2.. 1344661859us : sched_switch: prev_comm=kworker/0:0 
prev_pid=5 prev_prio=120 prev_state=R+ ==> next_comm=cyclictest next_pid=6314 
next_prio=19
cyclicte-63140d...2.. 1344661860us!: sched_switch: prev_comm=cyclictest 
prev_pid=6314 prev_prio=19 prev_state=S ==> next_comm=kworker/0:0 next_pid=5 
next_prio=120
kworker/-5   0dN.h2.. 1344661966us : sched_wakeup: comm=cyclictest pid=6314 
prio=19 target_cpu=000
kworker/-5   0d...2.. 1344661966us : sched_switch: prev_comm=kworker/0:0 
prev_pid=5 prev_prio=120 prev_state=R+ ==> next_comm=cyclictest next_pid=6314 
next_prio=19
cyclicte-63140d...2.. 1344661967us+: sched_switch: prev_comm=cyclictest 
prev_pid=6314 prev_prio=19 prev_state=S ==> next_comm=kworker/0:0 next_pid=5 
next_prio=120
kworker/-5   0dN.h2.. 1344662052us : sched_wakeup: comm=cyclictest pid=6314 
prio=19 target_cpu=000
kworker/-5   

4.9-rc0: nf_hooks_ingress missing, breaking compilation

2016-10-06 Thread Pavel Machek
Hi!

In kernel based on edadd0e, I get plenty of errors such as:

net/netfilter/core.c:96:3: note: in expansion of macro ‘rcu_assign_pointer’
   rcu_assign_pointer(reg->dev->nf_hooks_ingress, entry);
   ^
In file included from ./include/linux/linkage.h:4:0,
 from ./include/linux/kernel.h:6,
 from net/netfilter/core.c:10:
net/netfilter/core.c:96:30: error: ‘struct net_device’ has no member named 
‘nf_hooks_ingress’
   rcu_assign_pointer(reg->dev->nf_hooks_ingress, entry);
  ^

Config is attached.

[Ok, I guess testing -rc0 is "a bit too brave" :-)]

Best regards,
Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html


delme.gz
Description: application/gzip


signature.asc
Description: Digital signature


Re: [PATCH 3/3] net: smsc911x: add u16 workaround for pxa platforms

2016-10-06 Thread Robert Jarzmik
Robert Jarzmik  writes:

> Mark Rutland  writes:
>
>> On Mon, Oct 03, 2016 at 06:11:23PM +0200, Robert Jarzmik wrote:
>>> Mark Rutland  writes:
>>> 
>>> reg-u16-align4 tells that a specific hardware doesn't support 16 bit writes 
>>> not
>>> being 32 bits aligned, or said differently that a "store" 16 bits wide on an
>>> address of the format 4*n + 2 deserves a special handling in the driver, 
>>> while a
>>> store 16 bits wide on an address of the format 4*n can follow the simple 
>>> casual
>>> case.
>>
>> If I've understood correctly, effectively the low 2 address lines to the
>> device are hard-wired to zero, e.g. a 16-bit access to 4*n + 2 would go
>> to 4*n + 0 on the device? Or is the failure case distinct from that?
> It is distinct.
>
> The "awful truth" is that an FPGA lies between the system bus and the
> smc91c111. And this FPGA cannot handle correctly the 4*n + 2 u16 writes.
>
>> Do we have other platforms where similar is true? e.g. u8 accesses
>> requiring 16-bit alignment?
>
> Not really, ie. not with a alignement requirement.
>
> But there are of course these ones are handled by reg-io-width and the
> SMC_USE_xxx_BITS flags as far as I understand it. These cases are when a
> platform declares SMC91X_USE_16BIT or SMC91X_USE_32BIT, but not 
> SMC91X_USE_8BIT,
> which would make me think of :
>  - CONFIG_SH_SH4202_MICRODEV,
>  - CONFIG_M32R
>  - several omap1 boards
>  - 1 sa1100 board
>  - several MMP and realview boards
>
> With all these platforms, each u8 access is replaced with a u16 access and a
> mask / shift + mask.

Or so what should I call this entry if reg-u16-align4 is not a good candidate ?

Cheers.

-- 
Robert


Re: [PATCH net] Panic when tc_lookup_action_n finds a partially initialized action.

2016-10-06 Thread Krister Johansen
On Wed, Oct 05, 2016 at 11:01:38AM -0700, Cong Wang wrote:
> On Tue, Oct 4, 2016 at 11:52 PM, Krister Johansen
>  wrote:
> > On Mon, Oct 03, 2016 at 11:22:33AM -0700, Cong Wang wrote:
> >> Please try the attached patch. I also convert the read path to RCU
> >> to avoid a possible deadlock. A quick test shows no lockdep splat.
> >
> > I tried this patch, but it doesn't solve the problem.  I got a panic on
> > my very first try:
> 
> Thanks for testing it.

Absolutely; thanks for helping to try to simplify this fix.

> > The problem here is the same as before: by using RCU the race isn't
> > fixed because the module is still discoverable from act_base before the
> > pernet initialization is completed.
> >
> > You can see from the trap frame that the first two arguments to
> > tcf_hash_check were 0.  It couldn't look up the correct per-subsystem
> > pointer because the id hadn't yet been registered.
> 
> I thought the problem is that we don't do pernet ops registration and
> action ops registration atomically therefore chose to use mutex+RCU,
> but I was wrong, the problem here is just ordering, we need to finish
> the pernet initialization before making action ops visible.
> 
> If so, why not just reorder them? Does the attached patch make any
> sense now? Our pernet init doesn't rely on act_base, so even we have
> some race, the worst case is after we initialize the pernet netns for an
> action but its ops still not visible, which seems fine (at least no crash).
> 
> Or I still miss something here?

I'm not sure.  The reason I didn't take this approach from the outset is
that all of TC's callers of tcf_register_action pass a pointer to a
static structure as their *ops argument.  The existence of code that
checks the action for uniqueness suggests that it's possible for
tcf_register_action to get passed two identical tc_action_ops.  If that
happens in the current code base, we'll also get passed a duplicate
pernet_operations pointer.  The code in register_pernet_subsys() makes
no attempt to check for duplicates.  If we add a pointer that's already
in the list, and subsequently call unregister, the results seem
undefined.  It looks like we'll remove the pernet_operations for the
existing action, assuming we don't corrupt the list in the process.

Is this actually safe?  If so, what corner case is the act->type /
act->kind protecting us from?

> (Sorry that I don't have the environment to reproduce your bug)

I'm sorry that I didn't do a good job of explaining how we end up in
this situation in the first place.  I can give a few more details,
because it may explain some of my concern about the request_module()
call.

The system that encounters this bug launches a bunch of containers from
systemd on boot.  Each container creates a new user, net, pid, and mount
namespace and begins its setup.  When the networking in all of these
containers, each in a new netns, try to configure TC and no modules are
loaded we end up with this race.

I can also reproduce by unloading the modules, and then launching a
bunch of processes that configure tc in new namespaces.

Part of the desire to inhibit extra modprobe calls is that if hundreds
of these all start at once on boot, it's really unnecessary to have all
of the rest of them wait while lots of extra modprobe calls are forked
by the kernel.

> Thanks for your patience and testing!

Thank you for taking the time to look through the fix and discuss
alternatives.

-K