Re: [PATCH v2 2/6] cgroup: add support for eBPF programs

2016-08-25 Thread kbuild test robot
Hi Daniel,

[auto build test WARNING on net-next/master]
[also build test WARNING on v4.8-rc3 next-20160824]
[cannot apply to linus/master linux/master]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]
[Suggest to use git(>=2.9.0) format-patch --base= (or --base=auto for 
convenience) to record what (public, well-known) commit your patch series was 
built on]
[Check https://git-scm.com/docs/git-format-patch for more information]

url:
https://github.com/0day-ci/linux/commits/Daniel-Mack/Add-eBPF-hooks-for-cgroups/20160825-042759
reproduce:
# apt-get install sparse
make ARCH=x86_64 allmodconfig
make C=1 CF=-D__CHECK_ENDIAN__


sparse warnings: (new ones prefixed by >>)

   include/linux/compiler.h:230:8: sparse: attribute 'no_sanitize_address': 
unknown attribute
>> kernel/bpf/cgroup.c:46:17: sparse: incompatible types in comparison 
>> expression (different address spaces)
>> kernel/bpf/cgroup.c:46:17: sparse: incompatible types in comparison 
>> expression (different address spaces)
   kernel/bpf/cgroup.c:97:17: sparse: incompatible types in comparison 
expression (different address spaces)
   kernel/bpf/cgroup.c:147:16: sparse: incompatible types in comparison 
expression (different address spaces)

vim +46 kernel/bpf/cgroup.c

30  continue;
31  
32  bpf_prog_put(cgrp->bpf.prog[type]);
33  static_branch_dec(&cgroup_bpf_enabled_key);
34  }
35  
36  rcu_read_unlock();
37  }
38  
39  void cgroup_bpf_inherit(struct cgroup *cgrp, struct cgroup *parent)
40  {
41  unsigned int type;
42  
43  rcu_read_lock();
44  
45  for (type = 0; type < __MAX_BPF_ATTACH_TYPE; type++)
  > 46  rcu_assign_pointer(cgrp->bpf.prog_effective[type],
47  
rcu_dereference(parent->bpf.prog_effective[type]));
48  
49  rcu_read_unlock();
50  }
51  
52  /**
53   * __cgroup_bpf_update() - Update the pinned program of a cgroup, and
54   * propagate the change to descendants

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


Re: [PATCH] bpf: fix size of copy_to_user in percpu map.

2016-08-25 Thread William Tu
Hi,

I've tested on kvm and encountered similar issue. If I boot up VM with
CPU hotplug enabled like below:
./qemu-system-x86_64 -smp 2, maxcpus=4 

then the ' /sys/devices/system/cpu/possible' does not equal to the
number of cpu* dirs in ' /sys/devices/system/cpu/', which will crash
the percpu map testcase. Should we consider fix it, or for simply
workaround, we could disable CONFIG_HOTPLUG_CPU?

Thanks
William

On Thu, Aug 18, 2016 at 6:38 PM, William Tu  wrote:
> Hi Alexei and Daniel,
>
> I got feedback from Fusion bios/chipset team. In short, the value
> 'possible' includes empty CPU socket. To verify, I tested on a
> physical Xeon machine with 2 CPU sockets, one of them is empty. I got
> 'possible' = 0-239, the number of 'cpu*' =12. As a result, extra bytes
> are copied from kernel to userspace.
>
> As for Fusion, there is a configuration in *.vmx. If we disable cpu
> hot plug by "vcpu.hotadd=FALSE", then 'possible' ==  the number of
> 'cpu*' dirs, which is the configuration in ESX. If "vcpu.hotadd=TURE",
> then 'possible' is larger than 'cpu*' dirs, allowing users to add more
> vcpus.
>
> Regards,
> William
>
>
> On Fri, Aug 12, 2016 at 4:08 PM, Alexei Starovoitov
>  wrote:
>> On Fri, Aug 12, 2016 at 09:58:51AM -0700, William Tu wrote:
>>> Hi,
>>>
>>> I've tested on ESXi version 5.5 and it seems OK.
>>> - VM1: Ubuntu 14.04, kernel 3.19 ---> OK 3 cpu dirs, possible = 0-2
>>> - VM2: Centos7, kernel 3.10 ---> OK 8 cpu dirs, possible = 0-7
>>>
>>> I tried another MacBook with Fusion, same issue happens, the cpu[0-9]
>>> dirs are not equal to /sys/devices/system/cpu/possible
>>
>> great. thanks for testing. I think the issue is closed and
>> hopefully you can follow up with fusion guys ;)
>>


[PATCH net-next 6/6] net: dsa: bcm_sf2: Remove duplicate code

2016-08-25 Thread Florian Fainelli
Now that we are using b53_common for most VLAN, FDB and bridge
operations, delete all the redundant code that we had in bcm_sf2.c to
keep only the integration specific logic that we have to deal with:
power management, link management and the external interfaces (RGMII,
MDIO).

Signed-off-by: Florian Fainelli 
---
 drivers/net/dsa/bcm_sf2.c  | 772 +
 drivers/net/dsa/bcm_sf2.h  |  73 +---
 drivers/net/dsa/bcm_sf2_regs.h | 122 ---
 3 files changed, 17 insertions(+), 950 deletions(-)

diff --git a/drivers/net/dsa/bcm_sf2.c b/drivers/net/dsa/bcm_sf2.c
index 56e898f01c0f..6aee184b2963 100644
--- a/drivers/net/dsa/bcm_sf2.c
+++ b/drivers/net/dsa/bcm_sf2.c
@@ -36,109 +36,6 @@
 #include "b53/b53_priv.h"
 #include "b53/b53_regs.h"
 
-/* String, offset, and register size in bytes if different from 4 bytes */
-static const struct bcm_sf2_hw_stats bcm_sf2_mib[] = {
-   { "TxOctets",   0x000, 8},
-   { "TxDropPkts", 0x020   },
-   { "TxQPKTQ0",   0x030   },
-   { "TxBroadcastPkts",0x040   },
-   { "TxMulticastPkts",0x050   },
-   { "TxUnicastPKts",  0x060   },
-   { "TxCollisions",   0x070   },
-   { "TxSingleCollision",  0x080   },
-   { "TxMultipleCollision", 0x090  },
-   { "TxDeferredCollision", 0x0a0  },
-   { "TxLateCollision",0x0b0   },
-   { "TxExcessiveCollision", 0x0c0 },
-   { "TxFrameInDisc",  0x0d0   },
-   { "TxPausePkts",0x0e0   },
-   { "TxQPKTQ1",   0x0f0   },
-   { "TxQPKTQ2",   0x100   },
-   { "TxQPKTQ3",   0x110   },
-   { "TxQPKTQ4",   0x120   },
-   { "TxQPKTQ5",   0x130   },
-   { "RxOctets",   0x140, 8},
-   { "RxUndersizePkts",0x160   },
-   { "RxPausePkts",0x170   },
-   { "RxPkts64Octets", 0x180   },
-   { "RxPkts65to127Octets", 0x190  },
-   { "RxPkts128to255Octets", 0x1a0 },
-   { "RxPkts256to511Octets", 0x1b0 },
-   { "RxPkts512to1023Octets", 0x1c0},
-   { "RxPkts1024toMaxPktsOctets", 0x1d0},
-   { "RxOversizePkts", 0x1e0   },
-   { "RxJabbers",  0x1f0   },
-   { "RxAlignmentErrors",  0x200   },
-   { "RxFCSErrors",0x210   },
-   { "RxGoodOctets",   0x220, 8},
-   { "RxDropPkts", 0x240   },
-   { "RxUnicastPkts",  0x250   },
-   { "RxMulticastPkts",0x260   },
-   { "RxBroadcastPkts",0x270   },
-   { "RxSAChanges",0x280   },
-   { "RxFragments",0x290   },
-   { "RxJumboPkt", 0x2a0   },
-   { "RxSymblErr", 0x2b0   },
-   { "InRangeErrCount",0x2c0   },
-   { "OutRangeErrCount",   0x2d0   },
-   { "EEELpiEvent",0x2e0   },
-   { "EEELpiDuration", 0x2f0   },
-   { "RxDiscard",  0x300, 8},
-   { "TxQPKTQ6",   0x320   },
-   { "TxQPKTQ7",   0x330   },
-   { "TxPkts64Octets", 0x340   },
-   { "TxPkts65to127Octets", 0x350  },
-   { "TxPkts128to255Octets", 0x360 },
-   { "TxPkts256to511Ocets", 0x370  },
-   { "TxPkts512to1023Ocets", 0x380 },
-   { "TxPkts1024toMaxPktOcets", 0x390  },
-};
-
-#define BCM_SF2_STATS_SIZE ARRAY_SIZE(bcm_sf2_mib)
-
-static void bcm_sf2_sw_get_strings(struct dsa_switch *ds,
-  int port, uint8_t *data)
-{
-   unsigned int i;
-
-   for (i = 0; i < BCM_SF2_STATS_SIZE; i++)
-   memcpy(data + i * ETH_GSTRING_LEN,
-  bcm_sf2_mib[i].string, ETH_GSTRING_LEN);
-}
-
-static void bcm_sf2_sw_get_ethtool_stats(struct dsa_switch *ds,
-int port, uint64_t *data)
-{
-   struct bcm_sf2_priv *priv = bcm_sf2_to_priv(ds);
-   const struct bcm_sf2_hw_stats *s;
-   unsigned int i;
-   u64 val = 0;
-   u32 offset;
-
-   mutex_lock(&priv->stats_mutex);
-
-   /* Now fetch the per-port counters */
-   for (i = 0; i < BCM_SF2_STATS_SIZE; i++) {
-   s = &bcm_sf2_mib[i];
-
-   /* Do a latched 64-bit read if needed */
-   offset = s->reg + CORE_P_MIB_OFFSET(port);
-   if (s->sizeof_stat == 8)
-   val = core_readq(priv, offset);
-   else
-   val = core_readl(priv, offset);
-
-   data[i] = (u64)val;
-   }
-
-   mutex_unlock(&priv->stats_mutex);
-}
-
-static int bcm_sf2_sw_get_sset_count(struct dsa_switch *ds)
-{
-   return B

Re: [PATCH net-next v1] gso: Support partial splitting at the frag_list pointer

2016-08-25 Thread Steffen Klassert
On Wed, Aug 24, 2016 at 02:25:29PM -0300, Marcelo Ricardo Leitner wrote:
> Em 24-08-2016 13:27, Alexander Duyck escreveu:
> >
> >I'm adding Marcelo as he could probably explain the GSO_BY_FRAGS
> >functionality better than I could since he is the original author.
> >
> >If I recall GSO_BY_FRAGS does something similar to what you are doing,
> >although I believe it doesn't carry any data in the first buffer other
> >than just a header.  I believe the idea behind GSO_BY_FRAGS was to
> >allow for segmenting a frame at the frag_list level instead of having
> >it done just based on MSS.  That was the only reason why I brought it
> >up.
> >
> 
> That's exactly it.
> 
> On this no data in the first buffer limitation, we probably can
> allow it have some data in there. It was done this way just because
> sctp is using skb_gro_receive() to build such skb and this was the
> way I found to get such frag_list skb generated by it, thus
> preserving frame boundaries.

Just to understand what you are doing. You generate MTU sized linear
buffers in sctp and then, skb_gro_receive() chains up these buffers
at the frag_list pointer. skb_gro_receive() does this because
skb_gro_offset is null and skb->head_frag is not set in your case.

At segmentation, you just need to split at the frag_list pointer
because you know that the chained buffers fit the MTU, right?



Re: [RFC PATCH] net: ip_finish_output_gso: Attempt gso_size clamping if segments exceed mtu

2016-08-25 Thread Shmulik Ladkani
Hi,

On Mon, 22 Aug 2016 14:58:42 +0200, f...@strlen.de wrote:
> Shmulik Ladkani  wrote:
> > There are cases where gso skbs (which originate from an ingress
> > interface) have a gso_size value that exceeds the output dst mtu:
> > 
> >  - ipv4 forwarding middlebox having in/out interfaces with different mtus
> >addressed by fe6cc55f3a 'net: ip, ipv6: handle gso skbs in forwarding 
> > path'
> >  - bridge having a tunnel member interface stacked over a device with small 
> > mtu
> >addressed by b8247f095e 'net: ip_finish_output_gso: If 
> > skb_gso_network_seglen exceeds MTU, allow segmentation for local udp 
> > tunneled skbs'
> > 
> > In both cases, such skbs are identified, then go through early software
> > segmentation+fragmentation as part of ip_finish_output_gso.
> > 
> > Another approach is to shrink the gso_size to a value suitable so
> > resulting segments are smaller than dst mtu, as suggeted by Eric
> > Dumazet (as part of [1]) and Florian Westphal (as part of [2]).
> > 
> > This will void the need for software segmentation/fragmentation at
> > ip_finish_output_gso, thus significantly improve throughput and lower
> > cpu load.
> > 
> > This RFC patch attempts to implement this gso_size clamping.
> > 
> > [1] https://patchwork.ozlabs.org/patch/314327/
> > [2] https://patchwork.ozlabs.org/patch/644724/
> > 
> > Signed-off-by: Shmulik Ladkani 
> > ---
> > 
> >  Florian, in fe6cc55f you described a BUG due to gso_size decrease.
> >  I've tested both bridged and routed cases, but in my setups failed to
> >  hit the issue; Appreciate if you can provide some hints.
> 
> Still get the BUG, I applied this patch on top of net-next.

The BUG occurs when GRO occurs on the ingress, and only if GRO merges
skbs into the frag_list (OTOH when segments are only placed into frags[]
of a single skb, skb_segment succeeds even if gso_size was altered).

This is due to an assumption that the frag_list members terminate on
exact MSS boundaries (which no longer holds during gso_size clamping).

The assumption is dated back to original support of 'frag_list' to
skb_segment:

89319d3801 net: Add frag_list support to skb_segment

This patch adds limited support for handling frag_list packets in
skb_segment.  The intention is to support GRO (Generic Receive Offload)
packets which will be constructed by chaining normal packets using
frag_list.

As such we require all frag_list members terminate on exact MSS
boundaries.  This is checked using BUG_ON.

As there should only be one producer in the kernel of such packets,
namely GRO, this requirement should not be difficult to maintain.

We have few alternatives for gso_size clamping:

1 Fix 'skb_segment' arithmentics to support inputs that do not match
  the "frag_list members terminate on exact MSS" assumption.

2 Perform gso_size clamping in 'ip_finish_output_gso' for non-GROed skbs.
  Other usecases will still benefit: (a) packets arriving from
  virtualized interfaces, e.g. tap and friends; (b) packets arriving from
  veth of other namespaces (packets are locally generated by TCP stack
  on a different netns).

3 Ditch the idea, again ;)

We can persue (1), especially if there are other benefits doing so.
OTOH due to the current complexity of 'skb_segment' this is bit risky.

Going with (2) could be reasonable too, as it brings value for
the virtualized environmnets that need gso_size clamping, while
presenting minimal risk.

Appreciate your opinions.

Regards,
Shmulik


Re: [RFC 1/3] tcp: randomize tcp timestamp offsets for each connection

2016-08-25 Thread Florian Westphal
Eric Dumazet  wrote:
> On Thu, 2016-08-18 at 14:48 +0200, Florian Westphal wrote:
> > commit ceaa1fef65a7c2e ("tcp: adding a per-socket timestamp offset")
> > added the main infrastructure that is needed for per-connection
> > randomization, in particular writing/reading the on-wire tcp header
> > format takes the offset into account so rest of stack can use normal
> > tcp_time_stamp (jiffies).
> > 
> > So only two items are left:
> >  - add a tsoffset for request sockets
> >  - extend the tcp isn generator to also return another 32bit number
> >  in addition to the ISN.
> > 
> > Re-use of ISN generator also means timestamps are still monotonically
> > increasing for same connection quadruple.
> 
> I like the idea, but the implementation looks a bit complex.
> 
> Instead of initializing tsoffset to 0, we could simply use
> 
> jhash(src_addr, dst_addr, boot_time_rnd)
> 
> This way, even syncookies would be handled, and we do not need to
> increase tcp_request_sock size.

So I gave this a try and it does avoid this tcp_request_sock increase,
but I feel that getting boot_time_rnd is too easy.

I tried a few other ideas but nothing satisfying/simpler came out of it
(e.g. i tried to also hash the isn but that gets scaled w. current
 clock so it doesn't work).

Are you more concerned wrt. complexity or the reqsk increase?

One could use tfo boolean padding in the struct to avoid size increase
(1 bit tfo_listener, 31 for tsoff).

I would then split this patch in two (one to add tsoff to reqsk, one
to add the randomization).

The only other alternative I see is to eat 2nd md5_transform and
add a tso_offset function to secure_seq.c -- but I don't like that
either.

Any other idea?


[PATCH 1/1] net: add killer E2500 device id

2016-08-25 Thread Owen Lin
Add Killer E2500 device ID in alx driver.

Signed-off-by: Owen Lin o...@rivetnetworks.com



diff -up1rN alx_orig/main.c alx/main.c
--- alx_orig/main.c 2016-08-25 11:12:34.170447500 +0800
+++ alx/main.c  2016-08-25 11:24:12.026853000 +0800
@@ -1547,2 +1547,4 @@ static const struct pci_device_id alx_pc
  .driver_data = ALX_DEV_QUIRK_MSI_INTX_DISABLE_BUG },
+   { PCI_VDEVICE(ATTANSIC, ALX_DEV_ID_E2500),
+ .driver_data = ALX_DEV_QUIRK_MSI_INTX_DISABLE_BUG },
{ PCI_VDEVICE(ATTANSIC, ALX_DEV_ID_AR8162),
diff -up1rN alx_orig/reg.h alx/reg.h
--- alx_orig/reg.h  2016-08-25 11:08:25.895534000 +0800
+++ alx/reg.h   2016-08-25 11:24:34.933147000 +0800
@@ -40,5 +40,6 @@
 #define ALX_DEV_ID_E2400   0xe0a1
+#define ALX_DEV_ID_E2500   0xe0b1
 #define ALX_DEV_ID_AR8162  0x1090
-#define ALX_DEV_ID_AR8171  0x10A1
-#define ALX_DEV_ID_AR8172  0x10A0
+#define ALX_DEV_ID_AR8171  0x10a1
+#define ALX_DEV_ID_AR8172  0x10a0

/* rev definition,


Re: wan-cosa: Use memdup_user() rather than duplicating its implementation

2016-08-25 Thread Jan Kasprzak
SF Markus Elfring wrote:
: > What about the GFP_DMA attribute, which your patch deletes?
: > The buffer in question has to be ISA DMA-able.
: 
: Thanks for your constructive feedback.
: 
: Would you be interested in using a variant of the function "memdup_…"
: with which the corresponding memory allocation option can be preserved?

I am not sure that extending an in-kernel API just for one
legacy driver is what we want. As I said, I would prefer the driver
unchanged, if possible.

Maybe it is the time for gradually phasing out ISA DMA support and
all the legacy drivers which use it?

Sincerely,

-Yenya

-- 
| Jan "Yenya" Kasprzak  |
| http://www.fi.muni.cz/~kas/ GPG: 4096R/A45477D5 |
 Like most things in Windows, on the surface it looks great.
 -- Jeremy Allison, A Tale of Two Standards


Re: [PATCH] usbnet: ax88179_178a: Add support for writing the EEPROM

2016-08-25 Thread Oliver Neukum
On Wed, 2016-08-24 at 16:40 +0200, Alban Bedel wrote:
> On Wed, 24 Aug 2016 16:30:39 +0200
> Oliver Neukum  wrote:
> 
> > On Wed, 2016-08-24 at 15:52 +0200, Alban Bedel wrote:

> > > +   if (block != data)
> > > +   kfree(block);  
> > 
> > And if block == dta, what frees the memory?
> 
> In this case this function didn't allocate any memory, so there is
> nothing to free.

Hi,

I see. kfree() has a check for NULL, so you could drop the
test, but it doesn't matter much either way.

Regards
Oliver





Re: [PATCH 1/1] net: add killer E2500 device id

2016-08-25 Thread Sabrina Dubroca
Hi Owen,

2016-08-25, 03:33:11 +, Owen Lin wrote:
> Add Killer E2500 device ID in alx driver.
> 
> Signed-off-by: Owen Lin o...@rivetnetworks.com
> 
> 

This line shouldn't be here (it will end up in the commit message).


> diff -up1rN alx_orig/main.c alx/main.c
> --- alx_orig/main.c   2016-08-25 11:12:34.170447500 +0800
> +++ alx/main.c2016-08-25 11:24:12.026853000 +0800

That patch won't apply.  The patch should be generated from the root
directory, and include the full path of the files you're modifying, so
that the patch can be applied from the root directory of the kernel
tree.  i.e. the paths should be of the form

a/drivers/net/ethernet/atheros/alx/main.c

The easiest way to create patches that apply properly is to use the
git tree and the `git format-patch` tool.


Thanks,

-- 
Sabrina


Re: [PATCH] usbnet: ax88179_178a: Add support for writing the EEPROM

2016-08-25 Thread Alban Bedel
On Thu, 25 Aug 2016 11:16:36 +0200
Oliver Neukum  wrote:

> On Wed, 2016-08-24 at 16:40 +0200, Alban Bedel wrote:
> > On Wed, 24 Aug 2016 16:30:39 +0200
> > Oliver Neukum  wrote:
> >   
> > > On Wed, 2016-08-24 at 15:52 +0200, Alban Bedel wrote:  
> 
> > > > +   if (block != data)
> > > > +   kfree(block);
> > > 
> > > And if block == dta, what frees the memory?  
> > 
> > In this case this function didn't allocate any memory, so there is
> > nothing to free.  
> 
> Hi,
> 
> I see. kfree() has a check for NULL, so you could drop the
> test, but it doesn't matter much either way.

I think you misunderstand something here. data is the buffer passed
by the caller and block is a local variable. There is two cases:

1) The data to write is block aligned, then we use the caller buffer
   as is and set block = data.
2) The requested data is not block aligned, then we kalloc block.

In both case the writing loop then use the block pointer. Afterwards
we only need to kfree block in case 2, that is when block != data.

Alban


pgp0R87lnmIOn.pgp
Description: OpenPGP digital signature


Re: [ovs-dev] [PATCH net-next v11 5/6] openvswitch: add layer 3 flow/port support

2016-08-25 Thread Simon Horman
On Tue, Aug 23, 2016 at 10:51:47AM +0200, Simon Horman wrote:
> On Mon, Aug 22, 2016 at 02:47:42PM -0700, Joe Stringer wrote:
> > On 22 August 2016 at 04:04, Simon Horman  wrote:
> > > On Wed, Aug 10, 2016 at 10:17:30AM -0700, Joe Stringer wrote:
> > >> On 10 August 2016 at 03:20, Simon Horman  
> > >> wrote:
> > >> > On Tue, Aug 09, 2016 at 08:47:40AM -0700, pravin shelar wrote:
> > >> >> On Mon, Aug 8, 2016 at 8:17 AM, Simon Horman 
> > >> >>  wrote:
> > >> >> > Light testing seems to indicate that it works for GSO skbs
> > >> >> > received over both L3 and L2 GRE tunnels by OvS with both
> > >> >> > IP-in-MPLS and IP (without MPLS) payloads.
> > >> >> >
> > >> >>
> > >> >> Thanks for testing it. Can you also add those tests to OVS kmod test 
> > >> >> suite?
> > >> >> ..
> > >> >
> > >> > Sure, I will look into doing that.
> > >> > Am I correct in thinking Joe Stringer is the best person to contact if
> > >> > I run into trouble there?
> > >>
> > >> Sure. The basics of running the tests is documented here:
> > >> https://github.com/openvswitch/ovs/blob/master/INSTALL.md#datapath-testing
> > >>
> > >> You should be able to get a good feel for how to add tests by perusing
> > >> the commits to tests/system-{traffic,kmod-macros}.at in the OVS source
> > >> tree.
> > >
> > > Thanks Joe,
> > >
> > > it took me a while but I think that I have something working
> > > against the head branch of the OVS tree. I'd value opinions
> > > on the direction I have taken.
> > >
> > > Subject: [PATCH] system-traffic: Exercise GSO
> > >
> > > Exercise GSO for: unencapsulated; MPLS; GRE; and MPLS in GRE.
> > >
> > > There is scope to extend this testing to other encapsulation formats
> > > if desired.
> > >
> > > This is motivated by a desire to test GRE and MPLS encapsulation in
> > > the context of L3/VPN (MPLS over non-TEB GRE work). That is not
> > > tested here but tests for those cases would ideally be based on those in
> > > this patch.
> > 
> > This makes sense to me. There's a few corners that could be improved,
> > primarily for reproducing sane results on a variety of systems, then a
> > couple of style comments. Please do run the tests via both "make
> > check-kernel" and "make check-system-userspace" before submitting,
> > ideally with at least two varieties of kernel: One where you would
> > expect the test to pass, and one where you would expect the tests to
> > be skipped.

Both make check-kernel and make check-system-userspace are now working.
I have tested against net-next and the 3.16 kernel that ships with
Debian stable.

> Thanks. I'm glad I ran this by you before expanding the number of tests.
> 
> > * CHECK_MPLS is defined in system-kmod-macros.at, so a corresponding
> > version should be provided in system-userspace-macros.at. If the
> > criteria for running the test(s) with both userspace and kernel
> > datapaths is the same, then this could instead be moved into
> > system-common-macros.at.
> 
> Understood.
>
> > * "datapath - ping over gre tunnel" adds a command to execute in
> > at_ns1, but that namespace doesn't exist.
> 
> Oops.

I have removed the chunk in question, it seems to be an artifact
of my development of the tests.

> > * "datapath - http over gre tunnel" is missing MPLS_CHECK.
> 
> Thanks, I'll fix that.

On further inspection it seems tome that this check does not use MPLS,
rather it is testing GSO for GRE (without MPLS).

> > * Is there a way to clear the netstat statistics before running the
> > tests which rely on it? I'm getting a failure on one of my systems
> > (ubuntu trusty with a 4.7 kernel), but I'm not sure if the counter was
> > already high before I ran the test.
> 
> I'll look into that. If not they could be recorded to allow a check
> for a non-zero delta.
> 
> Possibly an entirely different mechanism is needed to check for GSO
> functioning. But I'm not sure what it would be at this point.
>
> > * "datapath - http over mpls between two ports"  (maybe others too?)
> > should shift all openflow rules into a single section using AT_DATA,
> > similar to the other tests. This makes it easier to reason about the
> > flow table and understand what's going on before reading through the
> > rest of the test.
> 
> Sure, will do.
> 
> > * If there is a common set of configuration you do for local stack
> > within a namespace to route MPLS traffic, you could consider adding
> > another macro into system-common-macros.at.
> 
> Ok, possibly there is if some of the configuration is parametrised:
> e.g. over the namespace/netdev to send/receive MPLS using native Linux MPLS
> routing.
> 
> > I also see this error on "http over mpls over gre tunnel":
> > +sh: 1: cannot create /proc/sys/net/mpls/conf/ns_gre0/input: Directory
> > nonexistent
> > 
> > Maybe MPLS + GRE needs a separate check?
> 
> Yes, that is probably the case.
> 
> I believe some versions of the kernel support MPLS routing for
> some interfaces but not GRE interfaces.

Please find my working patch below.

From: Simon H

Re: [PATCH] usbnet: ax88179_178a: Add support for writing the EEPROM

2016-08-25 Thread Oliver Neukum
On Thu, 2016-08-25 at 12:07 +0200, Alban Bedel wrote:
> On Thu, 25 Aug 2016 11:16:36 +0200
> Oliver Neukum  wrote:
> 
> > On Wed, 2016-08-24 at 16:40 +0200, Alban Bedel wrote:
> > > On Wed, 24 Aug 2016 16:30:39 +0200
> > > Oliver Neukum  wrote:
> > >   
> > > > On Wed, 2016-08-24 at 15:52 +0200, Alban Bedel wrote:  
> > 
> > > > > +   if (block != data)
> > > > > +   kfree(block);
> > > > 
> > > > And if block == dta, what frees the memory?  
> > > 
> > > In this case this function didn't allocate any memory, so there is
> > > nothing to free.  
> > 
> > Hi,
> > 
> > I see. kfree() has a check for NULL, so you could drop the
> > test, but it doesn't matter much either way.
> 
> I think you misunderstand something here. data is the buffer passed
> by the caller and block is a local variable. There is two cases:
> 
> 1) The data to write is block aligned, then we use the caller buffer
>as is and set block = data.
> 2) The requested data is not block aligned, then we kalloc block.
> 
> In both case the writing loop then use the block pointer. Afterwards
> we only need to kfree block in case 2, that is when block != data.

Thanks for the clarification. Maybe worth a comment in the code?

Regards
Oliver




[PATCH net-next 4/6] net: dsa: b53: Add JOIN_ALL_VLAN support

2016-08-25 Thread Florian Fainelli
In order to migrate the bcm_sf2 driver over to the b53 driver for most
VLAN/FDB/bridge operations, we need to add support for the "join all
VLANs" register and behavior which allows us to make a given port join
all VLANs and avoid setting specific VLAN entries when it is leaving the
bridge.

Signed-off-by: Florian Fainelli 
---
 drivers/net/dsa/b53/b53_common.c | 30 ++
 drivers/net/dsa/b53/b53_regs.h   |  3 +++
 2 files changed, 29 insertions(+), 4 deletions(-)

diff --git a/drivers/net/dsa/b53/b53_common.c b/drivers/net/dsa/b53/b53_common.c
index e59d799880e4..4b270a401336 100644
--- a/drivers/net/dsa/b53/b53_common.c
+++ b/drivers/net/dsa/b53/b53_common.c
@@ -1315,9 +1315,21 @@ static int b53_br_join(struct dsa_switch *ds, int port,
   struct net_device *bridge)
 {
struct b53_device *dev = ds_to_priv(ds);
+   s8 cpu_port = ds->dst->cpu_port;
u16 pvlan, reg;
unsigned int i;
 
+   /* Make this port leave the all VLANs join since we will have proper
+* VLAN entries from now on
+*/
+   if (is58xx(dev)) {
+   b53_read16(dev, B53_VLAN_PAGE, B53_JOIN_ALL_VLAN_EN, ®);
+   reg &= ~BIT(port);
+   if ((reg & BIT(cpu_port)) == BIT(cpu_port))
+   reg &= ~BIT(cpu_port);
+   b53_write16(dev, B53_VLAN_PAGE, B53_JOIN_ALL_VLAN_EN, reg);
+   }
+
dev->ports[port].bridge_dev = bridge;
b53_read16(dev, B53_PVLAN_PAGE, B53_PVLAN_PORT_MASK(port), &pvlan);
 
@@ -1350,6 +1362,7 @@ static void b53_br_leave(struct dsa_switch *ds, int port)
struct b53_device *dev = ds_to_priv(ds);
struct net_device *bridge = dev->ports[port].bridge_dev;
struct b53_vlan *vl = &dev->vlans[0];
+   s8 cpu_port = ds->dst->cpu_port;
unsigned int i;
u16 pvlan, reg, pvid;
 
@@ -1379,10 +1392,19 @@ static void b53_br_leave(struct dsa_switch *ds, int 
port)
else
pvid = 0;
 
-   b53_get_vlan_entry(dev, pvid, vl);
-   vl->members |= BIT(port) | BIT(dev->cpu_port);
-   vl->untag |= BIT(port) | BIT(dev->cpu_port);
-   b53_set_vlan_entry(dev, pvid, vl);
+   /* Make this port join all VLANs without VLAN entries */
+   if (is58xx(dev)) {
+   b53_read16(dev, B53_VLAN_PAGE, B53_JOIN_ALL_VLAN_EN, ®);
+   reg |= BIT(port);
+   if (!(reg & BIT(cpu_port)))
+   reg |= BIT(cpu_port);
+   b53_write16(dev, B53_VLAN_PAGE, B53_JOIN_ALL_VLAN_EN, reg);
+   } else {
+   b53_get_vlan_entry(dev, pvid, vl);
+   vl->members |= BIT(port) | BIT(dev->cpu_port);
+   vl->untag |= BIT(port) | BIT(dev->cpu_port);
+   b53_set_vlan_entry(dev, pvid, vl);
+   }
 }
 
 static void b53_br_set_stp_state(struct dsa_switch *ds, int port,
diff --git a/drivers/net/dsa/b53/b53_regs.h b/drivers/net/dsa/b53/b53_regs.h
index a0b453ea34c9..dac0af4e2cd0 100644
--- a/drivers/net/dsa/b53/b53_regs.h
+++ b/drivers/net/dsa/b53/b53_regs.h
@@ -309,6 +309,9 @@
 /* Port VLAN mask (16 bit) IMP port is always 8, also on 5325 & co */
 #define B53_PVLAN_PORT_MASK(i) ((i) * 2)
 
+/* Join all VLANs register (16 bit) */
+#define B53_JOIN_ALL_VLAN_EN   0x50
+
 /*
  * 802.1Q Page Registers
  */
-- 
2.7.4



[PATCH net 02/10] net: ethernet: mediatek: fix incorrect return value of devm_clk_get with EPROBE_DEFER

2016-08-25 Thread Sean
From: Sean Wang 

If the return value of devm_clk_get is EPROBE_DEFER, we should
defer probing the driver. The change is verified and works based
on 4.8-rc1 staying with the latest clk-next code for MT7623.

Signed-off-by: Sean Wang 
---
 drivers/net/ethernet/mediatek/mtk_eth_soc.c | 11 +--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/mediatek/mtk_eth_soc.c 
b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
index 6e4a6ca..02b048f 100644
--- a/drivers/net/ethernet/mediatek/mtk_eth_soc.c
+++ b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
@@ -1851,8 +1851,15 @@ static int mtk_probe(struct platform_device *pdev)
eth->clk_gp1 = devm_clk_get(&pdev->dev, "gp1");
eth->clk_gp2 = devm_clk_get(&pdev->dev, "gp2");
if (IS_ERR(eth->clk_esw) || IS_ERR(eth->clk_gp1) ||
-   IS_ERR(eth->clk_gp2) || IS_ERR(eth->clk_ethif))
-   return -ENODEV;
+   IS_ERR(eth->clk_gp2) || IS_ERR(eth->clk_ethif)) {
+   if (PTR_ERR(eth->clk_esw) == -EPROBE_DEFER ||
+   PTR_ERR(eth->clk_gp1) == -EPROBE_DEFER ||
+   PTR_ERR(eth->clk_gp1) == -EPROBE_DEFER ||
+   PTR_ERR(eth->clk_gp2) == -EPROBE_DEFER)
+   return -EPROBE_DEFER;
+   else
+   return -ENODEV;
+   }
 
clk_prepare_enable(eth->clk_ethif);
clk_prepare_enable(eth->clk_esw);
-- 
1.9.1



[PATCH net 03/10] net: ethernet: mediatek: fix API usage with skb_free_frag

2016-08-25 Thread Sean
From: Sean Wang 

use skb_free_frag() instead of legacy put_page()

Signed-off-by: Sean Wang 
---
 drivers/net/ethernet/mediatek/mtk_eth_soc.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/mediatek/mtk_eth_soc.c 
b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
index 02b048f..1b131a1 100644
--- a/drivers/net/ethernet/mediatek/mtk_eth_soc.c
+++ b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
@@ -864,7 +864,7 @@ static int mtk_poll_rx(struct napi_struct *napi, int budget,
/* receive data */
skb = build_skb(data, ring->frag_size);
if (unlikely(!skb)) {
-   put_page(virt_to_head_page(new_data));
+   skb_free_frag(new_data);
netdev->stats.rx_dropped++;
goto release_desc;
}
-- 
1.9.1



[PATCH net 06/10] net: ethernet: mediatek: fix the loss of pin-mux setting for GMAC2

2016-08-25 Thread Sean
From: Sean Wang 

ommited the setting about pin-mux which results in incorrect signals
being routed on GMAC2.

Signed-off-by: Sean Wang 
---
 drivers/net/ethernet/mediatek/mtk_eth_soc.c | 14 ++
 drivers/net/ethernet/mediatek/mtk_eth_soc.h |  3 +++
 2 files changed, 17 insertions(+)

diff --git a/drivers/net/ethernet/mediatek/mtk_eth_soc.c 
b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
index 5bd31f8..0a4c782 100644
--- a/drivers/net/ethernet/mediatek/mtk_eth_soc.c
+++ b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
@@ -1415,6 +1415,7 @@ static int __init mtk_hw_init(struct mtk_eth *eth)
usleep_range(10, 20);
reset_control_deassert(eth->rstc);
usleep_range(10, 20);
+   pinctrl_select_state(eth->pins, eth->ephy_default);
 
/* Set GE2 driving and slew rate */
regmap_write(eth->pctl, GPIO_DRV_SEL10, 0xa00);
@@ -1858,6 +1859,19 @@ static int mtk_probe(struct platform_device *pdev)
return -ENODEV;
}
 
+   eth->pins = devm_pinctrl_get(&pdev->dev);
+   if (IS_ERR(eth->pins)) {
+   dev_err(&pdev->dev, "cannot get pinctrl\n");
+   return PTR_ERR(eth->pins);
+   }
+
+   eth->ephy_default =
+   pinctrl_lookup_state(eth->pins, "default");
+   if (IS_ERR(eth->ephy_default)) {
+   dev_err(&pdev->dev, "cannot get pinctrl state\n");
+   return PTR_ERR(eth->ephy_default);
+   }
+
clk_prepare_enable(eth->clk_ethif);
clk_prepare_enable(eth->clk_esw);
clk_prepare_enable(eth->clk_gp1);
diff --git a/drivers/net/ethernet/mediatek/mtk_eth_soc.h 
b/drivers/net/ethernet/mediatek/mtk_eth_soc.h
index f82e3ac..13d3f1b 100644
--- a/drivers/net/ethernet/mediatek/mtk_eth_soc.h
+++ b/drivers/net/ethernet/mediatek/mtk_eth_soc.h
@@ -404,6 +404,9 @@ struct mtk_eth {
struct clk  *clk_esw;
struct clk  *clk_gp1;
struct clk  *clk_gp2;
+   struct pinctrl  *pins;
+   struct pinctrl_state*ephy_default;
+
struct mii_bus  *mii_bus;
struct work_struct  pending_work;
 };
-- 
1.9.1



[PATCH net 05/10] net: ethernet: mediatek: fix logic unbalance between probe and remove

2016-08-25 Thread Sean
From: Sean Wang 

original mdio_cleanup is not in the symmetric place against where
mdio_init is, so relocate mdio_cleanup to the right one.

Signed-off-by: Sean Wang 
---
 drivers/net/ethernet/mediatek/mtk_eth_soc.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/mediatek/mtk_eth_soc.c 
b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
index 9883dac..5bd31f8 100644
--- a/drivers/net/ethernet/mediatek/mtk_eth_soc.c
+++ b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
@@ -1504,7 +1504,6 @@ static void mtk_uninit(struct net_device *dev)
struct mtk_eth *eth = mac->hw;
 
phy_disconnect(mac->phy_dev);
-   mtk_mdio_cleanup(eth);
mtk_irq_disable(eth, ~0);
 }
 
@@ -1915,6 +1914,7 @@ static int mtk_remove(struct platform_device *pdev)
netif_napi_del(ð->tx_napi);
netif_napi_del(ð->rx_napi);
mtk_cleanup(eth);
+   mtk_mdio_cleanup(eth);
 
return 0;
 }
-- 
1.9.1



[PATCH net 07/10] net: ethernet: mediatek: fix issue of driver removal with interface is up

2016-08-25 Thread Sean
From: Sean Wang 

1) mtk_stop() must be called to stop for freeing DMA resources 
acquired and restoring state changed by mtk_open() when module 
removal.

2) group clock disabled related function into mtk_hw_deinit which 
could be reused with others functionality such as the whole ethernet 
reset that would be posted in the later series of patches.

Signed-off-by: Sean Wang 
---
 drivers/net/ethernet/mediatek/mtk_eth_soc.c | 22 ++
 1 file changed, 18 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/mediatek/mtk_eth_soc.c 
b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
index 0a4c782..c573475 100644
--- a/drivers/net/ethernet/mediatek/mtk_eth_soc.c
+++ b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
@@ -1478,6 +1478,16 @@ static int __init mtk_hw_init(struct mtk_eth *eth)
return 0;
 }
 
+static int mtk_hw_deinit(struct mtk_eth *eth)
+{
+   clk_disable_unprepare(eth->clk_esw);
+   clk_disable_unprepare(eth->clk_gp1);
+   clk_disable_unprepare(eth->clk_gp2);
+   clk_disable_unprepare(eth->clk_ethif);
+
+   return 0;
+}
+
 static int __init mtk_init(struct net_device *dev)
 {
struct mtk_mac *mac = netdev_priv(dev);
@@ -1919,11 +1929,15 @@ err_free_dev:
 static int mtk_remove(struct platform_device *pdev)
 {
struct mtk_eth *eth = platform_get_drvdata(pdev);
+   int i;
 
-   clk_disable_unprepare(eth->clk_ethif);
-   clk_disable_unprepare(eth->clk_esw);
-   clk_disable_unprepare(eth->clk_gp1);
-   clk_disable_unprepare(eth->clk_gp2);
+   /* stop all devices to make sure that dma is properly shut down */
+   for (i = 0; i < MTK_MAC_COUNT; i++) {
+   if (!eth->netdev[i])
+   continue;
+   mtk_stop(eth->netdev[i]);
+   }
+   mtk_hw_deinit(eth);
 
netif_napi_del(ð->tx_napi);
netif_napi_del(ð->rx_napi);
-- 
1.9.1



[PATCH net 09/10] net: ethernet: mediatek: use devm_mdiobus_alloc instead of mdiobus_alloc inside mtk_mdio_init

2016-08-25 Thread Sean
From: Sean Wang 

a lot of parts in the driver uses devm_* APIs to gain benefits from the
device resource management, so devm_mdiobus_alloc is also used instead
of mdiobus_alloc to have more elegant code flow.

Signed-off-by: Sean Wang 
---
 drivers/net/ethernet/mediatek/mtk_eth_soc.c | 13 +
 1 file changed, 1 insertion(+), 12 deletions(-)

diff --git a/drivers/net/ethernet/mediatek/mtk_eth_soc.c 
b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
index e3baa63..05d85da 100644
--- a/drivers/net/ethernet/mediatek/mtk_eth_soc.c
+++ b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
@@ -304,7 +304,7 @@ static int mtk_mdio_init(struct mtk_eth *eth)
goto err_put_node;
}
 
-   eth->mii_bus = mdiobus_alloc();
+   eth->mii_bus = devm_mdiobus_alloc(eth->dev);
if (!eth->mii_bus) {
err = -ENOMEM;
goto err_put_node;
@@ -318,18 +318,9 @@ static int mtk_mdio_init(struct mtk_eth *eth)
 
snprintf(eth->mii_bus->id, MII_BUS_ID_SIZE, "%s", mii_np->name);
err = of_mdiobus_register(eth->mii_bus, mii_np);
-   if (err)
-   goto err_free_bus;
-   of_node_put(mii_np);
-
-   return 0;
-
-err_free_bus:
-   mdiobus_free(eth->mii_bus);
 
 err_put_node:
of_node_put(mii_np);
-   eth->mii_bus = NULL;
return err;
 }
 
@@ -339,8 +330,6 @@ static void mtk_mdio_cleanup(struct mtk_eth *eth)
return;
 
mdiobus_unregister(eth->mii_bus);
-   of_node_put(eth->mii_bus->dev.of_node);
-   mdiobus_free(eth->mii_bus);
 }
 
 static inline void mtk_irq_disable(struct mtk_eth *eth, u32 mask)
-- 
1.9.1



[PATCH net 08/10] net: ethernet: mediatek: fix the missing of_node_put() after node is used done inside mtk_mdio_init

2016-08-25 Thread Sean
From: Sean Wang 

This patch adds the missing of_node_put() after finishing the usage
of of_get_child_by_name.

Signed-off-by: Sean Wang 
---
 drivers/net/ethernet/mediatek/mtk_eth_soc.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/net/ethernet/mediatek/mtk_eth_soc.c 
b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
index c573475..e3baa63 100644
--- a/drivers/net/ethernet/mediatek/mtk_eth_soc.c
+++ b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
@@ -320,6 +320,7 @@ static int mtk_mdio_init(struct mtk_eth *eth)
err = of_mdiobus_register(eth->mii_bus, mii_np);
if (err)
goto err_free_bus;
+   of_node_put(mii_np);
 
return 0;
 
-- 
1.9.1



[PATCH net 10/10] net: ethernet: mediatek: fix error handling inside mtk_mdio_init

2016-08-25 Thread Sean
From: Sean Wang 

return -ENODEV if no child is found in MDIO bus.

Signed-off-by: Sean Wang 
---
 drivers/net/ethernet/mediatek/mtk_eth_soc.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/mediatek/mtk_eth_soc.c 
b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
index 05d85da..2d547c2 100644
--- a/drivers/net/ethernet/mediatek/mtk_eth_soc.c
+++ b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
@@ -300,7 +300,7 @@ static int mtk_mdio_init(struct mtk_eth *eth)
}
 
if (!of_device_is_available(mii_np)) {
-   err = 0;
+   err = -ENODEV;
goto err_put_node;
}
 
-- 
1.9.1



[PATCH net 04/10] net: ethernet: mediatek: remove redundant free_irq for devm_request_irq allocated irq

2016-08-25 Thread Sean
From: Sean Wang 

these irqs are not used for shared irq and disabled during ethernet stops.
irq requested by devm_request_irq is safe to be freed automatically on
driver detach.

Signed-off-by: Sean Wang 
---
 drivers/net/ethernet/mediatek/mtk_eth_soc.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/drivers/net/ethernet/mediatek/mtk_eth_soc.c 
b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
index 1b131a1..9883dac 100644
--- a/drivers/net/ethernet/mediatek/mtk_eth_soc.c
+++ b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
@@ -1506,8 +1506,6 @@ static void mtk_uninit(struct net_device *dev)
phy_disconnect(mac->phy_dev);
mtk_mdio_cleanup(eth);
mtk_irq_disable(eth, ~0);
-   free_irq(eth->irq[1], dev);
-   free_irq(eth->irq[2], dev);
 }
 
 static int mtk_do_ioctl(struct net_device *dev, struct ifreq *ifr, int cmd)
-- 
1.9.1



[PATCH net 00/10] net: ethernet: mediatek: a couple of fixes

2016-08-25 Thread Sean
From: Sean Wang 

a couple of fixes come out from integrating with linux-4.8 rc1
they all are verified and workable on linux-4.8 rc1

Sean Wang (10):
  net: ethernet: mediatek: fix fails from TX housekeeping due to
incorrect port setup
  net: ethernet: mediatek: fix incorrect return value of devm_clk_get
with EPROBE_DEFER
  net: ethernet: mediatek: fix API usage with skb_free_frag
  net: ethernet: mediatek: remove redundant free_irq for
devm_request_irq allocated irq
  net: ethernet: mediatek: fix logic unbalance between probe and remove
  net: ethernet: mediatek: fix the loss of pin-mux setting for GMAC2
  net: ethernet: mediatek: fix issue of driver removal with interface is
up
  net: ethernet: mediatek: fix the missing of_node_put() after node is
used done inside mtk_mdio_init
  net: ethernet: mediatek: use devm_mdiobus_alloc instead of
mdiobus_alloc inside mtk_mdio_init
  net: ethernet: mediatek: fix error handling inside mtk_mdio_init

 drivers/net/ethernet/mediatek/mtk_eth_soc.c | 74 +++--
 drivers/net/ethernet/mediatek/mtk_eth_soc.h |  3 ++
 2 files changed, 52 insertions(+), 25 deletions(-)

-- 
1.9.1



[PATCH net 01/10] net: ethernet: mediatek: fix fails from TX housekeeping due to incorrect port setup

2016-08-25 Thread Sean
From: Sean Wang 

which net device the SKB is complete for depends on the forward port
on txd4 on the corresponding TX descriptor, but the information isn't
set up well in case of  SKB fragments that would lead to watchdog timeout
from the upper layer, so fix it up.

Signed-off-by: Sean Wang 
---
 drivers/net/ethernet/mediatek/mtk_eth_soc.c | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/mediatek/mtk_eth_soc.c 
b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
index 1801fd8..6e4a6ca 100644
--- a/drivers/net/ethernet/mediatek/mtk_eth_soc.c
+++ b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
@@ -587,14 +587,15 @@ static int mtk_tx_map(struct sk_buff *skb, struct 
net_device *dev,
dma_addr_t mapped_addr;
unsigned int nr_frags;
int i, n_desc = 1;
-   u32 txd4 = 0;
+   u32 txd4 = 0, fport;
 
itxd = ring->next_free;
if (itxd == ring->last_free)
return -ENOMEM;
 
/* set the forward port */
-   txd4 |= (mac->id + 1) << TX_DMA_FPORT_SHIFT;
+   fport = (mac->id + 1) << TX_DMA_FPORT_SHIFT;
+   txd4 |= fport;
 
tx_buf = mtk_desc_to_tx_buf(ring, itxd);
memset(tx_buf, 0, sizeof(*tx_buf));
@@ -652,7 +653,7 @@ static int mtk_tx_map(struct sk_buff *skb, struct 
net_device *dev,
WRITE_ONCE(txd->txd3, (TX_DMA_SWC |
   TX_DMA_PLEN0(frag_map_size) |
   last_frag * TX_DMA_LS0));
-   WRITE_ONCE(txd->txd4, 0);
+   WRITE_ONCE(txd->txd4, fport);
 
tx_buf->skb = (struct sk_buff *)MTK_DMA_DUMMY_DESC;
tx_buf = mtk_desc_to_tx_buf(ring, txd);
-- 
1.9.1



[RESEND PATCH net 08/10] net: ethernet: mediatek: fix the missing of_node_put() after node is used done inside mtk_mdio_init

2016-08-25 Thread Sean Wang
This patch adds the missing of_node_put() after finishing the usage
of of_get_child_by_name.

Signed-off-by: Sean Wang 
---
 drivers/net/ethernet/mediatek/mtk_eth_soc.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/net/ethernet/mediatek/mtk_eth_soc.c 
b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
index c573475..e3baa63 100644
--- a/drivers/net/ethernet/mediatek/mtk_eth_soc.c
+++ b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
@@ -320,6 +320,7 @@ static int mtk_mdio_init(struct mtk_eth *eth)
err = of_mdiobus_register(eth->mii_bus, mii_np);
if (err)
goto err_free_bus;
+   of_node_put(mii_np);
 
return 0;
 
-- 
1.9.1



[RESEND PATCH net 09/10] net: ethernet: mediatek: use devm_mdiobus_alloc instead of mdiobus_alloc inside mtk_mdio_init

2016-08-25 Thread Sean Wang
a lot of parts in the driver uses devm_* APIs to gain benefits from the
device resource management, so devm_mdiobus_alloc is also used instead
of mdiobus_alloc to have more elegant code flow.

Signed-off-by: Sean Wang 
---
 drivers/net/ethernet/mediatek/mtk_eth_soc.c | 13 +
 1 file changed, 1 insertion(+), 12 deletions(-)

diff --git a/drivers/net/ethernet/mediatek/mtk_eth_soc.c 
b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
index e3baa63..05d85da 100644
--- a/drivers/net/ethernet/mediatek/mtk_eth_soc.c
+++ b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
@@ -304,7 +304,7 @@ static int mtk_mdio_init(struct mtk_eth *eth)
goto err_put_node;
}
 
-   eth->mii_bus = mdiobus_alloc();
+   eth->mii_bus = devm_mdiobus_alloc(eth->dev);
if (!eth->mii_bus) {
err = -ENOMEM;
goto err_put_node;
@@ -318,18 +318,9 @@ static int mtk_mdio_init(struct mtk_eth *eth)
 
snprintf(eth->mii_bus->id, MII_BUS_ID_SIZE, "%s", mii_np->name);
err = of_mdiobus_register(eth->mii_bus, mii_np);
-   if (err)
-   goto err_free_bus;
-   of_node_put(mii_np);
-
-   return 0;
-
-err_free_bus:
-   mdiobus_free(eth->mii_bus);
 
 err_put_node:
of_node_put(mii_np);
-   eth->mii_bus = NULL;
return err;
 }
 
@@ -339,8 +330,6 @@ static void mtk_mdio_cleanup(struct mtk_eth *eth)
return;
 
mdiobus_unregister(eth->mii_bus);
-   of_node_put(eth->mii_bus->dev.of_node);
-   mdiobus_free(eth->mii_bus);
 }
 
 static inline void mtk_irq_disable(struct mtk_eth *eth, u32 mask)
-- 
1.9.1



[RESEND PATCH net 01/10] net: ethernet: mediatek: fix fails from TX housekeeping due to incorrect port setup

2016-08-25 Thread Sean Wang
which net device the SKB is complete for depends on the forward port
on txd4 on the corresponding TX descriptor, but the information isn't
set up well in case of  SKB fragments that would lead to watchdog timeout
from the upper layer, so fix it up.

Signed-off-by: Sean Wang 
---
 drivers/net/ethernet/mediatek/mtk_eth_soc.c | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/mediatek/mtk_eth_soc.c 
b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
index 1801fd8..6e4a6ca 100644
--- a/drivers/net/ethernet/mediatek/mtk_eth_soc.c
+++ b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
@@ -587,14 +587,15 @@ static int mtk_tx_map(struct sk_buff *skb, struct 
net_device *dev,
dma_addr_t mapped_addr;
unsigned int nr_frags;
int i, n_desc = 1;
-   u32 txd4 = 0;
+   u32 txd4 = 0, fport;
 
itxd = ring->next_free;
if (itxd == ring->last_free)
return -ENOMEM;
 
/* set the forward port */
-   txd4 |= (mac->id + 1) << TX_DMA_FPORT_SHIFT;
+   fport = (mac->id + 1) << TX_DMA_FPORT_SHIFT;
+   txd4 |= fport;
 
tx_buf = mtk_desc_to_tx_buf(ring, itxd);
memset(tx_buf, 0, sizeof(*tx_buf));
@@ -652,7 +653,7 @@ static int mtk_tx_map(struct sk_buff *skb, struct 
net_device *dev,
WRITE_ONCE(txd->txd3, (TX_DMA_SWC |
   TX_DMA_PLEN0(frag_map_size) |
   last_frag * TX_DMA_LS0));
-   WRITE_ONCE(txd->txd4, 0);
+   WRITE_ONCE(txd->txd4, fport);
 
tx_buf->skb = (struct sk_buff *)MTK_DMA_DUMMY_DESC;
tx_buf = mtk_desc_to_tx_buf(ring, txd);
-- 
1.9.1



[RESEND PATCH net 04/10] net: ethernet: mediatek: remove redundant free_irq for devm_request_irq allocated irq

2016-08-25 Thread Sean Wang
these irqs are not used for shared irq and disabled during ethernet stops.
irq requested by devm_request_irq is safe to be freed automatically on
driver detach.

Signed-off-by: Sean Wang 
---
 drivers/net/ethernet/mediatek/mtk_eth_soc.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/drivers/net/ethernet/mediatek/mtk_eth_soc.c 
b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
index 1b131a1..9883dac 100644
--- a/drivers/net/ethernet/mediatek/mtk_eth_soc.c
+++ b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
@@ -1506,8 +1506,6 @@ static void mtk_uninit(struct net_device *dev)
phy_disconnect(mac->phy_dev);
mtk_mdio_cleanup(eth);
mtk_irq_disable(eth, ~0);
-   free_irq(eth->irq[1], dev);
-   free_irq(eth->irq[2], dev);
 }
 
 static int mtk_do_ioctl(struct net_device *dev, struct ifreq *ifr, int cmd)
-- 
1.9.1



[RESEND PATCH net 06/10] net: ethernet: mediatek: fix the loss of pin-mux setting for GMAC2

2016-08-25 Thread Sean Wang
ommited the setting about pin-mux which results in incorrect signals
being routed on GMAC2.

Signed-off-by: Sean Wang 
---
 drivers/net/ethernet/mediatek/mtk_eth_soc.c | 14 ++
 drivers/net/ethernet/mediatek/mtk_eth_soc.h |  3 +++
 2 files changed, 17 insertions(+)

diff --git a/drivers/net/ethernet/mediatek/mtk_eth_soc.c 
b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
index 5bd31f8..0a4c782 100644
--- a/drivers/net/ethernet/mediatek/mtk_eth_soc.c
+++ b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
@@ -1415,6 +1415,7 @@ static int __init mtk_hw_init(struct mtk_eth *eth)
usleep_range(10, 20);
reset_control_deassert(eth->rstc);
usleep_range(10, 20);
+   pinctrl_select_state(eth->pins, eth->ephy_default);
 
/* Set GE2 driving and slew rate */
regmap_write(eth->pctl, GPIO_DRV_SEL10, 0xa00);
@@ -1858,6 +1859,19 @@ static int mtk_probe(struct platform_device *pdev)
return -ENODEV;
}
 
+   eth->pins = devm_pinctrl_get(&pdev->dev);
+   if (IS_ERR(eth->pins)) {
+   dev_err(&pdev->dev, "cannot get pinctrl\n");
+   return PTR_ERR(eth->pins);
+   }
+
+   eth->ephy_default =
+   pinctrl_lookup_state(eth->pins, "default");
+   if (IS_ERR(eth->ephy_default)) {
+   dev_err(&pdev->dev, "cannot get pinctrl state\n");
+   return PTR_ERR(eth->ephy_default);
+   }
+
clk_prepare_enable(eth->clk_ethif);
clk_prepare_enable(eth->clk_esw);
clk_prepare_enable(eth->clk_gp1);
diff --git a/drivers/net/ethernet/mediatek/mtk_eth_soc.h 
b/drivers/net/ethernet/mediatek/mtk_eth_soc.h
index f82e3ac..13d3f1b 100644
--- a/drivers/net/ethernet/mediatek/mtk_eth_soc.h
+++ b/drivers/net/ethernet/mediatek/mtk_eth_soc.h
@@ -404,6 +404,9 @@ struct mtk_eth {
struct clk  *clk_esw;
struct clk  *clk_gp1;
struct clk  *clk_gp2;
+   struct pinctrl  *pins;
+   struct pinctrl_state*ephy_default;
+
struct mii_bus  *mii_bus;
struct work_struct  pending_work;
 };
-- 
1.9.1



[RESEND PATCH net 00/10] net: ethernet: mediatek: a couple of fixes

2016-08-25 Thread Sean Wang
a couple of fixes come out from integrating with linux-4.8 rc1
they all are verified and workable on linux-4.8 rc1

Sean Wang (10):
  net: ethernet: mediatek: fix fails from TX housekeeping due to
incorrect port setup
  net: ethernet: mediatek: fix incorrect return value of devm_clk_get
with EPROBE_DEFER
  net: ethernet: mediatek: fix API usage with skb_free_frag
  net: ethernet: mediatek: remove redundant free_irq for
devm_request_irq allocated irq
  net: ethernet: mediatek: fix logic unbalance between probe and remove
  net: ethernet: mediatek: fix the loss of pin-mux setting for GMAC2
  net: ethernet: mediatek: fix issue of driver removal with interface is
up
  net: ethernet: mediatek: fix the missing of_node_put() after node is
used done inside mtk_mdio_init
  net: ethernet: mediatek: use devm_mdiobus_alloc instead of
mdiobus_alloc inside mtk_mdio_init
  net: ethernet: mediatek: fix error handling inside mtk_mdio_init

 drivers/net/ethernet/mediatek/mtk_eth_soc.c | 74 +++--
 drivers/net/ethernet/mediatek/mtk_eth_soc.h |  3 ++
 2 files changed, 52 insertions(+), 25 deletions(-)

-- 
1.9.1



[RESEND PATCH net 10/10] net: ethernet: mediatek: fix error handling inside mtk_mdio_init

2016-08-25 Thread Sean Wang
return -ENODEV if no child is found in MDIO bus.

Signed-off-by: Sean Wang 
---
 drivers/net/ethernet/mediatek/mtk_eth_soc.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/mediatek/mtk_eth_soc.c 
b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
index 05d85da..2d547c2 100644
--- a/drivers/net/ethernet/mediatek/mtk_eth_soc.c
+++ b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
@@ -300,7 +300,7 @@ static int mtk_mdio_init(struct mtk_eth *eth)
}
 
if (!of_device_is_available(mii_np)) {
-   err = 0;
+   err = -ENODEV;
goto err_put_node;
}
 
-- 
1.9.1



[RESEND PATCH net 07/10] net: ethernet: mediatek: fix issue of driver removal with interface is up

2016-08-25 Thread Sean Wang
1) mtk_stop() must be called to stop for freeing DMA resources
acquired and restoring state changed by mtk_open() when module
removal.

2) group clock disabled related function into mtk_hw_deinit which
could be reused with others functionality such as the whole ethernet
reset that would be posted in the later series of patches.

Signed-off-by: Sean Wang 
---
 drivers/net/ethernet/mediatek/mtk_eth_soc.c | 22 ++
 1 file changed, 18 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/mediatek/mtk_eth_soc.c 
b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
index 0a4c782..c573475 100644
--- a/drivers/net/ethernet/mediatek/mtk_eth_soc.c
+++ b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
@@ -1478,6 +1478,16 @@ static int __init mtk_hw_init(struct mtk_eth *eth)
return 0;
 }
 
+static int mtk_hw_deinit(struct mtk_eth *eth)
+{
+   clk_disable_unprepare(eth->clk_esw);
+   clk_disable_unprepare(eth->clk_gp1);
+   clk_disable_unprepare(eth->clk_gp2);
+   clk_disable_unprepare(eth->clk_ethif);
+
+   return 0;
+}
+
 static int __init mtk_init(struct net_device *dev)
 {
struct mtk_mac *mac = netdev_priv(dev);
@@ -1919,11 +1929,15 @@ err_free_dev:
 static int mtk_remove(struct platform_device *pdev)
 {
struct mtk_eth *eth = platform_get_drvdata(pdev);
+   int i;
 
-   clk_disable_unprepare(eth->clk_ethif);
-   clk_disable_unprepare(eth->clk_esw);
-   clk_disable_unprepare(eth->clk_gp1);
-   clk_disable_unprepare(eth->clk_gp2);
+   /* stop all devices to make sure that dma is properly shut down */
+   for (i = 0; i < MTK_MAC_COUNT; i++) {
+   if (!eth->netdev[i])
+   continue;
+   mtk_stop(eth->netdev[i]);
+   }
+   mtk_hw_deinit(eth);
 
netif_napi_del(ð->tx_napi);
netif_napi_del(ð->rx_napi);
-- 
1.9.1



[RFC v2 02/10] bpf: Move u64_to_ptr() to BPF headers and inline it

2016-08-25 Thread Mickaël Salaün
This helper will be useful for arraymap (next commit).

Signed-off-by: Mickaël Salaün 
Cc: Alexei Starovoitov 
Cc: David S. Miller 
Cc: Daniel Borkmann 
---
 include/linux/bpf.h  | 6 ++
 kernel/bpf/syscall.c | 6 --
 2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index 0de4de6dd43e..ca3742729ae7 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -251,6 +251,12 @@ static inline void bpf_long_memcpy(void *dst, const void 
*src, u32 size)
 
 /* verify correctness of eBPF program */
 int bpf_check(struct bpf_prog **fp, union bpf_attr *attr);
+
+/* helper to convert user pointers passed inside __aligned_u64 fields */
+static inline void __user *u64_to_ptr(__u64 val)
+{
+   return (void __user *) (unsigned long) val;
+}
 #else
 static inline void bpf_register_prog_type(struct bpf_prog_type_list *tl)
 {
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index 46ecce4b79ed..d305a3ce0fa7 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -247,12 +247,6 @@ struct bpf_map *bpf_map_get_with_uref(u32 ufd)
return map;
 }
 
-/* helper to convert user pointers passed inside __aligned_u64 fields */
-static void __user *u64_to_ptr(__u64 val)
-{
-   return (void __user *) (unsigned long) val;
-}
-
 int __weak bpf_stackmap_copy(struct bpf_map *map, void *key, void *value)
 {
return -ENOTSUPP;
-- 
2.8.1



[RFC v2 08/10] landlock: Handle file system comparisons

2016-08-25 Thread Mickaël Salaün
Add eBPF functions to compare file system access with a Landlock file
system handle:
* bpf_landlock_cmp_fs_prop_with_struct_file(prop, map, map_op, file)
  This function allows to compare the dentry, inode, device or mount
  point of the currently accessed file, with a reference handle.
* bpf_landlock_cmp_fs_beneath_with_struct_file(opt, map, map_op, file)
  This function allows an eBPF program to check if the current accessed
  file is the same or in the hierarchy of a reference handle.

The goal of file system handle is to abstract kernel objects such as a
struct file or a struct inode. Userland can create this kind of handle
thanks to the BPF_MAP_UPDATE_ELEM command. The element is a struct
landlock_handle containing the handle type (e.g.
BPF_MAP_HANDLE_TYPE_LANDLOCK_FS_FD) and a file descriptor. This could
also be any descriptions able to match a struct file or a struct inode
(e.g. path or glob string).

Signed-off-by: Mickaël Salaün 
Cc: Kees Cook 
Cc: Alexei Starovoitov 
Cc: James Morris 
Cc: Serge E. Hallyn 
Cc: David S. Miller 
Cc: Daniel Borkmann 
---
 include/linux/bpf.h|   4 +-
 include/uapi/linux/bpf.h   |  52 +++-
 kernel/bpf/arraymap.c  |  17 +++-
 kernel/bpf/verifier.c  |   6 ++
 security/landlock/Makefile |   2 +-
 security/landlock/checker_fs.c | 183 +
 security/landlock/checker_fs.h |  20 +
 security/landlock/lsm.c|  11 ++-
 8 files changed, 288 insertions(+), 7 deletions(-)
 create mode 100644 security/landlock/checker_fs.c
 create mode 100644 security/landlock/checker_fs.h

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index 557e7efdf0cd..79014aedbea4 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -84,6 +84,7 @@ enum bpf_arg_type {
 
ARG_PTR_TO_STRUCT_FILE, /* pointer to struct file */
ARG_PTR_TO_STRUCT_CRED, /* pointer to struct cred */
+   ARG_CONST_PTR_TO_LANDLOCK_HANDLE_FS,/* pointer to Landlock FS 
handle */
 };
 
 /* type of values returned from helper functions */
@@ -146,6 +147,7 @@ enum bpf_reg_type {
/* Landlock */
PTR_TO_STRUCT_FILE,
PTR_TO_STRUCT_CRED,
+   CONST_PTR_TO_LANDLOCK_HANDLE_FS,
 };
 
 struct bpf_prog;
@@ -207,7 +209,7 @@ struct bpf_array {
 
 #ifdef CONFIG_SECURITY_LANDLOCK
 struct map_landlock_handle {
-   u32 type;
+   u32 type; /* e.g. BPF_MAP_HANDLE_TYPE_LANDLOCK_FS_FD */
union {
struct file *file;
};
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index 983d14e910ff..88af79dd668c 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -89,10 +89,20 @@ enum bpf_map_type {
 
 enum bpf_map_array_type {
BPF_MAP_ARRAY_TYPE_UNSPEC,
+   BPF_MAP_ARRAY_TYPE_LANDLOCK_FS,
 };
 
 enum bpf_map_handle_type {
BPF_MAP_HANDLE_TYPE_UNSPEC,
+   BPF_MAP_HANDLE_TYPE_LANDLOCK_FS_FD,
+   BPF_MAP_HANDLE_TYPE_LANDLOCK_FS_GLOB,
+};
+
+enum bpf_map_array_op {
+   BPF_MAP_ARRAY_OP_UNSPEC,
+   BPF_MAP_ARRAY_OP_OR,
+   BPF_MAP_ARRAY_OP_AND,
+   BPF_MAP_ARRAY_OP_XOR,
 };
 
 enum bpf_prog_type {
@@ -325,6 +335,35 @@ enum bpf_func_id {
 */
BPF_FUNC_skb_get_tunnel_opt,
BPF_FUNC_skb_set_tunnel_opt,
+
+   /**
+* bpf_landlock_cmp_fs_prop_with_struct_file(prop, map, map_op, file)
+* Compare file system handles with a struct file
+*
+* @prop: properties to check against (e.g. LANDLOCK_FLAG_FS_DENTRY)
+* @map: handles to compare against
+* @map_op: which elements of the map to use (e.g. BPF_MAP_ARRAY_OP_OR)
+* @file: struct file address to compare with (taken from the context)
+*
+* Return: 0 if the file match the handles, 1 otherwise, or a negative
+* value if an error occurred.
+*/
+   BPF_FUNC_landlock_cmp_fs_prop_with_struct_file,
+
+   /**
+* bpf_landlock_cmp_fs_beneath_with_struct_file(opt, map, map_op, file)
+* Check if a struct file is a leaf of file system handles
+*
+* @opt: check options (e.g. LANDLOCK_FLAG_OPT_REVERSE)
+* @map: handles to compare against
+* @map_op: which elements of the map to use (e.g. BPF_MAP_ARRAY_OP_OR)
+* @file: struct file address to compare with (taken from the context)
+*
+* Return: 0 if the file is the same or beneath the handles,
+* 1 otherwise, or a negative value if an error occurred.
+*/
+   BPF_FUNC_landlock_cmp_fs_beneath_with_struct_file,
+
__BPF_FUNC_MAX_ID,
 };
 
@@ -398,6 +437,17 @@ struct bpf_tunnel_key {
__u32 tunnel_label;
 };
 
+/* Handle check flags */
+#define LANDLOCK_FLAG_FS_DENTRY(1 << 0)
+#define LANDLOCK_FLAG_FS_INODE (1 << 1)
+#define LANDLOCK_FLAG_FS_DEVICE(1 << 2)
+#define LANDLOCK_FLAG_FS_MOUNT (1 << 3)
+#define _LANDLOCK_FLAG_FS_MASK ((1 << 4) - 1)
+
+/* Handle option flags */
+#define LANDLO

[RFC v2 10/10] samples/landlock: Add sandbox example

2016-08-25 Thread Mickaël Salaün
Add a basic sandbox tool to create a process isolated from some part of
the system. This can depend of the current cgroup.

Example:

  $ mkdir /sys/fs/cgroup/sandboxed
  $ ls /home
  user1
  $ LANDLOCK_CGROUPS='/sys/fs/cgroup/sandboxed' \
  LANDLOCK_ALLOWED='/bin:/lib:/usr:/tmp:/proc/self/fd/0' \
  ./sandbox /bin/sh -i
  $ ls /home
  user1
  $ echo $$ > /sys/fs/cgroup/sandboxed/cgroup.procs
  $ ls /home
  ls: cannot open directory '/home': Permission denied

Signed-off-by: Mickaël Salaün 
Cc: Kees Cook 
Cc: Alexei Starovoitov 
Cc: James Morris 
Cc: Serge E. Hallyn 
Cc: David S. Miller 
Cc: Daniel Borkmann 
---
 samples/Makefile|   2 +-
 samples/landlock/.gitignore |   1 +
 samples/landlock/Makefile   |  16 +++
 samples/landlock/sandbox.c  | 295 
 4 files changed, 313 insertions(+), 1 deletion(-)
 create mode 100644 samples/landlock/.gitignore
 create mode 100644 samples/landlock/Makefile
 create mode 100644 samples/landlock/sandbox.c

diff --git a/samples/Makefile b/samples/Makefile
index 2e3b523d7097..42e6a613f728 100644
--- a/samples/Makefile
+++ b/samples/Makefile
@@ -2,4 +2,4 @@
 
 obj-$(CONFIG_SAMPLES)  += kobject/ kprobes/ trace_events/ livepatch/ \
   hw_breakpoint/ kfifo/ kdb/ hidraw/ rpmsg/ seccomp/ \
-  configfs/ connector/ v4l/
+  configfs/ connector/ v4l/ landlock/
diff --git a/samples/landlock/.gitignore b/samples/landlock/.gitignore
new file mode 100644
index ..f6c6da930a30
--- /dev/null
+++ b/samples/landlock/.gitignore
@@ -0,0 +1 @@
+/sandbox
diff --git a/samples/landlock/Makefile b/samples/landlock/Makefile
new file mode 100644
index ..d1044b2afd27
--- /dev/null
+++ b/samples/landlock/Makefile
@@ -0,0 +1,16 @@
+# kbuild trick to avoid linker error. Can be omitted if a module is built.
+obj- := dummy.o
+
+hostprogs-$(CONFIG_SECURITY_LANDLOCK) := sandbox
+sandbox-objs := sandbox.o
+
+always := $(hostprogs-y)
+
+HOSTCFLAGS += -I$(objtree)/usr/include
+
+# Trick to allow make to be run from this directory
+all:
+   $(MAKE) -C ../../ $$PWD/
+
+clean:
+   $(MAKE) -C ../../ M=$$PWD clean
diff --git a/samples/landlock/sandbox.c b/samples/landlock/sandbox.c
new file mode 100644
index ..86604963c30c
--- /dev/null
+++ b/samples/landlock/sandbox.c
@@ -0,0 +1,295 @@
+/*
+ * Landlock LSM - Sandbox Example
+ *
+ * Copyright (C) 2016  Mickaël Salaün 
+ *
+ * The code may be used by anyone for any purpose, and can serve as a starting
+ * point for developing a sandbox.
+ */
+
+#define _GNU_SOURCE
+#include 
+#include  /* open() */
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "../../tools/include/linux/filter.h"
+
+#include "../bpf/libbpf.c"
+
+#ifndef seccomp
+static int seccomp(unsigned int op, unsigned int flags, void *args)
+{
+   errno = 0;
+   return syscall(__NR_seccomp, op, flags, args);
+}
+#endif
+
+#define ARRAY_SIZE(a)  (sizeof(a) / sizeof(a[0]))
+
+static int apply_sandbox(const char **allowed_paths, int path_nb, const char 
**cgroup_paths, int cgroup_nb)
+{
+   __u32 key;
+   int i, ret = 0, map_fs = -1, map_cg = -1, offset;
+
+   /* set up the test sandbox */
+   if (prctl(PR_SET_NO_NEW_PRIVS, 1, 0, 0, 0)) {
+   perror("prctl(no_new_priv)");
+   return 1;
+   }
+
+   /* register a new syscall filter */
+   struct sock_filter filter0[] = {
+   /* pass a cookie containing 5 to the LSM hook filter */
+   BPF_STMT(BPF_RET|BPF_K, SECCOMP_RET_LANDLOCK | 5),
+   };
+   struct sock_fprog prog0 = {
+   .len = (unsigned short)ARRAY_SIZE(filter0),
+   .filter = filter0,
+   };
+   if (seccomp(SECCOMP_SET_MODE_FILTER, 0, &prog0)) {
+   perror("seccomp(set_filter)");
+   return 1;
+   }
+
+   if (path_nb) {
+   map_fs = bpf_create_map(BPF_MAP_TYPE_LANDLOCK_ARRAY, 
sizeof(key), sizeof(struct landlock_handle), 10, 0);
+   if (map_fs < 0) {
+   fprintf(stderr, "bpf_create_map(fs");
+   perror(")");
+   return 1;
+   }
+   for (key = 0; key < path_nb; key++) {
+   int fd = open(allowed_paths[key], O_RDONLY | O_CLOEXEC);
+   if (fd < 0) {
+   fprintf(stderr, "open(fs: \"%s\"", 
allowed_paths[key]);
+   perror(")");
+   return 1;
+   }
+   struct landlock_handle handle = {
+   .type = BPF_MAP_HANDLE_TYPE_LANDLOCK_FS_FD,
+   .fd = (__u64)fd,
+   };
+
+   /* register a new LSM handle */
+   if (bpf_update_elem(map_f

hello

2016-08-25 Thread jennifer dogolea15
Hello,
My name is jennifer
I saw your profile today on solidot.org and get interested to know you,
because you look very nice in your profile, here is my email address
(jenniferdogo...@hotmail.com) please send me an email so that i will
send you my photos. and tell you more about my self. mail me at
(jenniferdogo...@hotmail.com)
Remember distance, color, religion or tribe does not matter but love
matters a lot.
kiss my dear love
jennifer


[RFC v2 07/10] landlock: Add errno check

2016-08-25 Thread Mickaël Salaün
Add a max errno value.

This is not strictly needed but should improve reliability.

Signed-off-by: Mickaël Salaün 
Cc: Arnd Bergmann 
Cc: Serge E. Hallyn 
Cc: James Morris 
Cc: Kees Cook 
---
 include/uapi/asm-generic/errno-base.h | 1 +
 security/landlock/lsm.c   | 6 +++---
 2 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/include/uapi/asm-generic/errno-base.h 
b/include/uapi/asm-generic/errno-base.h
index 65115978510f..43407a403e72 100644
--- a/include/uapi/asm-generic/errno-base.h
+++ b/include/uapi/asm-generic/errno-base.h
@@ -35,5 +35,6 @@
 #defineEPIPE   32  /* Broken pipe */
 #defineEDOM33  /* Math argument out of domain of func 
*/
 #defineERANGE  34  /* Math result not representable */
+#define_ERRNO_LAST ERANGE
 
 #endif
diff --git a/security/landlock/lsm.c b/security/landlock/lsm.c
index aa9d4a64826e..322309068066 100644
--- a/security/landlock/lsm.c
+++ b/security/landlock/lsm.c
@@ -11,7 +11,6 @@
 #include 
 #include  /* enum bpf_reg_type, struct landlock_data */
 #include 
-#include  /* MAX_ERRNO */
 #include  /* struct bpf_prog, BPF_PROG_RUN() */
 #include  /* FIELD_SIZEOF() */
 #include 
@@ -104,8 +103,9 @@ static int landlock_run_prog(__u64 args[6])
}
}
if (!ret) {
-   if (cur_ret > MAX_ERRNO)
-   ret = MAX_ERRNO;
+   /* check errno to not mess with kernel code */
+   if (cur_ret > _ERRNO_LAST)
+   ret = EPERM;
else
ret = cur_ret;
}
-- 
2.8.1



[RFC v2 00/10] Landlock LSM: Unprivileged sandboxing

2016-08-25 Thread Mickaël Salaün
Hi,

This series is a proof of concept to fill some missing part of seccomp as the
ability to check syscall argument pointers or creating more dynamic security
policies. The goal of this new stackable Linux Security Module (LSM) called
Landlock is to allow any process, including unprivileged ones, to create
powerful security sandboxes comparable to the Seatbelt/XNU Sandbox or the
OpenBSD Pledge. This kind of sandbox help to mitigate the security impact of
bugs or unexpected/malicious behaviors in userland applications.

The first RFC [1] was focused on extending seccomp while staying at the syscall
level. This brought a working PoC but with some (mitigated) ToCToU race
conditions due to the seccomp ptrace hole (now fixed) and the non-atomic
syscall argument evaluation (hence the LSM hooks).


# Landlock LSM

This second RFC is a fresh revamp of the code while keeping some working ideas.
This series is mainly focused on LSM hooks, while keeping the possibility to
tied them to syscalls. This new code removes all race conditions by design. It
now use eBPF instead of a subset of cBPF (as used by seccomp-bpf). This allow
to remove the previous stacked cBPF hack to do complex access checks thanks to
dedicated eBPF functions. An eBPF program is still very limited (i.e. can only
call a whitelist of functions) and can not do a denial of service (i.e. no
loop). The other major improvement is the replacement of the previous custom
checker groups of syscall arguments with a new dedicated eBPF map to collect
and compare Landlock handles with system resources (e.g. files or network
connections).

The approach taken is to add the minimum amount of code while still allowing
the userland to create quite complex access rules. A dedicated security policy
language such as used by SELinux, AppArmor and other major LSMs is a lot of
code and dedicated to a trusted process (i.e. root/administrator).


# eBPF

To get an expressive language while still being safe and small, Landlock is
based on eBPF. Landlock should be usable by untrusted processes and must then
expose a minimal attack surface. The eBPF bytecode is minimal while powerful,
widely used and thought to be used by not so trusted application. Reusing this
code allows to not reproduce the same mistakes and minimize new code  while
still taking a generic approach. There is only some new features like a new
kind of arraymap and few dedicated eBPF functions.

An eBPF program have access to an eBPF context which contains the LSM hook
arguments (as does seccomp-bpf with syscall arguments). They can be used
directly or passed to helper functions according to their types. It is then
possible to do complex access checks without race conditions nor inconsistent
evaluation (i.e. incorrect mirroring of the OS code and state [2]).

There is one new eBPF program type per LSM hook. This allow to statically check
which context access is performed by an eBPF program. This is needed to deny
kernel address leak and ensure the right use of LSM hook arguments with eBPF
functions. Moreover, this safe pointer handling remove the need for runtime
check or abstract data, which improve performances. Any user can add multiple
Landlock eBPF programs per LSM hook. They are stacked and evaluated one after
the other (cf. seccomp-bpf).


# LSM hooks

Contrary to syscalls, LSM hooks are security checkpoints and are not
architecture dependant. They are designed to match a security need reflected by
a security policy (e.g. access to a file). Exposing parts of some LSM hooks
instead of using the syscall API for sandboxing should help to avoid bugs and
hacks as encountered by the first RFC. Instead of redoing the work of the LSM
hooks through syscalls, we should use and expose them as does policies of
access control LSM.

Only a subset of the hooks are meaningful for an unprivileged sandbox mechanism
(e.g. file system or network access control). Landlock use an abstraction of
raw LSM hooks, which allow to deal with possible future API changes of the LSM
hook API. Moreover, thanks to the ePBF program typing (per LSM hook) used by
Landlock, it should not be hard to make such evolutions backward compatible.


# Use case scenario

First, a process need to create a new dedicated eBPF map containing handles.
This handles are references to system resources (e.g. file or directory) and
grouped in one or multiple maps to be efficiently managed and checked in
batches. This kind of map can be passed to Landlock eBPF functions to compare,
for example, with a file access request. The handles are only accessible from
the eBPF programs created by the same thread.

The loaded Landlock eBPF programs can be triggered by a seccomp filter
returning RET_LANDLOCK. In addition, a cookie (16-bit value) can be passed from
a seccomp filter to eBPF programs. This allow flexible security policies
between seccomp and Landlock.

A triggered Landlock eBPF program can then allow or deny an access, according
to its type (i.e. LSM hook), th

[RFC v2 01/10] landlock: Add Kconfig

2016-08-25 Thread Mickaël Salaün
Initial Landlock Kconfig needed to split the Landlock eBPF and seccomp
parts to ease the review.

Signed-off-by: Mickaël Salaün 
Cc: James Morris 
Cc: Kees Cook 
Cc: Serge E. Hallyn 
---
 security/Kconfig  |  1 +
 security/landlock/Kconfig | 16 
 2 files changed, 17 insertions(+)
 create mode 100644 security/landlock/Kconfig

diff --git a/security/Kconfig b/security/Kconfig
index 176758cdfa57..be6c549dd0ca 100644
--- a/security/Kconfig
+++ b/security/Kconfig
@@ -124,6 +124,7 @@ source security/tomoyo/Kconfig
 source security/apparmor/Kconfig
 source security/loadpin/Kconfig
 source security/yama/Kconfig
+source security/landlock/Kconfig
 
 source security/integrity/Kconfig
 
diff --git a/security/landlock/Kconfig b/security/landlock/Kconfig
new file mode 100644
index ..dc8328d216d7
--- /dev/null
+++ b/security/landlock/Kconfig
@@ -0,0 +1,16 @@
+config SECURITY_LANDLOCK
+   bool "Landlock sandbox support"
+   depends on SECURITY
+   select BPF_SYSCALL
+   select SECCOMP
+   default y
+   help
+ Landlock is a stacked LSM which allows any user to load a security 
policy
+ to restrict their processes (i.e. create a sandbox). The policy is a 
list
+ of stacked eBPF programs for some LSM hooks. Each program can do some
+ access comparison to check if an access request is legitimate.
+
+ Further information about eBPF can be found in
+ Documentation/networking/filter.txt
+
+ If you are unsure how to answer this question, answer Y.
-- 
2.8.1



[RESEND PATCH net 03/10] net: ethernet: mediatek: fix API usage with skb_free_frag

2016-08-25 Thread Sean Wang
use skb_free_frag() instead of legacy put_page()

Signed-off-by: Sean Wang 
---
 drivers/net/ethernet/mediatek/mtk_eth_soc.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/mediatek/mtk_eth_soc.c 
b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
index 02b048f..1b131a1 100644
--- a/drivers/net/ethernet/mediatek/mtk_eth_soc.c
+++ b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
@@ -864,7 +864,7 @@ static int mtk_poll_rx(struct napi_struct *napi, int budget,
/* receive data */
skb = build_skb(data, ring->frag_size);
if (unlikely(!skb)) {
-   put_page(virt_to_head_page(new_data));
+   skb_free_frag(new_data);
netdev->stats.rx_dropped++;
goto release_desc;
}
-- 
1.9.1



[RFC v2 05/10] seccomp: Handle Landlock

2016-08-25 Thread Mickaël Salaün
A Landlock program can be triggered when a seccomp filter return
RET_LANDLOCK. Moreover, it is possible to return a 16-bit cookie which
will be readable by the Landlock programs.

Only seccomp filters loaded from the same thread and before a Landlock
program can trigger it. Multiple Landlock programs can be triggered by
one or more seccomp filters. This way, each RET_LANDLOCK (with specific
cookie) will trigger all the allowed Landlock programs once.

Signed-off-by: Mickaël Salaün 
Cc: Kees Cook 
Cc: Andy Lutomirski 
Cc: Will Drewry 
Cc: Andrew Morton 
---
 include/linux/seccomp.h  |  49 +++
 include/uapi/linux/seccomp.h |   2 +
 kernel/fork.c|  39 -
 kernel/seccomp.c | 190 ++-
 4 files changed, 275 insertions(+), 5 deletions(-)

diff --git a/include/linux/seccomp.h b/include/linux/seccomp.h
index 29b20fe8fd4d..785ccbebf687 100644
--- a/include/linux/seccomp.h
+++ b/include/linux/seccomp.h
@@ -10,7 +10,33 @@
 #include 
 #include 
 
+#ifdef CONFIG_SECURITY_LANDLOCK
+#include  /* struct bpf_prog */
+#endif /* CONFIG_SECURITY_LANDLOCK */
+
 struct seccomp_filter;
+
+#ifdef CONFIG_SECURITY_LANDLOCK
+struct seccomp_landlock_ret {
+   struct seccomp_landlock_ret *prev;
+   /* @filter points to a @landlock_filter list */
+   struct seccomp_filter *filter;
+   u16 cookie;
+   bool triggered;
+};
+
+struct seccomp_landlock_prog {
+   atomic_t usage;
+   struct seccomp_landlock_prog *prev;
+   /*
+* List of filters (through filter->landlock_prev) allowed to trigger
+* this Landlock program.
+*/
+   struct seccomp_filter *filter;
+   struct bpf_prog *prog;
+};
+#endif /* CONFIG_SECURITY_LANDLOCK */
+
 /**
  * struct seccomp - the state of a seccomp'ed process
  *
@@ -18,6 +44,10 @@ struct seccomp_filter;
  * system calls available to a process.
  * @filter: must always point to a valid seccomp-filter or NULL as it is
  *  accessed without locking during system call entry.
+ * @landlock_filter: list of filters allowed to trigger an associated
+ *Landlock hook via a RET_LANDLOCK.
+ * @landlock_ret: stored values from a RET_LANDLOCK.
+ * @landlock_prog: list of Landlock programs.
  *
  *  @filter must only be accessed from the context of current as there
  *  is no read locking.
@@ -25,6 +55,12 @@ struct seccomp_filter;
 struct seccomp {
int mode;
struct seccomp_filter *filter;
+
+#ifdef CONFIG_SECURITY_LANDLOCK
+   struct seccomp_filter *landlock_filter;
+   struct seccomp_landlock_ret *landlock_ret;
+   struct seccomp_landlock_prog *landlock_prog;
+#endif /* CONFIG_SECURITY_LANDLOCK */
 };
 
 #ifdef CONFIG_HAVE_ARCH_SECCOMP_FILTER
@@ -85,6 +121,12 @@ static inline int seccomp_mode(struct seccomp *s)
 #ifdef CONFIG_SECCOMP_FILTER
 extern void put_seccomp(struct task_struct *tsk);
 extern void get_seccomp_filter(struct task_struct *tsk);
+#ifdef CONFIG_SECURITY_LANDLOCK
+extern void put_landlock_ret(struct seccomp_landlock_ret *landlock_ret);
+extern struct seccomp_landlock_ret *dup_landlock_ret(
+   struct seccomp_landlock_ret *ret_orig);
+#endif /* CONFIG_SECURITY_LANDLOCK */
+
 #else  /* CONFIG_SECCOMP_FILTER */
 static inline void put_seccomp(struct task_struct *tsk)
 {
@@ -95,6 +137,13 @@ static inline void get_seccomp_filter(struct task_struct 
*tsk)
 {
return;
 }
+
+#ifdef CONFIG_SECURITY_LANDLOCK
+static inline void put_landlock_ret(struct seccomp_landlock_ret *landlock_ret) 
{}
+static inline struct seccomp_landlock_ret *dup_landlock_ret(
+   struct seccomp_landlock_ret *ret_orig) {}
+#endif /* CONFIG_SECURITY_LANDLOCK */
+
 #endif /* CONFIG_SECCOMP_FILTER */
 
 #if defined(CONFIG_SECCOMP_FILTER) && defined(CONFIG_CHECKPOINT_RESTORE)
diff --git a/include/uapi/linux/seccomp.h b/include/uapi/linux/seccomp.h
index 0f238a43ff1e..b4aab1c19b8a 100644
--- a/include/uapi/linux/seccomp.h
+++ b/include/uapi/linux/seccomp.h
@@ -13,6 +13,7 @@
 /* Valid operations for seccomp syscall. */
 #define SECCOMP_SET_MODE_STRICT0
 #define SECCOMP_SET_MODE_FILTER1
+#define SECCOMP_SET_LANDLOCK_HOOK  2
 
 /* Valid flags for SECCOMP_SET_MODE_FILTER */
 #define SECCOMP_FILTER_FLAG_TSYNC  1
@@ -28,6 +29,7 @@
 #define SECCOMP_RET_KILL   0xU /* kill the task immediately */
 #define SECCOMP_RET_TRAP   0x0003U /* disallow and force a SIGSYS */
 #define SECCOMP_RET_ERRNO  0x0005U /* returns an errno */
+#define SECCOMP_RET_LANDLOCK   0x0007U /* trigger LSM evaluation */
 #define SECCOMP_RET_TRACE  0x7ff0U /* pass to a tracer or disallow */
 #define SECCOMP_RET_ALLOW  0x7fffU /* allow */
 
diff --git a/kernel/fork.c b/kernel/fork.c
index b23a71ec8003..3658c1e95e03 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -369,7 +369,12 @@ static struct task_struct *dup_task_struct(struct 
task_struct *orig, int node)

[RFC v2 06/10] landlock: Add LSM hooks

2016-08-25 Thread Mickaël Salaün
Add LSM hooks which can be used by userland through Landlock (eBPF)
programs. This programs are limited to a whitelist of functions (cf.
next commit). The eBPF program context is depicted by the struct
landlock_data (cf. include/uapi/linux/bpf.h):
* hook: LSM hook ID (useful when using the same program for multiple LSM
  hooks);
* cookie: the 16-bit value from the seccomp filter that triggered this
  Landlock program;
* args[6]: array of LSM hook arguments.

The LSM hook arguments can contain raw values as integers or
(unleakable) pointers. The only way to use the pointers are to pass them
to an eBPF function according to their types (e.g. the
bpf_landlock_cmp_fs_beneath_with_struct_file function can use a struct
file pointer).

For now, there is three hooks for file system access control:
* file_open;
* file_permission;
* mmap_file.

Signed-off-by: Mickaël Salaün 
Cc: Alexei Starovoitov 
Cc: Kees Cook 
Cc: Andy Lutomirski 
Cc: Will Drewry 
Cc: James Morris 
Cc: Serge E. Hallyn 
Cc: David S. Miller 
Cc: Daniel Borkmann 
---
 include/linux/bpf.h|   7 ++
 include/linux/lsm_hooks.h  |   5 ++
 include/uapi/linux/bpf.h   |  20 +
 kernel/bpf/syscall.c   |   3 +
 kernel/bpf/verifier.c  |   8 ++
 kernel/seccomp.c   |   7 +-
 security/Makefile  |   2 +
 security/landlock/Makefile |   3 +
 security/landlock/lsm.c| 211 +
 security/security.c|   1 +
 10 files changed, 265 insertions(+), 2 deletions(-)
 create mode 100644 security/landlock/Makefile
 create mode 100644 security/landlock/lsm.c

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index 9a5b388be099..557e7efdf0cd 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -81,6 +81,9 @@ enum bpf_arg_type {
 
ARG_PTR_TO_CTX, /* pointer to context */
ARG_ANYTHING,   /* any (initialized) argument is ok */
+
+   ARG_PTR_TO_STRUCT_FILE, /* pointer to struct file */
+   ARG_PTR_TO_STRUCT_CRED, /* pointer to struct cred */
 };
 
 /* type of values returned from helper functions */
@@ -139,6 +142,10 @@ enum bpf_reg_type {
 */
PTR_TO_PACKET,
PTR_TO_PACKET_END,   /* skb->data + headlen */
+
+   /* Landlock */
+   PTR_TO_STRUCT_FILE,
+   PTR_TO_STRUCT_CRED,
 };
 
 struct bpf_prog;
diff --git a/include/linux/lsm_hooks.h b/include/linux/lsm_hooks.h
index 7ae397669d8b..6792ae8fb53d 100644
--- a/include/linux/lsm_hooks.h
+++ b/include/linux/lsm_hooks.h
@@ -1898,5 +1898,10 @@ void __init loadpin_add_hooks(void);
 #else
 static inline void loadpin_add_hooks(void) { };
 #endif
+#ifdef CONFIG_SECURITY_LANDLOCK
+extern void __init landlock_add_hooks(void);
+#else
+static inline void __init landlock_add_hooks(void) { }
+#endif /* CONFIG_SECURITY_LANDLOCK */
 
 #endif /* ! __LINUX_LSM_HOOKS_H */
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index a60eedc17d40..983d14e910ff 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -102,6 +102,9 @@ enum bpf_prog_type {
BPF_PROG_TYPE_SCHED_CLS,
BPF_PROG_TYPE_SCHED_ACT,
BPF_PROG_TYPE_TRACEPOINT,
+   BPF_PROG_TYPE_LANDLOCK_FILE_OPEN,
+   BPF_PROG_TYPE_LANDLOCK_FILE_PERMISSION,
+   BPF_PROG_TYPE_LANDLOCK_MMAP_FILE,
 };
 
 #define BPF_PSEUDO_MAP_FD  1
@@ -404,4 +407,21 @@ struct landlock_handle {
};
 } __attribute__((aligned(8)));
 
+/**
+ * struct landlock_data
+ *
+ * @hook: LSM hook ID
+ * @cookie: value set by a seccomp-filter return value RET_LANDLOCK. This come
+ *  from a trusted seccomp-bpf program: the same process that loaded
+ *  this Landlock hook program.
+ * @args: LSM hook arguments, see include/linux/lsm_hooks.h for there
+ *description and the LANDLOCK_HOOK* definitions from
+ *security/landlock/lsm.c for their types.
+ */
+struct landlock_data {
+   __u32 hook;
+   __u16 cookie;
+   __u64 args[6];
+};
+
 #endif /* _UAPI__LINUX_BPF_H__ */
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index 32a10ef4b878..6b8bfc34c751 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -719,6 +719,9 @@ static int bpf_prog_load(union bpf_attr *attr)
 
switch (type) {
case BPF_PROG_TYPE_SOCKET_FILTER:
+   case BPF_PROG_TYPE_LANDLOCK_FILE_OPEN:
+   case BPF_PROG_TYPE_LANDLOCK_FILE_PERMISSION:
+   case BPF_PROG_TYPE_LANDLOCK_MMAP_FILE:
break;
default:
if (!capable(CAP_SYS_ADMIN))
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index c15f6cc28e00..2931e2efcc10 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -244,6 +244,8 @@ static const char * const reg_type_str[] = {
[CONST_IMM] = "imm",
[PTR_TO_PACKET] = "pkt",
[PTR_TO_PACKET_END] = "pkt_end",
+   [PTR_TO_STRUCT_FILE]= "struct_file",
+   [PTR_TO_STRUCT_CRED]= "struct_cred",
 };
 
 static void prin

[RESEND PATCH net 05/10] net: ethernet: mediatek: fix logic unbalance between probe and remove

2016-08-25 Thread Sean Wang
original mdio_cleanup is not in the symmetric place against where
mdio_init is, so relocate mdio_cleanup to the right one.

Signed-off-by: Sean Wang 
---
 drivers/net/ethernet/mediatek/mtk_eth_soc.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/mediatek/mtk_eth_soc.c 
b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
index 9883dac..5bd31f8 100644
--- a/drivers/net/ethernet/mediatek/mtk_eth_soc.c
+++ b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
@@ -1504,7 +1504,6 @@ static void mtk_uninit(struct net_device *dev)
struct mtk_eth *eth = mac->hw;
 
phy_disconnect(mac->phy_dev);
-   mtk_mdio_cleanup(eth);
mtk_irq_disable(eth, ~0);
 }
 
@@ -1915,6 +1914,7 @@ static int mtk_remove(struct platform_device *pdev)
netif_napi_del(ð->tx_napi);
netif_napi_del(ð->rx_napi);
mtk_cleanup(eth);
+   mtk_mdio_cleanup(eth);
 
return 0;
 }
-- 
1.9.1



[RFC v2 03/10] bpf,landlock: Add a new arraymap type to deal with (Landlock) handles

2016-08-25 Thread Mickaël Salaün
This new arraymap looks like a set and brings new properties:
* strong typing of entries: the eBPF functions get the array type of
  elements instead of CONST_PTR_TO_MAP (e.g.
  CONST_PTR_TO_LANDLOCK_HANDLE_FS);
* force sequential filling (i.e. replace or append-only update), which
  allow quick browsing of all entries.

This strong typing is useful to statically check if the content of a map
can be passed to an eBPF function. For example, Landlock use it to store
and manage kernel objects (e.g. struct file) instead of dealing with
userland raw data. This improve efficiency and ensure that an eBPF
program can only call functions with the right high-level arguments.

The enum bpf_map_handle_type list low-level types (e.g.
BPF_MAP_HANDLE_TYPE_LANDLOCK_FS_FD) which are identified when
updating a map entry (handle). This handle types are used to infer a
high-level arraymap type which are listed in enum bpf_map_array_type
(e.g. BPF_MAP_ARRAY_TYPE_LANDLOCK_FS).

For now, this new arraymap is only used by Landlock LSM (cf. next
commits) but it could be useful for other needs.

Signed-off-by: Mickaël Salaün 
Cc: Alexei Starovoitov 
Cc: David S. Miller 
Cc: Daniel Borkmann 
Cc: James Morris 
Cc: Kees Cook 
---
 include/linux/bpf.h  |  18 +
 include/uapi/linux/bpf.h |  18 +
 kernel/bpf/arraymap.c| 181 +++
 kernel/bpf/syscall.c |   9 ++-
 kernel/bpf/verifier.c|  12 +++-
 5 files changed, 235 insertions(+), 3 deletions(-)

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index ca3742729ae7..9a5b388be099 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -12,6 +12,10 @@
 #include 
 #include 
 
+#ifdef CONFIG_SECURITY_LANDLOCK
+#include  /* struct file */
+#endif /* CONFIG_SECURITY_LANDLOCK */
+
 struct bpf_map;
 
 /* map is generic key/value storage optionally accesible by eBPF programs */
@@ -34,6 +38,7 @@ struct bpf_map_ops {
 struct bpf_map {
atomic_t refcnt;
enum bpf_map_type map_type;
+   enum bpf_map_array_type map_array_type;
u32 key_size;
u32 value_size;
u32 max_entries;
@@ -183,12 +188,25 @@ struct bpf_array {
 */
enum bpf_prog_type owner_prog_type;
bool owner_jited;
+#ifdef CONFIG_SECURITY_LANDLOCK
+   u32 n_entries;  /* number of entries in a handle array */
+#endif /* CONFIG_SECURITY_LANDLOCK */
union {
char value[0] __aligned(8);
void *ptrs[0] __aligned(8);
void __percpu *pptrs[0] __aligned(8);
};
 };
+
+#ifdef CONFIG_SECURITY_LANDLOCK
+struct map_landlock_handle {
+   u32 type;
+   union {
+   struct file *file;
+   };
+};
+#endif /* CONFIG_SECURITY_LANDLOCK */
+
 #define MAX_TAIL_CALL_CNT 32
 
 u64 bpf_tail_call(u64 ctx, u64 r2, u64 index, u64 r4, u64 r5);
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index 406459b935a2..a60eedc17d40 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -84,6 +84,15 @@ enum bpf_map_type {
BPF_MAP_TYPE_PERCPU_HASH,
BPF_MAP_TYPE_PERCPU_ARRAY,
BPF_MAP_TYPE_STACK_TRACE,
+   BPF_MAP_TYPE_LANDLOCK_ARRAY,
+};
+
+enum bpf_map_array_type {
+   BPF_MAP_ARRAY_TYPE_UNSPEC,
+};
+
+enum bpf_map_handle_type {
+   BPF_MAP_HANDLE_TYPE_UNSPEC,
 };
 
 enum bpf_prog_type {
@@ -386,4 +395,13 @@ struct bpf_tunnel_key {
__u32 tunnel_label;
 };
 
+/* Map handle entry */
+struct landlock_handle {
+   __u32 type; /* enum bpf_map_handle_type */
+   union {
+   __u32 fd;
+   __aligned_u64 glob;
+   };
+} __attribute__((aligned(8)));
+
 #endif /* _UAPI__LINUX_BPF_H__ */
diff --git a/kernel/bpf/arraymap.c b/kernel/bpf/arraymap.c
index 76d5a794e426..5938b8ee475b 100644
--- a/kernel/bpf/arraymap.c
+++ b/kernel/bpf/arraymap.c
@@ -16,6 +16,8 @@
 #include 
 #include 
 #include 
+#include  /* fput() */
+#include  /* struct file */
 
 static void bpf_array_free_percpu(struct bpf_array *array)
 {
@@ -491,3 +493,182 @@ static int __init register_perf_event_array_map(void)
return 0;
 }
 late_initcall(register_perf_event_array_map);
+
+
+#ifdef CONFIG_SECURITY_LANDLOCK
+static struct bpf_map *landlock_array_map_alloc(union bpf_attr *attr)
+{
+   if (attr->value_size != sizeof(struct landlock_handle))
+   return ERR_PTR(-EINVAL);
+   attr->value_size = sizeof(struct map_landlock_handle);
+
+   return array_map_alloc(attr);
+}
+
+static void landlock_put_handle(struct map_landlock_handle *handle)
+{
+   switch (handle->type) {
+   /* TODO: add handle types */
+   default:
+   WARN_ON(1);
+   }
+   /* safeguard */
+   handle->type = BPF_MAP_HANDLE_TYPE_UNSPEC;
+}
+
+static void landlock_array_map_free(struct bpf_map *map)
+{
+   struct bpf_array *array = container_of(map, struct bpf_array, map);
+   int i;
+
+   synchronize_rcu();
+
+   for (i = 0; i < array->n_entries

[RFC v2 04/10] seccomp: Split put_seccomp_filter() with put_seccomp()

2016-08-25 Thread Mickaël Salaün
The semantic is unchanged. This will be useful for the Landlock
integration with seccomp (next commit).

Signed-off-by: Mickaël Salaün 
Cc: Kees Cook 
Cc: Andy Lutomirski 
Cc: Will Drewry 
---
 include/linux/seccomp.h |  5 +++--
 kernel/fork.c   |  2 +-
 kernel/seccomp.c| 18 +-
 3 files changed, 17 insertions(+), 8 deletions(-)

diff --git a/include/linux/seccomp.h b/include/linux/seccomp.h
index 2296e6b2f690..29b20fe8fd4d 100644
--- a/include/linux/seccomp.h
+++ b/include/linux/seccomp.h
@@ -83,13 +83,14 @@ static inline int seccomp_mode(struct seccomp *s)
 #endif /* CONFIG_SECCOMP */
 
 #ifdef CONFIG_SECCOMP_FILTER
-extern void put_seccomp_filter(struct task_struct *tsk);
+extern void put_seccomp(struct task_struct *tsk);
 extern void get_seccomp_filter(struct task_struct *tsk);
 #else  /* CONFIG_SECCOMP_FILTER */
-static inline void put_seccomp_filter(struct task_struct *tsk)
+static inline void put_seccomp(struct task_struct *tsk)
 {
return;
 }
+
 static inline void get_seccomp_filter(struct task_struct *tsk)
 {
return;
diff --git a/kernel/fork.c b/kernel/fork.c
index 4a7ec0c6c88c..b23a71ec8003 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -235,7 +235,7 @@ void free_task(struct task_struct *tsk)
free_thread_stack(tsk->stack);
rt_mutex_debug_task_free(tsk);
ftrace_graph_exit_task(tsk);
-   put_seccomp_filter(tsk);
+   put_seccomp(tsk);
arch_release_task_struct(tsk);
free_task_struct(tsk);
 }
diff --git a/kernel/seccomp.c b/kernel/seccomp.c
index 7002796f14a4..f1f475691c27 100644
--- a/kernel/seccomp.c
+++ b/kernel/seccomp.c
@@ -60,6 +60,8 @@ struct seccomp_filter {
struct bpf_prog *prog;
 };
 
+static void put_seccomp_filter(struct seccomp_filter *filter);
+
 /* Limit any path through the tree to 256KB worth of instructions. */
 #define MAX_INSNS_PER_PATH ((1 << 18) / sizeof(struct sock_filter))
 
@@ -313,7 +315,7 @@ static inline void seccomp_sync_threads(void)
 * current's path will hold a reference.  (This also
 * allows a put before the assignment.)
 */
-   put_seccomp_filter(thread);
+   put_seccomp_filter(thread->seccomp.filter);
smp_store_release(&thread->seccomp.filter,
  caller->seccomp.filter);
 
@@ -475,10 +477,11 @@ static inline void seccomp_filter_free(struct 
seccomp_filter *filter)
}
 }
 
-/* put_seccomp_filter - decrements the ref count of tsk->seccomp.filter */
-void put_seccomp_filter(struct task_struct *tsk)
+/* put_seccomp_filter - decrements the ref count of a filter */
+static void put_seccomp_filter(struct seccomp_filter *filter)
 {
-   struct seccomp_filter *orig = tsk->seccomp.filter;
+   struct seccomp_filter *orig = filter;
+
/* Clean up single-reference branches iteratively. */
while (orig && atomic_dec_and_test(&orig->usage)) {
struct seccomp_filter *freeme = orig;
@@ -487,6 +490,11 @@ void put_seccomp_filter(struct task_struct *tsk)
}
 }
 
+void put_seccomp(struct task_struct *tsk)
+{
+   put_seccomp_filter(tsk->seccomp.filter);
+}
+
 /**
  * seccomp_send_sigsys - signals the task to allow in-process syscall emulation
  * @syscall: syscall number to send to userland
@@ -926,7 +934,7 @@ long seccomp_get_filter(struct task_struct *task, unsigned 
long filter_off,
if (copy_to_user(data, fprog->filter, bpf_classic_proglen(fprog)))
ret = -EFAULT;
 
-   put_seccomp_filter(task);
+   put_seccomp_filter(task->seccomp.filter);
return ret;
 
 out:
-- 
2.8.1



[RFC v2 09/10] landlock: Handle cgroups

2016-08-25 Thread Mickaël Salaün
Add an eBPF function bpf_landlock_cmp_cgroup_beneath(opt, map, map_op)
to compare the current process cgroup with a cgroup handle, The handle
can match the current cgroup if it is the same or a child. This allows
to make conditional rules according to the current cgroup.

A cgroup handle is a map entry created from a file descriptor referring
a cgroup directory (e.g. by opening /sys/fs/cgroup/X). In this case, the
map entry is of type BPF_MAP_HANDLE_TYPE_LANDLOCK_CGROUP_FD and the
inferred array map is of type BPF_MAP_ARRAY_TYPE_LANDLOCK_CGROUP.

An unprivileged process can create and manipulate cgroups thanks to
cgroup delegation.

Signed-off-by: Mickaël Salaün 
Cc: Kees Cook 
Cc: Alexei Starovoitov 
Cc: James Morris 
Cc: Serge E. Hallyn 
Cc: David S. Miller 
Cc: Daniel Borkmann 
---
 include/linux/bpf.h|  8 
 include/uapi/linux/bpf.h   | 15 ++
 kernel/bpf/arraymap.c  | 30 
 kernel/bpf/verifier.c  |  6 +++
 security/landlock/Kconfig  |  3 ++
 security/landlock/Makefile |  2 +-
 security/landlock/checker_cgroup.c | 96 ++
 security/landlock/checker_cgroup.h | 18 +++
 security/landlock/lsm.c|  8 
 9 files changed, 185 insertions(+), 1 deletion(-)
 create mode 100644 security/landlock/checker_cgroup.c
 create mode 100644 security/landlock/checker_cgroup.h

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index 79014aedbea4..9e6786e7a40a 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -14,6 +14,9 @@
 
 #ifdef CONFIG_SECURITY_LANDLOCK
 #include  /* struct file */
+#ifdef CONFIG_CGROUPS
+#include  /* struct cgroup_subsys_state */
+#endif /* CONFIG_CGROUPS */
 #endif /* CONFIG_SECURITY_LANDLOCK */
 
 struct bpf_map;
@@ -85,6 +88,7 @@ enum bpf_arg_type {
ARG_PTR_TO_STRUCT_FILE, /* pointer to struct file */
ARG_PTR_TO_STRUCT_CRED, /* pointer to struct cred */
ARG_CONST_PTR_TO_LANDLOCK_HANDLE_FS,/* pointer to Landlock FS 
handle */
+   ARG_CONST_PTR_TO_LANDLOCK_HANDLE_CGROUP,/* pointer to Landlock 
cgroup handle */
 };
 
 /* type of values returned from helper functions */
@@ -148,6 +152,7 @@ enum bpf_reg_type {
PTR_TO_STRUCT_FILE,
PTR_TO_STRUCT_CRED,
CONST_PTR_TO_LANDLOCK_HANDLE_FS,
+   CONST_PTR_TO_LANDLOCK_HANDLE_CGROUP,
 };
 
 struct bpf_prog;
@@ -212,6 +217,9 @@ struct map_landlock_handle {
u32 type; /* e.g. BPF_MAP_HANDLE_TYPE_LANDLOCK_FS_FD */
union {
struct file *file;
+#ifdef CONFIG_CGROUPS
+   struct cgroup_subsys_state *css;
+#endif /* CONFIG_CGROUPS */
};
 };
 #endif /* CONFIG_SECURITY_LANDLOCK */
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index 88af79dd668c..7f60b9fdb35c 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -90,12 +90,14 @@ enum bpf_map_type {
 enum bpf_map_array_type {
BPF_MAP_ARRAY_TYPE_UNSPEC,
BPF_MAP_ARRAY_TYPE_LANDLOCK_FS,
+   BPF_MAP_ARRAY_TYPE_LANDLOCK_CGROUP,
 };
 
 enum bpf_map_handle_type {
BPF_MAP_HANDLE_TYPE_UNSPEC,
BPF_MAP_HANDLE_TYPE_LANDLOCK_FS_FD,
BPF_MAP_HANDLE_TYPE_LANDLOCK_FS_GLOB,
+   BPF_MAP_HANDLE_TYPE_LANDLOCK_CGROUP_FD,
 };
 
 enum bpf_map_array_op {
@@ -364,6 +366,19 @@ enum bpf_func_id {
 */
BPF_FUNC_landlock_cmp_fs_beneath_with_struct_file,
 
+   /**
+* bpf_landlock_cmp_cgroup_beneath(opt, map, map_op)
+* Check if the current process is a leaf of cgroup handles
+*
+* @opt: check options (e.g. LANDLOCK_FLAG_OPT_REVERSE)
+* @map: handles to compare against
+* @map_op: which elements of the map to use (e.g. BPF_MAP_ARRAY_OP_OR)
+*
+* Return: 0 if the current cgroup is the sam or beneath the handle,
+* 1 otherwise, or a negative value if an error occurred.
+*/
+   BPF_FUNC_landlock_cmp_cgroup_beneath,
+
__BPF_FUNC_MAX_ID,
 };
 
diff --git a/kernel/bpf/arraymap.c b/kernel/bpf/arraymap.c
index 6804dafd8355..050b3d8d88c8 100644
--- a/kernel/bpf/arraymap.c
+++ b/kernel/bpf/arraymap.c
@@ -19,6 +19,12 @@
 #include  /* fput() */
 #include  /* struct file */
 
+#ifdef CONFIG_SECURITY_LANDLOCK
+#ifdef CONFIG_CGROUPS
+#include  /* struct cgroup_subsys_state */
+#endif /* CONFIG_CGROUPS */
+#endif /* CONFIG_SECURITY_LANDLOCK */
+
 static void bpf_array_free_percpu(struct bpf_array *array)
 {
int i;
@@ -514,6 +520,12 @@ static void landlock_put_handle(struct map_landlock_handle 
*handle)
else
WARN_ON(1);
break;
+   case BPF_MAP_HANDLE_TYPE_LANDLOCK_CGROUP_FD:
+   if (likely(handle->css))
+   css_put(handle->css);
+   else
+   WARN_ON(1);
+   break;
default:
WARN_ON(1);
}
@@ -541,6 +553,10 @@ static enum bpf_map_array

[RESEND PATCH net 02/10] net: ethernet: mediatek: fix incorrect return value of devm_clk_get with EPROBE_DEFER

2016-08-25 Thread Sean Wang
If the return value of devm_clk_get is EPROBE_DEFER, we should
defer probing the driver. The change is verified and works based
on 4.8-rc1 staying with the latest clk-next code for MT7623.

Signed-off-by: Sean Wang 
---
 drivers/net/ethernet/mediatek/mtk_eth_soc.c | 11 +--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/mediatek/mtk_eth_soc.c 
b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
index 6e4a6ca..02b048f 100644
--- a/drivers/net/ethernet/mediatek/mtk_eth_soc.c
+++ b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
@@ -1851,8 +1851,15 @@ static int mtk_probe(struct platform_device *pdev)
eth->clk_gp1 = devm_clk_get(&pdev->dev, "gp1");
eth->clk_gp2 = devm_clk_get(&pdev->dev, "gp2");
if (IS_ERR(eth->clk_esw) || IS_ERR(eth->clk_gp1) ||
-   IS_ERR(eth->clk_gp2) || IS_ERR(eth->clk_ethif))
-   return -ENODEV;
+   IS_ERR(eth->clk_gp2) || IS_ERR(eth->clk_ethif)) {
+   if (PTR_ERR(eth->clk_esw) == -EPROBE_DEFER ||
+   PTR_ERR(eth->clk_gp1) == -EPROBE_DEFER ||
+   PTR_ERR(eth->clk_gp1) == -EPROBE_DEFER ||
+   PTR_ERR(eth->clk_gp2) == -EPROBE_DEFER)
+   return -EPROBE_DEFER;
+   else
+   return -ENODEV;
+   }
 
clk_prepare_enable(eth->clk_ethif);
clk_prepare_enable(eth->clk_esw);
-- 
1.9.1



Re: [PATCH net-next v1] gso: Support partial splitting at the frag_list pointer

2016-08-25 Thread Steffen Klassert
On Wed, Aug 24, 2016 at 09:27:54AM -0700, Alexander Duyck wrote:
> 
> In you case though we maybe be able to make this easier.  If I am not
> mistaken I believe we should have the main skb, and any in the chain
> excluding the last containing the same amount of data. 

Yes, it seems to be like that. With this observation we can spmplify
things. 

> That being the
> case we should be able to determine the size that you would need to
> segment at by taking skb->len, and removing the length of all the
> skbuffs hanging off of frag_list.  At that point you just use that as
> your MSS for segmentation and it should break things up so that you
> have a series of equal sized segments split as the frag_list buffer
> boundaries.
> 
> After that all that is left is to update the gso info for the buffers.
> For GSO_PARTIAL I was handling that on the first segment only.  For
> this change you would need to update that code to address the fact
> that you would have to determine the number of segments on the first
> frame and the last since the last could be less than the first, but
> all of the others in-between should have the same number of segments.

I tried to do this and ended up with the patch below.
Seems to work, but sill needs some tests. So it is
not an official patch submission.

Subject: [PATCH net-next RFC] gso: Support partial splitting at the frag_list 
pointer

Since commit 8a29111c7 ("net: gro: allow to build full sized skb")
gro may build buffers with a frag_list. This can hurt forwarding
because most NICs can't offload such packets, they need to be
segmented in software. This patch splits buffers with a frag_list
at the frag_list pointer into buffers that can be TSO offloaded.

Signed-off-by: Steffen Klassert 
---
 net/core/skbuff.c  | 46 --
 net/ipv4/af_inet.c |  6 --
 net/ipv4/gre_offload.c |  4 +++-
 net/ipv4/tcp_offload.c |  3 +++
 net/ipv4/udp_offload.c |  6 --
 net/ipv6/ip6_offload.c |  5 -
 6 files changed, 54 insertions(+), 16 deletions(-)

diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 3864b4b6..cb326e5 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -3060,6 +3060,7 @@ struct sk_buff *skb_segment(struct sk_buff *head_skb,
unsigned int offset = doffset;
unsigned int tnl_hlen = skb_tnl_header_len(head_skb);
unsigned int partial_segs = 0;
+   unsigned int fraglist_segs = 0;
unsigned int headroom;
unsigned int len = head_skb->len;
__be16 proto;
@@ -3078,16 +3079,27 @@ struct sk_buff *skb_segment(struct sk_buff *head_skb,
sg = !!(features & NETIF_F_SG);
csum = !!can_checksum_protocol(features, proto);
 
-   /* GSO partial only requires that we trim off any excess that
-* doesn't fit into an MSS sized block, so take care of that
-* now.
-*/
-   if (sg && csum && (features & NETIF_F_GSO_PARTIAL)) {
-   partial_segs = len / mss;
-   if (partial_segs > 1)
-   mss *= partial_segs;
-   else
-   partial_segs = 0;
+   if (sg && csum) {
+   /* GSO partial only requires that we trim off any excess that
+* doesn't fit into an MSS sized block, so take care of that
+* now.
+*/
+   if ((features & NETIF_F_GSO_PARTIAL)) {
+   partial_segs = len / mss;
+   if (partial_segs > 1)
+   mss *= partial_segs;
+   else
+   partial_segs = 0;
+   } else if (list_skb && (mss != GSO_BY_FRAGS) &&
+  net_gso_ok(features, 
skb_shinfo(head_skb)->gso_type)) {
+
+   skb_walk_frags(head_skb, segs) {
+   len -= segs->len;
+   }
+   fraglist_segs = len / mss;
+   mss = len;
+   segs = NULL;
+   }
}
 
headroom = skb_headroom(head_skb);
@@ -3298,6 +3310,20 @@ perform_csum_check:
SKB_GSO_CB(segs)->data_offset = skb_headroom(segs) + doffset;
}
 
+   if (fraglist_segs) {
+   struct sk_buff *iter;
+
+   for (iter = segs; iter; iter = iter->next) {
+   if (iter->next) {
+   skb_shinfo(iter)->gso_size = 
skb_shinfo(head_skb)->gso_size;
+   skb_shinfo(iter)->gso_segs = fraglist_segs;
+   } else {
+   skb_shinfo(iter)->gso_size = 
skb_shinfo(head_skb)->gso_size;
+   skb_shinfo(iter)->gso_segs = iter->len / 
skb_shinfo(head_skb)->gso_size;
+   }
+   }
+   }
+
/* Following permits correct backpressure, for protocols
 * using skb_set_owner_w().
 * Idea is

Re: [RFC v2 07/10] landlock: Add errno check

2016-08-25 Thread Andy Lutomirski
On Thu, Aug 25, 2016 at 3:32 AM, Mickaël Salaün  wrote:
> Add a max errno value.
>
> This is not strictly needed but should improve reliability.
>
> Signed-off-by: Mickaël Salaün 
> Cc: Arnd Bergmann 
> Cc: Serge E. Hallyn 
> Cc: James Morris 
> Cc: Kees Cook 
> ---
>  include/uapi/asm-generic/errno-base.h | 1 +
>  security/landlock/lsm.c   | 6 +++---
>  2 files changed, 4 insertions(+), 3 deletions(-)
>
> diff --git a/include/uapi/asm-generic/errno-base.h 
> b/include/uapi/asm-generic/errno-base.h
> index 65115978510f..43407a403e72 100644
> --- a/include/uapi/asm-generic/errno-base.h
> +++ b/include/uapi/asm-generic/errno-base.h
> @@ -35,5 +35,6 @@
>  #defineEPIPE   32  /* Broken pipe */
>  #defineEDOM33  /* Math argument out of domain of 
> func */
>  #defineERANGE  34  /* Math result not representable */
> +#define_ERRNO_LAST ERANGE

At the very least this needs a more sensible name.


Re: [RFC v2 09/10] landlock: Handle cgroups

2016-08-25 Thread Andy Lutomirski
On Thu, Aug 25, 2016 at 3:32 AM, Mickaël Salaün  wrote:
> Add an eBPF function bpf_landlock_cmp_cgroup_beneath(opt, map, map_op)
> to compare the current process cgroup with a cgroup handle, The handle
> can match the current cgroup if it is the same or a child. This allows
> to make conditional rules according to the current cgroup.
>
> A cgroup handle is a map entry created from a file descriptor referring
> a cgroup directory (e.g. by opening /sys/fs/cgroup/X). In this case, the
> map entry is of type BPF_MAP_HANDLE_TYPE_LANDLOCK_CGROUP_FD and the
> inferred array map is of type BPF_MAP_ARRAY_TYPE_LANDLOCK_CGROUP.

Can you elaborate on why this is useful?  I.e. why not just supply
different policies to different subtrees.

Also, how does this interact with the current cgroup v1 vs v2 mess?
As far as I can tell, no one can even really agree on what "what
cgroup am I in" means right now.

>
> An unprivileged process can create and manipulate cgroups thanks to
> cgroup delegation.

What is cgroup delegation?

--Andy


Re: [RFC v2 08/10] landlock: Handle file system comparisons

2016-08-25 Thread Andy Lutomirski
On Thu, Aug 25, 2016 at 3:32 AM, Mickaël Salaün  wrote:
> Add eBPF functions to compare file system access with a Landlock file
> system handle:
> * bpf_landlock_cmp_fs_prop_with_struct_file(prop, map, map_op, file)
>   This function allows to compare the dentry, inode, device or mount
>   point of the currently accessed file, with a reference handle.
> * bpf_landlock_cmp_fs_beneath_with_struct_file(opt, map, map_op, file)
>   This function allows an eBPF program to check if the current accessed
>   file is the same or in the hierarchy of a reference handle.
>
> The goal of file system handle is to abstract kernel objects such as a
> struct file or a struct inode. Userland can create this kind of handle
> thanks to the BPF_MAP_UPDATE_ELEM command. The element is a struct
> landlock_handle containing the handle type (e.g.
> BPF_MAP_HANDLE_TYPE_LANDLOCK_FS_FD) and a file descriptor. This could
> also be any descriptions able to match a struct file or a struct inode
> (e.g. path or glob string).

This needs Eric's opinion.

Also, where do all the struct file *'s get stashed?  Are they
preserved in the arraymap?  What prevents reference cycles or absurdly
large numbers of struct files getting pinned?

--Andy


Re: [RFC v2 00/10] Landlock LSM: Unprivileged sandboxing

2016-08-25 Thread Andy Lutomirski
On Thu, Aug 25, 2016 at 3:32 AM, Mickaël Salaün  wrote:
> Hi,
>
> This series is a proof of concept to fill some missing part of seccomp as the
> ability to check syscall argument pointers or creating more dynamic security
> policies. The goal of this new stackable Linux Security Module (LSM) called
> Landlock is to allow any process, including unprivileged ones, to create
> powerful security sandboxes comparable to the Seatbelt/XNU Sandbox or the
> OpenBSD Pledge. This kind of sandbox help to mitigate the security impact of
> bugs or unexpected/malicious behaviors in userland applications.
>

Maybe I'm missing an obvious description, but: do you have a
description of the eBPF API to landlock?  What function do you
provide, when is it called, what functions can it call, what does the
fancy new arraymap do, etc?

--Andy


RE: [PATCH for-next 0/2] {IB,net}/hns: Add support of ACPI to the Hisilicon RoCE Driver

2016-08-25 Thread Salil Mehta


> -Original Message-
> From: David Miller [mailto:da...@davemloft.net]
> Sent: Thursday, August 25, 2016 5:54 AM
> To: Salil Mehta
> Cc: dledf...@redhat.com; Huwei (Xavier); oulijun; Zhuangyuzeng (Yisen);
> mehta.salil@gmail.com; linux-r...@vger.kernel.org;
> netdev@vger.kernel.org; linux-ker...@vger.kernel.org; Linuxarm
> Subject: Re: [PATCH for-next 0/2] {IB,net}/hns: Add support of ACPI to
> the Hisilicon RoCE Driver
> 
> From: Salil Mehta 
> Date: Wed, 24 Aug 2016 04:44:48 +0800
> 
> > This patch is meant to add support of ACPI to the Hisilicon RoCE
> driver.
> > Following changes have been made in the driver(s):
> >
> > Patch 1/2: HNS Ethernet Driver: changes to support ACPI have been
> done in
> >the RoCE reset function part of the HNS ethernet driver. Earlier
> it only
> >supported DT/syscon.
> >
> > Patch 2/2. HNS RoCE driver: changes done in RoCE driver are meant to
> detect
> >the type and then either use DT specific or ACPI spcific
> functions. Where
> >ever possible, this patch tries to make use of "Unified Device
> Property
> >Interface" APIs to support both DT and ACPI through single
> interface.
> >
> > NOTE 1: ACPI changes done in both of the drivers depend upon the ACPI
> Table
> >  (DSDT and IORT tables) changes part of UEFI/BIOS. These changes
> are NOT
> >  part of this patch-set.
> > NOTE 2: Reset function in Patch 1/2 depends upon the reset function
> added in
> >  ACPI tables(basically DSDT table) part of the UEFI/BIOS. Again,
> this
> >  change is NOT reflected in this patch-set.
> 
> I can't apply this series to my tree because the hns infiniband driver
> doesn't exist in it.
Hi David,
I understand your point. This patch-set was primarily sent for Doug Ledford
and is based on his internal repository (which has been rebased on the
net-next). 

Though we were hoping, if by any chance, we can expedite the acceptance of the
below patch part of patch-set in the net-next. This might help Doug Ledford as
well later on when he pushes the already accepted RoCE driver and ACPI patches
to linux-next,

"[PATCH for-next 1/2] net: hns: Add support of ACPI to HNS driver RoCE Reset
 function"

Below HNS RoCE reset function patch has already been accepted and is part of 
your
net-next,
https://patchwork.kernel.org/patch/9287497/

Above ACPI support of RoCE Reset patch cleanly applies over the already accepted
patch in the link. It is not dependent on other accompanying RoCE driver ACPI
changes related patch or even the presence of the Infiniband/RoCE Driver in the
net-next repository.

Could you please suggest anything here?  

Thanks 
Salil



Re: [PATCH for-next 0/2] {IB,net}/hns: Add support of ACPI to the Hisilicon RoCE Driver

2016-08-25 Thread Doug Ledford
On 8/25/2016 12:53 AM, David Miller wrote:
> From: Salil Mehta 
> Date: Wed, 24 Aug 2016 04:44:48 +0800
> 
>> This patch is meant to add support of ACPI to the Hisilicon RoCE driver.
>> Following changes have been made in the driver(s):
>>
>> Patch 1/2: HNS Ethernet Driver: changes to support ACPI have been done in
>>the RoCE reset function part of the HNS ethernet driver. Earlier it only
>>supported DT/syscon.
>>
>> Patch 2/2. HNS RoCE driver: changes done in RoCE driver are meant to detect
>>the type and then either use DT specific or ACPI spcific functions. Where
>>ever possible, this patch tries to make use of "Unified Device Property
>>Interface" APIs to support both DT and ACPI through single interface.
>>
>> NOTE 1: ACPI changes done in both of the drivers depend upon the ACPI Table
>>  (DSDT and IORT tables) changes part of UEFI/BIOS. These changes are NOT
>>  part of this patch-set.
>> NOTE 2: Reset function in Patch 1/2 depends upon the reset function added in
>>  ACPI tables(basically DSDT table) part of the UEFI/BIOS. Again, this
>>  change is NOT reflected in this patch-set.
> 
> I can't apply this series to my tree because the hns infiniband driver
> doesn't exist in it.
> 

No.  This probably needs to go through my tree.  Although with all of
the requirements, I'm a bit concerned about those being present elsewhere.

-- 
Doug Ledford 
GPG Key ID: 0E572FDD



signature.asc
Description: OpenPGP digital signature


Re: [PATCH ipsec-next] xfrm: state: remove per-netns gc task

2016-08-25 Thread Steffen Klassert
On Tue, Aug 23, 2016 at 04:00:12PM +0200, Florian Westphal wrote:
> After commit 5b8ef3415a21f173
> ("xfrm: Remove ancient sleeping when the SA is in acquire state")
> gc does not need any per-netns data anymore.
> 
> As far as gc is concerned all state structs are the same, so we
> can use a global work struct for it.
> 
> Signed-off-by: Florian Westphal 

Applied to ipsec-next, thanks Florian!


RE: [PATCH for-next 0/2] {IB,net}/hns: Add support of ACPI to the Hisilicon RoCE Driver

2016-08-25 Thread Salil Mehta


> -Original Message-
> From: Doug Ledford [mailto:dledf...@redhat.com]
> Sent: Thursday, August 25, 2016 12:57 PM
> To: David Miller; Salil Mehta
> Cc: Huwei (Xavier); oulijun; Zhuangyuzeng (Yisen);
> mehta.salil@gmail.com; linux-r...@vger.kernel.org;
> netdev@vger.kernel.org; linux-ker...@vger.kernel.org; Linuxarm
> Subject: Re: [PATCH for-next 0/2] {IB,net}/hns: Add support of ACPI to
> the Hisilicon RoCE Driver
> 
> On 8/25/2016 12:53 AM, David Miller wrote:
> > From: Salil Mehta 
> > Date: Wed, 24 Aug 2016 04:44:48 +0800
> >
> >> This patch is meant to add support of ACPI to the Hisilicon RoCE
> driver.
> >> Following changes have been made in the driver(s):
> >>
> >> Patch 1/2: HNS Ethernet Driver: changes to support ACPI have been
> done in
> >>the RoCE reset function part of the HNS ethernet driver. Earlier
> it only
> >>supported DT/syscon.
> >>
> >> Patch 2/2. HNS RoCE driver: changes done in RoCE driver are meant to
> detect
> >>the type and then either use DT specific or ACPI spcific
> functions. Where
> >>ever possible, this patch tries to make use of "Unified Device
> Property
> >>Interface" APIs to support both DT and ACPI through single
> interface.
> >>
> >> NOTE 1: ACPI changes done in both of the drivers depend upon the
> ACPI Table
> >>  (DSDT and IORT tables) changes part of UEFI/BIOS. These changes
> are NOT
> >>  part of this patch-set.
> >> NOTE 2: Reset function in Patch 1/2 depends upon the reset function
> added in
> >>  ACPI tables(basically DSDT table) part of the UEFI/BIOS. Again,
> this
> >>  change is NOT reflected in this patch-set.
> >
> > I can't apply this series to my tree because the hns infiniband
> driver
> > doesn't exist in it.
> >
> 
> No.  This probably needs to go through my tree.  Although with all of
> the requirements, I'm a bit concerned about those being present
> elsewhere.
> 
> --
> Doug Ledford 
> GPG Key ID: 0E572FDD
Hello Doug,
Thanks for your reply. I have just replied to David email as well and did
not realize your response was already on the way. Sorry for this!

I would just like to request, if by any chance, we can expedite the acceptance
of the below patch (part of patch-set) in the net-next. This might help you as
well in future when you will actually push the RoCE driver to the linux-next.

"[PATCH for-next 1/2] net: hns: Add support of ACPI to HNS driver RoCE Reset
 function"

Below HNS RoCE reset function patch has already been accepted by Dave Miller and
is part of net-next,
https://patchwork.kernel.org/patch/9287497/

Also, above ACPI support of RoCE Reset patch cleanly applies over the already
accepted patch in the link and is not dependent on other accompanying RoCE
driver ACPI changes or even the presence of the Infiniband/RoCE Driver in the
net-next repository.

Could you please suggest if this the something which can be considered?   

Thanks in anticipation
Salil Mehta





Re: [PATCH net-next v3 1/2] net: ethernet: mediatek: modify to use the PDMA instead of the QDMA for Ethernet RX

2016-08-25 Thread John Crispin


On 25/08/2016 04:26, Nelson Chang wrote:
> Because the PDMA has richer features than the QDMA for Ethernet RX
> (such as multiple RX rings, HW LRO, etc.),
> the patch modifies to use the PDMA to handle Ethernet RX.
> 
> Signed-off-by: Nelson Chang 

Acked-by: John Crispin 


> ---
>  drivers/net/ethernet/mediatek/mtk_eth_soc.c | 76 
> +
>  drivers/net/ethernet/mediatek/mtk_eth_soc.h | 31 +++-
>  2 files changed, 74 insertions(+), 33 deletions(-)
>  mode change 100644 => 100755 drivers/net/ethernet/mediatek/mtk_eth_soc.c
>  mode change 100644 => 100755 drivers/net/ethernet/mediatek/mtk_eth_soc.h
> 
> diff --git a/drivers/net/ethernet/mediatek/mtk_eth_soc.c 
> b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
> old mode 100644
> new mode 100755
> index 1801fd8..cbeb793
> --- a/drivers/net/ethernet/mediatek/mtk_eth_soc.c
> +++ b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
> @@ -342,25 +342,27 @@ static void mtk_mdio_cleanup(struct mtk_eth *eth)
>   mdiobus_free(eth->mii_bus);
>  }
>  
> -static inline void mtk_irq_disable(struct mtk_eth *eth, u32 mask)
> +static inline void mtk_irq_disable(struct mtk_eth *eth,
> +unsigned reg, u32 mask)
>  {
>   unsigned long flags;
>   u32 val;
>  
>   spin_lock_irqsave(ð->irq_lock, flags);
> - val = mtk_r32(eth, MTK_QDMA_INT_MASK);
> - mtk_w32(eth, val & ~mask, MTK_QDMA_INT_MASK);
> + val = mtk_r32(eth, reg);
> + mtk_w32(eth, val & ~mask, reg);
>   spin_unlock_irqrestore(ð->irq_lock, flags);
>  }
>  
> -static inline void mtk_irq_enable(struct mtk_eth *eth, u32 mask)
> +static inline void mtk_irq_enable(struct mtk_eth *eth,
> +   unsigned reg, u32 mask)
>  {
>   unsigned long flags;
>   u32 val;
>  
>   spin_lock_irqsave(ð->irq_lock, flags);
> - val = mtk_r32(eth, MTK_QDMA_INT_MASK);
> - mtk_w32(eth, val | mask, MTK_QDMA_INT_MASK);
> + val = mtk_r32(eth, reg);
> + mtk_w32(eth, val | mask, reg);
>   spin_unlock_irqrestore(ð->irq_lock, flags);
>  }
>  
> @@ -897,12 +899,12 @@ release_desc:
>* we continue
>*/
>   wmb();
> - mtk_w32(eth, ring->calc_idx, MTK_QRX_CRX_IDX0);
> + mtk_w32(eth, ring->calc_idx, MTK_PRX_CRX_IDX0);
>   done++;
>   }
>  
>   if (done < budget)
> - mtk_w32(eth, MTK_RX_DONE_INT, MTK_QMTK_INT_STATUS);
> + mtk_w32(eth, MTK_RX_DONE_INT, MTK_PDMA_INT_STATUS);
>  
>   return done;
>  }
> @@ -1012,7 +1014,7 @@ static int mtk_napi_tx(struct napi_struct *napi, int 
> budget)
>   return budget;
>  
>   napi_complete(napi);
> - mtk_irq_enable(eth, MTK_TX_DONE_INT);
> + mtk_irq_enable(eth, MTK_QDMA_INT_MASK, MTK_TX_DONE_INT);
>  
>   return tx_done;
>  }
> @@ -1024,12 +1026,12 @@ static int mtk_napi_rx(struct napi_struct *napi, int 
> budget)
>   int rx_done = 0;
>  
>   mtk_handle_status_irq(eth);
> - mtk_w32(eth, MTK_RX_DONE_INT, MTK_QMTK_INT_STATUS);
> + mtk_w32(eth, MTK_RX_DONE_INT, MTK_PDMA_INT_STATUS);
>   rx_done = mtk_poll_rx(napi, budget, eth);
>  
>   if (unlikely(netif_msg_intr(eth))) {
> - status = mtk_r32(eth, MTK_QMTK_INT_STATUS);
> - mask = mtk_r32(eth, MTK_QDMA_INT_MASK);
> + status = mtk_r32(eth, MTK_PDMA_INT_STATUS);
> + mask = mtk_r32(eth, MTK_PDMA_INT_MASK);
>   dev_info(eth->dev,
>"done rx %d, intr 0x%08x/0x%x\n",
>rx_done, status, mask);
> @@ -1038,12 +1040,12 @@ static int mtk_napi_rx(struct napi_struct *napi, int 
> budget)
>   if (rx_done == budget)
>   return budget;
>  
> - status = mtk_r32(eth, MTK_QMTK_INT_STATUS);
> + status = mtk_r32(eth, MTK_PDMA_INT_STATUS);
>   if (status & MTK_RX_DONE_INT)
>   return budget;
>  
>   napi_complete(napi);
> - mtk_irq_enable(eth, MTK_RX_DONE_INT);
> + mtk_irq_enable(eth, MTK_PDMA_INT_MASK, MTK_RX_DONE_INT);
>  
>   return rx_done;
>  }
> @@ -1092,6 +1094,7 @@ static int mtk_tx_alloc(struct mtk_eth *eth)
>   mtk_w32(eth,
>   ring->phys + ((MTK_DMA_SIZE - 1) * sz),
>   MTK_QTX_DRX_PTR);
> + mtk_w32(eth, (QDMA_RES_THRES << 8) | QDMA_RES_THRES, MTK_QTX_CFG(0));
>  
>   return 0;
>  
> @@ -1162,11 +1165,10 @@ static int mtk_rx_alloc(struct mtk_eth *eth)
>*/
>   wmb();
>  
> - mtk_w32(eth, eth->rx_ring.phys, MTK_QRX_BASE_PTR0);
> - mtk_w32(eth, MTK_DMA_SIZE, MTK_QRX_MAX_CNT0);
> - mtk_w32(eth, eth->rx_ring.calc_idx, MTK_QRX_CRX_IDX0);
> - mtk_w32(eth, MTK_PST_DRX_IDX0, MTK_QDMA_RST_IDX);
> - mtk_w32(eth, (QDMA_RES_THRES << 8) | QDMA_RES_THRES, MTK_QTX_CFG(0));
> + mtk_w32(eth, eth->rx_ring.phys, MTK_PRX_BASE_PTR0);
> + mtk_w32(eth, MTK_DMA_SIZE, MTK_PRX_MAX_CNT0);
> + mtk_w32(eth, eth->rx_ring.calc_idx, MTK_PRX_CRX_IDX0);
> + mtk_w32(eth, MT

Re: [PATCH net-next v3 2/2] net: ethernet: mediatek: modify GDM to send packets to the PDMA for RX

2016-08-25 Thread John Crispin


On 25/08/2016 04:26, Nelson Chang wrote:
> Because we change to use the PDMA as the Ethernet RX DMA engine,
> the patch modifies to set GDM to send packets to PDMA for RX.
> 
> Signed-off-by: Nelson Chang 
> ---
>  drivers/net/ethernet/mediatek/mtk_eth_soc.c | 4 ++--
>  drivers/net/ethernet/mediatek/mtk_eth_soc.h | 0
>  2 files changed, 2 insertions(+), 2 deletions(-)
>  mode change 100755 => 100644 drivers/net/ethernet/mediatek/mtk_eth_soc.c
>  mode change 100755 => 100644 drivers/net/ethernet/mediatek/mtk_eth_soc.h
> 
> diff --git a/drivers/net/ethernet/mediatek/mtk_eth_soc.c 
> b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
> old mode 100755
> new mode 100644
> index cbeb793..c47fef4
> --- a/drivers/net/ethernet/mediatek/mtk_eth_soc.c
> +++ b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
> @@ -1473,9 +1473,9 @@ static int __init mtk_hw_init(struct mtk_eth *eth)
>   for (i = 0; i < 2; i++) {
>   u32 val = mtk_r32(eth, MTK_GDMA_FWD_CFG(i));
>  
> - /* setup the forward port to send frame to QDMA */
> + /* setup the forward port to send frame to PDMA */
>   val &= ~0x;
> - val |= 0x;
> + val |= 0x;

masking it and then setting it to 0 makes no sense. simply remove the
2nd line.

John

>  
>   /* Enable RX checksum */
>   val |= MTK_GDMA_ICS_EN | MTK_GDMA_TCS_EN | MTK_GDMA_UCS_EN;
> diff --git a/drivers/net/ethernet/mediatek/mtk_eth_soc.h 
> b/drivers/net/ethernet/mediatek/mtk_eth_soc.h
> old mode 100755
> new mode 100644
> 


Re: [PATCH net-next v1] gso: Support partial splitting at the frag_list pointer

2016-08-25 Thread Marcelo Ricardo Leitner
On Thu, Aug 25, 2016 at 09:31:26AM +0200, Steffen Klassert wrote:
> On Wed, Aug 24, 2016 at 02:25:29PM -0300, Marcelo Ricardo Leitner wrote:
> > Em 24-08-2016 13:27, Alexander Duyck escreveu:
> > >
> > >I'm adding Marcelo as he could probably explain the GSO_BY_FRAGS
> > >functionality better than I could since he is the original author.
> > >
> > >If I recall GSO_BY_FRAGS does something similar to what you are doing,
> > >although I believe it doesn't carry any data in the first buffer other
> > >than just a header.  I believe the idea behind GSO_BY_FRAGS was to
> > >allow for segmenting a frame at the frag_list level instead of having
> > >it done just based on MSS.  That was the only reason why I brought it
> > >up.
> > >
> > 
> > That's exactly it.
> > 
> > On this no data in the first buffer limitation, we probably can
> > allow it have some data in there. It was done this way just because
> > sctp is using skb_gro_receive() to build such skb and this was the
> > way I found to get such frag_list skb generated by it, thus
> > preserving frame boundaries.
> 
> Just to understand what you are doing. You generate MTU sized linear
> buffers in sctp and then, skb_gro_receive() chains up these buffers
> at the frag_list pointer. skb_gro_receive() does this because
> skb_gro_offset is null and skb->head_frag is not set in your case.
> 
> At segmentation, you just need to split at the frag_list pointer
> because you know that the chained buffers fit the MTU, right?
> 

Correct. Just note that these buffers fit the MTU, but not necessary
uses all of it. That is main point in here, variable segmentation size.


Re: [PATCH 1/1] netfilter: gre: Use the consitent GRE and PPTP struct instead of the structures defined in netfilter

2016-08-25 Thread Pablo Neira Ayuso
On Fri, Aug 19, 2016 at 11:03:46PM +0800, Feng Gao wrote:
> My email server reports the last same patch email failed to send.
> So I just sent it again.
> 
> I am sorry, if anyone receives duplicated ones.

git am 
v2-1-2-net-next-netfilter-gre-Use-consistent-GRE_-macros-instead-of-ones-defined-by-netfilter..patch
-s
Applying: netfilter: gre: Use consistent GRE_* macros instead of ones
defined by netfilter.
error: patch failed: include/uapi/linux/if_tunnel.h:36
error: include/uapi/linux/if_tunnel.h: patch does not apply

It seems your base was missing this patch:

commit ab10dccb11608b96b43b557c12a5ad867723e503
Author: Gao Feng 
Date:   Tue Aug 9 12:38:24 2016 +0800

rps: Inspect PPTP encapsulated by GRE to get flow hash

Since I cannot see GRE_FLAGS in your patch as context.

Please rebase and resubmit, thanks!


Re: [RESEND PATCH net 01/10] net: ethernet: mediatek: fix fails from TX housekeeping due to incorrect port setup

2016-08-25 Thread John Crispin


On 25/08/2016 12:44, Sean Wang wrote:
> which net device the SKB is complete for depends on the forward port
> on txd4 on the corresponding TX descriptor, but the information isn't
> set up well in case of  SKB fragments that would lead to watchdog timeout
> from the upper layer, so fix it up.
> 
> Signed-off-by: Sean Wang 

Acked-by: John Crispin 
> ---
>  drivers/net/ethernet/mediatek/mtk_eth_soc.c | 7 ---
>  1 file changed, 4 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/net/ethernet/mediatek/mtk_eth_soc.c 
> b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
> index 1801fd8..6e4a6ca 100644
> --- a/drivers/net/ethernet/mediatek/mtk_eth_soc.c
> +++ b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
> @@ -587,14 +587,15 @@ static int mtk_tx_map(struct sk_buff *skb, struct 
> net_device *dev,
>   dma_addr_t mapped_addr;
>   unsigned int nr_frags;
>   int i, n_desc = 1;
> - u32 txd4 = 0;
> + u32 txd4 = 0, fport;
>  
>   itxd = ring->next_free;
>   if (itxd == ring->last_free)
>   return -ENOMEM;
>  
>   /* set the forward port */
> - txd4 |= (mac->id + 1) << TX_DMA_FPORT_SHIFT;
> + fport = (mac->id + 1) << TX_DMA_FPORT_SHIFT;
> + txd4 |= fport;
>  
>   tx_buf = mtk_desc_to_tx_buf(ring, itxd);
>   memset(tx_buf, 0, sizeof(*tx_buf));
> @@ -652,7 +653,7 @@ static int mtk_tx_map(struct sk_buff *skb, struct 
> net_device *dev,
>   WRITE_ONCE(txd->txd3, (TX_DMA_SWC |
>  TX_DMA_PLEN0(frag_map_size) |
>  last_frag * TX_DMA_LS0));
> - WRITE_ONCE(txd->txd4, 0);
> + WRITE_ONCE(txd->txd4, fport);
>  
>   tx_buf->skb = (struct sk_buff *)MTK_DMA_DUMMY_DESC;
>   tx_buf = mtk_desc_to_tx_buf(ring, txd);
> 


Re: [RESEND PATCH net 04/10] net: ethernet: mediatek: remove redundant free_irq for devm_request_irq allocated irq

2016-08-25 Thread John Crispin


On 25/08/2016 12:44, Sean Wang wrote:
> these irqs are not used for shared irq and disabled during ethernet stops.
> irq requested by devm_request_irq is safe to be freed automatically on
> driver detach.
> 
> Signed-off-by: Sean Wang 

Acked-by: John Crispin 

> ---
>  drivers/net/ethernet/mediatek/mtk_eth_soc.c | 2 --
>  1 file changed, 2 deletions(-)
> 
> diff --git a/drivers/net/ethernet/mediatek/mtk_eth_soc.c 
> b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
> index 1b131a1..9883dac 100644
> --- a/drivers/net/ethernet/mediatek/mtk_eth_soc.c
> +++ b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
> @@ -1506,8 +1506,6 @@ static void mtk_uninit(struct net_device *dev)
>   phy_disconnect(mac->phy_dev);
>   mtk_mdio_cleanup(eth);
>   mtk_irq_disable(eth, ~0);
> - free_irq(eth->irq[1], dev);
> - free_irq(eth->irq[2], dev);
>  }
>  
>  static int mtk_do_ioctl(struct net_device *dev, struct ifreq *ifr, int cmd)
> 


Re: [RESEND PATCH net 03/10] net: ethernet: mediatek: fix API usage with skb_free_frag

2016-08-25 Thread John Crispin


On 25/08/2016 12:44, Sean Wang wrote:
> use skb_free_frag() instead of legacy put_page()
> 
> Signed-off-by: Sean Wang 

Acked-by: John Crispin 

> ---
>  drivers/net/ethernet/mediatek/mtk_eth_soc.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/net/ethernet/mediatek/mtk_eth_soc.c 
> b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
> index 02b048f..1b131a1 100644
> --- a/drivers/net/ethernet/mediatek/mtk_eth_soc.c
> +++ b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
> @@ -864,7 +864,7 @@ static int mtk_poll_rx(struct napi_struct *napi, int 
> budget,
>   /* receive data */
>   skb = build_skb(data, ring->frag_size);
>   if (unlikely(!skb)) {
> - put_page(virt_to_head_page(new_data));
> + skb_free_frag(new_data);
>   netdev->stats.rx_dropped++;
>   goto release_desc;
>   }
> 


Re: [PATCH net 2/2] sctp: not copying duplicate addrs to the assoc's bind address list

2016-08-25 Thread Marcelo Ricardo Leitner
On Thu, Aug 25, 2016 at 12:03:30PM +0800, Xin Long wrote:
> > Or add a refcnt to its members. 
> > NETDEV_UP, it gets a ++ if it's already there
> > NETDEV_DOWN, it gets a -- and cleans it up if it reaches 0
> > And the rest probably could stay the same.
> >
> Yes, it could also avoid the issue of amounts of duplicate addrs.
> or add a nic index variable to  its members.
> 
> But I still prefer the current patch.
> 1. This issue only happens when server bind 'ANY' addresses.
> we don't need to add any new members to struct sctp_sockaddr_entry.
> especially if it's a really corner issue,  we fix this as an improvement.
> 
> 2. It's yet two issues  here, the duplicate addrs may be from
>a) different local NICs.
>b) the same one NIC.
>It may be unexpectable to filter them in NETDEV_UP/DOWN events.
> 
> 3. We check it only when sctp really binds it, just like sctp_do_bind.
> 
> What do you think ?

Yep, +1. LGTM the current patch, thanks.



Re: [RESEND PATCH net 05/10] net: ethernet: mediatek: fix logic unbalance between probe and remove

2016-08-25 Thread John Crispin


On 25/08/2016 12:44, Sean Wang wrote:
> original mdio_cleanup is not in the symmetric place against where
> mdio_init is, so relocate mdio_cleanup to the right one.
> 
> Signed-off-by: Sean Wang 

Acked-by: John Crispin 

> ---
>  drivers/net/ethernet/mediatek/mtk_eth_soc.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/net/ethernet/mediatek/mtk_eth_soc.c 
> b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
> index 9883dac..5bd31f8 100644
> --- a/drivers/net/ethernet/mediatek/mtk_eth_soc.c
> +++ b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
> @@ -1504,7 +1504,6 @@ static void mtk_uninit(struct net_device *dev)
>   struct mtk_eth *eth = mac->hw;
>  
>   phy_disconnect(mac->phy_dev);
> - mtk_mdio_cleanup(eth);
>   mtk_irq_disable(eth, ~0);
>  }
>  
> @@ -1915,6 +1914,7 @@ static int mtk_remove(struct platform_device *pdev)
>   netif_napi_del(ð->tx_napi);
>   netif_napi_del(ð->rx_napi);
>   mtk_cleanup(eth);
> + mtk_mdio_cleanup(eth);
>  
>   return 0;
>  }
> 


[PATCH iproute2 net-next v3] bridge: vlan: add support to display per-vlan statistics

2016-08-25 Thread Nikolay Aleksandrov
This patch adds support for the stats argument to the bridge
vlan command which will display the per-vlan statistics and the device
each vlan belongs to with its flags. The supported command filtering
options are dev and vid. Also the man page is updated to explain the new
option.
The patch uses the new RTM_GETSTATS interface with a filter_mask to dump
all bridges and ports vlans. Later we can add support for using the
per-device dump and filter it in the kernel instead.

Example:
$ bridge -s vlan show
port vlan id
br0   1 Egress Untagged
RX: 2536 bytes 20 packets
TX: 2536 bytes 20 packets
  101
RX: 43158 bytes 50 packets
TX: 43158 bytes 50 packets
eth1  1 Egress Untagged
RX: 2536 bytes 20 packets
TX: 2536 bytes 20 packets
  100
RX: 0 bytes 0 packets
TX: 0 bytes 0 packets
  101
RX: 43158 bytes 50 packets
TX: 43158 bytes 50 packets
  102
RX: 16897 bytes 93 packets
TX: 0 bytes 0 packets

The format is the same as bridge vlan show but with stats, even though
under the hood the calls done to the kernel are different.

Signed-off-by: Nikolay Aleksandrov 
---
v3: Make the output like "bridge vlan show" and extract vlan flags, this
is only possible with the latest net-next patches and slave vlan stats
dump support, thus target net-next tree. Revert back to use the "-s" switch
as we don't need a special command anymore.

v2: Change the output format as per Stephen's comment and change the -s use
to a subcommand called stats in order to have a different format than show,
update the man page appropriately.

 bridge/vlan.c | 154 --
 include/libnetlink.h  |   8 +++
 include/linux/if_bridge.h |   2 +-
 lib/libnetlink.c  |  20 ++
 man/man8/bridge.8 |   5 ++
 5 files changed, 168 insertions(+), 21 deletions(-)

diff --git a/bridge/vlan.c b/bridge/vlan.c
index d3505b59b6fc..0b6c69077160 100644
--- a/bridge/vlan.c
+++ b/bridge/vlan.c
@@ -15,14 +15,16 @@
 #include "utils.h"
 
 static unsigned int filter_index, filter_vlan;
+static int last_ifidx = -1;
 
 json_writer_t *jw_global = NULL;
 
 static void usage(void)
 {
-   fprintf(stderr, "Usage: bridge vlan { add | del } vid VLAN_ID dev DEV [ 
pvid] [ untagged ]\n");
+   fprintf(stderr, "Usage: bridge vlan { add | del } vid VLAN_ID dev DEV [ 
pvid ] [ untagged ]\n");
fprintf(stderr, " [ 
self ] [ master ]\n");
fprintf(stderr, "   bridge vlan { show } [ dev DEV ] [ vid VLAN_ID 
]\n");
+   fprintf(stderr, "   bridge vlan { stats } [ dev DEV ] [ vid VLAN_ID 
]\n");
exit(-1);
 }
 
@@ -298,6 +300,88 @@ static int print_vlan(const struct sockaddr_nl *who,
return 0;
 }
 
+static void print_one_vlan_stats(FILE *fp,
+const struct bridge_vlan_xstats *vstats,
+int ifindex)
+{
+   const char *ifname = "";
+
+   if (filter_vlan && filter_vlan != vstats->vid)
+   return;
+   /* skip pure port entries, they'll be dumped via the slave stats call */
+   if ((vstats->flags & BRIDGE_VLAN_INFO_MASTER) &&
+   !(vstats->flags & BRIDGE_VLAN_INFO_BRENTRY))
+   return;
+
+   if (last_ifidx != ifindex) {
+   ifname = ll_index_to_name(ifindex);
+   last_ifidx = ifindex;
+   }
+   fprintf(fp, "%-16s  %hu", ifname, vstats->vid);
+   if (vstats->flags & BRIDGE_VLAN_INFO_PVID)
+   fprintf(fp, " PVID");
+   if (vstats->flags & BRIDGE_VLAN_INFO_UNTAGGED)
+   fprintf(fp, " Egress Untagged");
+   fprintf(fp, "\n");
+   fprintf(fp, "%-16sRX: %llu bytes %llu packets\n",
+   "", vstats->rx_bytes, vstats->rx_packets);
+   fprintf(fp, "%-16sTX: %llu bytes %llu packets\n",
+   "", vstats->tx_bytes, vstats->tx_packets);
+}
+
+static void print_vlan_stats_attr(FILE *fp, struct rtattr *attr, int ifindex)
+{
+   struct rtattr *brtb[LINK_XSTATS_TYPE_MAX+1];
+   struct rtattr *i, *list;
+   int rem;
+
+   parse_rtattr(brtb, LINK_XSTATS_TYPE_MAX, RTA_DATA(attr),
+RTA_PAYLOAD(attr));
+   if (!brtb[LINK_XSTATS_TYPE_BRIDGE])
+   return;
+
+   list = brtb[LINK_XSTATS_TYPE_BRIDGE];
+   rem = RTA_PAYLOAD(list);
+   for (i = RTA_DATA(list); RTA_OK(i, rem); i = RTA_NEXT(i, rem)) {
+   if (i->rta_type != BRIDGE_XSTATS_VLAN)
+   continue;
+   print_one_vlan_stats(fp, RTA_DATA(i), ifindex);
+   }
+}
+
+static int print_vlan_stats(const struct sockaddr_nl *who,
+   st

[PATCH net-next] net: bridge: export also pvid flag in the xstats flags

2016-08-25 Thread Nikolay Aleksandrov
When I added support to export the vlan entry flags via xstats I forgot to
add support for the pvid since it is manually matched, so check if the
entry matches the vlan_group's pvid and set the flag appropriately.

Signed-off-by: Nikolay Aleksandrov 
---
 net/bridge/br_netlink.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/net/bridge/br_netlink.c b/net/bridge/br_netlink.c
index 872d4c0deb59..190a5bc00f4a 100644
--- a/net/bridge/br_netlink.c
+++ b/net/bridge/br_netlink.c
@@ -1313,6 +1313,9 @@ static int br_fill_linkxstats(struct sk_buff *skb,
return -EMSGSIZE;
 
if (vg) {
+   u16 pvid;
+
+   pvid = br_get_pvid(vg);
list_for_each_entry(v, &vg->vlan_list, vlist) {
struct bridge_vlan_xstats vxi;
struct br_vlan_stats stats;
@@ -1322,6 +1325,8 @@ static int br_fill_linkxstats(struct sk_buff *skb,
memset(&vxi, 0, sizeof(vxi));
vxi.vid = v->vid;
vxi.flags = v->flags;
+   if (v->vid == pvid)
+   vxi.flags |= BRIDGE_VLAN_INFO_PVID;
br_vlan_get_stats(v, &stats);
vxi.rx_bytes = stats.rx_bytes;
vxi.rx_packets = stats.rx_packets;
-- 
2.1.4



Re: [PATCH net-next v1] gso: Support partial splitting at the frag_list pointer

2016-08-25 Thread Marcelo Ricardo Leitner
On Thu, Aug 25, 2016 at 01:00:19PM +0200, Steffen Klassert wrote:
> On Wed, Aug 24, 2016 at 09:27:54AM -0700, Alexander Duyck wrote:
> > 
> > In you case though we maybe be able to make this easier.  If I am not
> > mistaken I believe we should have the main skb, and any in the chain
> > excluding the last containing the same amount of data. 
> 
> Yes, it seems to be like that. With this observation we can spmplify
> things. 
> 
> > That being the
> > case we should be able to determine the size that you would need to
> > segment at by taking skb->len, and removing the length of all the
> > skbuffs hanging off of frag_list.  At that point you just use that as
> > your MSS for segmentation and it should break things up so that you
> > have a series of equal sized segments split as the frag_list buffer
> > boundaries.
> > 
> > After that all that is left is to update the gso info for the buffers.
> > For GSO_PARTIAL I was handling that on the first segment only.  For
> > this change you would need to update that code to address the fact
> > that you would have to determine the number of segments on the first
> > frame and the last since the last could be less than the first, but
> > all of the others in-between should have the same number of segments.
> 
> I tried to do this and ended up with the patch below.
> Seems to work, but sill needs some tests. So it is
> not an official patch submission.
> 
> Subject: [PATCH net-next RFC] gso: Support partial splitting at the frag_list 
> pointer
> 
> Since commit 8a29111c7 ("net: gro: allow to build full sized skb")
> gro may build buffers with a frag_list. This can hurt forwarding
> because most NICs can't offload such packets, they need to be
> segmented in software. This patch splits buffers with a frag_list
> at the frag_list pointer into buffers that can be TSO offloaded.
> 
> Signed-off-by: Steffen Klassert 
> ---
>  net/core/skbuff.c  | 46 --
>  net/ipv4/af_inet.c |  6 --
>  net/ipv4/gre_offload.c |  4 +++-
>  net/ipv4/tcp_offload.c |  3 +++
>  net/ipv4/udp_offload.c |  6 --
>  net/ipv6/ip6_offload.c |  5 -
>  6 files changed, 54 insertions(+), 16 deletions(-)
> 
> diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> index 3864b4b6..cb326e5 100644
> --- a/net/core/skbuff.c
> +++ b/net/core/skbuff.c
> @@ -3060,6 +3060,7 @@ struct sk_buff *skb_segment(struct sk_buff *head_skb,
>   unsigned int offset = doffset;
>   unsigned int tnl_hlen = skb_tnl_header_len(head_skb);
>   unsigned int partial_segs = 0;
> + unsigned int fraglist_segs = 0;
>   unsigned int headroom;
>   unsigned int len = head_skb->len;
>   __be16 proto;
> @@ -3078,16 +3079,27 @@ struct sk_buff *skb_segment(struct sk_buff *head_skb,
>   sg = !!(features & NETIF_F_SG);
>   csum = !!can_checksum_protocol(features, proto);
>  
> - /* GSO partial only requires that we trim off any excess that
> -  * doesn't fit into an MSS sized block, so take care of that
> -  * now.
> -  */
> - if (sg && csum && (features & NETIF_F_GSO_PARTIAL)) {
> - partial_segs = len / mss;
> - if (partial_segs > 1)
> - mss *= partial_segs;
> - else
> - partial_segs = 0;
> + if (sg && csum) {
> + /* GSO partial only requires that we trim off any excess that
> +  * doesn't fit into an MSS sized block, so take care of that
> +  * now.
> +  */
> + if ((features & NETIF_F_GSO_PARTIAL)) {
> + partial_segs = len / mss;
> + if (partial_segs > 1)
> + mss *= partial_segs;
> + else
> + partial_segs = 0;
> + } else if (list_skb && (mss != GSO_BY_FRAGS) &&
> +net_gso_ok(features, 
> skb_shinfo(head_skb)->gso_type)) {
> +
> + skb_walk_frags(head_skb, segs) {
> + len -= segs->len;
> + }
> + fraglist_segs = len / mss;
> + mss = len;
> + segs = NULL;
> + }
>   }
>  
>   headroom = skb_headroom(head_skb);
> @@ -3298,6 +3310,20 @@ perform_csum_check:
>   SKB_GSO_CB(segs)->data_offset = skb_headroom(segs) + doffset;
>   }
>  
> + if (fraglist_segs) {
> + struct sk_buff *iter;
> +
> + for (iter = segs; iter; iter = iter->next) {
> + if (iter->next) {
> + skb_shinfo(iter)->gso_size = 
> skb_shinfo(head_skb)->gso_size;
> + skb_shinfo(iter)->gso_segs = fraglist_segs;
> + } else {
> + skb_shinfo(iter)->gso_size = 
> skb_shinfo(head_skb)->gso_size;
> + skb_shinfo(iter)->gso_segs = iter->len / 
> skb_shinfo(head

Re: CVE-2014-9900 fix is not upstream

2016-08-25 Thread Johannes Berg

> If we want to go down this route, probably the only option is to add
> __attribute__((pack)) those structs to just have no padding at all,
> thus breaking uapi.
> 

We could also spell out the padding bytes as reserved, i.e. instead of

struct ethtool_wolinfo {
__u32   cmd;
__u32   supported;
__u32   wolopts;
__u8sopass[SOPASS_MAX]; // 6, actually
};

we could do

struct ethtool_wolinfo {
__u32   cmd;
__u32   supported;
__u32   wolopts;
__u8sopass[SOPASS_MAX]; // 6, actually
__u8reserved[2];
};

and then the compiler has to properly treat it, since it's no longer
unnamed padding.

Maybe somebody can come up with a smart BUILD_BUG_ON() to ensure such
structs have no padding.

That would allow us to keep the C99 initializers (which is nice) and
not have to worry about this.

johannes


Re: CVE-2014-9900 fix is not upstream

2016-08-25 Thread Johannes Berg

> struct ethtool_wolinfo {
> __u32   cmd;
> __u32   supported;
> __u32   wolopts;
> __u8sopass[SOPASS_MAX]; // 6, actually
> };
> 
> we could do
> 
> struct ethtool_wolinfo {
> __u32   cmd;
> __u32   supported;
> __u32   wolopts;
> __u8sopass[SOPASS_MAX]; // 6, actually
>   __u8reserved[2];
> };
> 
> and then the compiler has to properly treat it, since it's no longer
> unnamed padding.
> 

Although, on some architectures, that could actually break the ABI by
changing the size, oh well.

johannes


Re: [RESEND PATCH net 06/10] net: ethernet: mediatek: fix the loss of pin-mux setting for GMAC2

2016-08-25 Thread Andrew Lunn
On Thu, Aug 25, 2016 at 06:44:57PM +0800, Sean Wang wrote:
> ommited the setting about pin-mux which results in incorrect signals
> being routed on GMAC2.

Hi Sean

Please could you explain this some more. I don't know too much about
pinctrl, but i've never seen a driver have to do anything with it. The
core driver code handles it all, selecting the default state. See
seeing this here makes me wonder if it is correct.

Thanks
Andrew

> 
> Signed-off-by: Sean Wang 
> ---
>  drivers/net/ethernet/mediatek/mtk_eth_soc.c | 14 ++
>  drivers/net/ethernet/mediatek/mtk_eth_soc.h |  3 +++
>  2 files changed, 17 insertions(+)
> 
> diff --git a/drivers/net/ethernet/mediatek/mtk_eth_soc.c 
> b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
> index 5bd31f8..0a4c782 100644
> --- a/drivers/net/ethernet/mediatek/mtk_eth_soc.c
> +++ b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
> @@ -1415,6 +1415,7 @@ static int __init mtk_hw_init(struct mtk_eth *eth)
>   usleep_range(10, 20);
>   reset_control_deassert(eth->rstc);
>   usleep_range(10, 20);
> + pinctrl_select_state(eth->pins, eth->ephy_default);
>  
>   /* Set GE2 driving and slew rate */
>   regmap_write(eth->pctl, GPIO_DRV_SEL10, 0xa00);
> @@ -1858,6 +1859,19 @@ static int mtk_probe(struct platform_device *pdev)
>   return -ENODEV;
>   }
>  
> + eth->pins = devm_pinctrl_get(&pdev->dev);
> + if (IS_ERR(eth->pins)) {
> + dev_err(&pdev->dev, "cannot get pinctrl\n");
> + return PTR_ERR(eth->pins);
> + }
> +
> + eth->ephy_default =
> + pinctrl_lookup_state(eth->pins, "default");
> + if (IS_ERR(eth->ephy_default)) {
> + dev_err(&pdev->dev, "cannot get pinctrl state\n");
> + return PTR_ERR(eth->ephy_default);
> + }
> +
>   clk_prepare_enable(eth->clk_ethif);
>   clk_prepare_enable(eth->clk_esw);
>   clk_prepare_enable(eth->clk_gp1);
> diff --git a/drivers/net/ethernet/mediatek/mtk_eth_soc.h 
> b/drivers/net/ethernet/mediatek/mtk_eth_soc.h
> index f82e3ac..13d3f1b 100644
> --- a/drivers/net/ethernet/mediatek/mtk_eth_soc.h
> +++ b/drivers/net/ethernet/mediatek/mtk_eth_soc.h
> @@ -404,6 +404,9 @@ struct mtk_eth {
>   struct clk  *clk_esw;
>   struct clk  *clk_gp1;
>   struct clk  *clk_gp2;
> + struct pinctrl  *pins;
> + struct pinctrl_state*ephy_default;
> +
>   struct mii_bus  *mii_bus;
>   struct work_struct  pending_work;
>  };
> -- 
> 1.9.1
> 


Re: [RESEND PATCH net 09/10] net: ethernet: mediatek: use devm_mdiobus_alloc instead of mdiobus_alloc inside mtk_mdio_init

2016-08-25 Thread John Crispin
Hi Sean,

small nitpick inline

On 25/08/2016 12:45, Sean Wang wrote:
> a lot of parts in the driver uses devm_* APIs to gain benefits from the
> device resource management, so devm_mdiobus_alloc is also used instead
> of mdiobus_alloc to have more elegant code flow.
> 
> Signed-off-by: Sean Wang 
> ---
>  drivers/net/ethernet/mediatek/mtk_eth_soc.c | 13 +
>  1 file changed, 1 insertion(+), 12 deletions(-)
> 
> diff --git a/drivers/net/ethernet/mediatek/mtk_eth_soc.c 
> b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
> index e3baa63..05d85da 100644
> --- a/drivers/net/ethernet/mediatek/mtk_eth_soc.c
> +++ b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
> @@ -304,7 +304,7 @@ static int mtk_mdio_init(struct mtk_eth *eth)
>   goto err_put_node;
>   }
>  
> - eth->mii_bus = mdiobus_alloc();
> + eth->mii_bus = devm_mdiobus_alloc(eth->dev);
>   if (!eth->mii_bus) {
>   err = -ENOMEM;
>   goto err_put_node;
> @@ -318,18 +318,9 @@ static int mtk_mdio_init(struct mtk_eth *eth)
>  
>   snprintf(eth->mii_bus->id, MII_BUS_ID_SIZE, "%s", mii_np->name);
>   err = of_mdiobus_register(eth->mii_bus, mii_np);
> - if (err)
> - goto err_free_bus;
> - of_node_put(mii_np);
> -
> - return 0;
> -
> -err_free_bus:
> - mdiobus_free(eth->mii_bus);
>  
>  err_put_node:
>   of_node_put(mii_np);
> - eth->mii_bus = NULL;
>   return err;

you might want to rename err to ret. that would make it more readable.
right now it looks like the code always flows through the error path.

John

>  }
>  
> @@ -339,8 +330,6 @@ static void mtk_mdio_cleanup(struct mtk_eth *eth)
>   return;
>  
>   mdiobus_unregister(eth->mii_bus);
> - of_node_put(eth->mii_bus->dev.of_node);
> - mdiobus_free(eth->mii_bus);
>  }
>  
>  static inline void mtk_irq_disable(struct mtk_eth *eth, u32 mask)
> 


Re: [RESEND PATCH net 10/10] net: ethernet: mediatek: fix error handling inside mtk_mdio_init

2016-08-25 Thread John Crispin


On 25/08/2016 12:45, Sean Wang wrote:
> return -ENODEV if no child is found in MDIO bus.
> 
> Signed-off-by: Sean Wang 

Acked-by: John Crispin 

> ---
>  drivers/net/ethernet/mediatek/mtk_eth_soc.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/net/ethernet/mediatek/mtk_eth_soc.c 
> b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
> index 05d85da..2d547c2 100644
> --- a/drivers/net/ethernet/mediatek/mtk_eth_soc.c
> +++ b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
> @@ -300,7 +300,7 @@ static int mtk_mdio_init(struct mtk_eth *eth)
>   }
>  
>   if (!of_device_is_available(mii_np)) {
> - err = 0;
> + err = -ENODEV;
>   goto err_put_node;
>   }
>  
> 


Re: [PATCH net] qdisc: fix a module refcount leak in qdisc_create_dflt()

2016-08-25 Thread Wei Yongjun
On 08/25/2016 12:39 AM, Eric Dumazet wrote:
> From: Eric Dumazet 
>
> Should qdisc_alloc() fail, we must release the module refcount
> we got right before.
>
> Fixes: 6da7c8fcbcbd ("qdisc: allow setting default queuing discipline")
> Signed-off-by: Eric Dumazet 
> ---
>  net/sched/sch_generic.c |9 +
>  1 file changed, 5 insertions(+), 4 deletions(-)
>
> diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c
> index e95b67cd5718..657c13362b19 100644
> --- a/net/sched/sch_generic.c
> +++ b/net/sched/sch_generic.c
> @@ -643,18 +643,19 @@ struct Qdisc *qdisc_create_dflt(struct netdev_queue 
> *dev_queue,
>   struct Qdisc *sch;
>  
>   if (!try_module_get(ops->owner))
> - goto errout;
> + return NULL;
>  
>   sch = qdisc_alloc(dev_queue, ops);
> - if (IS_ERR(sch))
> - goto errout;
> + if (IS_ERR(sch)) {
> + module_put(ops->owner);
> + return NULL;
> + }
>   sch->parent = parentid;
>  
>   if (!ops->init || ops->init(sch, NULL) == 0)
>   return sch;
>  
>   qdisc_destroy(sch);

ops->init() failed, missing module_put() here.


> -errout:
>   return NULL;
>  }
>  EXPORT_SYMBOL(qdisc_create_dflt);
>
>
> .
>



Re: [PATCH 1/1] netfilter: gre: Use the consitent GRE and PPTP struct instead of the structures defined in netfilter

2016-08-25 Thread Feng Gao
Hi Pablo,

inline

On Thu, Aug 25, 2016 at 8:16 PM, Pablo Neira Ayuso  wrote:
> On Fri, Aug 19, 2016 at 11:03:46PM +0800, Feng Gao wrote:
>> My email server reports the last same patch email failed to send.
>> So I just sent it again.
>>
>> I am sorry, if anyone receives duplicated ones.
>
> git am 
> v2-1-2-net-next-netfilter-gre-Use-consistent-GRE_-macros-instead-of-ones-defined-by-netfilter..patch
> -s
> Applying: netfilter: gre: Use consistent GRE_* macros instead of ones
> defined by netfilter.
> error: patch failed: include/uapi/linux/if_tunnel.h:36
> error: include/uapi/linux/if_tunnel.h: patch does not apply
>
> It seems your base was missing this patch:
>
> commit ab10dccb11608b96b43b557c12a5ad867723e503
> Author: Gao Feng 
> Date:   Tue Aug 9 12:38:24 2016 +0800
>
> rps: Inspect PPTP encapsulated by GRE to get flow hash
>
> Since I cannot see GRE_FLAGS in your patch as context.

Oh, it is possible that the patches are generated based on my local
branch which has this commit locally, not the latest net-next.
Now, I am more familiar with the rules than before.

>
> Please rebase and resubmit, thanks!

Ok, I will resubmit. Sorry for this fault.

Regards
Feng


Re: [RESEND PATCH net 02/10] net: ethernet: mediatek: fix incorrect return value of devm_clk_get with EPROBE_DEFER

2016-08-25 Thread John Crispin


On 25/08/2016 12:44, Sean Wang wrote:
> If the return value of devm_clk_get is EPROBE_DEFER, we should
> defer probing the driver. The change is verified and works based
> on 4.8-rc1 staying with the latest clk-next code for MT7623.
> 
> Signed-off-by: Sean Wang 
> ---
>  drivers/net/ethernet/mediatek/mtk_eth_soc.c | 11 +--
>  1 file changed, 9 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/net/ethernet/mediatek/mtk_eth_soc.c 
> b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
> index 6e4a6ca..02b048f 100644
> --- a/drivers/net/ethernet/mediatek/mtk_eth_soc.c
> +++ b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
> @@ -1851,8 +1851,15 @@ static int mtk_probe(struct platform_device *pdev)
>   eth->clk_gp1 = devm_clk_get(&pdev->dev, "gp1");
>   eth->clk_gp2 = devm_clk_get(&pdev->dev, "gp2");
>   if (IS_ERR(eth->clk_esw) || IS_ERR(eth->clk_gp1) ||
> - IS_ERR(eth->clk_gp2) || IS_ERR(eth->clk_ethif))
> - return -ENODEV;
> + IS_ERR(eth->clk_gp2) || IS_ERR(eth->clk_ethif)) {
> + if (PTR_ERR(eth->clk_esw) == -EPROBE_DEFER ||
> + PTR_ERR(eth->clk_gp1) == -EPROBE_DEFER ||
> + PTR_ERR(eth->clk_gp1) == -EPROBE_DEFER ||
> + PTR_ERR(eth->clk_gp2) == -EPROBE_DEFER)
> + return -EPROBE_DEFER;
> + else
> + return -ENODEV;
> + }

Hi Sean,

this looks a bit tedious. maybe a better solution would be to add an
array to struct mtk_eth for the clocks and an enum for the index
mapping. that would allow the usage of loops to work out if all clocks
are fine. the following code calling clk_prepare_enable() could then
also be turned into a loop

John

>  
>   clk_prepare_enable(eth->clk_ethif);
>   clk_prepare_enable(eth->clk_esw);
> 


Re: [PATCH for-next 0/2] {IB,net}/hns: Add support of ACPI to the Hisilicon RoCE Driver

2016-08-25 Thread Doug Ledford
On 8/25/2016 8:00 AM, Salil Mehta wrote:
> 
> 
>> -Original Message-
>> From: David Miller [mailto:da...@davemloft.net]
>> Sent: Thursday, August 25, 2016 5:54 AM
>> To: Salil Mehta
>> Cc: dledf...@redhat.com; Huwei (Xavier); oulijun; Zhuangyuzeng (Yisen);
>> mehta.salil@gmail.com; linux-r...@vger.kernel.org;
>> netdev@vger.kernel.org; linux-ker...@vger.kernel.org; Linuxarm
>> Subject: Re: [PATCH for-next 0/2] {IB,net}/hns: Add support of ACPI to
>> the Hisilicon RoCE Driver
>>
>> From: Salil Mehta 
>> Date: Wed, 24 Aug 2016 04:44:48 +0800
>>
>>> This patch is meant to add support of ACPI to the Hisilicon RoCE
>> driver.
>>> Following changes have been made in the driver(s):
>>>
>>> Patch 1/2: HNS Ethernet Driver: changes to support ACPI have been
>> done in
>>>the RoCE reset function part of the HNS ethernet driver. Earlier
>> it only
>>>supported DT/syscon.
>>>
>>> Patch 2/2. HNS RoCE driver: changes done in RoCE driver are meant to
>> detect
>>>the type and then either use DT specific or ACPI spcific
>> functions. Where
>>>ever possible, this patch tries to make use of "Unified Device
>> Property
>>>Interface" APIs to support both DT and ACPI through single
>> interface.
>>>
>>> NOTE 1: ACPI changes done in both of the drivers depend upon the ACPI
>> Table
>>>  (DSDT and IORT tables) changes part of UEFI/BIOS. These changes
>> are NOT
>>>  part of this patch-set.
>>> NOTE 2: Reset function in Patch 1/2 depends upon the reset function
>> added in
>>>  ACPI tables(basically DSDT table) part of the UEFI/BIOS. Again,
>> this
>>>  change is NOT reflected in this patch-set.
>>
>> I can't apply this series to my tree because the hns infiniband driver
>> doesn't exist in it.
> Hi David,
> I understand your point. This patch-set was primarily sent for Doug Ledford
> and is based on his internal repository (which has been rebased on the
> net-next). 
> 
> Though we were hoping, if by any chance, we can expedite the acceptance of the
> below patch part of patch-set in the net-next. This might help Doug Ledford as
> well later on when he pushes the already accepted RoCE driver and ACPI patches
> to linux-next,
> 
> "[PATCH for-next 1/2] net: hns: Add support of ACPI to HNS driver RoCE Reset
>  function"
> 
> Below HNS RoCE reset function patch has already been accepted and is part of 
> your
> net-next,
> https://patchwork.kernel.org/patch/9287497/
> 
> Above ACPI support of RoCE Reset patch cleanly applies over the already 
> accepted
> patch in the link. It is not dependent on other accompanying RoCE driver ACPI
> changes related patch or even the presence of the Infiniband/RoCE Driver in 
> the
> net-next repository.
> 
> Could you please suggest anything here?  
> 
> Thanks 
> Salil
> 

I can take both.  I already pulled net-next to get the initial hns roce
reset patch from Dave, so these will apply cleanly with my tree and
merge cleanly with Dave's due to the common ancestral base.  The only
problem is that if you intend to send any other patches that effect this
code, then they would need to come through me until the 4.9 merge window
is complete so that we don't have later merge conflicts.

-- 
Doug Ledford 
GPG Key ID: 0E572FDD



signature.asc
Description: OpenPGP digital signature


Re: [RFC v2 00/10] Landlock LSM: Unprivileged sandboxing

2016-08-25 Thread Mickaël Salaün

On 25/08/2016 13:05, Andy Lutomirski wrote:
> On Thu, Aug 25, 2016 at 3:32 AM, Mickaël Salaün  wrote:
>> Hi,
>>
>> This series is a proof of concept to fill some missing part of seccomp as the
>> ability to check syscall argument pointers or creating more dynamic security
>> policies. The goal of this new stackable Linux Security Module (LSM) called
>> Landlock is to allow any process, including unprivileged ones, to create
>> powerful security sandboxes comparable to the Seatbelt/XNU Sandbox or the
>> OpenBSD Pledge. This kind of sandbox help to mitigate the security impact of
>> bugs or unexpected/malicious behaviors in userland applications.
>>
> 
> Maybe I'm missing an obvious description, but: do you have a
> description of the eBPF API to landlock?  What function do you
> provide, when is it called, what functions can it call, what does the
> fancy new arraymap do, etc?
> 
> --Andy
> 

The eBPF context is described in "[RFC v2 06/10] landlock: Add LSM hooks".

The provided eBPF functions are described in "[RFC v2 08/10] landlock:
Handle file system comparisons"
(bpf_landlock_cmp_fs_prop_with_struct_file and
bpf_landlock_cmp_fs_beneath_with_struct_file) and "[RFC v2 09/10]
landlock: Handle cgroups" (bpf_landlock_cmp_cgroup_beneath). The
function descriptions are summarized in include/uapi/linux/bpf.h .

This functions can be called by an eBPF program of type
BPF_PROG_TYPE_LANDLOCK_FILE_OPEN, BPF_PROG_TYPE_LANDLOCK_FILE_PERMISSION
and BPF_PROG_TYPE_LANDLOCK_MMAP_FILE as described in "[RFC v2 06/10]
landlock: Add LSM hooks".

I tried to split the commits as much as possible to ease the review. The
"[RFC v2 10/10] samples/landlock: Add sandbox example" may help to see
the whole picture.

Hope this helps,
 Mickaël



signature.asc
Description: OpenPGP digital signature


[PATCH net-next] net: flush the softnet backlog in process context

2016-08-25 Thread Paolo Abeni
Currently in process_backlog(), the process_queue dequeuing is
performed with local IRQ disabled, to protect against
flush_backlog(), which runs in hard IRQ context.

This patch moves the flush operation to a work queue and runs the
callback with bottom half disabled to protect the process_queue
against dequeuing.
Since process_queue is now always manipulated in bottom half context,
the irq disable/enable pair around the dequeue operation are removed.

To keep the flush time as low as possible, the flush
works are scheduled on all online cpu simultaneously, using the
high priority work-queue and statically allocated, per cpu,
work structs.

Overall this change increases the time required to destroy a device
to improve slightly the packets reinjection performances.

Acked-by: Hannes Frederic Sowa 
Signed-off-by: Paolo Abeni 
---
rfc -> v1: rebased

 net/core/dev.c | 72 --
 1 file changed, 50 insertions(+), 22 deletions(-)

diff --git a/net/core/dev.c b/net/core/dev.c
index a75df86..7feae74 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -4292,15 +4292,25 @@ int netif_receive_skb(struct sk_buff *skb)
 }
 EXPORT_SYMBOL(netif_receive_skb);
 
-/* Network device is going away, flush any packets still pending
- * Called with irqs disabled.
- */
-static void flush_backlog(void *arg)
+struct flush_work {
+   struct net_device *dev;
+   struct work_struct work;
+};
+
+DEFINE_PER_CPU(struct flush_work, flush_works);
+
+/* Network device is going away, flush any packets still pending */
+static void flush_backlog(struct work_struct *work)
 {
-   struct net_device *dev = arg;
-   struct softnet_data *sd = this_cpu_ptr(&softnet_data);
+   struct flush_work *flush = container_of(work, typeof(*flush), work);
+   struct net_device *dev = flush->dev;
struct sk_buff *skb, *tmp;
+   struct softnet_data *sd;
+
+   local_bh_disable();
+   sd = this_cpu_ptr(&softnet_data);
 
+   local_irq_disable();
rps_lock(sd);
skb_queue_walk_safe(&sd->input_pkt_queue, skb, tmp) {
if (skb->dev == dev) {
@@ -4310,6 +4320,7 @@ static void flush_backlog(void *arg)
}
}
rps_unlock(sd);
+   local_irq_enable();
 
skb_queue_walk_safe(&sd->process_queue, skb, tmp) {
if (skb->dev == dev) {
@@ -4318,6 +4329,27 @@ static void flush_backlog(void *arg)
input_queue_head_incr(sd);
}
}
+   local_bh_enable();
+}
+
+static void flush_all_backlogs(struct net_device *dev)
+{
+   unsigned int cpu;
+
+   get_online_cpus();
+
+   for_each_online_cpu(cpu) {
+   struct flush_work *flush = per_cpu_ptr(&flush_works, cpu);
+
+   INIT_WORK(&flush->work, flush_backlog);
+   flush->dev = dev;
+   queue_work_on(cpu, system_highpri_wq, &flush->work);
+   }
+
+   for_each_online_cpu(cpu)
+   flush_work(&per_cpu_ptr(&flush_works, cpu)->work);
+
+   put_online_cpus();
 }
 
 static int napi_gro_complete(struct sk_buff *skb)
@@ -4805,8 +4837,9 @@ static bool sd_has_rps_ipi_waiting(struct softnet_data 
*sd)
 
 static int process_backlog(struct napi_struct *napi, int quota)
 {
-   int work = 0;
struct softnet_data *sd = container_of(napi, struct softnet_data, 
backlog);
+   bool again = true;
+   int work = 0;
 
/* Check if we have pending ipi, its better to send them now,
 * not waiting net_rx_action() end.
@@ -4817,23 +4850,20 @@ static int process_backlog(struct napi_struct *napi, 
int quota)
}
 
napi->weight = weight_p;
-   local_irq_disable();
-   while (1) {
+   while (again) {
struct sk_buff *skb;
 
while ((skb = __skb_dequeue(&sd->process_queue))) {
rcu_read_lock();
-   local_irq_enable();
__netif_receive_skb(skb);
rcu_read_unlock();
-   local_irq_disable();
input_queue_head_incr(sd);
-   if (++work >= quota) {
-   local_irq_enable();
+   if (++work >= quota)
return work;
-   }
+
}
 
+   local_irq_disable();
rps_lock(sd);
if (skb_queue_empty(&sd->input_pkt_queue)) {
/*
@@ -4845,16 +4875,14 @@ static int process_backlog(struct napi_struct *napi, 
int quota)
 * and we dont need an smp_mb() memory barrier.
 */
napi->state = 0;
-   rps_unlock(sd);
-
-   break;
+   again = false;
+   } else {
+   skb_queue_splice_tail_init(&sd->input_pkt_queue,
+ 

Re: [RESEND PATCH net 07/10] net: ethernet: mediatek: fix issue of driver removal with interface is up

2016-08-25 Thread John Crispin


On 25/08/2016 12:44, Sean Wang wrote:
> 1) mtk_stop() must be called to stop for freeing DMA resources
> acquired and restoring state changed by mtk_open() when module
> removal.
> 
> 2) group clock disabled related function into mtk_hw_deinit which
> could be reused with others functionality such as the whole ethernet
> reset that would be posted in the later series of patches.
> 

Hi Sean,

these are 2 unrelated changes so they really need to go into two
separate patches. i also think that change 1) would better fit into the
future series making use of that functionality.

John

> Signed-off-by: Sean Wang 
> ---
>  drivers/net/ethernet/mediatek/mtk_eth_soc.c | 22 ++
>  1 file changed, 18 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/net/ethernet/mediatek/mtk_eth_soc.c 
> b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
> index 0a4c782..c573475 100644
> --- a/drivers/net/ethernet/mediatek/mtk_eth_soc.c
> +++ b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
> @@ -1478,6 +1478,16 @@ static int __init mtk_hw_init(struct mtk_eth *eth)
>   return 0;
>  }
>  
> +static int mtk_hw_deinit(struct mtk_eth *eth)
> +{
> + clk_disable_unprepare(eth->clk_esw);
> + clk_disable_unprepare(eth->clk_gp1);
> + clk_disable_unprepare(eth->clk_gp2);
> + clk_disable_unprepare(eth->clk_ethif);
> +
> + return 0;
> +}
> +
>  static int __init mtk_init(struct net_device *dev)
>  {
>   struct mtk_mac *mac = netdev_priv(dev);
> @@ -1919,11 +1929,15 @@ err_free_dev:
>  static int mtk_remove(struct platform_device *pdev)
>  {
>   struct mtk_eth *eth = platform_get_drvdata(pdev);
> + int i;
>  
> - clk_disable_unprepare(eth->clk_ethif);
> - clk_disable_unprepare(eth->clk_esw);
> - clk_disable_unprepare(eth->clk_gp1);
> - clk_disable_unprepare(eth->clk_gp2);
> + /* stop all devices to make sure that dma is properly shut down */
> + for (i = 0; i < MTK_MAC_COUNT; i++) {
> + if (!eth->netdev[i])
> + continue;
> + mtk_stop(eth->netdev[i]);
> + }
> + mtk_hw_deinit(eth);
>  
>   netif_napi_del(ð->tx_napi);
>   netif_napi_del(ð->rx_napi);
> 


Re: [RESEND PATCH net 08/10] net: ethernet: mediatek: fix the missing of_node_put() after node is used done inside mtk_mdio_init

2016-08-25 Thread John Crispin


On 25/08/2016 12:44, Sean Wang wrote:
> This patch adds the missing of_node_put() after finishing the usage
> of of_get_child_by_name.
> 
> Signed-off-by: Sean Wang 

Acked-by: John Crispin 

> ---
>  drivers/net/ethernet/mediatek/mtk_eth_soc.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/net/ethernet/mediatek/mtk_eth_soc.c 
> b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
> index c573475..e3baa63 100644
> --- a/drivers/net/ethernet/mediatek/mtk_eth_soc.c
> +++ b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
> @@ -320,6 +320,7 @@ static int mtk_mdio_init(struct mtk_eth *eth)
>   err = of_mdiobus_register(eth->mii_bus, mii_np);
>   if (err)
>   goto err_free_bus;
> + of_node_put(mii_np);
>  
>   return 0;
>  
> 


Re: [for-next 00/15][PULL request] Mellanox mlx5 core driver updates 2016-08-24

2016-08-25 Thread Doug Ledford
On 8/24/2016 12:38 PM, David Miller wrote:
> From: Saeed Mahameed 
> Date: Wed, 24 Aug 2016 13:38:59 +0300
> 
>> This series contains some low level and API updates for mlx5 core
>> driver interface and mlx5_ifc.h, plus mlx5 LAG core driver support,
>> to be shared as base code for net-next and rdma mlx5 4.9 submissions.
> 
> Pulled, thanks.
> 

Ditto.  The full set of shared patches is now in my for-next branch and
the remainder of your 4.9 IB submissions can be built upon it.

-- 
Doug Ledford 
GPG Key ID: 0E572FDD



signature.asc
Description: OpenPGP digital signature


Re: [PATCH for-next 0/2] {IB,net}/hns: Add support of ACPI to the Hisilicon RoCE Driver

2016-08-25 Thread Doug Ledford
On 8/25/2016 8:08 AM, Salil Mehta wrote:
> 
> 
>> -Original Message-
>> From: Doug Ledford [mailto:dledf...@redhat.com]
>> Sent: Thursday, August 25, 2016 12:57 PM
>> To: David Miller; Salil Mehta
>> Cc: Huwei (Xavier); oulijun; Zhuangyuzeng (Yisen);
>> mehta.salil@gmail.com; linux-r...@vger.kernel.org;
>> netdev@vger.kernel.org; linux-ker...@vger.kernel.org; Linuxarm
>> Subject: Re: [PATCH for-next 0/2] {IB,net}/hns: Add support of ACPI to
>> the Hisilicon RoCE Driver
>>
>> On 8/25/2016 12:53 AM, David Miller wrote:
>>> From: Salil Mehta 
>>> Date: Wed, 24 Aug 2016 04:44:48 +0800
>>>
 This patch is meant to add support of ACPI to the Hisilicon RoCE
>> driver.
 Following changes have been made in the driver(s):

 Patch 1/2: HNS Ethernet Driver: changes to support ACPI have been
>> done in
the RoCE reset function part of the HNS ethernet driver. Earlier
>> it only
supported DT/syscon.

 Patch 2/2. HNS RoCE driver: changes done in RoCE driver are meant to
>> detect
the type and then either use DT specific or ACPI spcific
>> functions. Where
ever possible, this patch tries to make use of "Unified Device
>> Property
Interface" APIs to support both DT and ACPI through single
>> interface.

 NOTE 1: ACPI changes done in both of the drivers depend upon the
>> ACPI Table
  (DSDT and IORT tables) changes part of UEFI/BIOS. These changes
>> are NOT
  part of this patch-set.
 NOTE 2: Reset function in Patch 1/2 depends upon the reset function
>> added in
  ACPI tables(basically DSDT table) part of the UEFI/BIOS. Again,
>> this
  change is NOT reflected in this patch-set.
>>>
>>> I can't apply this series to my tree because the hns infiniband
>> driver
>>> doesn't exist in it.
>>>
>>
>> No.  This probably needs to go through my tree.  Although with all of
>> the requirements, I'm a bit concerned about those being present
>> elsewhere.
>>
>> --
>> Doug Ledford 
>> GPG Key ID: 0E572FDD
> Hello Doug,
> Thanks for your reply. I have just replied to David email as well and did
> not realize your response was already on the way. Sorry for this!
> 
> I would just like to request, if by any chance, we can expedite the acceptance
> of the below patch (part of patch-set) in the net-next. This might help you as
> well in future when you will actually push the RoCE driver to the linux-next.
> 
> "[PATCH for-next 1/2] net: hns: Add support of ACPI to HNS driver RoCE Reset
>  function"
> 
> Below HNS RoCE reset function patch has already been accepted by Dave Miller 
> and
> is part of net-next,
> https://patchwork.kernel.org/patch/9287497/
> 
> Also, above ACPI support of RoCE Reset patch cleanly applies over the already
> accepted patch in the link and is not dependent on other accompanying RoCE
> driver ACPI changes or even the presence of the Infiniband/RoCE Driver in the
> net-next repository.
> 
> Could you please suggest if this the something which can be considered?   

I've pulled both of these patches in.  I usually merge late in the merge
window, so it won't be any stretch to wait until the ACPI tree has been
merged before I send Linus my pull request.


-- 
Doug Ledford 
GPG Key ID: 0E572FDD



signature.asc
Description: OpenPGP digital signature


Re: [RFC 1/3] tcp: randomize tcp timestamp offsets for each connection

2016-08-25 Thread Eric Dumazet
On Thu, 2016-08-25 at 11:06 +0200, Florian Westphal wrote:
> So I gave this a try and it does avoid this tcp_request_sock increase,
> but I feel that getting boot_time_rnd is too easy.

Fair enough, I didn't think very hard about it.

> 
> I tried a few other ideas but nothing satisfying/simpler came out of it
> (e.g. i tried to also hash the isn but that gets scaled w. current
>  clock so it doesn't work).
> 
> Are you more concerned wrt. complexity or the reqsk increase?
> 

No, a reqsk increase is really fine.

I guess I was simply worrying your work would make my future submission
more tricky.

Here at Google we have been using usec resolution in TCP timestamps for
a while for all our DC traffic, and we have to upstream this at some
point.

It would be nice if the randomization was optional, because having usec
timestamps with a common base (ie no per flow random base) helps a lot
to understand the host delays at transmit time, and receive time.

I will review your patch more in depth today, thanks.





Re: [RFC v2 08/10] landlock: Handle file system comparisons

2016-08-25 Thread Mickaël Salaün

On 25/08/2016 13:12, Andy Lutomirski wrote:
> On Thu, Aug 25, 2016 at 3:32 AM, Mickaël Salaün  wrote:
>> Add eBPF functions to compare file system access with a Landlock file
>> system handle:
>> * bpf_landlock_cmp_fs_prop_with_struct_file(prop, map, map_op, file)
>>   This function allows to compare the dentry, inode, device or mount
>>   point of the currently accessed file, with a reference handle.
>> * bpf_landlock_cmp_fs_beneath_with_struct_file(opt, map, map_op, file)
>>   This function allows an eBPF program to check if the current accessed
>>   file is the same or in the hierarchy of a reference handle.
>>
>> The goal of file system handle is to abstract kernel objects such as a
>> struct file or a struct inode. Userland can create this kind of handle
>> thanks to the BPF_MAP_UPDATE_ELEM command. The element is a struct
>> landlock_handle containing the handle type (e.g.
>> BPF_MAP_HANDLE_TYPE_LANDLOCK_FS_FD) and a file descriptor. This could
>> also be any descriptions able to match a struct file or a struct inode
>> (e.g. path or glob string).
> 
> This needs Eric's opinion.
> 
> Also, where do all the struct file *'s get stashed?  Are they
> preserved in the arraymap?  What prevents reference cycles or absurdly
> large numbers of struct files getting pinned?

Yes, the struct file are kept in the arraymap and dropped when there is
no more reference on them. Currently, the limitations are the maximum
number of open file descriptors referring to an arraymap and the maximum
number of eBPF Landlock programs loaded in a process
(LANDLOCK_PROG_LIST_MAX_PAGES in kernel/seccomp.c).

What kind of reference cycles have you in mind?

It probably needs another limit for kernel object references as well.
What is the best option here? Add another static limitation or use an
existing one?

 Mickaël



signature.asc
Description: OpenPGP digital signature


Re: [PATCH v1 1/1 net-next] 8139cp: Fix one possible deadloop in cp_rx_poll

2016-08-25 Thread Feng Gao
On Thu, Aug 25, 2016 at 9:45 AM,   wrote:
> From: Gao Feng 
>
> When cp_rx_poll does not get enough packet, it will check the rx
> interrupt status again. If so, it will jumpt to rx_status_loop again.
> But the goto jump resets the rx variable as zero too.
>
> As a result, it causes one possible deadloop. Assume this case,
> rx_status_loop only gets the packet count which is less than budget,
> and (cpr16(IntrStatus) & cp_rx_intr_mask) condition is always true.
> It causes the deadloop happens and system is blocked.
>
> Signed-off-by: Gao Feng 
> ---
>  drivers/net/ethernet/realtek/8139cp.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/net/ethernet/realtek/8139cp.c 
> b/drivers/net/ethernet/realtek/8139cp.c
> index deae10d..5297bf7 100644
> --- a/drivers/net/ethernet/realtek/8139cp.c
> +++ b/drivers/net/ethernet/realtek/8139cp.c
> @@ -467,8 +467,8 @@ static int cp_rx_poll(struct napi_struct *napi, int 
> budget)
> unsigned int rx_tail = cp->rx_tail;
> int rx;
>
> -rx_status_loop:
> rx = 0;
> +rx_status_loop:
> cpw16(IntrStatus, cp_rx_intr_mask);
>
> while (rx < budget) {
> --
> 1.9.1
>
>

Sorry, this commit should be for net.git, not net-next.git
Because it fixed one possible infinite loop.

Best Regards
Feng


Re: [PATCH net-next] net: flush the softnet backlog in process context

2016-08-25 Thread Eric Dumazet
On Thu, 2016-08-25 at 15:58 +0200, Paolo Abeni wrote:
> Currently in process_backlog(), the process_queue dequeuing is
> performed with local IRQ disabled, to protect against
> flush_backlog(), which runs in hard IRQ context.

Acked-by: Eric Dumazet 

> @@ -6707,7 +6735,7 @@ static void rollback_registered_many(struct list_head 
> *head)
>   unlist_netdevice(dev);
>  
>   dev->reg_state = NETREG_UNREGISTERING;
> - on_each_cpu(flush_backlog, dev, 1);
> + flush_all_backlogs(dev);
>   }
>  
>   synchronize_net();

In a future patch, we could change this so that we kick
flush_all_backlogs() once for all devices, instead of one device at a
time.

We would not pass @dev anymore as a parameter and simply look at
skb->dev->reg_state to decide to remove packets from queues in
flush_backlog()

Batching matters for some guys using hundred of devices and suddenly
removing them all in one go.

Thanks.




[PATCH] smc91x: remove ARM hack for unaligned 16-bit writes

2016-08-25 Thread Arnd Bergmann
The ARM specific I/O operations are almost the same as the generic
ones, with the exception of the SMC_outw macro that works around
a problem of some platforms that cannot write to 16-bit registers
at an address that is not 32-bit aligned.

By inspection, I found that this is handled already in the
register abstractions for almost all cases, the exceptions being
SMC_SET_MAC_ADDR() and SMC_SET_MCAST(). I assume that all
platforms that require the hack for the other registers also
need it here, so the ones listed explictly here are the only
ones that work correctly, while the other ones either don't
need the hack at all, or they will set an incorrect MAC
address (which can often go unnoticed).

This changes the two macros that set the unaligned registers
to use 32-bit writes if possible, which should do the right
thing in all combinations. The ARM specific SMC_outw gets removed
as a consequence.

The only difference between the ARM behavior and the default is
the selection of the LED settings. The fact that we have different
defaults based on the CPU architectures here is a bit suspicious,
but probably harmless, and I have no plan of touching that.

Signed-off-by: Arnd Bergmann 
---
 drivers/net/ethernet/smsc/smc91x.h | 50 +++---
 1 file changed, 30 insertions(+), 20 deletions(-)

While this patch fixes one bug on Neponset, it probably doesn't address
the one that Russell ran into first, so this is for review only for now,
until the remaining problem(s) have been worked out.


diff --git a/drivers/net/ethernet/smsc/smc91x.h 
b/drivers/net/ethernet/smsc/smc91x.h
index 22333477d0b5..908473d9ede0 100644
--- a/drivers/net/ethernet/smsc/smc91x.h
+++ b/drivers/net/ethernet/smsc/smc91x.h
@@ -58,6 +58,7 @@
 #define SMC_inw(a, r)  readw((a) + (r))
 #define SMC_inl(a, r)  readl((a) + (r))
 #define SMC_outb(v, a, r)  writeb(v, (a) + (r))
+#define SMC_outw(v, a, r)  writew(v, (a) + (r))
 #define SMC_outl(v, a, r)  writel(v, (a) + (r))
 #define SMC_insw(a, r, p, l)   readsw((a) + (r), p, l)
 #define SMC_outsw(a, r, p, l)  writesw((a) + (r), p, l)
@@ -65,19 +66,6 @@
 #define SMC_outsl(a, r, p, l)  writesl((a) + (r), p, l)
 #define SMC_IRQ_FLAGS  (-1)/* from resource */
 
-/* We actually can't write halfwords properly if not word aligned */
-static inline void SMC_outw(u16 val, void __iomem *ioaddr, int reg)
-{
-   if ((machine_is_mainstone() || machine_is_stargate2() ||
-machine_is_pxa_idp()) && reg & 2) {
-   unsigned int v = val << 16;
-   v |= readl(ioaddr + (reg & ~2)) & 0x;
-   writel(v, ioaddr + (reg & ~2));
-   } else {
-   writew(val, ioaddr + reg);
-   }
-}
-
 #elif  defined(CONFIG_SH_SH4202_MICRODEV)
 
 #define SMC_CAN_USE_8BIT   0
@@ -1029,18 +1017,40 @@ static const char * chip_ids[ 16 ] =  {
 
 #define SMC_SET_MAC_ADDR(lp, addr) \
do {\
-   SMC_out16(addr[0]|(addr[1] << 8), ioaddr, ADDR0_REG(lp)); \
-   SMC_out16(addr[2]|(addr[3] << 8), ioaddr, ADDR1_REG(lp)); \
-   SMC_out16(addr[4]|(addr[5] << 8), ioaddr, ADDR2_REG(lp)); \
+   if (SMC_32BIT(lp)) {\
+   SMC_outl((addr[0]  )|(addr[1] <<  8) |  \
+(addr[2] << 16)|(addr[3] << 24),   \
+ioaddr, ADDR0_REG(lp));\
+   } else {\
+   SMC_out16(addr[0]|(addr[1] << 8), ioaddr,   \
+ ADDR0_REG(lp));   \
+   SMC_out16(addr[2]|(addr[3] << 8), ioaddr,   \
+ ADDR1_REG(lp));   \
+   }   \
+   SMC_out16(addr[4]|(addr[5] << 8), ioaddr,   \
+ ADDR2_REG(lp)); \
} while (0)
 
 #define SMC_SET_MCAST(lp, x)   \
do {\
const unsigned char *mt = (x);  \
-   SMC_out16(mt[0] | (mt[1] << 8), ioaddr, MCAST_REG1(lp)); \
-   SMC_out16(mt[2] | (mt[3] << 8), ioaddr, MCAST_REG2(lp)); \
-   SMC_out16(mt[4] | (mt[5] << 8), ioaddr, MCAST_REG3(lp)); \
-   SMC_out16(mt[6] | (mt[7] << 8), ioaddr, MCAST_REG4(lp)); \
+   if (SMC_32BIT(lp)) {\
+   SMC_outl((mt[0]  ) | (mt[1] <<  8) |\
+(mt[2] << 16) | (mt[3] << 24), \
+ioaddr, MCAST_REG1(lp));   \
+   SMC_outl(

[PATCH 2/2] smc91x: remove ARM hack for unaligned 16-bit writes

2016-08-25 Thread Arnd Bergmann
The ARM specific I/O operations are almost the same as the generic
ones, with the exception of the SMC_outw macro that works around
a problem of some platforms that cannot write to 16-bit registers
at an address that is not 32-bit aligned.

By inspection, I found that this is handled already in the
register abstractions for almost all cases, the exceptions being
SMC_SET_MAC_ADDR() and SMC_SET_MCAST(). I assume that all
platforms that require the hack for the other registers also
need it here, so the ones listed explictly here are the only
ones that work correctly, while the other ones either don't
need the hack at all, or they will set an incorrect MAC
address (which can often go unnoticed).

This changes the two macros that set the unaligned registers
to use 32-bit writes if possible, which should do the right
thing in all combinations. The ARM specific SMC_outw gets removed
as a consequence.

The only difference between the ARM behavior and the default is
the selection of the LED settings. The fact that we have different
defaults based on the CPU architectures here is a bit suspicious,
but probably harmless, and I have no plan of touching that.

Signed-off-by: Arnd Bergmann 
---
 drivers/net/ethernet/smsc/smc91x.h | 50 +++---
 1 file changed, 30 insertions(+), 20 deletions(-)

diff --git a/drivers/net/ethernet/smsc/smc91x.h 
b/drivers/net/ethernet/smsc/smc91x.h
index 22333477d0b5..908473d9ede0 100644
--- a/drivers/net/ethernet/smsc/smc91x.h
+++ b/drivers/net/ethernet/smsc/smc91x.h
@@ -58,6 +58,7 @@
 #define SMC_inw(a, r)  readw((a) + (r))
 #define SMC_inl(a, r)  readl((a) + (r))
 #define SMC_outb(v, a, r)  writeb(v, (a) + (r))
+#define SMC_outw(v, a, r)  writew(v, (a) + (r))
 #define SMC_outl(v, a, r)  writel(v, (a) + (r))
 #define SMC_insw(a, r, p, l)   readsw((a) + (r), p, l)
 #define SMC_outsw(a, r, p, l)  writesw((a) + (r), p, l)
@@ -65,19 +66,6 @@
 #define SMC_outsl(a, r, p, l)  writesl((a) + (r), p, l)
 #define SMC_IRQ_FLAGS  (-1)/* from resource */
 
-/* We actually can't write halfwords properly if not word aligned */
-static inline void SMC_outw(u16 val, void __iomem *ioaddr, int reg)
-{
-   if ((machine_is_mainstone() || machine_is_stargate2() ||
-machine_is_pxa_idp()) && reg & 2) {
-   unsigned int v = val << 16;
-   v |= readl(ioaddr + (reg & ~2)) & 0x;
-   writel(v, ioaddr + (reg & ~2));
-   } else {
-   writew(val, ioaddr + reg);
-   }
-}
-
 #elif  defined(CONFIG_SH_SH4202_MICRODEV)
 
 #define SMC_CAN_USE_8BIT   0
@@ -1029,18 +1017,40 @@ static const char * chip_ids[ 16 ] =  {
 
 #define SMC_SET_MAC_ADDR(lp, addr) \
do {\
-   SMC_out16(addr[0]|(addr[1] << 8), ioaddr, ADDR0_REG(lp)); \
-   SMC_out16(addr[2]|(addr[3] << 8), ioaddr, ADDR1_REG(lp)); \
-   SMC_out16(addr[4]|(addr[5] << 8), ioaddr, ADDR2_REG(lp)); \
+   if (SMC_32BIT(lp)) {\
+   SMC_outl((addr[0]  )|(addr[1] <<  8) |  \
+(addr[2] << 16)|(addr[3] << 24),   \
+ioaddr, ADDR0_REG(lp));\
+   } else {\
+   SMC_out16(addr[0]|(addr[1] << 8), ioaddr,   \
+ ADDR0_REG(lp));   \
+   SMC_out16(addr[2]|(addr[3] << 8), ioaddr,   \
+ ADDR1_REG(lp));   \
+   }   \
+   SMC_out16(addr[4]|(addr[5] << 8), ioaddr,   \
+ ADDR2_REG(lp)); \
} while (0)
 
 #define SMC_SET_MCAST(lp, x)   \
do {\
const unsigned char *mt = (x);  \
-   SMC_out16(mt[0] | (mt[1] << 8), ioaddr, MCAST_REG1(lp)); \
-   SMC_out16(mt[2] | (mt[3] << 8), ioaddr, MCAST_REG2(lp)); \
-   SMC_out16(mt[4] | (mt[5] << 8), ioaddr, MCAST_REG3(lp)); \
-   SMC_out16(mt[6] | (mt[7] << 8), ioaddr, MCAST_REG4(lp)); \
+   if (SMC_32BIT(lp)) {\
+   SMC_outl((mt[0]  ) | (mt[1] <<  8) |\
+(mt[2] << 16) | (mt[3] << 24), \
+ioaddr, MCAST_REG1(lp));   \
+   SMC_outl((mt[4]  ) | (mt[5] <<  8) |\
+(mt[6] << 16) | (mt[7] << 24), \
+ioaddr, MCAST_REG3(lp));   \
+  

[PATCH 1/2] smc91x: always use 8-bit access if necessary

2016-08-25 Thread Arnd Bergmann
As Russell King found out the hard way, a change I did to fix multiplatform
builds with this driver broke the old Assabet/Neponset platform: It turns
out that while the driver is runtime configurable in principle, the
runtime configuration does not cover the specific case of machines that
can not do any 16-bit I/O on the smc91x registers.

The driver currently provides helpers to access 16-bit registers for
architectures that are known at compile-time to only have 8-bit I/O,
but my patch changed it to a runtime flag that never gets consulted
most register accesses.

This introduces new SMC_out16()/SMC_in16 helpers (if anyone can suggest
a better name, I'm glad to modify this) that behaves like SMC_outw()/SMC_inw()
most of the time, but uses a pair of 8-bit accesses on platforms that
have no support for wider register accesses.

Signed-off-by: Arnd Bergmann 
Reported-by: Russell King 
Fixes: b70661c70830d ("net: smc91x: use run-time configuration on all ARM 
machines")
---
 drivers/net/ethernet/smsc/smc91x.h | 125 -
 1 file changed, 66 insertions(+), 59 deletions(-)


While this patch fixes one bug on Neponset, it probably doesn't address
the one that Russell ran into first, so this is for review only for now,
until the remaining problem(s) have been worked out.

Please ignore the first submission, I accidentally only sent out patch 2/2,
which does not actually fix a bug.

diff --git a/drivers/net/ethernet/smsc/smc91x.h 
b/drivers/net/ethernet/smsc/smc91x.h
index 1a55c7976df0..22333477d0b5 100644
--- a/drivers/net/ethernet/smsc/smc91x.h
+++ b/drivers/net/ethernet/smsc/smc91x.h
@@ -414,25 +414,32 @@ smc_pxa_dma_insw(void __iomem *ioaddr, struct smc_local 
*lp, int reg, int dma,
 #define SMC_outsl(a, r, p, l)  BUG()
 #endif
 
-#if ! SMC_CAN_USE_16BIT
-
 /*
- * Any 16-bit access is performed with two 8-bit accesses if the hardware
- * can't do it directly. Most registers are 16-bit so those are mandatory.
+ * Any 16-bit register access is performed with two 8-bit accesses if the
+ * hardware can't do it directly.
  */
-#define SMC_outw(x, ioaddr, reg)   \
-   do {\
-   unsigned int __val16 = (x); \
-   SMC_outb( __val16, ioaddr, reg );   \
-   SMC_outb( __val16 >> 8, ioaddr, reg + (1 << SMC_IO_SHIFT));\
-   } while (0)
-#define SMC_inw(ioaddr, reg)   \
-   ({  \
-   unsigned int __val16;   \
-   __val16 =  SMC_inb( ioaddr, reg );  \
+#define SMC_out16(x, ioaddr, reg)   \
+do {\
+   if (SMC_CAN_USE_8BIT && !SMC_16BIT(lp)) {\
+   unsigned int __val16 = (x);  \
+   SMC_outb(__val16, ioaddr, reg ); \
+   SMC_outb(__val16 >> 8, ioaddr, reg + (1 << SMC_IO_SHIFT));   \
+   } else { \
+   SMC_outw(x, ioaddr, reg);\
+   }\
+} while (0)
+
+#define SMC_in16(ioaddr, reg)   \
+({  \
+   unsigned int __val16;\
+   if (SMC_CAN_USE_8BIT && !SMC_16BIT(lp)) {\
+   __val16 =  SMC_inb( ioaddr, reg );   \
__val16 |= SMC_inb( ioaddr, reg + (1 << SMC_IO_SHIFT)) << 8; \
-   __val16;\
-   })
+   } else { \
+   __val16 = SMC_inw(ioaddr, reg);  \
+   }\
+   __val16; \
+})
 
 #define SMC_insw(a, r, p, l)   BUG()
 #define SMC_outsw(a, r, p, l)  BUG()
@@ -927,113 +934,113 @@ static const char * chip_ids[ 16 ] =  {
SMC_outw((x) << 8, ioaddr, INT_REG(lp));\
} while (0)
 
-#define SMC_CURRENT_BANK(lp)   SMC_inw(ioaddr, BANK_SELECT)
+#define SMC_CURRENT_BANK(lp)   SMC_in16(ioaddr, BANK_SELECT)
 
 #define SMC_SELECT_BANK(lp, x) \
do {\
if (SMC_MUST_ALIGN_WRITE(lp))   \
  

Re: [RFC 1/3] tcp: randomize tcp timestamp offsets for each connection

2016-08-25 Thread Florian Westphal
Eric Dumazet  wrote:
> Here at Google we have been using usec resolution in TCP timestamps for
> a while for all our DC traffic, and we have to upstream this at some
> point.
> 
> It would be nice if the randomization was optional, because having usec
> timestamps with a common base (ie no per flow random base) helps a lot
> to understand the host delays at transmit time, and receive time.

Would it help to make it per-host instead of per-flow?

> I will review your patch more in depth today, thanks.

Great, thanks a lot!


[PATCH nf-next,v2 2/2] netfilter: nf_tables: honor NLM_F_EXCL flag in set element insertion

2016-08-25 Thread Pablo Neira Ayuso
If the NLM_F_EXCL flag is set, then new elements that clash with an
existing one return EEXIST. In case you try to add an element whose
data area differs from what we have, then this returns EBUSY. If no
flag is specified at all, then this returns success to userspace.

This patch also update the set insert operation so we can fetch the
existing element that clashes with the one you want to add, we need
this to make sure the element data doesn't differ.

Signed-off-by: Pablo Neira Ayuso 
---
v2: Adapt this patch to semantics changes in rhashtable_lookup_get_insert_key()
proposed by Herbert.

 include/net/netfilter/nf_tables.h |  3 ++-
 net/netfilter/nf_tables_api.c | 20 +++-
 net/netfilter/nft_set_hash.c  | 17 +
 net/netfilter/nft_set_rbtree.c| 12 
 4 files changed, 38 insertions(+), 14 deletions(-)

diff --git a/include/net/netfilter/nf_tables.h 
b/include/net/netfilter/nf_tables.h
index f2f1339..8972468 100644
--- a/include/net/netfilter/nf_tables.h
+++ b/include/net/netfilter/nf_tables.h
@@ -251,7 +251,8 @@ struct nft_set_ops {
 
int (*insert)(const struct net *net,
  const struct nft_set *set,
- const struct nft_set_elem 
*elem);
+ const struct nft_set_elem 
*elem,
+ struct nft_set_ext **ext);
void(*activate)(const struct net *net,
const struct nft_set *set,
const struct nft_set_elem 
*elem);
diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c
index 221d27f..bd9715e 100644
--- a/net/netfilter/nf_tables_api.c
+++ b/net/netfilter/nf_tables_api.c
@@ -3483,12 +3483,12 @@ static int nft_setelem_parse_flags(const struct nft_set 
*set,
 }
 
 static int nft_add_set_elem(struct nft_ctx *ctx, struct nft_set *set,
-   const struct nlattr *attr)
+   const struct nlattr *attr, u32 nlmsg_flags)
 {
struct nlattr *nla[NFTA_SET_ELEM_MAX + 1];
struct nft_data_desc d1, d2;
struct nft_set_ext_tmpl tmpl;
-   struct nft_set_ext *ext;
+   struct nft_set_ext *ext, *ext2;
struct nft_set_elem elem;
struct nft_set_binding *binding;
struct nft_userdata *udata;
@@ -3615,9 +3615,19 @@ static int nft_add_set_elem(struct nft_ctx *ctx, struct 
nft_set *set,
goto err4;
 
ext->genmask = nft_genmask_cur(ctx->net) | NFT_SET_ELEM_BUSY_MASK;
-   err = set->ops->insert(ctx->net, set, &elem);
-   if (err < 0)
+   err = set->ops->insert(ctx->net, set, &elem, &ext2);
+   if (err) {
+   if (err == -EEXIST) {
+   if (nft_set_ext_exists(ext, NFT_SET_EXT_DATA) &&
+   nft_set_ext_exists(ext2, NFT_SET_EXT_DATA) &&
+   memcmp(nft_set_ext_data(ext),
+  nft_set_ext_data(ext2), set->dlen) != 0)
+   err = -EBUSY;
+   else if (!(nlmsg_flags & NLM_F_EXCL))
+   err = 0;
+   }
goto err5;
+   }
 
nft_trans_elem(trans) = elem;
list_add_tail(&trans->list, &ctx->net->nft.commit_list);
@@ -3673,7 +3683,7 @@ static int nf_tables_newsetelem(struct net *net, struct 
sock *nlsk,
!atomic_add_unless(&set->nelems, 1, set->size + 
set->ndeact))
return -ENFILE;
 
-   err = nft_add_set_elem(&ctx, set, attr);
+   err = nft_add_set_elem(&ctx, set, attr, nlh->nlmsg_flags);
if (err < 0) {
atomic_dec(&set->nelems);
break;
diff --git a/net/netfilter/nft_set_hash.c b/net/netfilter/nft_set_hash.c
index 564fa79..3794cb2 100644
--- a/net/netfilter/nft_set_hash.c
+++ b/net/netfilter/nft_set_hash.c
@@ -126,7 +126,8 @@ err1:
 }
 
 static int nft_hash_insert(const struct net *net, const struct nft_set *set,
-  const struct nft_set_elem *elem)
+  const struct nft_set_elem *elem,
+  struct nft_set_ext **ext)
 {
struct nft_hash *priv = nft_set_priv(set);
struct nft_hash_elem *he = elem->priv;
@@ -135,9 +136,17 @@ static int nft_hash_insert(const struct net *net, const 
struct nft_set *set,
.set = set,
.key = elem->key.val.data,
};
-
-   return rhashtable_lookup_insert_key(&priv->ht, &arg, &he->node,
-   nft_hash_params);
+   struct nft_hash_elem *prev;
+
+   prev = rhashtable_lookup_get_insert_key(&priv->ht, &arg, &he->node,
+

[PATCH net-next] devlink: remove unused priv_size

2016-08-25 Thread Ivan Vecera
Remove unused and useless priv_size member from struct devlink_ops.

Cc: Jiri Pirko 
Signed-off-by: Ivan Vecera 
---
 include/net/devlink.h | 1 -
 1 file changed, 1 deletion(-)

diff --git a/include/net/devlink.h b/include/net/devlink.h
index c99ffe8..211bd3c 100644
--- a/include/net/devlink.h
+++ b/include/net/devlink.h
@@ -50,7 +50,6 @@ struct devlink_sb_pool_info {
 };
 
 struct devlink_ops {
-   size_t priv_size;
int (*port_type_set)(struct devlink_port *devlink_port,
 enum devlink_port_type port_type);
int (*port_split)(struct devlink *devlink, unsigned int port_index,
-- 
2.7.3



RE: [PATCH for-next 0/2] {IB,net}/hns: Add support of ACPI to the Hisilicon RoCE Driver

2016-08-25 Thread Salil Mehta


> -Original Message-
> From: Doug Ledford [mailto:dledf...@redhat.com]
> Sent: Thursday, August 25, 2016 2:53 PM
> To: Salil Mehta; David Miller
> Cc: Huwei (Xavier); oulijun; Zhuangyuzeng (Yisen);
> mehta.salil@gmail.com; linux-r...@vger.kernel.org;
> netdev@vger.kernel.org; linux-ker...@vger.kernel.org; Linuxarm
> Subject: Re: [PATCH for-next 0/2] {IB,net}/hns: Add support of ACPI to
> the Hisilicon RoCE Driver
> 
> On 8/25/2016 8:00 AM, Salil Mehta wrote:
> >
> >
> >> -Original Message-
> >> From: David Miller [mailto:da...@davemloft.net]
> >> Sent: Thursday, August 25, 2016 5:54 AM
> >> To: Salil Mehta
> >> Cc: dledf...@redhat.com; Huwei (Xavier); oulijun; Zhuangyuzeng
> (Yisen);
> >> mehta.salil@gmail.com; linux-r...@vger.kernel.org;
> >> netdev@vger.kernel.org; linux-ker...@vger.kernel.org; Linuxarm
> >> Subject: Re: [PATCH for-next 0/2] {IB,net}/hns: Add support of ACPI
> to
> >> the Hisilicon RoCE Driver
> >>
> >> From: Salil Mehta 
> >> Date: Wed, 24 Aug 2016 04:44:48 +0800
> >>
> >>> This patch is meant to add support of ACPI to the Hisilicon RoCE
> >> driver.
> >>> Following changes have been made in the driver(s):
> >>>
> >>> Patch 1/2: HNS Ethernet Driver: changes to support ACPI have been
> >> done in
> >>>the RoCE reset function part of the HNS ethernet driver. Earlier
> >> it only
> >>>supported DT/syscon.
> >>>
> >>> Patch 2/2. HNS RoCE driver: changes done in RoCE driver are meant
> to
> >> detect
> >>>the type and then either use DT specific or ACPI spcific
> >> functions. Where
> >>>ever possible, this patch tries to make use of "Unified Device
> >> Property
> >>>Interface" APIs to support both DT and ACPI through single
> >> interface.
> >>>
> >>> NOTE 1: ACPI changes done in both of the drivers depend upon the
> ACPI
> >> Table
> >>>  (DSDT and IORT tables) changes part of UEFI/BIOS. These
> changes
> >> are NOT
> >>>  part of this patch-set.
> >>> NOTE 2: Reset function in Patch 1/2 depends upon the reset function
> >> added in
> >>>  ACPI tables(basically DSDT table) part of the UEFI/BIOS.
> Again,
> >> this
> >>>  change is NOT reflected in this patch-set.
> >>
> >> I can't apply this series to my tree because the hns infiniband
> driver
> >> doesn't exist in it.
> > Hi David,
> > I understand your point. This patch-set was primarily sent for Doug
> Ledford
> > and is based on his internal repository (which has been rebased on
> the
> > net-next).
> >
> > Though we were hoping, if by any chance, we can expedite the
> acceptance of the
> > below patch part of patch-set in the net-next. This might help Doug
> Ledford as
> > well later on when he pushes the already accepted RoCE driver and
> ACPI patches
> > to linux-next,
> >
> > "[PATCH for-next 1/2] net: hns: Add support of ACPI to HNS driver
> RoCE Reset
> >  function"
> >
> > Below HNS RoCE reset function patch has already been accepted and is
> part of your
> > net-next,
> > https://patchwork.kernel.org/patch/9287497/
> >
> > Above ACPI support of RoCE Reset patch cleanly applies over the
> already accepted
> > patch in the link. It is not dependent on other accompanying RoCE
> driver ACPI
> > changes related patch or even the presence of the Infiniband/RoCE
> Driver in the
> > net-next repository.
> >
> > Could you please suggest anything here?
> >
> > Thanks
> > Salil
> >
> 
> I can take both.  I already pulled net-next to get the initial hns roce
> reset patch from Dave, so these will apply cleanly with my tree and
> merge cleanly with Dave's due to the common ancestral base.  The only
> problem is that if you intend to send any other patches that effect
> this
> code, then they would need to come through me until the 4.9 merge
> window
> is complete so that we don't have later merge conflicts.
Ok sure, I got your point. Yes, there are few patches we need to push in
but are related to RoCE CM(Connection Manager) mode and would follow
soon. There are no further patches we foresee which are for RoCE Driver but are
dependent upon HNS Ethernet driver. 

But kindly note, there could be some patches in development in HNS Ethernet 
driver
which might sneak in through net-next. These might not be related to RoCE 
Driver but
might have some common files which might lead to conflict again further down
the line when you try to merge ACPI RoCE reset again. This HNS driver change
is very difficult for us to control since amount of development going on in HNS
is of much higher magnitude than the RoCE as of now. It will be almost 
impossible
for us to convince internally and shift that entire development being done right
now on net-next and rebase it to your internal hns-roce branch for a month of 
time
till 4.9. This will affect many features deadlines internally. 

So, if I understood you correctly, this delta (which could be large), when next 
merge
window open would be taken care by you. And we can expect below to be part of 
4.9
1) RoCE Base driver (*Already Accepted*)
2) AC

RE: [PATCH for-next 0/2] {IB,net}/hns: Add support of ACPI to the Hisilicon RoCE Driver

2016-08-25 Thread Salil Mehta


> -Original Message-
> From: linux-rdma-ow...@vger.kernel.org [mailto:linux-rdma-
> ow...@vger.kernel.org] On Behalf Of Doug Ledford
> Sent: Thursday, August 25, 2016 3:09 PM
> To: Salil Mehta; David Miller
> Cc: Huwei (Xavier); oulijun; Zhuangyuzeng (Yisen);
> mehta.salil@gmail.com; linux-r...@vger.kernel.org;
> netdev@vger.kernel.org; linux-ker...@vger.kernel.org; Linuxarm
> Subject: Re: [PATCH for-next 0/2] {IB,net}/hns: Add support of ACPI to
> the Hisilicon RoCE Driver
> 
> On 8/25/2016 8:08 AM, Salil Mehta wrote:
> >
> >
> >> -Original Message-
> >> From: Doug Ledford [mailto:dledf...@redhat.com]
> >> Sent: Thursday, August 25, 2016 12:57 PM
> >> To: David Miller; Salil Mehta
> >> Cc: Huwei (Xavier); oulijun; Zhuangyuzeng (Yisen);
> >> mehta.salil@gmail.com; linux-r...@vger.kernel.org;
> >> netdev@vger.kernel.org; linux-ker...@vger.kernel.org; Linuxarm
> >> Subject: Re: [PATCH for-next 0/2] {IB,net}/hns: Add support of ACPI
> to
> >> the Hisilicon RoCE Driver
> >>
> >> On 8/25/2016 12:53 AM, David Miller wrote:
> >>> From: Salil Mehta 
> >>> Date: Wed, 24 Aug 2016 04:44:48 +0800
> >>>
>  This patch is meant to add support of ACPI to the Hisilicon RoCE
> >> driver.
>  Following changes have been made in the driver(s):
> 
>  Patch 1/2: HNS Ethernet Driver: changes to support ACPI have been
> >> done in
> the RoCE reset function part of the HNS ethernet driver.
> Earlier
> >> it only
> supported DT/syscon.
> 
>  Patch 2/2. HNS RoCE driver: changes done in RoCE driver are meant
> to
> >> detect
> the type and then either use DT specific or ACPI spcific
> >> functions. Where
> ever possible, this patch tries to make use of "Unified Device
> >> Property
> Interface" APIs to support both DT and ACPI through single
> >> interface.
> 
>  NOTE 1: ACPI changes done in both of the drivers depend upon the
> >> ACPI Table
>   (DSDT and IORT tables) changes part of UEFI/BIOS. These
> changes
> >> are NOT
>   part of this patch-set.
>  NOTE 2: Reset function in Patch 1/2 depends upon the reset
> function
> >> added in
>   ACPI tables(basically DSDT table) part of the UEFI/BIOS.
> Again,
> >> this
>   change is NOT reflected in this patch-set.
> >>>
> >>> I can't apply this series to my tree because the hns infiniband
> >> driver
> >>> doesn't exist in it.
> >>>
> >>
> >> No.  This probably needs to go through my tree.  Although with all
> of
> >> the requirements, I'm a bit concerned about those being present
> >> elsewhere.
> >>
> >> --
> >> Doug Ledford 
> >> GPG Key ID: 0E572FDD
> > Hello Doug,
> > Thanks for your reply. I have just replied to David email as well and
> did
> > not realize your response was already on the way. Sorry for this!
> >
> > I would just like to request, if by any chance, we can expedite the
> acceptance
> > of the below patch (part of patch-set) in the net-next. This might
> help you as
> > well in future when you will actually push the RoCE driver to the
> linux-next.
> >
> > "[PATCH for-next 1/2] net: hns: Add support of ACPI to HNS driver
> RoCE Reset
> >  function"
> >
> > Below HNS RoCE reset function patch has already been accepted by Dave
> Miller and
> > is part of net-next,
> > https://patchwork.kernel.org/patch/9287497/
> >
> > Also, above ACPI support of RoCE Reset patch cleanly applies over the
> already
> > accepted patch in the link and is not dependent on other accompanying
> RoCE
> > driver ACPI changes or even the presence of the Infiniband/RoCE
> Driver in the
> > net-next repository.
> >
> > Could you please suggest if this the something which can be
> considered?
> 
> I've pulled both of these patches in.  I usually merge late in the
> merge
> window, so it won't be any stretch to wait until the ACPI tree has been
> merged before I send Linus my pull request.
Thanks David! Hope we can take care of the delta which might get created
because of unrelated (not related to RoCE driver from other people) HNS
Ethernet changes?

The pace & magnitude at which HNS development is going on and at which RoCE
Development is going on is different.

Best regards
Salil
> 
> 
> --
> Doug Ledford 
> GPG Key ID: 0E572FDD



Re: [PATCH net-next] net: flush the softnet backlog in process context

2016-08-25 Thread Paolo Abeni
On Thu, 2016-08-25 at 07:47 -0700, Eric Dumazet wrote:
> On Thu, 2016-08-25 at 07:32 -0700, Eric Dumazet wrote:
> 
> > In a future patch, we could change this so that we kick
> > flush_all_backlogs() once for all devices, instead of one device at a
> > time.
> > 
> > We would not pass @dev anymore as a parameter and simply look at
> > skb->dev->reg_state to decide to remove packets from queues in
> > flush_backlog()
> 
> This would be something like :

This is actually a nice cleanup. I hope to test it later.

> diff --git a/net/core/dev.c b/net/core/dev.c
> index 7feae74ca928..793ace2c600f 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -4293,7 +4293,6 @@ int netif_receive_skb(struct sk_buff *skb)
>  EXPORT_SYMBOL(netif_receive_skb);
>  
>  struct flush_work {
> - struct net_device *dev;
>   struct work_struct work;
>  };

With 'dev' removal, I think we can use directly 'work_struct'  and avoid
'container_of' usage in flush_backlog().

Paolo



Re: [PATCH net] qdisc: fix a module refcount leak in qdisc_create_dflt()

2016-08-25 Thread Eric Dumazet
On Thu, 2016-08-25 at 21:26 +0800, Wei Yongjun wrote:
> On 08/25/2016 12:39 AM, Eric Dumazet wrote:
> > From: Eric Dumazet 
> >
> > Should qdisc_alloc() fail, we must release the module refcount
> > we got right before.
> >
> > Fixes: 6da7c8fcbcbd ("qdisc: allow setting default queuing discipline")
> > Signed-off-by: Eric Dumazet 
> > ---
> >  net/sched/sch_generic.c |9 +
> >  1 file changed, 5 insertions(+), 4 deletions(-)
> >
> > diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c
> > index e95b67cd5718..657c13362b19 100644
> > --- a/net/sched/sch_generic.c
> > +++ b/net/sched/sch_generic.c
> > @@ -643,18 +643,19 @@ struct Qdisc *qdisc_create_dflt(struct netdev_queue 
> > *dev_queue,
> > struct Qdisc *sch;
> >  
> > if (!try_module_get(ops->owner))
> > -   goto errout;
> > +   return NULL;
> >  
> > sch = qdisc_alloc(dev_queue, ops);
> > -   if (IS_ERR(sch))
> > -   goto errout;
> > +   if (IS_ERR(sch)) {
> > +   module_put(ops->owner);
> > +   return NULL;
> > +   }
> > sch->parent = parentid;
> >  
> > if (!ops->init || ops->init(sch, NULL) == 0)
> > return sch;
> >  
> > qdisc_destroy(sch);
> 
> ops->init() failed, missing module_put() here.

qdisc_destroy() is doing this for us, we do not want to have a double
module_put()





Re: [PATCH net-next] net: flush the softnet backlog in process context

2016-08-25 Thread Eric Dumazet
On Thu, 2016-08-25 at 07:32 -0700, Eric Dumazet wrote:

> In a future patch, we could change this so that we kick
> flush_all_backlogs() once for all devices, instead of one device at a
> time.
> 
> We would not pass @dev anymore as a parameter and simply look at
> skb->dev->reg_state to decide to remove packets from queues in
> flush_backlog()

This would be something like :

diff --git a/net/core/dev.c b/net/core/dev.c
index 7feae74ca928..793ace2c600f 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -4293,7 +4293,6 @@ int netif_receive_skb(struct sk_buff *skb)
 EXPORT_SYMBOL(netif_receive_skb);
 
 struct flush_work {
-   struct net_device *dev;
struct work_struct work;
 };
 
@@ -4303,7 +4302,6 @@ DEFINE_PER_CPU(struct flush_work, flush_works);
 static void flush_backlog(struct work_struct *work)
 {
struct flush_work *flush = container_of(work, typeof(*flush), work);
-   struct net_device *dev = flush->dev;
struct sk_buff *skb, *tmp;
struct softnet_data *sd;
 
@@ -4313,7 +4311,7 @@ static void flush_backlog(struct work_struct *work)
local_irq_disable();
rps_lock(sd);
skb_queue_walk_safe(&sd->input_pkt_queue, skb, tmp) {
-   if (skb->dev == dev) {
+   if (skb->dev->reg_state == NETREG_UNREGISTERING) {
__skb_unlink(skb, &sd->input_pkt_queue);
kfree_skb(skb);
input_queue_head_incr(sd);
@@ -4323,7 +4321,7 @@ static void flush_backlog(struct work_struct *work)
local_irq_enable();
 
skb_queue_walk_safe(&sd->process_queue, skb, tmp) {
-   if (skb->dev == dev) {
+   if (skb->dev->reg_state == NETREG_UNREGISTERING) {
__skb_unlink(skb, &sd->process_queue);
kfree_skb(skb);
input_queue_head_incr(sd);
@@ -4332,7 +4330,7 @@ static void flush_backlog(struct work_struct *work)
local_bh_enable();
 }
 
-static void flush_all_backlogs(struct net_device *dev)
+static void flush_all_backlogs(void)
 {
unsigned int cpu;
 
@@ -4342,7 +4340,6 @@ static void flush_all_backlogs(struct net_device *dev)
struct flush_work *flush = per_cpu_ptr(&flush_works, cpu);
 
INIT_WORK(&flush->work, flush_backlog);
-   flush->dev = dev;
queue_work_on(cpu, system_highpri_wq, &flush->work);
}
 
@@ -6735,8 +6732,8 @@ static void rollback_registered_many(struct list_head 
*head)
unlist_netdevice(dev);
 
dev->reg_state = NETREG_UNREGISTERING;
-   flush_all_backlogs(dev);
}
+   flush_all_backlogs();
 
synchronize_net();
 




Re: [PATCH for-next 0/2] {IB,net}/hns: Add support of ACPI to the Hisilicon RoCE Driver

2016-08-25 Thread Doug Ledford
On 8/25/2016 10:50 AM, Salil Mehta wrote:

>> I can take both.  I already pulled net-next to get the initial hns roce
>> reset patch from Dave, so these will apply cleanly with my tree and
>> merge cleanly with Dave's due to the common ancestral base.  The only
>> problem is that if you intend to send any other patches that effect
>> this
>> code, then they would need to come through me until the 4.9 merge
>> window
>> is complete so that we don't have later merge conflicts.
> Ok sure, I got your point. Yes, there are few patches we need to push in
> but are related to RoCE CM(Connection Manager) mode and would follow
> soon. There are no further patches we foresee which are for RoCE Driver but 
> are
> dependent upon HNS Ethernet driver. 

Ok.

> But kindly note, there could be some patches in development in HNS Ethernet 
> driver
> which might sneak in through net-next. These might not be related to RoCE 
> Driver but
> might have some common files which might lead to conflict again further down
> the line when you try to merge ACPI RoCE reset again. This HNS driver change
> is very difficult for us to control since amount of development going on in 
> HNS
> is of much higher magnitude than the RoCE as of now. It will be almost 
> impossible
> for us to convince internally and shift that entire development being done 
> right
> now on net-next and rebase it to your internal hns-roce branch for a month of 
> time
> till 4.9. This will affect many features deadlines internally. 

This is what Linus wants to avoid.  It's not necessary to shift your
work from one tree to another, what is needed if for your RoCE team and
your net team to plan out what you are going to submit for the next
kernel and provide a complete list of conflicting code patches to both
Dave and myself and allow us to pull those patches into both our trees
so there are no conflicts.  See the recent threads on linux-rdma about
the pull requests from Mellanox.  This is how it needs to be done.
Neither team needs to slow down, or not do your work, you simply need to
plan that work out and provide a common base for Dave and I to apply the
separate patches on top of.

> So, if I understood you correctly, this delta (which could be large), when 
> next merge
> window open would be taken care by you. And we can expect below to be part of 
> 4.9
> 1) RoCE Base driver (*Already Accepted*)
> 2) ACPI changes for RoCE Driver (*if accepted*)
>* ACPI changes for the RoCE Driver
>* ACPI changes for RoCE reset function part of the HNS driver

Both of these changes are already applied to my tree.  However, if you
submit other changes to net-next and it starts generating merge
conflicts, you and the net team are going to get yelled at.  If you are
going to have a shared driver, then you *HAVE* to work as a larger team
and plan your changes you submit to the linux kernel.


-- 
Doug Ledford 
GPG Key ID: 0E572FDD



signature.asc
Description: OpenPGP digital signature


  1   2   3   >