Re: [PATCH V2] Add flow control to the portmapper

2016-07-24 Thread Leon Romanovsky
On Fri, Jul 22, 2016 at 10:26:01AM -0500, Shiraz Saleem wrote:
> On Thu, Jul 21, 2016 at 08:29:42PM +0300, Leon Romanovsky wrote:
> > On Wed, Jul 20, 2016 at 09:47:50PM -0500, Shiraz Saleem wrote:
> > > On Tue, Jul 19, 2016 at 08:32:53PM +0300, Leon Romanovsky wrote:
> > > > On Tue, Jul 19, 2016 at 09:50:24AM -0500, Shiraz Saleem wrote:
> > > > > On Tue, Jul 19, 2016 at 08:40:06AM +0300, Leon Romanovsky wrote:
> > > > > > 
> > > > > > You are the one user of this new inline function.
> > > > > > Why don't you directly call to netlink_unicast() in your 
> > > > > > ibnl_unicast()
> > > > > > without messing with widely visible header file?
> > > > > 
> > > > > Since there is a non-blocking version of nlmsg_unicast(), the idea is 
> > > > > to make a blocking version available to others as well as maintain 
> > > > > consistency of existing code.
> > > > > 
> > > > 
> > > > In such way, please provide patch series which will convert all other
> > > > users to this new call.
> > > > 
> > > > ➜  linux-rdma git:(master) grep -rI netlink_unicast * | grep -I 0
> > > > kernel/audit.c: err = netlink_unicast(audit_sock, skb, 
> > > > audit_nlk_portid, 0);
> > > > kernel/audit.c: netlink_unicast(aunet->nlsk, skb, dest->portid, 
> > > > 0);
> > > > kernel/audit.c: netlink_unicast(aunet->nlsk , reply->skb, 
> > > > reply->portid, 0);
> > > > kernel/audit.c: return netlink_unicast(audit_sock, skb, 
> > > > audit_nlk_portid, 0);
> > > > samples/connector/cn_test.c:netlink_unicast(nls, skb, 0, 0);
> > > 
> > > These usages of netlink_unicast() with blocking are not the same as the 
> > > new
> > > nlmsg_unicast_block() function. 
> > 
> > Really?
> > Did you look in the code?
> > Let's take first function from that grep output
> > 
> > 414 err = netlink_unicast(audit_sock, skb, audit_nlk_portid, 0);
> > 415 if (err < 0) {
> > ... do something ...
> > 437 } else
> > ... do something else ...
> > 
> > which fits nicely with your proposal.
> > 
> > +static inline int nlmsg_unicast_block(struct sock *sk, struct sk_buff 
> > *skb, u32 portid)
> > +{
> > +   int err;
> > +
> > +   err = netlink_unicast(sk, skb, portid, 0);
> > +   if (err > 0)
> > +   err = 0;
> > +
> > +   return err;
> > +}
> > 
> > 
> > > You can't drop in nlmsg_unicast_block() in 
> > > place of netlink_unicast() in these places. I'm not going to introduce 
> > > code 
> > > which modifies old behavior.
> > 
> > Again, you aren't changing any behaviour.
> > Anyway we are not adding general function to common include file just
> > because one caller wants it.
> > 
> 
> We assumed the nlmsg_ API in linux/include/net/netlink.h is there for a 
> purpose. 
> That purpose is to normalize the return code. That API is used in places 
> where 
> the return code needs to be normalized, and when normalization is not needed, 
> then the direct calls are used. 
> 
> Now since the nlm_ API in netlink.h is missing a blocking version of the 
> nlmsg_unicast function, it would seem reasonable to add it there.
> 
> Changing all the direct calls as you suggest would at the very least be 
> less efficient since it would normalize return codes when not needed. 

One if with one assignment in non data path.
Please look at the code.

> 
> However, if there is a strict rule against adding an API unless you 
> immediately 
> have at least 2 callers, then I guess, we will make the direct call. The 
> amount 
> of code added will be the same, except that the next person who wants a 
> normalized 
> return code will have to duplicate the same code.

Yes, we are not adding to general header file code which has not
multiple callers.

> 
> Changing other code to be less efficient so that we can meet the 2 caller 
> criteria 
> doesn't seem reasonable.

I'm sorry to hear that you didn't look at the code.

> 
> 


signature.asc
Description: Digital signature


Re: [PATCH V2] Add flow control to the portmapper

2016-07-24 Thread Leon Romanovsky
On Thu, Jul 21, 2016 at 12:42:42PM -0500, Steve Wise wrote:
> > 
> > On Wed, Jul 20, 2016 at 09:47:50PM -0500, Shiraz Saleem wrote:
> > > On Tue, Jul 19, 2016 at 08:32:53PM +0300, Leon Romanovsky wrote:
> > > > On Tue, Jul 19, 2016 at 09:50:24AM -0500, Shiraz Saleem wrote:
> > > > > On Tue, Jul 19, 2016 at 08:40:06AM +0300, Leon Romanovsky wrote:
> > > > > >
> > > > > > You are the one user of this new inline function.
> > > > > > Why don't you directly call to netlink_unicast() in your 
> > > > > > ibnl_unicast()
> > > > > > without messing with widely visible header file?
> > > > >
> > > > > Since there is a non-blocking version of nlmsg_unicast(), the idea is
> > > > > to make a blocking version available to others as well as maintain
> > > > > consistency of existing code.
> > > > >
> > > >
> > > > In such way, please provide patch series which will convert all other
> > > > users to this new call.
> > > >
> > > > ➜  linux-rdma git:(master) grep -rI netlink_unicast * | grep -I 0
> > > > kernel/audit.c: err = netlink_unicast(audit_sock, skb, 
> > > > audit_nlk_portid, 0);
> > > > kernel/audit.c: netlink_unicast(aunet->nlsk, skb, dest->portid, 
> > > > 0);
> > > > kernel/audit.c: netlink_unicast(aunet->nlsk , reply->skb, 
> > > > reply->portid, 0);
> > > > kernel/audit.c: return netlink_unicast(audit_sock, skb, 
> > > > audit_nlk_portid, 0);
> > > > samples/connector/cn_test.c:netlink_unicast(nls, skb, 0, 0);
> > >
> > > These usages of netlink_unicast() with blocking are not the same as the 
> > > new
> > > nlmsg_unicast_block() function.
> > 
> > Really?
> > Did you look in the code?
> > Let's take first function from that grep output
> > 
> > 414 err = netlink_unicast(audit_sock, skb, audit_nlk_portid, 0);
> > 415 if (err < 0) {
> > ... do something ...
> > 437 } else
> > ... do something else ...
> > 
> > which fits nicely with your proposal.
> >
> 
> The key is to ensure that places calling a blocking service are never called 
> in a non-blocking context.   Leon, do you know if the new sites are always 
> safe to block?  
> 
> In general, I think blocking due to sockbuf flow control vs dropping or 
> retrying is a good thing for all the users in the rdam core, assuming they 
> are safe to block.

Steve,
Sorry for my slow response,

I afraid that you was misled by the author of the proposed patch who did
two logical changes in one patch. One is move from non-blocking mode to
blocking mode which is fine enough after justification was added. And
the second change is introduction of general inline function in common
header file (include/net/netlink.h) with one caller only.

This second change is in question and I'm not feeling comfortable by
half done work.

> 
>  
> > +static inline int nlmsg_unicast_block(struct sock *sk, struct sk_buff 
> > *skb, u32
> > portid)
> > +{
> > +   int err;
> > +
> > +   err = netlink_unicast(sk, skb, portid, 0);
> > +   if (err > 0)
> > +   err = 0;
> > +
> > +   return err;
> > +}
> > 
> > 
> > > You can't drop in nlmsg_unicast_block() in
> > > place of netlink_unicast() in these places. I'm not going to introduce 
> > > code
> > > which modifies old behavior.
> > 
> > Again, you aren't changing any behaviour.
> 
> Potential block/sleep is a change.  But if we can conclude that these 
> additional sites are safe to block, then probably its ok to just go ahead and 
> use the blocking service everywhere.

These potential sites has the same blocking call now netlink_unicast(... , ... 
, ... , 0),
the difference and question if they can handle normalized return value from new 
nlmsg_unicast_block
function. I'm convinced that the answer is yes.


signature.asc
Description: Digital signature


Re: [PATCH] net: neigh: disallow state transition DELAY->STALE in neigh_update()

2016-07-24 Thread YOSHIFUJI Hideaki/吉藤英明
Hi,

Chunhui He wrote:
> Hi,
> 
> On Fri, 22 Jul 2016 10:20:01 +0300 (EEST), Julian Anastasov  
> wrote:
>>
>>  Hello,
>>
>> On Thu, 21 Jul 2016, Chunhui He wrote:
>>
>>> If neigh entry was CONNECTED and address is not changed, and if new state is
>>> STALE, entry state will not change. Because DELAY is not in CONNECTED, it's
>>> possible to change state from DELAY to STALE.
>>>
>>> That is bad. Consider a host in IPv4 nerwork, a neigh entry in STALE state
>>> is referenced to send packets, so goes to DELAY state. If the entry is not
>>> confirmed by upper layer, it goes to PROBE state, and sends ARP request.
>>> The neigh host sends ARP reply, then the entry goes to REACHABLE state.
>>> But the entry state may be reseted to STALE by broadcast ARP packets, before
>>> the entry goes to PROBE state. So it's possible that the entry will never go
>>> to REACHABLE state, without external confirmation.
>>>
>>> In my case, the gateway refuses to send unicast packets to me, before it 
>>> sees
>>> my ARP request. So it's critical to enter REACHABLE state by sending ARP
>>> request, but not by external confirmation.
>>>
>>> This fixes neigh_update() not to change to STALE if old state is CONNECTED 
>>> or
>>> DELAY.
>>>
>>> Signed-off-by: Chunhui He 
>>> ---
>>>  net/core/neighbour.c | 2 +-
>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/net/core/neighbour.c b/net/core/neighbour.c
>>> index 510cd62..29429eb 100644
>>> --- a/net/core/neighbour.c
>>> +++ b/net/core/neighbour.c
>>> @@ -1152,7 +1152,7 @@ int neigh_update(struct neighbour *neigh, const u8 
>>> *lladdr, u8 new,
>>> } else {
>>> if (lladdr == neigh->ha && new == NUD_STALE &&
>>> ((flags & NEIGH_UPDATE_F_WEAK_OVERRIDE) ||
>>> -(old & NUD_CONNECTED))
>>> +(old & (NUD_CONNECTED | NUD_DELAY)))
>>> )
>>> new = old;
>>> }
>>
>>  You change looks correct to me. But this place
>> has more problems. There is no good reason to set NUD_STALE
>> for any state that is NUD_VALID if address is not changed.
>> This matches perfectly the comment above this code:
>> NUD_STALE should change a NUD_VALID state only when
>> address changes. It also means that IPv6 does not need
>> to provide NEIGH_UPDATE_F_WEAK_OVERRIDE anymore when
>> NEIGH_UPDATE_F_OVERRIDE is also present.
>>
> 
> The NEIGH_UPDATE_F_WEAK_OVERRIDE is confusing to me, so I choose not to deal
> with the flag.

IPv6 depends on WEAK_OVERRIDE.  Please do not change.

> 
>>  By this way the state machine can continue with
>> the resolving: NUD_STALE -> NUD_DELAY (traffic) ->
>> NUD_PROBE (retries) -> NUD_REACHABLE (unicast reply)
>> while the address is not changed. Your change covers only
>> NUD_DELAY, not NUD_PROBE, so it is better to allow more
>> retries to send. We should not give up until success (NUD_REACHABLE).
>>
> 
> I have thought about this.
> The origin code allows NUD_DELAY -> NUD_STALE and NUD_PROBE -> NUD_STALE.
> This part was imported to kernel since v2.1.79, I don't know clearly why it
> allows that.
> 
> My analysis:
> (1) As shown in my previous mail, NUD_DELAY -> NUD_STALE may cause "dead 
> loop",
> so it should be fixed.
> 
> (2) But NUD_PROBE -> NUD_STALE is acceptable, because in NUD_PROBE, ARP 
> request
> has been sent, it is sufficient to break the "dead loop".
> More attempts are accomplished by the following sequence:
> NUD_STALE --> NUD_DELAY -(sent req)-> NUD_PROBE -(reset by neigh_update())->
> NUD_STALE --> NUD_DELAY -(send req again)-> ... -->
> NUD_REACHABLE
> 
> 
> But I also agree your change.
> 
>>  Second problem: NEIGH_UPDATE_F_WEAK_OVERRIDE has no
>> priority over NEIGH_UPDATE_F_ADMIN. For example, now I can not
>> change from NUD_PERMANENT to NUD_STALE:
>>
>> # ip neigh add 192.168.168.111 lladdr 00:11:22:33:44:55 nud perm dev wlan0
>> # ip neigh show to 192.168.168.111
>> 192.168.168.111 dev wlan0 lladdr 00:11:22:33:44:55 PERMANENT
>> # ip neigh change 192.168.168.111 lladdr 00:11:22:33:44:55 nud stale dev 
>> wlan0
>> # ip neigh show to 192.168.168.111
>> 192.168.168.111 dev wlan0 lladdr 00:11:22:33:44:55 PERMANENT
>>
>>  IMHO, here is how this place should look:
>>
>> diff --git a/net/core/neighbour.c b/net/core/neighbour.c
>> index 5cdc62a..2b1cb91 100644
>> --- a/net/core/neighbour.c
>> +++ b/net/core/neighbour.c
>> @@ -1151,10 +1151,8 @@ int neigh_update(struct neighbour *neigh, const u8 
>> *lladdr, u8 new,
>>  goto out;
>>  } else {
>>  if (lladdr == neigh->ha && new == NUD_STALE &&
>> -((flags & NEIGH_UPDATE_F_WEAK_OVERRIDE) ||
>> - (old & NUD_CONNECTED))
>> -)
>> -new = old;
>> +!(flags & NEIGH_UPDATE_F_ADMIN))
>> +  

Re: [PATCH net-next v4] cdc_ether: Improve ZTE MF823/831/910 handling

2016-07-24 Thread David Miller
From: Kristian Evensen 
Date: Thu, 21 Jul 2016 11:10:06 +0200

> The firmware in several ZTE devices (at least the MF823/831/910
> modems/mifis) use OS fingerprinting to determine which type of device to
> export. In addition, these devices export a REST API which can be used to
> control the type of device. So far, on Linux, the devices have been seen as
> RNDIS or CDC Ether.
> 
> When CDC Ether is used, devices of the same type are, as with RNDIS,
> exported with the same, bogus random MAC address. In addition, the devices
> (at least on all firmware revisions I have found) use the bogus MAC when
> sending traffic routed from external networks. And as a final feature, the
> devices sometimes export the link state incorrectly. There are also
> references online to several other ZTE devices displaying this behavior,
> with several different PIDs and MAC addresses.
> 
> This patch tries to improve the handling of ZTE devices by doing the
> following:
 ...
> v3->v4:
> * Forgot to remove unused variables, sorry about that (thanks David
> Miller).

Applied, thanks.


Re: [PATCH net-next V3] net: ena: Add a driver for Amazon Elastic Network Adapters (ENA)

2016-07-24 Thread Leon Romanovsky
On Tue, Jul 19, 2016 at 10:49:57AM -0700, Joe Perches wrote:
> On Tue, 2016-07-19 at 20:26 +0300, Leon Romanovsky wrote:
> > On Tue, Jul 19, 2016 at 02:09:25PM +0300, Netanel Belgazal wrote:
> > > This is the debugging message interface.
> > > https://www.kernel.org/doc/Documentation/networking/netif-msg.txt
> > This document was updated last time in 2006 and I doubt that it is
> > relevant in 2016. You have dynamic debug prints infrastructure for it,
> > use it.
> 
> I think this is uninformed.
> 
> netif_ works well, is compatible with dynamic debug,
> and is commonly used with new networking drivers.
> 

I have a very strong feeling that it is not "used in new drivers" by was
influenced (copied) from "old drivers". The same goes for real life usage of
module version which was introduced in this patch.


signature.asc
Description: Digital signature


[PATCH net-next 3/3] bnxt_en: Add new NPAR and dual media device IDs.

2016-07-24 Thread Michael Chan
Add 5741X/5731X NPAR device IDs and dual media SFP/10GBase-T device IDs.

Signed-off-by: Michael Chan 
---
 drivers/net/ethernet/broadcom/bnxt/bnxt.c | 39 ++-
 1 file changed, 33 insertions(+), 6 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c 
b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index 398ecba..1d5dd5b 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -73,19 +73,28 @@ enum board_idx {
BCM57301,
BCM57302,
BCM57304,
+   BCM57417_NPAR,
BCM58700,
BCM57311,
BCM57312,
BCM57402,
BCM57404,
BCM57406,
-   BCM57404_NPAR,
+   BCM57402_NPAR,
+   BCM57407,
BCM57412,
BCM57414,
BCM57416,
BCM57417,
-   BCM57414_NPAR,
+   BCM57412_NPAR,
BCM57314,
+   BCM57417_SFP,
+   BCM57416_SFP,
+   BCM57404_NPAR,
+   BCM57406_NPAR,
+   BCM57407_SFP,
+   BCM57414_NPAR,
+   BCM57416_NPAR,
BCM57304_VF,
BCM57404_VF,
BCM57414_VF,
@@ -99,19 +108,28 @@ static const struct {
{ "Broadcom BCM57301 NetXtreme-C Single-port 10Gb Ethernet" },
{ "Broadcom BCM57302 NetXtreme-C Dual-port 10Gb/25Gb Ethernet" },
{ "Broadcom BCM57304 NetXtreme-C Dual-port 10Gb/25Gb/40Gb/50Gb 
Ethernet" },
+   { "Broadcom BCM57417 NetXtreme-E Ethernet Partition" },
{ "Broadcom BCM58700 Nitro 4-port 1Gb/2.5Gb/10Gb Ethernet" },
{ "Broadcom BCM57311 NetXtreme-C Single-port 10Gb Ethernet" },
{ "Broadcom BCM57312 NetXtreme-C Dual-port 10Gb/25Gb Ethernet" },
{ "Broadcom BCM57402 NetXtreme-E Dual-port 10Gb Ethernet" },
{ "Broadcom BCM57404 NetXtreme-E Dual-port 10Gb/25Gb Ethernet" },
{ "Broadcom BCM57406 NetXtreme-E Dual-port 10GBase-T Ethernet" },
-   { "Broadcom BCM57404 NetXtreme-E Ethernet Partition" },
+   { "Broadcom BCM57402 NetXtreme-E Ethernet Partition" },
+   { "Broadcom BCM57407 NetXtreme-E Dual-port 10GBase-T Ethernet" },
{ "Broadcom BCM57412 NetXtreme-E Dual-port 10Gb Ethernet" },
{ "Broadcom BCM57414 NetXtreme-E Dual-port 10Gb/25Gb Ethernet" },
{ "Broadcom BCM57416 NetXtreme-E Dual-port 10GBase-T Ethernet" },
{ "Broadcom BCM57417 NetXtreme-E Dual-port 10GBase-T Ethernet" },
-   { "Broadcom BCM57414 NetXtreme-E Ethernet Partition" },
+   { "Broadcom BCM57412 NetXtreme-E Ethernet Partition" },
{ "Broadcom BCM57314 NetXtreme-C Dual-port 10Gb/25Gb/40Gb/50Gb 
Ethernet" },
+   { "Broadcom BCM57417 NetXtreme-E Dual-port 10Gb/25Gb Ethernet" },
+   { "Broadcom BCM57416 NetXtreme-E Dual-port 10Gb Ethernet" },
+   { "Broadcom BCM57404 NetXtreme-E Ethernet Partition" },
+   { "Broadcom BCM57406 NetXtreme-E Ethernet Partition" },
+   { "Broadcom BCM57407 NetXtreme-E Dual-port 25Gb Ethernet" },
+   { "Broadcom BCM57414 NetXtreme-E Ethernet Partition" },
+   { "Broadcom BCM57416 NetXtreme-E Ethernet Partition" },
{ "Broadcom BCM57304 NetXtreme-C Ethernet Virtual Function" },
{ "Broadcom BCM57404 NetXtreme-E Ethernet Virtual Function" },
{ "Broadcom BCM57414 NetXtreme-E Ethernet Virtual Function" },
@@ -122,19 +140,28 @@ static const struct pci_device_id bnxt_pci_tbl[] = {
{ PCI_VDEVICE(BROADCOM, 0x16c8), .driver_data = BCM57301 },
{ PCI_VDEVICE(BROADCOM, 0x16c9), .driver_data = BCM57302 },
{ PCI_VDEVICE(BROADCOM, 0x16ca), .driver_data = BCM57304 },
+   { PCI_VDEVICE(BROADCOM, 0x16cc), .driver_data = BCM57417_NPAR },
{ PCI_VDEVICE(BROADCOM, 0x16cd), .driver_data = BCM58700 },
{ PCI_VDEVICE(BROADCOM, 0x16ce), .driver_data = BCM57311 },
{ PCI_VDEVICE(BROADCOM, 0x16cf), .driver_data = BCM57312 },
{ PCI_VDEVICE(BROADCOM, 0x16d0), .driver_data = BCM57402 },
{ PCI_VDEVICE(BROADCOM, 0x16d1), .driver_data = BCM57404 },
{ PCI_VDEVICE(BROADCOM, 0x16d2), .driver_data = BCM57406 },
-   { PCI_VDEVICE(BROADCOM, 0x16d4), .driver_data = BCM57404_NPAR },
+   { PCI_VDEVICE(BROADCOM, 0x16d4), .driver_data = BCM57402_NPAR },
+   { PCI_VDEVICE(BROADCOM, 0x16d5), .driver_data = BCM57407 },
{ PCI_VDEVICE(BROADCOM, 0x16d6), .driver_data = BCM57412 },
{ PCI_VDEVICE(BROADCOM, 0x16d7), .driver_data = BCM57414 },
{ PCI_VDEVICE(BROADCOM, 0x16d8), .driver_data = BCM57416 },
{ PCI_VDEVICE(BROADCOM, 0x16d9), .driver_data = BCM57417 },
-   { PCI_VDEVICE(BROADCOM, 0x16de), .driver_data = BCM57414_NPAR },
+   { PCI_VDEVICE(BROADCOM, 0x16de), .driver_data = BCM57412_NPAR },
{ PCI_VDEVICE(BROADCOM, 0x16df), .driver_data = BCM57314 },
+   { PCI_VDEVICE(BROADCOM, 0x16e2), .driver_data = BCM57417_SFP },
+   { PCI_VDEVICE(BROADCOM, 0x16e3), .driver_data = BCM57416_SFP },
+   { PCI_VDEVICE(BROADCOM, 0x16e7), .driver_data = BCM57404_NPAR },
+   { PCI_VDEVICE(BROADCOM, 

[PATCH net-next 1/3] bnxt_en: Improve ntuple filters by checking destination MAC address.

2016-07-24 Thread Michael Chan
Include the destination MAC address in the ntuple filter structure.  The
current code assumes that the destination MAC address is always the MAC
address of the NIC.  This may not be true if there are macvlans, for
example.  Add destination MAC address checking and configure the filter
correctly using the correct index for the destination MAC address.

Signed-off-by: Michael Chan 
---
 drivers/net/ethernet/broadcom/bnxt/bnxt.c | 25 ++---
 drivers/net/ethernet/broadcom/bnxt/bnxt.h |  2 ++
 2 files changed, 24 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c 
b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index 8a0165b..7de7d7a 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -3240,7 +3240,7 @@ static int bnxt_hwrm_cfa_ntuple_filter_alloc(struct bnxt 
*bp,
struct bnxt_vnic_info *vnic = >vnic_info[fltr->rxq + 1];
 
bnxt_hwrm_cmd_hdr_init(bp, , HWRM_CFA_NTUPLE_FILTER_ALLOC, -1, -1);
-   req.l2_filter_id = bp->vnic_info[0].fw_l2_filter_id[0];
+   req.l2_filter_id = bp->vnic_info[0].fw_l2_filter_id[fltr->l2_fltr_idx];
 
req.enables = cpu_to_le32(BNXT_NTP_FLTR_FLAGS);
 
@@ -6299,7 +6299,8 @@ static bool bnxt_fltr_match(struct bnxt_ntuple_filter *f1,
keys1->ports.ports == keys2->ports.ports &&
keys1->basic.ip_proto == keys2->basic.ip_proto &&
keys1->basic.n_proto == keys2->basic.n_proto &&
-   ether_addr_equal(f1->src_mac_addr, f2->src_mac_addr))
+   ether_addr_equal(f1->src_mac_addr, f2->src_mac_addr) &&
+   ether_addr_equal(f1->dst_mac_addr, f2->dst_mac_addr))
return true;
 
return false;
@@ -6312,12 +6313,28 @@ static int bnxt_rx_flow_steer(struct net_device *dev, 
const struct sk_buff *skb,
struct bnxt_ntuple_filter *fltr, *new_fltr;
struct flow_keys *fkeys;
struct ethhdr *eth = (struct ethhdr *)skb_mac_header(skb);
-   int rc = 0, idx, bit_id;
+   int rc = 0, idx, bit_id, l2_idx = 0;
struct hlist_head *head;
 
if (skb->encapsulation)
return -EPROTONOSUPPORT;
 
+   if (!ether_addr_equal(dev->dev_addr, eth->h_dest)) {
+   struct bnxt_vnic_info *vnic = >vnic_info[0];
+   int off = 0, j;
+
+   netif_addr_lock_bh(dev);
+   for (j = 0; j < vnic->uc_filter_count; j++, off += ETH_ALEN) {
+   if (ether_addr_equal(eth->h_dest,
+vnic->uc_list + off)) {
+   l2_idx = j + 1;
+   break;
+   }
+   }
+   netif_addr_unlock_bh(dev);
+   if (!l2_idx)
+   return -EINVAL;
+   }
new_fltr = kzalloc(sizeof(*new_fltr), GFP_ATOMIC);
if (!new_fltr)
return -ENOMEM;
@@ -6335,6 +6352,7 @@ static int bnxt_rx_flow_steer(struct net_device *dev, 
const struct sk_buff *skb,
goto err_free;
}
 
+   memcpy(new_fltr->dst_mac_addr, eth->h_dest, ETH_ALEN);
memcpy(new_fltr->src_mac_addr, eth->h_source, ETH_ALEN);
 
idx = skb_get_hash_raw(skb) & BNXT_NTP_FLTR_HASH_MASK;
@@ -6360,6 +6378,7 @@ static int bnxt_rx_flow_steer(struct net_device *dev, 
const struct sk_buff *skb,
 
new_fltr->sw_id = (u16)bit_id;
new_fltr->flow_id = flow_id;
+   new_fltr->l2_fltr_idx = l2_idx;
new_fltr->rxq = rxq_index;
hlist_add_head_rcu(_fltr->hash, head);
bp->ntp_fltr_count++;
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.h 
b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
index 5307a2e..23e04a6 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.h
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
@@ -785,10 +785,12 @@ struct bnxt_pf_info {
 
 struct bnxt_ntuple_filter {
struct hlist_node   hash;
+   u8  dst_mac_addr[ETH_ALEN];
u8  src_mac_addr[ETH_ALEN];
struct flow_keysfkeys;
__le64  filter_id;
u16 sw_id;
+   u8  l2_fltr_idx;
u16 rxq;
u32 flow_id;
unsigned long   state;
-- 
1.8.3.1



[PATCH net-next 0/3] bnxt_en: Improve ntuple filters and add new IDs.

2016-07-24 Thread Michael Chan
Improve ntuple filters and add some new PCI device IDs.  Please review
for net-next.

Michael Chan (2):
  bnxt_en: Improve ntuple filters by checking destination MAC address.
  bnxt_en: Add new NPAR and dual media device IDs.

Vasundhara Volam (1):
  bnxt_en: Log a message, if enabling NTUPLE filtering fails.

 drivers/net/ethernet/broadcom/bnxt/bnxt.c | 74 ++-
 drivers/net/ethernet/broadcom/bnxt/bnxt.h |  2 +
 2 files changed, 65 insertions(+), 11 deletions(-)

-- 
1.8.3.1



[PATCH net-next 2/3] bnxt_en: Log a message, if enabling NTUPLE filtering fails.

2016-07-24 Thread Michael Chan
From: Vasundhara Volam 

If there are not enough resources to enable ntuple filtering,
log a warning message.

Signed-off-by: Vasundhara Volam 
Signed-off-by: Michael Chan 
---
 drivers/net/ethernet/broadcom/bnxt/bnxt.c | 10 --
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c 
b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index 7de7d7a..398ecba 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -5790,8 +5790,14 @@ static bool bnxt_rfs_capable(struct bnxt *bp)
return false;
 
vnics = 1 + bp->rx_nr_rings;
-   if (vnics > pf->max_rsscos_ctxs || vnics > pf->max_vnics)
+   if (vnics > pf->max_rsscos_ctxs || vnics > pf->max_vnics) {
+   netdev_warn(bp->dev,
+   "Not enough resources to support NTUPLE filters");
+   netdev_warn(bp->dev,
+   "Enough NTUPLE resources for up to %d rx rings",
+   min(pf->max_rsscos_ctxs - 1, pf->max_vnics - 1));
return false;
+   }
 
return true;
 #else
@@ -5804,7 +5810,7 @@ static netdev_features_t bnxt_fix_features(struct 
net_device *dev,
 {
struct bnxt *bp = netdev_priv(dev);
 
-   if (!bnxt_rfs_capable(bp))
+   if ((features & NETIF_F_NTUPLE) && !bnxt_rfs_capable(bp))
features &= ~NETIF_F_NTUPLE;
 
/* Both CTAG and STAG VLAN accelaration on the RX side have to be
-- 
1.8.3.1



Re: [PATCH 1/1] lvs: Use IS_ERR_OR_NULL(svc) instead of IS_ERR(svc) || svc == NULL

2016-07-24 Thread Simon Horman
On Fri, Jul 22, 2016 at 06:52:30PM +0800, Feng Gao wrote:
> Thanks Simon.
> This commit is accepted ?

Hi Feng,

it is now.

I have pushed it to the ipvs-next tree and it should appear in net-next
within the next 24 hours. Baring a calamity it should appear in the v4.9
release.


Re: [PATCH v4 5/7] thunderbolt: Networking state machine

2016-07-24 Thread Lukas Wunner
On Mon, Jul 18, 2016 at 01:00:38PM +0300, Amir Levy wrote:
> + const unique_id_be proto_uuid = APPLE_THUNDERBOLT_IP_PROTOCOL_UUID;
> +
> + if (memcmp(proto_uuid, hdr->apple_tbt_ip_proto_uuid,
> +sizeof(proto_uuid)) != 0) {

You may want to use the uuid_be data type provided by 
instead of rolling your own, as well as the helper uuid_be_cmp()
defined ibidem.

Thanks,

Lukas


Re: kernel panic, __neigh_notify, 4.7.0-rc7, Workqueue: events_power_efficient neigh_periodic_work

2016-07-24 Thread Denys Fedoryshchenko

On 2016-07-24 21:40, nuclear...@nuclearcat.com wrote:

Different hardware, but same workload. Seems different bug, happened
at least twice on this unit (both kernel panic messages here)
As additional sidenote, that might be useful (found in commits, that 
proxy arp might induce this bug, such as in commit "net/neighbour: fix 
crash at dumping device-agnostic proxy entries"): it is pppoe server 
with proxy_arp running on it


Re: [PATCH v10 05/12] net/mlx4_en: add support for fast rx drop bpf program

2016-07-24 Thread Daniel Borkmann

On 07/24/2016 06:57 PM, Tom Herbert wrote:

On Tue, Jul 19, 2016 at 2:16 PM, Brenden Blanco  wrote:

Add support for the BPF_PROG_TYPE_XDP hook in mlx4 driver.

In tc/socket bpf programs, helpers linearize skb fragments as needed
when the program touches the packet data. However, in the pursuit of
speed, XDP programs will not be allowed to use these slower functions,
especially if it involves allocating an skb.

Therefore, disallow MTU settings that would produce a multi-fragment
packet that XDP programs would fail to access. Future enhancements could
be done to increase the allowable MTU.

The xdp program is present as a per-ring data structure, but as of yet
it is not possible to set at that granularity through any ndo.

Signed-off-by: Brenden Blanco 

[...]

+   if (prog) {
+   prog = bpf_prog_add(prog, priv->rx_ring_num - 1);
+   if (IS_ERR(prog))
+   return PTR_ERR(prog);
+   }
+
+   priv->xdp_ring_num = xdp_ring_num;
+
+   /* This xchg is paired with READ_ONCE in the fast path */
+   for (i = 0; i < priv->rx_ring_num; i++) {
+   old_prog = xchg(>rx_ring[i]->xdp_prog, prog);


This can be done under a lock instead of relying on xchg.


+   if (old_prog)
+   bpf_prog_put(old_prog);


I don't see how this can work. Even after setting the new program, the
old program might still be run (pointer dereferenced before xchg).
Either rcu needs to be used or the queue should stopped and synced
before setting the new program.


It's a strict requirement that all BPF programs must run under RCU.


Re: [Intel-wired-lan] [PATCH net] e1000e: fix PTP on e1000_pch_lpt variants

2016-07-24 Thread kbuild test robot
Hi,

[auto build test ERROR on net/master]

url:
https://github.com/0day-ci/linux/commits/Jarod-Wilson/e1000e-fix-PTP-on-e1000_pch_lpt-variants/20160725-040850
config: i386-randconfig-x001-201630 (attached as .config)
compiler: gcc-6 (Debian 6.1.1-9) 6.1.1 20160705
reproduce:
# save the attached .config to linux build tree
make ARCH=i386 

All error/warnings (new ones prefixed by >>):

   drivers/net/ethernet/intel/e1000e/netdev.c: In function 
'e1000e_cyclecounter_read':
>> drivers/net/ethernet/intel/e1000e/netdev.c:4342:3: error: a label can only 
>> be part of a statement and a declaration is not a statement
  u64 time_delta, rem, temp;
  ^~~
>> drivers/net/ethernet/intel/e1000e/netdev.c:4343:3: error: expected 
>> expression before 'u32'
  u32 incvalue;
  ^~~
>> drivers/net/ethernet/intel/e1000e/netdev.c:4344:3: warning: ISO C90 forbids 
>> mixed declarations and code [-Wdeclaration-after-statement]
  int i;
  ^~~
>> drivers/net/ethernet/intel/e1000e/netdev.c:4350:3: error: 'incvalue' 
>> undeclared (first use in this function)
  incvalue = er32(TIMINCA) & E1000_TIMINCA_INCVALUE_MASK;
  ^~~~
   drivers/net/ethernet/intel/e1000e/netdev.c:4350:3: note: each undeclared 
identifier is reported only once for each function it appears in

vim +4342 drivers/net/ethernet/intel/e1000e/netdev.c

ab507c9a Denys Vlasenko 2016-04-20  4336systim |= (cycle_t)systimeh << 
32;
b67e1913 Bruce Allan2012-12-27  4337  
cf608919 Jarod Wilson   2016-07-19  4338switch (hw->mac.type) {
cf608919 Jarod Wilson   2016-07-19  4339case e1000_82574:
cf608919 Jarod Wilson   2016-07-19  4340case e1000_82583:
cf608919 Jarod Wilson   2016-07-19  4341case e1000_pch_lpt:
fb5277f2 Denys Vlasenko 2016-04-20 @4342u64 time_delta, rem, 
temp;
fb5277f2 Denys Vlasenko 2016-04-20 @4343u32 incvalue;
5e7ff970 Todd Fujinaka  2014-05-03 @4344int i;
5e7ff970 Todd Fujinaka  2014-05-03  4345  
5e7ff970 Todd Fujinaka  2014-05-03  4346/* errata for 
82574/82583 possible bad bits read from SYSTIMH/L
5e7ff970 Todd Fujinaka  2014-05-03  4347 * check to see that 
the time is incrementing at a reasonable
5e7ff970 Todd Fujinaka  2014-05-03  4348 * rate and is a 
multiple of incvalue
5e7ff970 Todd Fujinaka  2014-05-03  4349 */
5e7ff970 Todd Fujinaka  2014-05-03 @4350incvalue = 
er32(TIMINCA) & E1000_TIMINCA_INCVALUE_MASK;
5e7ff970 Todd Fujinaka  2014-05-03  4351for (i = 0; i < 
E1000_MAX_82574_SYSTIM_REREADS; i++) {
5e7ff970 Todd Fujinaka  2014-05-03  4352/* latch 
SYSTIMH on read of SYSTIML */
5e7ff970 Todd Fujinaka  2014-05-03  4353systim_next = 
(cycle_t)er32(SYSTIML);

:: The code at line 4342 was first introduced by commit
:: fb5277f2c2e4db4a29740ff071072a688892d2df e1000e: 
e1000e_cyclecounter_read(): incvalue is 32 bits, not 64

:: TO: Denys Vlasenko 
:: CC: Jeff Kirsher 

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


.config.gz
Description: Binary data


Re: PROBLEM: network data corruption (bisected to e5a4b0bb803b)

2016-07-24 Thread Al Viro
On Sun, Jul 24, 2016 at 07:45:13PM +0200, Christian Lamparter wrote:

> > The symptom is that downloaded files (http, ftp, and probably other
> > protocols) have small corrupted segments (about 1-2 kilobytes long) in
> > random locations. Only downloads that sustain a high speed for at least a
> > few seconds are corrupted. Anything small enough to be received in less
> > than about 5 seconds is not affected.

Can that sucker be reproduced with netcat?  That would eliminate all issues
with multi-iovec recvmsg(2), narrowing the things down quite bit.

Another thing (and if that works, it's *NOT* a proper fix - it would be
papering over the problem, but at least it would show where to look for
it) - try (on top of mainline) the following delta:

diff --git a/net/core/datagram.c b/net/core/datagram.c
index b7de71f..0ee5995 100644
--- a/net/core/datagram.c
+++ b/net/core/datagram.c
@@ -734,7 +734,7 @@ int skb_copy_and_csum_datagram_msg(struct sk_buff *skb,
if (!chunk)
return 0;
 
-   if (msg_data_left(msg) < chunk) {
+   if (iov_iter_single_seg_count(>msg_iter) < chunk) {
if (__skb_checksum_complete(skb))
goto csum_error;
if (skb_copy_datagram_msg(skb, hlen, msg, chunk))


kernel panic, __neigh_notify, 4.7.0-rc7, Workqueue: events_power_efficient neigh_periodic_work

2016-07-24 Thread nuclearcat
Different hardware, but same workload. Seems different bug, happened at 
least twice on this unit (both kernel panic messages here)


First crash:

[45113.975701] general protection fault:  [#1]
SMP

[45113.976110] Modules linked in:
cls_fw
act_police
cls_u32
sch_ingress
sch_sfq
sch_htb
netconsole
configfs
pppoe
pppox
ppp_generic
slhc
xt_nat
ts_bm
xt_string
xt_connmark
xt_TCPMSS
xt_tcpudp
xt_mark
iptable_filter
iptable_nat
nf_conntrack_ipv4
nf_defrag_ipv4
nf_nat_ipv4
nf_nat
nf_conntrack
iptable_mangle
ip_tables
x_tables
8021q
garp
mrp
stp
llc
ixgbe
vxlan
udp_tunnel
dca

[45113.979815] CPU: 2 PID: 924 Comm: kworker/2:1 Not tainted 
4.6.4-build-0106 #3
[45113.980077] Hardware name: Supermicro X10SLM+-LN4F/X10SLM+-LN4F, BIOS 
3.0 04/24/2015

[45113.980528] Workqueue: events_power_efficient neigh_periodic_work

[45113.980862] task: 88040c8dde00 ti: 88040cff4000 task.ti: 
88040cff4000

[45113.981310] RIP: 0010:[]
[] neigh_periodic_work+0x171/0x18a
[45113.981828] RSP: 0018:88040cff7df8  EFLAGS: 00010202
[45113.982088] RAX: ffa0050402118600 RBX: 8803f8015600 RCX: 
000d
[45113.982349] RDX: 000102abcd01 RSI: 0200 RDI: 
8803f8015628
[45113.982615] RBP: 88040cff7e20 R08:  R09: 
88040f003700
[45113.982877] R10: ea000fc01e80 R11: 323a R12: 
820a8300
[45113.983142] R13: 820a83ac R14: 0253 R15: 
8803d3871298
[45113.983409] FS:  () GS:88041fc8() 
knlGS:

[45113.983856] CS:  0010 DS:  ES:  CR0: 80050033
[45113.984118] CR2: 7ff9b9221763 CR3: 02006000 CR4: 
001406e0

[45113.984382] Stack:
[45113.984631]  88040d253b40
88041fc93a80

820a8300

[45113.985352]  88041fc98300
88040cff7e68
810d263c
88041fc93a80

[45113.986066]  1fc93a80
88040d253b40
88041fc93a80
88040d253b70

[45113.986781] Call Trace:
[45113.987035]  [] process_one_work+0x193/0x2a1
[45113.987302]  [] worker_thread+0x279/0x362
[45113.987563]  [] ? rescuer_thread+0x29a/0x29a
[45113.987823]  [] kthread+0xcd/0xd5
[45113.988083]  [] ret_from_fork+0x22/0x40
[45113.988343]  [] ? 
kthread_create_on_node+0x177/0x177

[45113.988607] Code:
00
00
00
41
8b
4f
08
44
89
f0
d3
e8
85
c0
0f
85
47
ff
ff
ff
49
8b
17
44
89
f0
4c
8d
3c
c2
eb
bb
48
8b
43
10
48
8b
15
76
92
7a
00

63
40
70
48
29
d0
48
03
83
88
00
00
00
78
87
c6
43
28
00
49

[45113.993643] RIP
[] neigh_periodic_work+0x171/0x18a
[45113.993972]  RSP 
[45113.994247] ---[ end trace 02e34672899d1b2e ]---
[45113.995481] Kernel panic - not syncing: Fatal exception in interrupt
[45113.995750] Kernel Offset: disabled
[45113.997638] Rebooting in 5 seconds..
Jul 24 12:06:57 10.0.253.19
Jul 24 12:06:57 10.0.253.19 [45119.000564] ACPI MEMORY or I/O RESET_REG.

-
Second crash:



[ 2471.924122] general protection fault:  [#1] SMP
[ 2471.924392] Modules linked in:
cls_fw
act_police
cls_u32
sch_ingress
sch_sfq
sch_htb
netconsole
configfs
pppoe
pppox
ppp_generic
slhc
xt_nat
ts_bm
xt_string
xt_connmark
xt_TCPMSS
xt_tcpudp
xt_mark
iptable_filter
iptable_nat
nf_conntrack_ipv4
nf_defrag_ipv4
nf_nat_ipv4
nf_nat
nf_conntrack
iptable_mangle
ip_tables
x_tables
8021q
garp
mrp
stp
llc
ixgbe
dca

[ 2471.928066] CPU: 2 PID: 2135 Comm: kworker/2:2 Not tainted 
4.7.0-rc7-build-0107 #2
[ 2471.928518] Hardware name: Supermicro X10SLM+-LN4F/X10SLM+-LN4F, BIOS 
3.0 04/24/2015

[ 2471.928970] Workqueue: events_power_efficient neigh_periodic_work
2471.928970] Workqueue: events_power_efficient neigh_periodic_workBIOS 
3.0 04/24/2015


nat_ipv4 nf_nat nf_conntrack iptable_mangle ip_tables x_tables 8021q 
garp mrp stp llc ixgbe vxlan udp_tunnel dca
[ 2471.929312] task: 88040d2ec680 ti: 88040c828000 task.ti: 
88040c828000

[ 2471.929761] RIP: 0010:[]
[] __neigh_notify+0x29/0xb3
[ 2471.930293] RSP: :88040c82bdb0  EFLAGS: 00010246
[ 2471.930561] RAX: ffac05040202f000 RBX: 8800d4525c00 RCX: 

[ 2471.930828] RDX:  RSI: 02080020 RDI: 
0080
[ 2471.931096] RBP: 88040c82bdd0 R08:  R09: 
88040f003700
[ 2471.931364] R10: ea00034e5e00 R11: 00019bef R12: 
820a90c8
[ 2471.931633] R13: 820a9174 R14: 0b8c R15: 
8800d7260a00
[ 2471.931904] FS:  () GS:88041fc8() 
knlGS:

[ 2471.932355] CS:  0010 DS:  ES:  CR0: 80050033
[ 2471.932620] CR2: 7f2a50c1c840 CR3: 0003e7682000 CR4: 
001406e0

[ 2471.932889] Stack:
[ 2471.933147]  001d
8800d4525c00
820a90c8
820a9174

[ 2471.933878]  88040c82bde8
8186935f
8800d4525c00
88040c82be20

[ 2471.934614]  81869493
88040d19ba80
88041fc94a00


[ 2471.935350] Call Trace:
[ 2471.935613]  [] neigh_cleanup_and_release+0x26/0x39
[ 2471.935881]  [] 

[PATCH] gtp: #define #define _GTP_H_ and not #define _GTP_H

2016-07-24 Thread Colin King
From: Colin Ian King 

Fix clang build warning:

./include/net/gtp.h:1:9: warning: '_GTP_H_' is used as a header
guard here, followed by #define of a different macro [-Wheader-guard]

fix by defining _GTP_H_ and not _GTP_H

Signed-off-by: Colin Ian King 
---
 include/net/gtp.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/net/gtp.h b/include/net/gtp.h
index 894a37b..6398891 100644
--- a/include/net/gtp.h
+++ b/include/net/gtp.h
@@ -1,5 +1,5 @@
 #ifndef _GTP_H_
-#define _GTP_H
+#define _GTP_H_
 
 /* General GTP protocol related definitions. */
 
-- 
2.8.1



kernel panic, qdisc_dequeue_head, htb, 4.7.0-rc7

2016-07-24 Thread nuclearcat

Hi,

I'm still struggling to find out reason of random reboots on BRAS 
servers (PPPoE + HTB egress on ppp and policer for ppp ingress).
Here is happening at 4.7.0-rc7, it seems panic is happening at peak 
time, 1-2 times per day now. I noticed also rise of kernel panic on 
other locations as well, but i'm still setting up netconsole there.


Here is panic message:
[71623.328457] general protection fault:  [#1] SMP
[71623.328658] Modules linked in:
cls_fw
act_police
cls_u32
sch_ingress
sch_sfq
sch_htb
netconsole
configfs
coretemp
nf_nat_pptp
nf_nat_proto_gre
nf_conntrack_pptp
nf_conntrack_proto_gre
pppoe
pppox
ppp_generic
slhc
tun
xt_REDIRECT
nf_nat_redirect
xt_TCPMSS
ipt_REJECT
nf_reject_ipv4
xt_set
ts_bm
xt_string
xt_connmark
xt_DSCP
xt_mark
xt_tcpudp
ip_set_hash_net
ip_set_hash_ip
ip_set
nfnetlink
iptable_mangle
iptable_filter
iptable_nat
nf_conntrack_ipv4
nf_defrag_ipv4
nf_nat_ipv4
nf_nat
nf_conntrack
ip_tables
x_tables
8021q
garp
mrp
stp
llc

[71623.332554] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 
4.7.0-rc7-build-0107 #2
[71623.332754] Hardware name: HP ProLiant DL320e Gen8 v2, BIOS P80 
04/02/2015
[71623.332954] task: 8200b500 ti: 8200 task.ti: 
8200

[71623.333289] RIP: 0010:[]
[] qdisc_dequeue_head+0x5a/0x7b
[71623.333685] RSP: 0018:880447403e10  EFLAGS: 00010282
[71623.333882] RAX: 880427b67f00 RBX: 4125c7ec4a6c RCX: 
0001
[71623.334083] RDX: ffa0050402791340 RSI:  RDI: 
8804164f2000
[71623.334281] RBP: 880447403e10 R08:  R09: 
880425471868
[71623.334481] R10: 820050c0 R11: 0001 R12: 
0007
[71623.334680] R13: 880425471000 R14:  R15: 
88040d9ebc00
[71623.334881] FS:  () GS:88044740() 
knlGS:

[71623.335216] CS:  0010 DS:  ES:  CR0: 80050033
[71623.335415] CR2: 7f184bb2f730 CR3: 02006000 CR4: 
001406f0

[71623.335614] Stack:
[71623.335808]  880447403eb0
a01328af
0578
01a0

[71623.336369]  00010440678b
880425471140
00070001
880425471860

[71623.336929]  880425471868
fffe

88040d9ebc00

[71623.340797] Call Trace:
[71623.340987]  

[71623.341050]  [] htb_dequeue+0x33a/0x6fe [sch_htb]
[71623.341433]  [] __qdisc_run+0x9d/0x17b
[71623.341630]  [] ? ktime_get+0x4b/0x9a
[71623.341827]  [] net_tx_action+0xe3/0x148
[71623.342027]  [] __do_softirq+0xb9/0x1a9
[71623.342226]  [] irq_exit+0x37/0x7c
[71623.342425]  [] smp_apic_timer_interrupt+0x3d/0x48
[71623.342627]  [] apic_timer_interrupt+0x7c/0x90
[71623.342825]  

[71623.342887]  [] ? mwait_idle+0x64/0x7a
[71623.343276]  [] ? 
atomic_notifier_call_chain+0x13/0x15

[71623.343476]  [] arch_cpu_idle+0xa/0xc
[71623.343675]  [] default_idle_call+0x27/0x29
[71623.343872]  [] cpu_startup_entry+0x115/0x1bf
[71623.344072]  [] rest_init+0x72/0x74
[71623.344270]  [] start_kernel+0x3bc/0x3c9
[71623.344467]  [] x86_64_start_reservations+0x2f/0x31
[71623.344668]  [] x86_64_start_kernel+0xbb/0xbe
[71623.344867] Code:
00
48
c7
40
08
00
00
00
00
48
89
51
08
48
89
0a
8b
50
28
b9
01
00
00
00
29
97
c4
00
00
00
8b
90
c0
00
00
00
48
03
90
c8
00
00
00

83
7a
02
00
74
04
0f
b7
4a
04
8b
50
28
01
8f
b8
00
00
00
48

[71623.349025] RIP
[] qdisc_dequeue_head+0x5a/0x7b
[71623.349281]  RSP 
[71623.349500] ---[ end trace 0fb7352fcf43439e ]---
[71623.355886] Kernel panic - not syncing: Fatal exception in interrupt
[71623.356093] Kernel Offset: disabled
[71623.359937] ERST: [Firmware Warn]: Firmware does not respond in time.
[71623.388538] Rebooting in 5 seconds.


Re: PROBLEM: network data corruption (bisected to e5a4b0bb803b)

2016-07-24 Thread Christian Lamparter
Hello,

I added Al Viro to the CC (probably not necessary...)

On Sunday, July 24, 2016 3:35:14 AM CEST Alan Curry wrote:
> [1.] One line summary of the problem:
> network data corruption (bisected to e5a4b0bb803b)
> 
> [2.] Full description of the problem/report:
> Note: although my bisect ended at a commit from before 3.19, I have the
> same symptom in all newer kernels I've tried, up to 4.6.4.
> 
> The commit was:
> 
> >commit e5a4b0bb803b39a36478451eae53a880d2663d5b
> >Author: Al Viro 
> >Date:   Mon Nov 24 18:17:55 2014 -0500
> >
> >switch memcpy_to_msg() and skb_copy{,_and_csum}_datagram_msg() to 
> > primitives
> 
> The symptom is that downloaded files (http, ftp, and probably other
> protocols) have small corrupted segments (about 1-2 kilobytes long) in
> random locations. Only downloads that sustain a high speed for at least a
> few seconds are corrupted. Anything small enough to be received in less
> than about 5 seconds is not affected.
> 
> If I download the same file twice in a row, the corruption is in different
> places in each copy.
> 
> If I try to do a git clone, it fails a few seconds into the "Receiving
> objects" stage with a deflate error.

Thanks for the detailed bug-report. I looked around the web to see if it
was already reported or not. If found that this issue was reported before:
[0], [1] and [2] by the same person (CC'ed). One difference is that the 
reporter had this issue with rsync on multiple SPARC systems. I ran a
git grep on a 4.7.0-rc7+ (wt-2016-07-21-15-g97bd3b0). But it didn't find
any patches directly referencing the commit. I'm not sure if this issue
has been fixed by now or not. I would greatly appreciate any comment
about this from the "people of netdev" (Al Viro? Alex Mcwhirter?).

As for carl9170: I'm not sure what the driver or firmware can do about
this at this time. You can try to disable the hardware crypto by setting
nohwcrypt via the module option. However, this might not do anything at all.

> [3.] Keywords: networking, carl9170
> 
> [4.] Kernel information
> [4.1.] Kernel version (from /proc/version):
> Multiple versions are known to be affected, from 3.19 to 4.6.4
> 
> [4.2.] Kernel .config file:
> For testing I built with make x86_64_defconfig followed by enabling the
> carl9170 driver, which adds these lines:
> CONFIG_ATH_COMMON=m
> CONFIG_ATH_CARDS=m
> CONFIG_CARL9170=m
> CONFIG_CARL9170_LEDS=y
> CONFIG_CARL9170_WPC=y
> 
> [5.] Most recent kernel version which did not have the bug:
> That would be the predecessor of e5a4b0bb803b39a36478451eae53a880d2663d5b
> which is v3.18-rc6-1620-g17836394e578
> 
> [6.] no Oops
> 
> [7.] A small shell script or example program which triggers the
>  problem (if possible)
> 
> This command fails reliably for me when running an affected kernel:
> 
> git clone git://git.kernel.org/pub/scm/git/git.git
> 
> (I'm including all the standard format stuff suggested by REPORTING-BUGS,
> but I think you can skip from here to section 8.7 without missing anything
> relevant)
Yes, I removed it for the most part. If anyone is interested in the details:
Here's a link to the original post @LKML [3].

> 
> [8.] Environment
> [8.1.] Software (add the output of the ver_linux script here)
> 
> Mostly Debian 8.5 stable packages here.
> 
> [8.3.] Module information (from /proc/modules):
> 
> When I tested with the x86_64_defconfig + carl9170 kernel, there were
> hardly any modules built, and I reproduced the problem after booting with
> init=/bin/sh, so no unnecessary modules were loaded. Currently running a
> normal 4.6.4 kernel which is showing the bug.
> 
> [...]
> [8.7.] Other information that might be relevant to the problem
>(please look in /proc and include all information that you
>think to be relevant):
> 
> lsusb identifies my network device as:
> 
> Bus 005 Device 004: ID 0cf3:1002 Atheros Communications, Inc. TP-Link 
> TL-WN821N v2 802.11n [Atheros AR9170]
> 
> I have version 1.9.9 of carl9170-1.fw in /lib/firmware
Just one additional question: Is the TL-WN821N connected to a USB3 port?

Regards,
Christian

[0] 
[1] 
[2] 
[3] 





Re: [PATCH v10 05/12] net/mlx4_en: add support for fast rx drop bpf program

2016-07-24 Thread Tom Herbert
On Tue, Jul 19, 2016 at 2:16 PM, Brenden Blanco  wrote:
> Add support for the BPF_PROG_TYPE_XDP hook in mlx4 driver.
>
> In tc/socket bpf programs, helpers linearize skb fragments as needed
> when the program touches the packet data. However, in the pursuit of
> speed, XDP programs will not be allowed to use these slower functions,
> especially if it involves allocating an skb.
>
> Therefore, disallow MTU settings that would produce a multi-fragment
> packet that XDP programs would fail to access. Future enhancements could
> be done to increase the allowable MTU.
>
> The xdp program is present as a per-ring data structure, but as of yet
> it is not possible to set at that granularity through any ndo.
>
> Signed-off-by: Brenden Blanco 
> ---
>  drivers/net/ethernet/mellanox/mlx4/en_netdev.c | 60 
> ++
>  drivers/net/ethernet/mellanox/mlx4/en_rx.c | 40 +++--
>  drivers/net/ethernet/mellanox/mlx4/mlx4_en.h   |  6 +++
>  3 files changed, 102 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/net/ethernet/mellanox/mlx4/en_netdev.c 
> b/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
> index 6083775..c34a33d 100644
> --- a/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
> +++ b/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
> @@ -31,6 +31,7 @@
>   *
>   */
>
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -2112,6 +2113,11 @@ static int mlx4_en_change_mtu(struct net_device *dev, 
> int new_mtu)
> en_err(priv, "Bad MTU size:%d.\n", new_mtu);
> return -EPERM;
> }
> +   if (priv->xdp_ring_num && MLX4_EN_EFF_MTU(new_mtu) > FRAG_SZ0) {
> +   en_err(priv, "MTU size:%d requires frags but XDP running\n",
> +  new_mtu);
> +   return -EOPNOTSUPP;
> +   }
> dev->mtu = new_mtu;
>
> if (netif_running(dev)) {
> @@ -2520,6 +2526,58 @@ static int mlx4_en_set_tx_maxrate(struct net_device 
> *dev, int queue_index, u32 m
> return err;
>  }
>
> +static int mlx4_xdp_set(struct net_device *dev, struct bpf_prog *prog)
> +{
> +   struct mlx4_en_priv *priv = netdev_priv(dev);
> +   struct bpf_prog *old_prog;
> +   int xdp_ring_num;
> +   int i;
> +
> +   xdp_ring_num = prog ? ALIGN(priv->rx_ring_num, MLX4_EN_NUM_UP) : 0;
> +
> +   if (priv->num_frags > 1) {
> +   en_err(priv, "Cannot set XDP if MTU requires multiple 
> frags\n");
> +   return -EOPNOTSUPP;
> +   }
> +
> +   if (prog) {
> +   prog = bpf_prog_add(prog, priv->rx_ring_num - 1);
> +   if (IS_ERR(prog))
> +   return PTR_ERR(prog);
> +   }
> +
> +   priv->xdp_ring_num = xdp_ring_num;
> +
> +   /* This xchg is paired with READ_ONCE in the fast path */
> +   for (i = 0; i < priv->rx_ring_num; i++) {
> +   old_prog = xchg(>rx_ring[i]->xdp_prog, prog);

This can be done under a lock instead of relying on xchg.

> +   if (old_prog)
> +   bpf_prog_put(old_prog);

I don't see how this can work. Even after setting the new program, the
old program might still be run (pointer dereferenced before xchg).
Either rcu needs to be used or the queue should stopped and synced
before setting the new program.

> +   }
> +
> +   return 0;
> +}
> +
> +static bool mlx4_xdp_attached(struct net_device *dev)
> +{
> +   struct mlx4_en_priv *priv = netdev_priv(dev);
> +
> +   return !!priv->xdp_ring_num;
> +}
> +
> +static int mlx4_xdp(struct net_device *dev, struct netdev_xdp *xdp)
> +{
> +   switch (xdp->command) {
> +   case XDP_SETUP_PROG:
> +   return mlx4_xdp_set(dev, xdp->prog);
> +   case XDP_QUERY_PROG:
> +   xdp->prog_attached = mlx4_xdp_attached(dev);
> +   return 0;
> +   default:
> +   return -EINVAL;
> +   }
> +}
> +
>  static const struct net_device_ops mlx4_netdev_ops = {
> .ndo_open   = mlx4_en_open,
> .ndo_stop   = mlx4_en_close,
> @@ -2548,6 +2606,7 @@ static const struct net_device_ops mlx4_netdev_ops = {
> .ndo_udp_tunnel_del = mlx4_en_del_vxlan_port,
> .ndo_features_check = mlx4_en_features_check,
> .ndo_set_tx_maxrate = mlx4_en_set_tx_maxrate,
> +   .ndo_xdp= mlx4_xdp,
>  };
>
>  static const struct net_device_ops mlx4_netdev_ops_master = {
> @@ -2584,6 +2643,7 @@ static const struct net_device_ops 
> mlx4_netdev_ops_master = {
> .ndo_udp_tunnel_del = mlx4_en_del_vxlan_port,
> .ndo_features_check = mlx4_en_features_check,
> .ndo_set_tx_maxrate = mlx4_en_set_tx_maxrate,
> +   .ndo_xdp= mlx4_xdp,
>  };
>
>  struct mlx4_en_bond {
> diff --git a/drivers/net/ethernet/mellanox/mlx4/en_rx.c 
> b/drivers/net/ethernet/mellanox/mlx4/en_rx.c
> index c1b3a9c..6729545 100644
> --- 

[PATCH net-next V2 2/2] net/mlx5e: Query minimum required header copy during xmit

2016-07-24 Thread Saeed Mahameed
From: Hadar Hen Zion 

Add support for query the minimum inline mode from the Firmware.
It is required for correct TX steering according to L3/L4 packet
headers.

Each send queue (SQ) has inline mode that defines the minimal required
headers that needs to be copied into the SQ WQE.
The driver asks the Firmware for the wqe_inline_mode device capability
value.  In case the device capability defined as "vport context" the
driver must check the reported min inline mode from the vport context
before creating its SQs.

Signed-off-by: Hadar Hen Zion 
Signed-off-by: Saeed Mahameed 
---
 drivers/net/ethernet/mellanox/mlx5/core/en.h  |  7 +++
 drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 24 +++
 drivers/net/ethernet/mellanox/mlx5/core/vport.c   | 12 
 include/linux/mlx5/mlx5_ifc.h | 10 +++---
 include/linux/mlx5/vport.h|  2 ++
 5 files changed, 52 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h 
b/drivers/net/ethernet/mellanox/mlx5/core/en.h
index 2c20c7b..1b495ef 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h
@@ -129,6 +129,12 @@ static inline int mlx5_max_log_rq_size(int wq_type)
}
 }
 
+enum {
+   MLX5E_INLINE_MODE_L2,
+   MLX5E_INLINE_MODE_VPORT_CONTEXT,
+   MLX5_INLINE_MODE_NOT_REQUIRED,
+};
+
 struct mlx5e_tx_wqe {
struct mlx5_wqe_ctrl_seg ctrl;
struct mlx5_wqe_eth_seg  eth;
@@ -188,6 +194,7 @@ struct mlx5e_params {
bool lro_en;
u32 lro_wqe_sz;
u16 tx_max_inline;
+   u8  tx_min_inline_mode;
u8  rss_hfunc;
u8  toeplitz_hash_key[40];
u32 indirection_rqt[MLX5E_INDIR_RQT_SIZE];
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c 
b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
index 611ab55..ca7b1e3 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -56,6 +56,7 @@ struct mlx5e_sq_param {
u32sqc[MLX5_ST_SZ_DW(sqc)];
struct mlx5_wq_param   wq;
u16max_inline;
+   u8 min_inline_mode;
bool   icosq;
 };
 
@@ -649,6 +650,9 @@ static int mlx5e_create_sq(struct mlx5e_channel *c,
}
sq->bf_buf_size = (1 << MLX5_CAP_GEN(mdev, log_bf_reg_size)) / 2;
sq->max_inline  = param->max_inline;
+   sq->min_inline_mode =
+   MLX5_CAP_ETH(mdev, wqe_inline_mode) == 
MLX5E_INLINE_MODE_VPORT_CONTEXT ?
+   param->min_inline_mode : 0;
 
err = mlx5e_alloc_sq_db(sq, cpu_to_node(c->cpu));
if (err)
@@ -731,6 +735,7 @@ static int mlx5e_enable_sq(struct mlx5e_sq *sq, struct 
mlx5e_sq_param *param)
 
MLX5_SET(sqc,  sqc, tis_num_0, param->icosq ? 0 : priv->tisn[sq->tc]);
MLX5_SET(sqc,  sqc, cqn,sq->cq.mcq.cqn);
+   MLX5_SET(sqc,  sqc, min_wqe_inline_mode, sq->min_inline_mode);
MLX5_SET(sqc,  sqc, state,  MLX5_SQC_STATE_RST);
MLX5_SET(sqc,  sqc, tis_lst_sz, param->icosq ? 0 : 1);
MLX5_SET(sqc,  sqc, flush_in_error_en,  1);
@@ -1343,6 +1348,7 @@ static void mlx5e_build_sq_param(struct mlx5e_priv *priv,
MLX5_SET(wq, wq, log_wq_sz, priv->params.log_sq_size);
 
param->max_inline = priv->params.tx_max_inline;
+   param->min_inline_mode = priv->params.tx_min_inline_mode;
 }
 
 static void mlx5e_build_common_cq_param(struct mlx5e_priv *priv,
@@ -2967,6 +2973,23 @@ void mlx5e_set_rx_cq_mode_params(struct mlx5e_params 
*params, u8 cq_period_mode)
MLX5E_PARAMS_DEFAULT_RX_CQ_MODERATION_USEC_FROM_CQE;
 }
 
+static void mlx5e_query_min_inline(struct mlx5_core_dev *mdev,
+  u8 *min_inline_mode)
+{
+   switch (MLX5_CAP_ETH(mdev, wqe_inline_mode)) {
+   case MLX5E_INLINE_MODE_L2:
+   *min_inline_mode = MLX5_INLINE_MODE_L2;
+   break;
+   case MLX5E_INLINE_MODE_VPORT_CONTEXT:
+   mlx5_query_nic_vport_min_inline(mdev,
+   min_inline_mode);
+   break;
+   case MLX5_INLINE_MODE_NOT_REQUIRED:
+   *min_inline_mode = MLX5_INLINE_MODE_NONE;
+   break;
+   }
+}
+
 static void mlx5e_build_nic_netdev_priv(struct mlx5_core_dev *mdev,
struct net_device *netdev,
const struct mlx5e_profile *profile,
@@ -3032,6 +3055,7 @@ static void mlx5e_build_nic_netdev_priv(struct 
mlx5_core_dev *mdev,
priv->params.tx_cq_moderation.pkts =
MLX5E_PARAMS_DEFAULT_TX_CQ_MODERATION_PKTS;
priv->params.tx_max_inline = mlx5e_get_max_inline_cap(mdev);
+   

[PATCH net-next V2 0/2] Mellanox 100G mlx5 minimum inline header mode

2016-07-24 Thread Saeed Mahameed
Hi Dave,

This small series from Hadar adds the support for minimum inline header mode 
query
in mlx5e NIC driver.

Today on TX the driver copies to the HW descriptor only up to L2 header which 
is the default
required mode and sufficient for today's needs.

The header in the HW descriptor is used for HW loopback steering decision, 
without it packets 
will go directly to the wire with no questions asked.

For TX loopback steering according to L2/L3/L4 headers, ConnectX-4 requires to 
copy the
corresponding headers into the send queue(SQ) WQE HW descriptor so it can 
decide whether to loop it back
or to forward to wire.

For legacy E-Switch mode only L2 headers copy is required.
For advanced steering (E-Switch offloads) more header layers may be required to 
be copied,
the required mode will be advertised by FW to each VF and PF according to the 
corresponding
E-Switch configuration.

Changes V2:
 - Allocate query_nic_vport_context_out on the stack

Thanks,
Saeed.

Hadar Hen Zion (2):
  net/mlx5e: Check the minimum inline header mode before xmit
  net/mlx5e: Query minimum required header copy during xmit

 drivers/net/ethernet/mellanox/mlx5/core/en.h  |  8 
 drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 24 +++
 drivers/net/ethernet/mellanox/mlx5/core/en_tx.c   | 49 +--
 drivers/net/ethernet/mellanox/mlx5/core/vport.c   | 12 ++
 include/linux/mlx5/device.h   |  7 
 include/linux/mlx5/mlx5_ifc.h | 10 +++--
 include/linux/mlx5/vport.h|  2 +
 7 files changed, 105 insertions(+), 7 deletions(-)

-- 
2.8.0



[PATCH net-next V2 1/2] net/mlx5e: Check the minimum inline header mode before xmit

2016-07-24 Thread Saeed Mahameed
From: Hadar Hen Zion 

Each send queue (SQ) has inline mode that defines the minimal required
inline headers in the SQ WQE.
Before sending each packet check that the minimum required headers
on the WQE are copied.

Signed-off-by: Hadar Hen Zion 
Signed-off-by: Saeed Mahameed 
---
 drivers/net/ethernet/mellanox/mlx5/core/en.h|  1 +
 drivers/net/ethernet/mellanox/mlx5/core/en_tx.c | 49 +++--
 include/linux/mlx5/device.h |  7 
 3 files changed, 53 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h 
b/drivers/net/ethernet/mellanox/mlx5/core/en.h
index 4cbd452..2c20c7b 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h
@@ -398,6 +398,7 @@ struct mlx5e_sq {
u32sqn;
u16bf_buf_size;
u16max_inline;
+   u8 min_inline_mode;
u16edge;
struct device *pdev;
struct mlx5e_tstamp   *tstamp;
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c 
b/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c
index 5740b46..e073bf59 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c
@@ -128,6 +128,50 @@ u16 mlx5e_select_queue(struct net_device *dev, struct 
sk_buff *skb,
return priv->channeltc_to_txq_map[channel_ix][up];
 }
 
+static inline int mlx5e_skb_l2_header_offset(struct sk_buff *skb)
+{
+#define MLX5E_MIN_INLINE (ETH_HLEN + VLAN_HLEN)
+
+   return max(skb_network_offset(skb), MLX5E_MIN_INLINE);
+}
+
+static inline int mlx5e_skb_l3_header_offset(struct sk_buff *skb)
+{
+   struct flow_keys keys;
+
+   if (skb_transport_header_was_set(skb))
+   return skb_transport_offset(skb);
+   else if (skb_flow_dissect_flow_keys(skb, , 0))
+   return keys.control.thoff;
+   else
+   return mlx5e_skb_l2_header_offset(skb);
+}
+
+static inline unsigned int mlx5e_calc_min_inline(enum mlx5_inline_modes mode,
+struct sk_buff *skb)
+{
+   int hlen;
+
+   switch (mode) {
+   case MLX5_INLINE_MODE_TCP_UDP:
+   hlen = eth_get_headlen(skb->data, skb_headlen(skb));
+   if (hlen == ETH_HLEN && !skb_vlan_tag_present(skb))
+   hlen += VLAN_HLEN;
+   return hlen;
+   case MLX5_INLINE_MODE_IP:
+   /* When transport header is set to zero, it means no transport
+* header. When transport header is set to 0xff's, it means
+* transport header wasn't set.
+*/
+   if (skb_transport_offset(skb))
+   return mlx5e_skb_l3_header_offset(skb);
+   /* fall through */
+   case MLX5_INLINE_MODE_L2:
+   default:
+   return mlx5e_skb_l2_header_offset(skb);
+   }
+}
+
 static inline u16 mlx5e_get_inline_hdr_size(struct mlx5e_sq *sq,
struct sk_buff *skb, bool bf)
 {
@@ -135,8 +179,6 @@ static inline u16 mlx5e_get_inline_hdr_size(struct mlx5e_sq 
*sq,
 * headers and occur before the data gather.
 * Therefore these headers must be copied into the WQE
 */
-#define MLX5E_MIN_INLINE (ETH_HLEN + VLAN_HLEN)
-
if (bf) {
u16 ihs = skb_headlen(skb);
 
@@ -146,8 +188,7 @@ static inline u16 mlx5e_get_inline_hdr_size(struct mlx5e_sq 
*sq,
if (ihs <= sq->max_inline)
return skb_headlen(skb);
}
-
-   return max(skb_network_offset(skb), MLX5E_MIN_INLINE);
+   return mlx5e_calc_min_inline(sq->min_inline_mode, skb);
 }
 
 static inline void mlx5e_tx_skb_pull_inline(unsigned char **skb_data,
diff --git a/include/linux/mlx5/device.h b/include/linux/mlx5/device.h
index e0a3ed7..0b6d15c 100644
--- a/include/linux/mlx5/device.h
+++ b/include/linux/mlx5/device.h
@@ -129,6 +129,13 @@ __mlx5_mask(typ, fld))
tmp;  \
})
 
+enum mlx5_inline_modes {
+   MLX5_INLINE_MODE_NONE,
+   MLX5_INLINE_MODE_L2,
+   MLX5_INLINE_MODE_IP,
+   MLX5_INLINE_MODE_TCP_UDP,
+};
+
 enum {
MLX5_MAX_COMMANDS   = 32,
MLX5_CMD_DATA_BLOCK_SIZE= 512,
-- 
2.8.0



Re: [PATCH v10 05/12] net/mlx4_en: add support for fast rx drop bpf program

2016-07-24 Thread Jesper Dangaard Brouer
On Tue, 19 Jul 2016 12:16:50 -0700
Brenden Blanco  wrote:

> The xdp program is present as a per-ring data structure, but as of yet
> it is not possible to set at that granularity through any ndo.

Thank you for doing this! :-)

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  Author of http://www.iptv-analyzer.org
  LinkedIn: http://www.linkedin.com/in/brouer


Re;为各跨境电商提供物流服务

2016-07-24 Thread Paul Tsang
Dear Sir/Miss
This is the e-mail from Paul of HAOCHING INT'L LOGISTICS LTD in China. Our 
company is a well-staffed freight forwarding and logistics service company 
located in Shenzhen/Dongguan/Guangzhou/HuiZhou/Zhongshan GuangDong Province 
china,and provide safety, convenient & speedy logistic service. 
As a young and vibrant company ,we are all full of enthusiasm and 
power.Surely,it is guaranteed we will provide the best service for import and 
export both of ocean and air freight .
To mention with ,We've got not only a great advantage in the special container 
transportation in all lines over the world,but we also have quite a few 
shipping lines including:ZIM,CSCL,EMC,WHL,OOCL,KMTC,YML,ANL,HMM and so on .
Specifications
1.Import and export service
2.Warehousing and logistics
3.Over 10 years of experience
4.the strongest ability of Customs Clearance
5.Door to Door delivery (from hongkong to chinamainland)/(from chinamainland to 
hongkong )
6. professional import logistics
7. Collecting-paying service;
We offer:
1.Customs clearance
2.Commodity consolidation
3.Domestic transportation
4.Conceal identity of true shipper or real consignee
5.Export Licence
6.Freight collect
If you have any requirement we can do for you, please call me (tell: 
852-97959365) or sent e-mail ( haochen...@yahoo.com.hk) as soon as possible. 
Hoping we can establish the win―win cooperation later for a long time. Thank 
for reading my e-mail.

以下報價為常見运输交貨報價,如有需要,歡迎來電詢價
1)散杂貨一般貿易退稅報關(EDI電子報關)出口香港報價;
(一)中港自提收费: RMB0.6/kg(最低收RMB150/單)
(二)報關费:RMB250
(三)检疫费及其杂费凭票据实报实销。(无商检则无此费用)
(四)商检换证凭条费用:RMB100  (无商检则无此费用)
(五)香港派送RMB200含首重500KG或首方3方,续重RMB0.3/KG或续方RMB50/CBM(杂费实报实销)
(六)香港入仓RMB310含首重500KG或首方3方,续重RMB0.3/KG或续方RMB50/CBM(登记费、停车费等杂费实报实销)

2)散貨速遞香港報價
一、广东江门/佛山/中山散(杂)货(買單)速递香港自提價;RMB0.9/KG或RMB200/CBM
广东清远/河源/惠州散(杂)货(買單)速递香港自提價;RMB0.9/KG或RMB200/CBM
广东汕头散(杂)货(買單)速递香港自提價;RMB1/KG或RMB200/CBM
浙江/江苏/上海/福建散(杂)货(買單)速递香港自提價;RMB1.5/KG或RMB250/CBM
二、香港派送RMB200含首重500KG或首方3方,续重RMB0.3/KG或续方RMB50/CBM(杂费实报实销)
三、香港入仓RMB310含首重500KG或首方3方,续重RMB0.3/KG或续方RMB50/CBM(登记费、停车费等杂费实报实销)
注;以上報價包報關費、清關費、派送費、入倉費.我司包所以报关单证文件

3)散貨速遞澳门報價
一、最低消费:RMB250/單
中澳自提价:RMB2.5/KG 或RMB250/CBM
二、澳门派送首重500KG或首方3方,RMB200,续重RMB0.3/KG或续方RMB50/CBM(杂费实报实销)
注;以上報價包報關費、清關費、派送費、入倉費.及所以报关单证文件

4)散(杂)貨交深圳盐田港倉、中外運,八达仓、金运达等外運倉報價;
一、最低消費;RMB800/單 (登记费等仓库杂费实报实销)
注;按公斤計算,包500KG,超出RMB0.5/KG;按方計算,包3CBM,超出RMB65/CBM(取大优先)

5)国内运输报价;
珠三角运输报价:RMB70/CBM或RMB0.4/KG+送货上门费用RMB200
深圳/东莞到福建报价:RMB120/CBM或RMB0.6/KG+送货上门费用RMB200
深圳/东莞到浙江报价:RMB160/CBM或RMB0.7/KG+送货上门费用RMB200
深圳/东莞到江苏报价:RMB170/CBM或RMB0.8/KG+送货上门费用RMB200
深圳/东莞到上海报价:RMB170/CBM或RMB0.8/KG+送货上门费用RMB200
深圳/东莞到京津翼报价:RMB180/CBM或RMB0.9/KG+送货上门费用RMB200

6)香港普貨,中檔貨到內地各省市報價;最低收RMB800/單
快件包稅報關入口特點如下
A.手續簡單,海關監管條件簡單,不用單證不用批文,一些電子設備可以不需3C證明也可入口. 
B.速度快,次日到貨,适合緊急貨物通關.
C.如果貨量較大,可以采用每天分批報關方式. 
D.快件包稅報關不提供稅票,沒法進行稅票抵扣.
A 类
檔,白纸,纸袋,纸箱,纸盒,名片,日历,色卡,吊牌,说明书,目录,标贴,海报,胶纸,贴纸,胶条等
¥9/kg
B 类
塑料接头,铁芯,铁针,磁铁,弹簧,锡线,钢线,布料,纱线,线球,花边,饰扣,拉链,胶布,海绵等
¥11/kg
C 类
卤轮,滚轮,滑轮,电线,鼓纸,砂轮,石材,灯壳,灯罩,发饰品, PVC 皮,磨石,蕾丝,磨轮,漆包线等
¥13/kg
D 类
牛皮,羊皮,激光纸,皮光纸,烫金纸,轴心,色粉,颜料,胶水,文具,绒毛玩具,背包,尼龙袋等
¥15/kg
E 类
低价值运动用品,电筒,低价值灯饰,衣服,鞋样,模具,灯具,地毯,计算机接线,滤蕊,灯箱,润滑油等
¥16/kg
F 类
加热棒,低值变压器,电表,耳筒,耳机配件,气压配件,泵,空压表,活塞环,电热管,丝巾,领带等
¥21/kg
G 类
二极管,三极管,电容,电阻,电动起子,温度计 ,风扇,碳粉,干手机,耳咪线,投币器配件,游戏机控制杆等
¥36/kg
H 类
低值液晶显示片,温控器,烟雾感应器,音箱,定时器,低值缝纫机,助听器,探头,感应器,变频器,变速器等
¥41/kg
I 类
打印机,扫瞄仪,计算机机箱,显示器,路由器,集线器,接线盒,转换器,遥控器,传真机,钮扣电池,菲琳等
¥53-83/kg
注;以上報價最低收RMB800/單,單件貨值不要超過RMB5000.如果客戶謊報資料出現問題,由客戶自行承擔責任.

7)20尺柜/40尺柜貨海運馬來西亞報價
海運費:RMB800
THC码头装卸费:RMBRMB1300
DOC 文件费:RMB500/票
SEAL 封条费:RMB55/票
EBS 紧急燃油附加费:RMB210/票

8)产地证文件;RMB200/单
一般原产地证CO
普惠制产地证FormA
东盟原产地证FormE
智利产地证FormF
中秘产地证FormR
中瑞产地证FormS
亚太产地证FormM
中巴原产地证FTA
中哥产地证FORML

9)散(杂)货速递(空運)台灣報價;
一、首重RMB30,續重RMB10
注;以上報價包報關費、清關費、派送費、入倉費.及所以报关单证文件
本公司還提供DHL国际件服务:经营经香港DHL中转到世界各地的快件,公司拥有DHL提供的操作系统,中转时效快,查询方便、安全可靠。

BEST REGARDS
Paul  
HAOCHING INT'L LOGISTICS LTD
M.P:86-13430873117
TEL: 852-31763891
微信;13430873117
QQ;2540313891
SKYPE:HAOCHENGHK1987
E-MAIL: haochenghk1...@163.com
 

Re: [PATCH] netfilter: x_tables: fix kmemcheck warning.

2016-07-24 Thread Sergei Shtylyov

Hello.

On 7/24/2016 5:31 AM, Tetsuo Handa wrote:


kmemcheck complains that some of struct nf_hook_ops members allocated at
xt_hook_ops_alloc() are not initialized before nf_register_net_hook() is
called. Add __GFP_ZERO to initialize explicitly.

[  367.411936] nf_conntrack version 0.5.0 (6144 buckets, 24576 max)
[  367.458540] ip_tables: (C) 2000-2006 Netfilter Core Team
[  367.463977] WARNING: kmemcheck: Caught 64-bit read from uninitialized memory 
(88003af7f300)
[  367.465633] 303f5381
[  367.468185]  u u u u u u u u u u u u u u u u i i i i i i i i u u u u u u u u
[  367.470687]  ^
[  367.471079] RIP: 0010:[]  [] 
nf_register_net_hook+0x2f/0x160
[  367.472846] RSP: 0018:88003f5abcb0  EFLAGS: 00010286
[  367.473821] RAX: 88003adb73c0 RBX: 88003af7f300 RCX: 
[  367.475122] RDX:  RSI: 0001 RDI: 88003adb7400
[  367.476485] RBP: 88003f5abcc8 R08: 0067 R09: 
[  367.477764] R10: 88003adb83c0 R11:  R12: 81876760
[  367.479053] R13: 88003adb73c0 R14: 0003 R15: 88003af7f300
[  367.480351] FS:  () GS:8182c000() 
knlGS:
[  367.482018] CS:  0010 DS:  ES:  CR0: 80050033
[  367.483069] CR2: 88003f404280 CR3: 3ad3 CR4: 001406f0
[  367.484357]  [] nf_register_net_hooks+0x3c/0xa0
[  367.485617]  [] ipt_register_table+0xef/0x130
[  367.486765]  [] iptable_filter_table_init.part.1+0x51/0x70
[  367.488125]  [] iptable_filter_net_init+0x2a/0x30
[  367.489375]  [] ops_init+0x3c/0x130
[  367.490410]  [] register_pernet_operations+0x108/0x1c0
[  367.491746]  [] register_pernet_subsys+0x23/0x40
[  367.492933]  [] iptable_filter_init+0x33/0x4b
[  367.494053]  [] do_one_initcall+0x4a/0x180
[  367.495168]  [] kernel_init_freeable+0x15b/0x201
[  367.496332]  [] kernel_init+0x9/0x100
[  367.497369]  [] ret_from_fork+0x1f/0x40
[  367.498449]  [] 0x
[  367.499523] Initializing XFRM netlink socket
[  367.500404] NET: Registered protocol family 10
[  367.501792] ip6_tables: (C) 2000-2006 Netfilter Core Team
[  367.502971] NET: Registered protocol family 17

Signed-off-by: Tetsuo Handa 
---
 net/netfilter/x_tables.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)



diff --git a/net/netfilter/x_tables.c b/net/netfilter/x_tables.c
index e0aa7c1..fc261fe 100644
--- a/net/netfilter/x_tables.c
+++ b/net/netfilter/x_tables.c
@@ -1513,7 +1513,7 @@ xt_hook_ops_alloc(const struct xt_table *table, nf_hookfn 
*fn)
if (!num_hooks)
return ERR_PTR(-EINVAL);

-   ops = kmalloc(sizeof(*ops) * num_hooks, GFP_KERNEL);
+   ops = kmalloc(sizeof(*ops) * num_hooks, GFP_KERNEL | __GFP_ZERO);


   Why not just use kzalloc() or even kcalloc()?

[...]

MBR, Sergei



Re: [ethtool PATCH v2 4/4] ethtool: Enhancing link mode bits to support 25G/50G/100G

2016-07-24 Thread Vidya Sagar Ravipati
Yuval,
I will try to resubmit the patches this week with updated comments

Thanks
Vidya Sagar

On Sat, Jul 23, 2016 at 10:57 PM, Yuval Mintz  wrote:
>> Enhancing link mode bits to support 25G/50G/100G for supported and
>> advertised speed mode bits
>>
>> Signed-off-by: Vidya Sagar Ravipati 
>> ---
>>  ethtool.c | 27 +++
>>  1 file changed, 27 insertions(+)
>
> Hi Vidya,
>
> Are you re-trying your series one anytime soon?
>
> If not, can we simply push this [and ethtool-copy.h], as those are needed
> for querying/setting the recently added new speeds.
>


RE: [ethtool PATCH v2 4/4] ethtool: Enhancing link mode bits to support 25G/50G/100G

2016-07-24 Thread Yuval Mintz
> Enhancing link mode bits to support 25G/50G/100G for supported and
> advertised speed mode bits
> 
> Signed-off-by: Vidya Sagar Ravipati 
> ---
>  ethtool.c | 27 +++
>  1 file changed, 27 insertions(+)

Hi Vidya,

Are you re-trying your series one anytime soon?

If not, can we simply push this [and ethtool-copy.h], as those are needed
for querying/setting the recently added new speeds.