Re: [PATCH] net: socket: don't set sk_uid to garbage value in ->setattr()

2016-12-31 Thread Lorenzo Colitti
On Sat, Dec 31, 2016 at 8:42 AM, Eric Biggers  wrote:
> ->setattr() was recently implemented for socket files to sync the socket
> inode's uid to the new 'sk_uid' member of struct sock.  It does this by
> copying over the ia_uid member of struct iattr.  However, ia_uid is
> actually only valid when ATTR_UID is set in ia_valid, indicating that
> the uid is being changed, e.g. by chown.
> [...]
> -   if (!err) {
> +   if (!err && (iattr->ia_valid & ATTR_UID)) {

Oops. Thanks for fixing this. Unit tested in
https://android-review.googlesource.com/316594 .

Tested-by: Lorenzo Colitti 
Acked-by: Lorenzo Colitti 


Re: [PATCH] drop_monitor: add missing call to genlmsg_end

2016-12-31 Thread Neil Horman
On Sat, Dec 31, 2016 at 09:11:57PM +0100, Reiter Wolfgang wrote:
> Update nlmsg_len field with genlmsg_end to enable userspace processing
> using nlmsg_next helper. Also adds error handling.
> 
> Signed-off-by: Reiter Wolfgang 
> ---
>  net/core/drop_monitor.c | 33 -
>  1 file changed, 24 insertions(+), 9 deletions(-)
> 
> diff --git a/net/core/drop_monitor.c b/net/core/drop_monitor.c
> index 8e0c063..f465bad 100644
> --- a/net/core/drop_monitor.c
> +++ b/net/core/drop_monitor.c
> @@ -75,6 +75,7 @@ static struct sk_buff *reset_per_cpu_data(struct 
> per_cpu_dm_data *data)
>   struct nlattr *nla;
>   struct sk_buff *skb;
>   unsigned long flags;
> + void *msg_header;
>  
>   al = sizeof(struct net_dm_alert_msg);
>   al += dm_hit_limit * sizeof(struct net_dm_drop_point);
> @@ -82,17 +83,31 @@ static struct sk_buff *reset_per_cpu_data(struct 
> per_cpu_dm_data *data)
>  
>   skb = genlmsg_new(al, GFP_KERNEL);
>  
> - if (skb) {
> - genlmsg_put(skb, 0, 0, &net_drop_monitor_family,
> - 0, NET_DM_CMD_ALERT);
> - nla = nla_reserve(skb, NLA_UNSPEC,
> -   sizeof(struct net_dm_alert_msg));
> - msg = nla_data(nla);
> - memset(msg, 0, al);
> - } else {
> - mod_timer(&data->send_timer, jiffies + HZ / 10);
> + if (!skb)
> + goto err;
> +
> + msg_header = genlmsg_put(skb, 0, 0, &net_drop_monitor_family,
> +  0, NET_DM_CMD_ALERT);
> + if (!msg_header) {
> + nlmsg_free(skb);
> + skb = NULL;
> + goto err;
> + }
> + nla = nla_reserve(skb, NLA_UNSPEC,
> +   sizeof(struct net_dm_alert_msg));
> + if (!nla) {
> + nlmsg_free(skb);
> + skb = NULL;
> + goto err;
>   }
> + msg = nla_data(nla);
> + memset(msg, 0, al);
> + genlmsg_end(skb, msg_header);
> + goto out;
>  
> +err:
> + mod_timer(&data->send_timer, jiffies + HZ / 10);
> +out:
>   spin_lock_irqsave(&data->lock, flags);
>   swap(data->skb, skb);
>   spin_unlock_irqrestore(&data->lock, flags);
> -- 
> 2.9.3
> 
> 
Acked-by: Neil Horman 



Re: [PATCH net 1/2] r8152: fix the sw rx checksum is unavailable

2016-12-31 Thread Ansis Atteka
On Wed, Nov 30, 2016 at 3:58 AM, Hayes Wang  wrote:
> Mark Lord 
> [...]
>> > Not sure why, because there really is no other way for the data to
>> > appear where it does at the beginning of that URB buffer.
>> >
>> > This does seem a rather unexpected burden to place upon someone
>> > reporting a regression in a USB network driver that corrupts user data.
>>
>> If you are the only person who can actively reproduce this, which
>> seems to be the case right now, this is unfortunately the only way to
>> reach a proper analysis and fix.
>
> I have tested it with iperf more than five days without any error.
> I would think if there is any other way to reproduce it.
>

For the past few days I have been debugging a similar data corruption
bug related to r8152 driver, but on x86-64 platform. Also, I think
that this data corruption bug has some serious security implications,
because it appears that "corrupted data" is actually 530 byte fragment
from one of the previous Ethernet frames that Realtek device just
received. See the ping test in the bottom of my email that
demonstrates this.

Besides the data corruption problem I am also experiencing another
serious problem that could be related and manifests itself in XHCI
module when Realtek Ethernet port receives packets at "high" rate (ie
10Mbps or higher). This second problem correlates with error messages
in kern.log printed by xhci-hcd. Ethernet connectivity is completely
lost at this time until I reload r8152 driver:

[ 2540.426240] xhci_hcd :0e:00.0: ERROR Transfer event TRB DMA ptr
not part of current TD ep_index 2 comp_code 13
[ 2540.426258] xhci_hcd :0e:00.0: Looking for event-dma
fff0f010 trb-start ff5c9fe0 trb-end ff5c9fe0
seg-start ff5c9000 seg-end ff5c9ff0
[ 2540.426259] xhci_hcd :0e:00.0: ERROR Transfer event TRB DMA ptr
not part of current TD ep_index 2 comp_code 13
[ 2540.426260] xhci_hcd :0e:00.0: Looking for event-dma
fff0f020 trb-start ff5c9fe0 trb-end ff5c9fe0
seg-start ff5c9000 seg-end ff5c9ff0
[ 2540.426334] xhci_hcd :0e:00.0: ERROR Transfer event TRB DMA ptr
not part of current TD ep_index 2 comp_code 13
[ 2540.426336] xhci_hcd :0e:00.0: Looking for event-dma
fff0f030 trb-start ff5c9fe0 trb-end ff5c9fe0
seg-start ff5c9000 seg-end ff5c9ff0
[ 2540.426372] xhci_hcd :0e:00.0: ERROR Transfer event TRB DMA ptr
not part of current TD ep_index 2 comp_code 13
[ 2540.426373] xhci_hcd :0e:00.0: Looking for event-dma
fff0f040 trb-start ff5c9fe0 trb-end ff5c9fe0
seg-start ff5c9000 seg-end ff5c9ff0
[ 2540.426488] xhci_hcd :0e:00.0: ERROR Transfer event TRB DMA ptr
not part of current TD ep_index 2 comp_code 13
[ 2540.426491] xhci_hcd :0e:00.0: Looking for event-dma
fff0f050 trb-start ff5c9fe0 trb-end ff5c9fe0
seg-start ff5c9000 seg-end ff5c9ff0
[ 2540.437020] xhci_hcd :0e:00.0: ERROR Transfer event TRB DMA ptr
not part of current TD ep_index 2 comp_code 13
[ 2540.437024] xhci_hcd :0e:00.0: Looking for event-dma
fff0f060 trb-start ff5c9fe0 trb-end ff5c9fe0
seg-start ff5c9000 seg-end ff5c9ff0
[ 2540.438239] xhci_hcd :0e:00.0: ERROR Transfer event TRB DMA ptr
not part of current TD ep_index 2 comp_code 13
[ 2540.438246] xhci_hcd :0e:00.0: Looking for event-dma
fff0f070 trb-start ff5c9fe0 trb-end ff5c9fe0
seg-start ff5c9000 seg-end ff5c9ff0
[ 2540.438493] xhci_hcd :0e:00.0: ERROR Transfer event TRB DMA ptr
not part of current TD ep_index 2 comp_code 13
[ 2540.438495] xhci_hcd :0e:00.0: Looking for event-dma
fff0f080 trb-start ff5c9fe0 trb-end ff5c9fe0
seg-start ff5c9000 seg-end ff5c9ff0


All of that is happening on my X86-64  Dell XPS15 9550 laptop that is
connected to Ethernet via Dell TB15 dock. This Dell TB 15 Dock uses
Realtek chip to provide Ethernet connectivity to laptop:

# lsusb
...
Bus 004 Device 003: ID 0bda:8153 Realtek Semiconductor Corp.
Device Descriptor:
  bLength18
  bDescriptorType 1
  bcdUSB   3.00
  bDeviceClass0 (Defined at Interface level)
  bDeviceSubClass 0
  bDeviceProtocol 0
  bMaxPacketSize0 9
  idVendor   0x0bda Realtek Semiconductor Corp.
  idProduct  0x8153
  bcdDevice   30.01
  iManufacturer   1 Realtek
  iProduct2 USB 10/100/1000 LAN
  iSerial 6 0100
  bNumConfigurations  2

This Realtek Ethernet port is connected to a XHCI ASMedia host
controller that also resides on Dell TB15 Dock. The dock itself is
connected via Thunderbolt 3 cable to laptop:

# lspci

0e:00.0 USB controller: ASMedia Technology Inc. ASM1042A USB 3.0 Host Controller


In my case it is easy to reproduce either of those two issues. Here

Re: [PATCH v1 0/2] bpf: add longest prefix match map

2016-12-31 Thread Alexei Starovoitov
On Fri, Dec 30, 2016 at 12:25 PM, David Miller  wrote:
> From: Daniel Mack 
> Date: Thu, 29 Dec 2016 18:28:53 +0100
>
>> This patch set adds a longest prefix match algorithm that can be used
>> to match IP addresses to a stored set of ranges. It is exposed as a
>> bpf map type.
>>
>> Internally, data is stored in an unbalanced tree of nodes that has a
>> maximum height of n, where n is the prefixlen the trie was created
>> with.
>>
>> Not that this has nothing to do with fib or fib6 and is in no way meant
>> to replace or share code with it. It's rather a much simpler
>> implementation that is specifically written with bpf maps in mind.
>>
>> Patch 1/2 adds the implementation, and 2/2 an extensive test suite.
>>
>> Feedback is much appreciated.
>
> I'll give Alexei and Daniel time to provide feedback on this series.

I did a quick glance over and, in general, I'm very much in favor.
Will do a proper review Jan 3rd when I'm back from pto.
Daniel,
could you provide performance numbers for lookups per second
for different key lengths with almost empty and almost
full 100k+ lpm rules?
And the rate of updates per second?
I guess your use case is more traditional 32 or 128 bit lpm
whereas I'd like to use it as 64-bit lpm.
Would be good to add few tests with non-power of 8 key length
(it seems to me they should be supported by the algo).
Thank you for working on it!


Re: [PATCH net-next V3 3/3] tun: rx batching

2016-12-31 Thread Stephen Hemminger
On Fri, 30 Dec 2016 13:20:51 +0800
Jason Wang  wrote:

> diff --git a/drivers/net/tun.c b/drivers/net/tun.c
> index cd8e02c..a268ed9 100644
> --- a/drivers/net/tun.c
> +++ b/drivers/net/tun.c
> @@ -75,6 +75,10 @@
>  
>  #include 
>  
> +static int rx_batched;
> +module_param(rx_batched, int, 0444);
> +MODULE_PARM_DESC(rx_batched, "Number of packets batched in rx");
> +
>  /* Uncomment to enable debugging */

I like the concept or rx batching. But controlling it via a module parameter
is one of the worst API choices.  Ethtool would be better to use because that is
how other network devices control batching.

If you do ethtool, you could even extend it to have an number of packets
and max latency value.


Re: [PATCH] Introduce a sysctl that modifies the value of PROT_SOCK.

2016-12-31 Thread Stephen Hemminger
On Fri, 30 Dec 2016 20:11:11 -0800
Krister Johansen  wrote:

>  
> +config LOWPORT_SYSCTL
> + bool "Adjust reserved port range via sysctl"
> + depends on SYSCTL
> + help
> +   This allows the administrator to adjust the reserved port range
> +   using a sysctl.

This looks like a good idea, and makes a lot of sense.

Please don't introduce yet another config option. All distro's will enable it 
anyway.
Having more config options doesn't help reliability or testability.

Do or do not, please no new config options.


Re: [PATCH v3] net: ethernet: faraday: To support device tree usage.

2016-12-31 Thread Arnd Bergmann
On Saturday, December 31, 2016 10:48:39 AM CET Florian Fainelli wrote:
> 
> On 12/29/2016 11:37 PM, Greentime Hu wrote:
> > Signed-off-by: Greentime Hu 
> 
> This is not enough, you need to add a Device Tree binding document under
> Documentation/devicetree/bindings/net/ which documents this compatible
> string, as well as additional properties that may be required to
> describe this hardware block.

We already have

Documentation/devicetree/bindings/net/moxa,moxart-mac.txt

for the same hardware (though used by a different driver).

I'd suggest renaming that one to a more generic file name and
adding the new compatible string there.

Aside from that, every patch should also have a changelog comment.

Arnd


Re: Bug w/ (policy) routing

2016-12-31 Thread David Ahern
On 12/30/16 4:00 PM, Olivier Brunel wrote:
> Hi,
> 
> (Please cc me as I'm not subscribed to the list, thanks.)
> 
> I'm trying to set things up using some policy routing, and having some
> weird issues I can't really explain. It looks to me like there might be
> a bug somewhere...
> 
> This is done under Arch Linux 64bits, iproute2 4.9.0 (`ip -V` says ip
> utility, iproute2-ss161212), kernel 4.8.13
> 
> Basically here's what I could reduce it to:
> - create a new network namespace, create a pair of veth devices: one in
> there, one sent back to the original namespace
> - I'm giving them IPs 10.4.0.1 (original namespace) & 10.4.0.2 (new
> namespace)
> - in that new namespace, I'm trying to add a route to 10.4.0.1, but
>   inside a new table. I also want a default route via 10.4.0.1 on the
>   table main. It seems to work, only not really...
> 
> It's not very easy to describe so hopefully this will make things
> clearer:
> 
> $ sudo unshare -n sh

The main and local fib tables start merged into a single fib_table instance.

> sh-4.4# ip rule add table 50 prio 50
> sh-4.4# ip link add test type veth peer name test2
> sh-4.4# ip addr add 10.4.0.2 dev test
> sh-4.4# ip link set dev test up
> sh-4.4# ip link set netns 1 dev test2
> # back in original namespace, we add 10.4.0.1 to test2 and bring it up
> 
> sh-4.4# ip route add 10.4.0.1 dev test table 50
> sh-4.4# ip route add default via 10.4.0.1 dev test
> sh-4.4# ip route flush cache
> sh-4.4# ip rule
> 0:from all lookup local 
> 50:   from all lookup 50 
> 32766:from all lookup main 
> 32767:from all lookup default 
> sh-4.4# ip route show table 50
> 10.4.0.1 dev test scope link 
> sh-4.4# ip route get 10.4.0.1
> 10.4.0.1 via 10.4.0.1 dev test table local src 10.4.0.2 
> cache 
> # !?? why isn't table 50 used as, I believe, it should. And why

The default rule set has the local table at priority 0 so all lookups hit that 
table first:

# ip ru ls
0:  from all lookup local
32766:  from all lookup main
32767:  from all lookup default

For some reason it hits a match doing the lookup in the merged local/main 
fib_table for this ip route get.

> does adding a rule "fixes" it :
> 
> sh-4.4# ip rule add prio 5

Adding this rule causes the local and main tables to be split into actual 
separate fib tables. Doing so causes the lookup in the local table to fail, so 
the lookup continues to the next rule -- which is your prio 50 table 50 rule.

I did not look into why the earlier rule addition did not cause the unmerge to 
happen -- it should have.


> sh-4.4# ip route get 10.4.0.1
> 10.4.0.1 dev test table 50 src 10.4.0.2 
> cache 
> # deleting the new rule makes no difference. It could even have been
> done right after adding it. It's like it just triggered something
> (reload, cache flushed, ...)
> As seen I did flush cached routes as mentionned in the man page, I don't
> know of anything else that would need done to "refresh" things?
> 
> Any ideas as to why this is happening? Should this work as I expect it,
> or is there anything I'm doing wrong?

For your purposes now this should fix the problem for you:

ip ru del from all lookup local
ip ru add prio 32765 from all lookup local




[PATCH] drop_monitor: add missing call to genlmsg_end

2016-12-31 Thread Reiter Wolfgang
Update nlmsg_len field with genlmsg_end to enable userspace processing
using nlmsg_next helper. Also adds error handling.

Signed-off-by: Reiter Wolfgang 
---
 net/core/drop_monitor.c | 33 -
 1 file changed, 24 insertions(+), 9 deletions(-)

diff --git a/net/core/drop_monitor.c b/net/core/drop_monitor.c
index 8e0c063..f465bad 100644
--- a/net/core/drop_monitor.c
+++ b/net/core/drop_monitor.c
@@ -75,6 +75,7 @@ static struct sk_buff *reset_per_cpu_data(struct 
per_cpu_dm_data *data)
struct nlattr *nla;
struct sk_buff *skb;
unsigned long flags;
+   void *msg_header;
 
al = sizeof(struct net_dm_alert_msg);
al += dm_hit_limit * sizeof(struct net_dm_drop_point);
@@ -82,17 +83,31 @@ static struct sk_buff *reset_per_cpu_data(struct 
per_cpu_dm_data *data)
 
skb = genlmsg_new(al, GFP_KERNEL);
 
-   if (skb) {
-   genlmsg_put(skb, 0, 0, &net_drop_monitor_family,
-   0, NET_DM_CMD_ALERT);
-   nla = nla_reserve(skb, NLA_UNSPEC,
- sizeof(struct net_dm_alert_msg));
-   msg = nla_data(nla);
-   memset(msg, 0, al);
-   } else {
-   mod_timer(&data->send_timer, jiffies + HZ / 10);
+   if (!skb)
+   goto err;
+
+   msg_header = genlmsg_put(skb, 0, 0, &net_drop_monitor_family,
+0, NET_DM_CMD_ALERT);
+   if (!msg_header) {
+   nlmsg_free(skb);
+   skb = NULL;
+   goto err;
+   }
+   nla = nla_reserve(skb, NLA_UNSPEC,
+ sizeof(struct net_dm_alert_msg));
+   if (!nla) {
+   nlmsg_free(skb);
+   skb = NULL;
+   goto err;
}
+   msg = nla_data(nla);
+   memset(msg, 0, al);
+   genlmsg_end(skb, msg_header);
+   goto out;
 
+err:
+   mod_timer(&data->send_timer, jiffies + HZ / 10);
+out:
spin_lock_irqsave(&data->lock, flags);
swap(data->skb, skb);
spin_unlock_irqrestore(&data->lock, flags);
-- 
2.9.3



Re: [PATCH v3] net: ethernet: faraday: To support device tree usage.

2016-12-31 Thread Florian Fainelli


On 12/29/2016 11:37 PM, Greentime Hu wrote:
> Signed-off-by: Greentime Hu 

This is not enough, you need to add a Device Tree binding document under
Documentation/devicetree/bindings/net/ which documents this compatible
string, as well as additional properties that may be required to
describe this hardware block.

-- 
Florian


Re: [PATCH] net: socket: don't set sk_uid to garbage value in ->setattr()

2016-12-31 Thread David Miller
From: Eric Biggers 
Date: Fri, 30 Dec 2016 17:42:32 -0600

> From: Eric Biggers 
> 
> ->setattr() was recently implemented for socket files to sync the socket
> inode's uid to the new 'sk_uid' member of struct sock.  It does this by
> copying over the ia_uid member of struct iattr.  However, ia_uid is
> actually only valid when ATTR_UID is set in ia_valid, indicating that
> the uid is being changed, e.g. by chown.  Other metadata operations such
> as chmod or utimes leave ia_uid uninitialized.  Therefore, sk_uid could
> be set to a "garbage" value from the stack.
> 
> Fix this by only copying the uid over when ATTR_UID is set.
> 
> Fixes: 86741ec25462 ("net: core: Add a UID field to struct sock.")
> Signed-off-by: Eric Biggers 

Lorenzo, please review.


Re: [PATCH net-next V3 3/3] tun: rx batching

2016-12-31 Thread David Miller
From: Jason Wang 
Date: Fri, 30 Dec 2016 13:20:51 +0800

> @@ -1283,10 +1314,15 @@ static ssize_t tun_get_user(struct tun_struct *tun, 
> struct tun_file *tfile,
>   skb_probe_transport_header(skb, 0);
>  
>   rxhash = skb_get_hash(skb);
> +
>  #ifndef CONFIG_4KSTACKS
> - local_bh_disable();
> - netif_receive_skb(skb);
> - local_bh_enable();
> + if (!rx_batched) {
> + local_bh_disable();
> + netif_receive_skb(skb);
> + local_bh_enable();
> + } else {
> + tun_rx_batched(tfile, skb, more);
> + }
>  #else
>   netif_rx_ni(skb);
>  #endif

If rx_batched has been set, and we are talking to clients not using
this new MSG_MORE facility (or such clients don't have multiple TX
packets to send to you, thus MSG_MORE is often clear), you are doing a
lot more work per-packet than the existing code.

You take the queue lock, you test state, you splice into a local queue
on the stack, then you walk that local stack queue to submit just one
SKB to netif_receive_skb().

I think you want to streamline this sequence in such cases so that the
cost before and after is similar if not equivalent.


Re: [PATCH net-next] af_packet: Provide a TPACKET_V2 compatible Tx path for TPACKET_V3

2016-12-31 Thread Sowmini Varadhan
On (12/30/16 23:59), Willem de Bruijn wrote:
> > Actually I'm not averse to looking at extensions (or at least,
> > place-holders) to allow variable sized slots- do you have any
> > suggestions?
> 
> It is probably enough to just enforce that tp_next_offset is always
> sane. Either that it points to the start of the next packet, even
> though that currently can also be inferred from frame_size. Or, that
> it is always zero unless the ring is to be interpreted as holding
> variable sized frames. If the field is non-zero, drop the packet.

Ok, let me add this to patchv2, along with extra test cases.

--Sowmini


ping: What's the purpose of rate limit?

2016-12-31 Thread Guan Xin
Hello,

Excuse me, I searched but didn't find an answer --

What's the purpose of setting a limit to the "-i" and "-l" parameters
of ping for non-root users?

It seems that this is only intended to prevent accidental misuse
because these restrictions can be easily worked around by starting
multiple instances or wrapping the program in a loop, e.g.,
while true; do ping -l3 -c3 192.168.0.1; done

(Please Cc to me because I'm not subscribed to this list. Thanks!)

Regards,
Guan