Re: [PATCH v2 net] bonding: don't use stale speed and duplex information

2016-02-28 Thread zhuyj

On 02/29/2016 01:39 PM, Jay Vosburgh wrote:

zhuyj  wrote:


On 02/25/2016 09:33 PM, Jay Vosburgh wrote:

zhuyj  wrote:
[...]

I delved into the source code and Emil's tests. I think that the problem
that this patch expects to fix occurs very unusually.

Do you agree with me?

If so, maybe the following patch can reduce the performance loss.
Please comment on it. Thanks a lot.


diff --git a/drivers/net/bonding/bond_main.c
b/drivers/net/bonding/bond_main.c
index b7f1a99..c4c511a 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -2129,7 +2129,9 @@ static void bond_miimon_commit(struct bonding *bond)
 continue;

 case BOND_LINK_UP:
-   bond_update_speed_duplex(slave);
+   if (slave->speed == SPEED_UNKNOWN)
+   bond_update_speed_duplex(slave);
+
 bond_set_slave_link_state(slave, BOND_LINK_UP,
BOND_SLAVE_NOTIFY_NOW);
 slave->last_link_up = jiffies;

I don't believe the speed is necessarily SPEED_UNKNOWN coming in
here.  If the race occurs at a time later than the initial enslavement,
speed may already be set (and the race manifests if the new speed
changes, i.e., the link changes from 1 Gb/sec to 10 Gb/sec), so I don't
think this is functionally correct.

Hi, Jay

Thanks for your reply.

IMHO, "If the race occurs at a time later than the initial enslavement,
speed may already be set (and the race manifests if the new speed
changes, i.e., the link changes from 1 Gb/sec to 10 Gb/sec)", from my test,
this will not happen because the previous source code make the speed
correct.

How, exactly, will "the previous source code make the speed
correct"?


This "bond_update_speed_duplex" repeats to get the correct speed.

That is, this patch is to fix the error in initial enslavement. The
mentioned scenario will not occur.

I see nothing in the code that limits the race to happening only
at enslavement time.

If the bond_mii_monitor call executes between the device going
link up and the arrival of the NETDEV_CHANGE or NETDEV_UP callback, the
stored speed and duplex are stale.  The stale speed value is not
guaranteed to be SPEED_UNKNOWN, so your patch is not functionally
correct.


Hi, Jay

In this function bond_slave_netdev_event, the speed is updated.

Best Regards!
Zhu Yanjun



-J


Even though the performance impact is minimal, if we can avoid this
performance
impact, why not ?

Best Regards!
Zhu Yanjun


Also, the call to bond_miimon_commit itself is already gated by
bond_miimon_inspect finding a link state change.  The performance impact
here should be minimal.

-J

---
-Jay Vosburgh, jay.vosbu...@canonical.com




linux-next: manual merge of the target-merge tree with the net-next tree

2016-02-28 Thread Stephen Rothwell
Hi Nicholas,

Today's linux-next merge of the target-merge tree got a conflict in:

  drivers/net/ethernet/chelsio/cxgb4/t4fw_api.h

between commit:

  ba9cee6aa67d ("cxgb4/iw_cxgb4: TOS support")

from the net-next tree and commit:

  c973e2a3ff1b ("cxgb4: add definitions for iSCSI target ULD")

from the target-merge tree.

I fixed it up (the latter was a superset of the former) and can carry
the fix as necessary (no action is required).

-- 
Cheers,
Stephen Rothwell


Re: [PATCH] asm-generic: remove old nonatomic-io wrapper files

2016-02-28 Thread Vinod Koul
On Fri, Feb 26, 2016 at 03:29:05PM +0100, Arnd Bergmann wrote:
> The two header files got moved to include/linux, and most
> users were already converted, this changes the remaining drivers
> and removes the files.
> 
> Signed-off-by: Arnd Bergmann 
> ---
>  drivers/dma/idma64.h| 2 +-
For this:

Acked-by: Vinod Koul 

Thanks
-- 
~Vinod


Re: [PATCH v2 net] bonding: don't use stale speed and duplex information

2016-02-28 Thread Jay Vosburgh
zhuyj  wrote:

>On 02/25/2016 09:33 PM, Jay Vosburgh wrote:
>> zhuyj  wrote:
>> [...]
>>> I delved into the source code and Emil's tests. I think that the problem
>>> that this patch expects to fix occurs very unusually.
>>>
>>> Do you agree with me?
>>>
>>> If so, maybe the following patch can reduce the performance loss.
>>> Please comment on it. Thanks a lot.
>>>
>>>
>>> diff --git a/drivers/net/bonding/bond_main.c
>>> b/drivers/net/bonding/bond_main.c
>>> index b7f1a99..c4c511a 100644
>>> --- a/drivers/net/bonding/bond_main.c
>>> +++ b/drivers/net/bonding/bond_main.c
>>> @@ -2129,7 +2129,9 @@ static void bond_miimon_commit(struct bonding *bond)
>>> continue;
>>>
>>> case BOND_LINK_UP:
>>> -   bond_update_speed_duplex(slave);
>>> +   if (slave->speed == SPEED_UNKNOWN)
>>> +   bond_update_speed_duplex(slave);
>>> +
>>> bond_set_slave_link_state(slave, BOND_LINK_UP,
>>> BOND_SLAVE_NOTIFY_NOW);
>>> slave->last_link_up = jiffies;
>>  I don't believe the speed is necessarily SPEED_UNKNOWN coming in
>> here.  If the race occurs at a time later than the initial enslavement,
>> speed may already be set (and the race manifests if the new speed
>> changes, i.e., the link changes from 1 Gb/sec to 10 Gb/sec), so I don't
>> think this is functionally correct.
>Hi, Jay
>
>Thanks for your reply.
>
>IMHO, "If the race occurs at a time later than the initial enslavement,
>speed may already be set (and the race manifests if the new speed
>changes, i.e., the link changes from 1 Gb/sec to 10 Gb/sec)", from my test,
>this will not happen because the previous source code make the speed
>correct.

How, exactly, will "the previous source code make the speed
correct"?

>This "bond_update_speed_duplex" repeats to get the correct speed.
>
>That is, this patch is to fix the error in initial enslavement. The
>mentioned scenario will not occur.

I see nothing in the code that limits the race to happening only
at enslavement time.

If the bond_mii_monitor call executes between the device going
link up and the arrival of the NETDEV_CHANGE or NETDEV_UP callback, the
stored speed and duplex are stale.  The stale speed value is not
guaranteed to be SPEED_UNKNOWN, so your patch is not functionally
correct.

-J

>Even though the performance impact is minimal, if we can avoid this
>performance
>impact, why not ?
>
>Best Regards!
>Zhu Yanjun
>
>>
>>  Also, the call to bond_miimon_commit itself is already gated by
>> bond_miimon_inspect finding a link state change.  The performance impact
>> here should be minimal.
>>
>>  -J

---
-Jay Vosburgh, jay.vosbu...@canonical.com


Re: [PATCH V3 3/3] vhost_net: basic polling support

2016-02-28 Thread Jason Wang


On 02/29/2016 05:56 AM, Christian Borntraeger wrote:
> On 02/26/2016 09:42 AM, Jason Wang wrote:
>> > This patch tries to poll for new added tx buffer or socket receive
>> > queue for a while at the end of tx/rx processing. The maximum time
>> > spent on polling were specified through a new kind of vring ioctl.
>> > 
>> > Signed-off-by: Jason Wang 
>> > ---
>> >  drivers/vhost/net.c| 79 
>> > +++---
>> >  drivers/vhost/vhost.c  | 14 
>> >  drivers/vhost/vhost.h  |  1 +
>> >  include/uapi/linux/vhost.h |  6 
>> >  4 files changed, 95 insertions(+), 5 deletions(-)
>> > 
>> > diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
>> > index 9eda69e..c91af93 100644
>> > --- a/drivers/vhost/net.c
>> > +++ b/drivers/vhost/net.c
>> > @@ -287,6 +287,44 @@ static void vhost_zerocopy_callback(struct ubuf_info 
>> > *ubuf, bool success)
>> >rcu_read_unlock_bh();
>> >  }
>> > 
>> > +static inline unsigned long busy_clock(void)
>> > +{
>> > +  return local_clock() >> 10;
>> > +}
>> > +
>> > +static bool vhost_can_busy_poll(struct vhost_dev *dev,
>> > +  unsigned long endtime)
>> > +{
>> > +  return likely(!need_resched()) &&
>> > + likely(!time_after(busy_clock(), endtime)) &&
>> > + likely(!signal_pending(current)) &&
>> > + !vhost_has_work(dev) &&
>> > + single_task_running();
>> > +}
>> > +
>> > +static int vhost_net_tx_get_vq_desc(struct vhost_net *net,
>> > +  struct vhost_virtqueue *vq,
>> > +  struct iovec iov[], unsigned int iov_size,
>> > +  unsigned int *out_num, unsigned int *in_num)
>> > +{
>> > +  unsigned long uninitialized_var(endtime);
>> > +  int r = vhost_get_vq_desc(vq, vq->iov, ARRAY_SIZE(vq->iov),
>> > +  out_num, in_num, NULL, NULL);
>> > +
>> > +  if (r == vq->num && vq->busyloop_timeout) {
>> > +  preempt_disable();
>> > +  endtime = busy_clock() + vq->busyloop_timeout;
>> > +  while (vhost_can_busy_poll(vq->dev, endtime) &&
>> > + vhost_vq_avail_empty(vq->dev, vq))
>> > +  cpu_relax();
> Can you use cpu_relax_lowlatency (which should be the same as cpu_relax for 
> almost
> everybody but s390? cpu_relax (without low latency might give up the time 
> slice
> when running under another hypervisor (like LPAR on s390), which might not be 
> what
> we want here.

Ok, will do this in next version.


Re: [PATCH V3 3/3] vhost_net: basic polling support

2016-02-28 Thread Jason Wang


On 02/28/2016 10:09 PM, Michael S. Tsirkin wrote:
> On Fri, Feb 26, 2016 at 04:42:44PM +0800, Jason Wang wrote:
>> > This patch tries to poll for new added tx buffer or socket receive
>> > queue for a while at the end of tx/rx processing. The maximum time
>> > spent on polling were specified through a new kind of vring ioctl.
>> > 
>> > Signed-off-by: Jason Wang 
> Looks good overall, but I still see one problem.
>
>> > ---
>> >  drivers/vhost/net.c| 79 
>> > +++---
>> >  drivers/vhost/vhost.c  | 14 
>> >  drivers/vhost/vhost.h  |  1 +
>> >  include/uapi/linux/vhost.h |  6 
>> >  4 files changed, 95 insertions(+), 5 deletions(-)
>> > 
>> > diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
>> > index 9eda69e..c91af93 100644
>> > --- a/drivers/vhost/net.c
>> > +++ b/drivers/vhost/net.c
>> > @@ -287,6 +287,44 @@ static void vhost_zerocopy_callback(struct ubuf_info 
>> > *ubuf, bool success)
>> >rcu_read_unlock_bh();
>> >  }
>> >  
>> > +static inline unsigned long busy_clock(void)
>> > +{
>> > +  return local_clock() >> 10;
>> > +}
>> > +
>> > +static bool vhost_can_busy_poll(struct vhost_dev *dev,
>> > +  unsigned long endtime)
>> > +{
>> > +  return likely(!need_resched()) &&
>> > + likely(!time_after(busy_clock(), endtime)) &&
>> > + likely(!signal_pending(current)) &&
>> > + !vhost_has_work(dev) &&
>> > + single_task_running();
> So I find it quite unfortunate that this still uses single_task_running.
> This means that for example a SCHED_IDLE task will prevent polling from
> becoming active, and that seems like a bug, or at least
> an undocumented feature :).

Yes, it may need more thoughts.

>
> Unfortunately this logic affects the behaviour as observed
> by userspace, so we can't merge it like this and tune
> afterwards, since otherwise mangement tools will start
> depending on this logic.
>
>

How about remove single_task_running() first here and optimize on top?
We probably need something like this to handle overcommitment.



Re: Softirq priority inversion from "softirq: reduce latencies"

2016-02-28 Thread Mike Galbraith
On Sun, 2016-02-28 at 18:01 +0100, Francois Romieu wrote:
> Mike Galbraith  :
> [...]
> > Hrm, relatively new + tasklet woes rings a bell.  Ah, that..
> > 
> > 
> > What's worse is that at the point where this code was written it was
> > already well known that tasklets are a steaming pile of crap and
> > should die.
> > 
> > 
> > Source thereof https://lwn.net/Articles/588457/
> 
> tasklets are ingrained in the dmaengine API (see 
> Documentation/dmaengine/client.txt
> and drivers/dma/virt-dma.h::vchan_cookie_complete).
> 
> Moving everything to irq context or handling his own sub-{jiffy/ms} timer
> while losing async dma doesn't exactly smell like roses either. :o(

https://lwn.net/Articles/239633/

If I'm listening properly, the root cause is that there is a timing
constraint involved, which is being exposed because one softirq raises
another (ew).  Processing timeout happens, freshly raised tasklet
wanders off to SCHED_NORMAL kthread context where its constraint dies.

Given the dma stuff apparently works fine in -rt (or did, see below),
timing constraints can't be super tight, so perhaps we could grow
realtime workqueue support for the truly deserving.  The tricky bit
would be being keeping everybody and his brother from abusing it.

WRT -rt: if dma tasklets really do have hard (ish) constraints, -rt
recently "broke" in the same way.. of all softirqs which are deferred
to kthread context, due to a recent change, only timer/hrtimer are
executed at realtime priority by default.

-Mike


Re: [PATCH] 3c59x: Ensure to apply the expires time

2016-02-28 Thread David Miller
From: Stafford Horne 
Date: Sun, 28 Feb 2016 16:49:29 +0900

> In commit 5b6490def9168af6a ("3c59x: Use setup_timer()") Amitoj
> removed add_timer which sets up the epires timer.  In this patch
> the behavior is restore but it uses mod_timer which is a bit more
> compact.
> 
> Signed-off-by: Stafford Horne 

Applied, thanks.


Re: [PATCH net-next V1 09/10] net/mlx5: Fix global UAR mapping

2016-02-28 Thread David Miller
From: Saeed Mahameed 
Date: Sun, 28 Feb 2016 17:09:10 +0200

> We use ARCH_HAS_IOREMAP_WC to know if the current arch supports WC
> (Write combining) IO memory mapping, if it is not supported
> "uar->bf_map" will be NULL, thus we will use NC (Non Cached) mapping
> "uar->map".

This description sucks.

You're just saying what will happen if the CPP is defined or not
(uar->bf_map ends up being NULL).

Well anyone can see that from the code.

You have to explain why.

And BTW, ARCH_HAS_IOREMAP_WC doesn't even tell you if the platform
will actually give you a write-combining mapping.

So if it's the driver operates properly if a non-WC mapping is used
for uar->bf_map, then get rid of this CPP test altogether PLEASE!

Otherwise your driver is buggy, because ARCH_HAS_IOREMAP_WC only says
whether the default implementation of ioremap_wc() needs to be
provided by include/asm-generic/iomap.h It does not guarantee that a
write-combining mapping will be provided.

I really can't think of any reason why you absolutely require a
WC mapping, and the CPP test just makes your driver look more
ugly than it needs to me.

So can you please explain what the hell is happening here and why you
are doing things this way rather than just reading the code to me?

Thanks.


Re: [PATCH v2 2/3] net: ipv4: tcp_probe: Replace timespec with timespec64

2016-02-28 Thread YOSHIFUJI Hideaki


Deepa Dinamani wrote:
> TCP probe log timestamps use struct timespec which is
> not y2038 safe. Even though timespec might be good enough here
> as it is used to represent delta time, the plan is to get rid
> of all uses of timespec in the kernel.
> Replace with struct timespec64 which is y2038 safe.
> 
> Prints still use unsigned long format and type.
> 
> Signed-off-by: Deepa Dinamani 

Acked-by: YOSHIFUJI Hideaki 

> Reviewed-by: Arnd Bergmann 
> Cc: "David S. Miller" 
> Cc: Alexey Kuznetsov 
> Cc: James Morris 
> Cc: Hideaki YOSHIFUJI 
> Cc: Patrick McHardy 
> ---
>  net/ipv4/tcp_probe.c | 8 
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/net/ipv4/tcp_probe.c b/net/ipv4/tcp_probe.c
> index ebf5ff5..f6c50af 100644
> --- a/net/ipv4/tcp_probe.c
> +++ b/net/ipv4/tcp_probe.c
> @@ -187,13 +187,13 @@ static int tcpprobe_sprint(char *tbuf, int n)
>  {
>   const struct tcp_log *p
>   = tcp_probe.log + tcp_probe.tail;
> - struct timespec tv
> - = ktime_to_timespec(ktime_sub(p->tstamp, tcp_probe.start));
> + struct timespec64 ts
> + = ktime_to_timespec64(ktime_sub(p->tstamp, tcp_probe.start));
>  
>   return scnprintf(tbuf, n,
>   "%lu.%09lu %pISpc %pISpc %d %#x %#x %u %u %u %u %u\n",
> - (unsigned long)tv.tv_sec,
> - (unsigned long)tv.tv_nsec,
> + (unsigned long)ts.tv_sec,
> + (unsigned long)ts.tv_nsec,
>   >src, >dst, p->length, p->snd_nxt, p->snd_una,
>   p->snd_cwnd, p->ssthresh, p->snd_wnd, p->srtt, 
> p->rcv_wnd);
>  }
> 

-- 
Hideaki Yoshifuji 
Technical Division, MIRACLE LINUX CORPORATION


[PATCH v19 04/10] bpf: Mark __bpf_prog_run() stack frame as non-standard

2016-02-28 Thread Josh Poimboeuf
objtool reports the following false positive warnings:

  kernel/bpf/core.o: warning: objtool: __bpf_prog_run()+0x5c: sibling call from 
callable instruction with changed frame pointer
  kernel/bpf/core.o: warning: objtool: __bpf_prog_run()+0x60: function has 
unreachable instruction
  kernel/bpf/core.o: warning: objtool: __bpf_prog_run()+0x64: function has 
unreachable instruction
  [...]

It's confused by the following dynamic jump instruction in
__bpf_prog_run()::

  jmp *(%r12,%rax,8)

which corresponds to the following line in the C code:

  goto *jumptable[insn->code];

There's no way for objtool to deterministically find all possible
branch targets for a dynamic jump, so it can't verify this code.

In this case the jumps all stay within the function, and there's nothing
unusual going on related to the stack, so we can whitelist the function.

Signed-off-by: Josh Poimboeuf 
Acked-by: Daniel Borkmann 
Acked-by: Alexei Starovoitov 
Cc: netdev@vger.kernel.org
---
 kernel/bpf/core.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
index 972d9a8..be0abf6 100644
--- a/kernel/bpf/core.c
+++ b/kernel/bpf/core.c
@@ -27,6 +27,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 
@@ -649,6 +650,7 @@ load_byte:
WARN_RATELIMIT(1, "unknown opcode %02x\n", insn->code);
return 0;
 }
+STACK_FRAME_NON_STANDARD(__bpf_prog_run); /* jump table */
 
 bool bpf_prog_array_compatible(struct bpf_array *array,
   const struct bpf_prog *fp)
-- 
2.4.3



Re: [PATCH v2 1/3] net: ipv4: Convert IP network timestamps to be y2038 safe

2016-02-28 Thread YOSHIFUJI Hideaki


Deepa Dinamani wrote:
> ICMP timestamp messages and IP source route options require
> timestamps to be in milliseconds modulo 24 hours from
> midnight UT format.
> 
> Add inet_current_timestamp() function to support this. The function
> returns the required timestamp in network byte order.
> 
> Timestamp calculation is also changed to call ktime_get_real_ts64()
> which uses struct timespec64. struct timespec64 is y2038 safe.
> Previously it called getnstimeofday() which uses struct timespec.
> struct timespec is not y2038 safe.
> 
> Signed-off-by: Deepa Dinamani 

Acked-by: YOSHIFUJI Hideaki 

--yoshfuji

> Cc: "David S. Miller" 
> Cc: Alexey Kuznetsov 
> Cc: Hideaki YOSHIFUJI 
> Cc: James Morris 
> Cc: Patrick McHardy 
> ---
>  include/net/ip.h  |  2 ++
>  net/ipv4/af_inet.c| 26 ++
>  net/ipv4/icmp.c   |  5 +
>  net/ipv4/ip_options.c | 14 ++
>  4 files changed, 35 insertions(+), 12 deletions(-)
> 
> diff --git a/include/net/ip.h b/include/net/ip.h
> index 1a98f1c..5d3a9eb 100644
> --- a/include/net/ip.h
> +++ b/include/net/ip.h
> @@ -240,6 +240,8 @@ static inline int inet_is_local_reserved_port(struct net 
> *net, int port)
>  }
>  #endif
>  
> +__be32 inet_current_timestamp(void);
> +
>  /* From inetpeer.c */
>  extern int inet_peer_threshold;
>  extern int inet_peer_minttl;
> diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c
> index eade66d..408e2b3 100644
> --- a/net/ipv4/af_inet.c
> +++ b/net/ipv4/af_inet.c
> @@ -1386,6 +1386,32 @@ out:
>   return pp;
>  }
>  
> +#define SECONDS_PER_DAY  86400
> +
> +/* inet_current_timestamp - Return IP network timestamp
> + *
> + * Return milliseconds since midnight in network byte order.
> + */
> +__be32 inet_current_timestamp(void)
> +{
> + u32 secs;
> + u32 msecs;
> + struct timespec64 ts;
> +
> + ktime_get_real_ts64();
> +
> + /* Get secs since midnight. */
> + (void)div_u64_rem(ts.tv_sec, SECONDS_PER_DAY, );
> + /* Convert to msecs. */
> + msecs = secs * MSEC_PER_SEC;
> + /* Convert nsec to msec. */
> + msecs += (u32)ts.tv_nsec / NSEC_PER_MSEC;
> +
> + /* Convert to network byte order. */
> + return htons(msecs);
> +}
> +EXPORT_SYMBOL(inet_current_timestamp);
> +
>  int inet_recv_error(struct sock *sk, struct msghdr *msg, int len, int 
> *addr_len)
>  {
>   if (sk->sk_family == AF_INET)
> diff --git a/net/ipv4/icmp.c b/net/ipv4/icmp.c
> index 36e2697..6333489 100644
> --- a/net/ipv4/icmp.c
> +++ b/net/ipv4/icmp.c
> @@ -931,7 +931,6 @@ static bool icmp_echo(struct sk_buff *skb)
>   */
>  static bool icmp_timestamp(struct sk_buff *skb)
>  {
> - struct timespec tv;
>   struct icmp_bxm icmp_param;
>   /*
>*  Too short.
> @@ -942,9 +941,7 @@ static bool icmp_timestamp(struct sk_buff *skb)
>   /*
>*  Fill in the current time as ms since midnight UT:
>*/
> - getnstimeofday();
> - icmp_param.data.times[1] = htonl((tv.tv_sec % 86400) * MSEC_PER_SEC +
> -  tv.tv_nsec / NSEC_PER_MSEC);
> + icmp_param.data.times[1] = inet_current_timestamp();
>   icmp_param.data.times[2] = icmp_param.data.times[1];
>   if (skb_copy_bits(skb, 0, _param.data.times[0], 4))
>   BUG();
> diff --git a/net/ipv4/ip_options.c b/net/ipv4/ip_options.c
> index bd24679..4d158ff 100644
> --- a/net/ipv4/ip_options.c
> +++ b/net/ipv4/ip_options.c
> @@ -58,10 +58,9 @@ void ip_options_build(struct sk_buff *skb, struct 
> ip_options *opt,
>   if (opt->ts_needaddr)
>   ip_rt_get_source(iph+opt->ts+iph[opt->ts+2]-9, skb, rt);
>   if (opt->ts_needtime) {
> - struct timespec tv;
>   __be32 midtime;
> - getnstimeofday();
> - midtime = htonl((tv.tv_sec % 86400) * MSEC_PER_SEC + 
> tv.tv_nsec / NSEC_PER_MSEC);
> +
> + midtime = inet_current_timestamp();
>   memcpy(iph+opt->ts+iph[opt->ts+2]-5, , 4);
>   }
>   return;
> @@ -415,11 +414,10 @@ int ip_options_compile(struct net *net,
>   break;
>   }
>   if (timeptr) {
> - struct timespec tv;
> - u32  midtime;
> - getnstimeofday();
> - midtime = (tv.tv_sec % 86400) * 
> MSEC_PER_SEC + tv.tv_nsec / NSEC_PER_MSEC;
> - put_unaligned_be32(midtime, timeptr);
> + __be32 midtime;
> +
> + midtime = inet_current_timestamp();
> + 

Re: [PATCH net] be2net: don't {en,dis}able filters on BE3 when transparent tagging is enabled

2016-02-28 Thread Sathya Perla
On Fri, Feb 26, 2016 at 6:43 PM, Ivan Vecera  wrote:
> Should the MULTICAST bit be masked in any be_cmd_rx_filter() call on BE3's
> VFs if the trans. tagging is enabled?

Not on any be_cmd_rx_filter() call, but on the first call in
be_open()->be_if_enable_filters() where the basic filtering flags are
being enabled and the driver is explicitly checking for the return
status. We have a fix for this already in our internal builds that
hasn't been upstreamed yet. Will send out a patch for this asap.
Thanks for the help on this Ivan!


Re: [PATCH RFC v2 18/32] dsa: mv88e6xxx: Prepare for turning this into a library module

2016-02-28 Thread Vivien Didelot
Hi Andrew,

Andrew Lunn  writes:

> Export all the functions so that we can later turn the module into a
> library module.

As I mentioned in the first RFC [1], wouldn't that be preferable to
avoid adding modules and factorize everything into a single one?

The common code would have a single probe function which calls
mv88e6xxx_lookup_name for every model tables until it finds the good
one.

For the rest of the code, a few additional switch family checks should
do the job. What do you think?

[1] https://patchwork.ozlabs.org/patch/560540/

Thanks,

-v


linux-next: manual merge of the net-next tree with the wireless-drivers tree

2016-02-28 Thread Stephen Rothwell
Hi all,

Today's linux-next merge of the net-next tree got a conflict in:

  drivers/net/wireless/intel/iwlwifi/mvm/fw.c

between commit:

  905e36ae172c ("iwlwifi: mvm: Fix paging memory leak")

from the wireless-drivers tree and commit:

  43413a975d06 ("iwlwifi: mvm: support rss queues configuration command")

from the net-next tree.

I fixed it up (see below) and can carry the fix as necessary (no action
is required).

-- 
Cheers,
Stephen Rothwell

diff --cc drivers/net/wireless/intel/iwlwifi/mvm/fw.c
index 0ccc697fef76,070e2af05ca2..
--- a/drivers/net/wireless/intel/iwlwifi/mvm/fw.c
+++ b/drivers/net/wireless/intel/iwlwifi/mvm/fw.c
@@@ -107,7 -108,25 +108,25 @@@ static int iwl_send_tx_ant_cfg(struct i
sizeof(tx_ant_cmd), _ant_cmd);
  }
  
+ static int iwl_send_rss_cfg_cmd(struct iwl_mvm *mvm)
+ {
+   int i;
+   struct iwl_rss_config_cmd cmd = {
+   .flags = cpu_to_le32(IWL_RSS_ENABLE),
+   .hash_mask = IWL_RSS_HASH_TYPE_IPV4_TCP |
+IWL_RSS_HASH_TYPE_IPV4_PAYLOAD |
+IWL_RSS_HASH_TYPE_IPV6_TCP |
+IWL_RSS_HASH_TYPE_IPV6_PAYLOAD,
+   };
+ 
+   for (i = 0; i < ARRAY_SIZE(cmd.indirection_table); i++)
+   cmd.indirection_table[i] = i % mvm->trans->num_rx_queues;
+   memcpy(cmd.secret_key, mvm->secret_key, ARRAY_SIZE(cmd.secret_key));
+ 
+   return iwl_mvm_send_cmd_pdu(mvm, RSS_CONFIG_CMD, 0, sizeof(cmd), );
+ }
+ 
 -static void iwl_free_fw_paging(struct iwl_mvm *mvm)
 +void iwl_free_fw_paging(struct iwl_mvm *mvm)
  {
int i;
  


Re: [PATCH v4 net-next] net: Implement fast csum_partial for x86_64

2016-02-28 Thread Maciej W. Rozycki
On Sun, 28 Feb 2016, Alexander Duyck wrote:

> I actually found the root cause.  The problem is in add32_with_carry3.
> 
> > +static inline unsigned int add32_with_carry3(unsigned int a, unsigned int 
> > b,
> > +unsigned int c)
> > +{
> > +   asm("addl %2,%0\n\t"
> > +   "adcl %3,%0\n\t"
> > +   "adcl $0,%0"
> > +   : "=r" (a)
> > +   : "" (a), "rm" (b), "rm" (c));
> > +
> > +   return a;
> > +}
> > +
> 
> You need to set the 'a' input variable attribute to "0" instead of ""
> and then things work for me correctly.

 Or alternatively you can reduce the number of operands by one, by using 
`"+r" (a)' as output, and then removing `a' as a separate input and 
renumbering references to `b' and `c' accordingly.  It would IMHO actually 
better match the in-place operation as well.

 FWIW,

  Maciej


Re: [PATCH v4 net-next] net: Implement fast csum_partial for x86_64

2016-02-28 Thread Alexander Duyck
On Sun, Feb 28, 2016 at 11:15 AM, Tom Herbert  wrote:
> On Sun, Feb 28, 2016 at 10:56 AM, Alexander Duyck
>  wrote:
>> On Sat, Feb 27, 2016 at 12:30 AM, Alexander Duyck
>>  wrote:
 +{
 +   asm("lea 40f(, %[slen], 4), %%r11\n\t"
 +   "clc\n\t"
 +   "jmpq *%%r11\n\t"
 +   "adcq 7*8(%[src]),%[res]\n\t"
 +   "adcq 6*8(%[src]),%[res]\n\t"
 +   "adcq 5*8(%[src]),%[res]\n\t"
 +   "adcq 4*8(%[src]),%[res]\n\t"
 +   "adcq 3*8(%[src]),%[res]\n\t"
 +   "adcq 2*8(%[src]),%[res]\n\t"
 +   "adcq 1*8(%[src]),%[res]\n\t"
 +   "adcq 0*8(%[src]),%[res]\n\t"
 +   "nop\n\t"
 +   "40: adcq $0,%[res]"
 +   : [res] "=r" (sum)
 +   : [src] "r" (buff),
 + [slen] "r" (-((unsigned long)(len >> 3))), "[res]" 
 (sum)
 +   : "r11");
 +
>>>
>>> With this patch I cannot mix/match different length checksums without
>>> things failing.  In perf the jmpq in the loop above seems to be set to
>>> a fixed value so perhaps it is something in how the compiler is
>>> interpreting the inline assembler.
>>
>> The perf thing was a red herring.  Turns out the code is working
>> correctly there.
>>
>> I actually found the root cause.  The problem is in add32_with_carry3.
>>
> Thanks for the follow-up. btw are you trying to build csum_partial in
> userspace for testing, or was this all in kernel?

It was in the kernel.  I have been some user space work but all of the
problems I was having were in the kernel.  My guess is that the
original sum value wasn't being used

- Alex


[PATCH RFC 1/2] rhashtable: accept GFP flags in rhashtable_walk_init

2016-02-28 Thread Bob Copeland
In certain cases, the 802.11 mesh pathtable code wants to
iterate over all of the entries in the forwarding table from
the receive path, which is inside an RCU read-side critical
section.  Enable walks inside atomic sections by allowing
GFP_ATOMIC allocations for the walker state.

Change all existing callsites to pass in GFP_KERNEL.

Cc: Thomas Graf 
Cc: netdev@vger.kernel.org
Signed-off-by: Bob Copeland 
---
 include/linux/rhashtable.h | 3 ++-
 lib/rhashtable.c   | 6 --
 net/ipv6/ila/ila_xlat.c| 3 ++-
 net/netfilter/nft_hash.c   | 4 ++--
 net/netlink/af_netlink.c   | 3 ++-
 net/sctp/proc.c| 3 ++-
 6 files changed, 14 insertions(+), 8 deletions(-)

diff --git a/include/linux/rhashtable.h b/include/linux/rhashtable.h
index 63bd7601b6de..3eef0802a0cd 100644
--- a/include/linux/rhashtable.h
+++ b/include/linux/rhashtable.h
@@ -346,7 +346,8 @@ struct bucket_table *rhashtable_insert_slow(struct 
rhashtable *ht,
struct bucket_table *old_tbl);
 int rhashtable_insert_rehash(struct rhashtable *ht, struct bucket_table *tbl);
 
-int rhashtable_walk_init(struct rhashtable *ht, struct rhashtable_iter *iter);
+int rhashtable_walk_init(struct rhashtable *ht, struct rhashtable_iter *iter,
+gfp_t gfp);
 void rhashtable_walk_exit(struct rhashtable_iter *iter);
 int rhashtable_walk_start(struct rhashtable_iter *iter) __acquires(RCU);
 void *rhashtable_walk_next(struct rhashtable_iter *iter);
diff --git a/lib/rhashtable.c b/lib/rhashtable.c
index cc808707d1cf..5d845ffd7982 100644
--- a/lib/rhashtable.c
+++ b/lib/rhashtable.c
@@ -487,6 +487,7 @@ EXPORT_SYMBOL_GPL(rhashtable_insert_slow);
  * rhashtable_walk_init - Initialise an iterator
  * @ht:Table to walk over
  * @iter:  Hash table Iterator
+ * @gfp:   GFP flags for allocations
  *
  * This function prepares a hash table walk.
  *
@@ -504,14 +505,15 @@ EXPORT_SYMBOL_GPL(rhashtable_insert_slow);
  * You must call rhashtable_walk_exit if this function returns
  * successfully.
  */
-int rhashtable_walk_init(struct rhashtable *ht, struct rhashtable_iter *iter)
+int rhashtable_walk_init(struct rhashtable *ht, struct rhashtable_iter *iter,
+gfp_t gfp)
 {
iter->ht = ht;
iter->p = NULL;
iter->slot = 0;
iter->skip = 0;
 
-   iter->walker = kmalloc(sizeof(*iter->walker), GFP_KERNEL);
+   iter->walker = kmalloc(sizeof(*iter->walker), gfp);
if (!iter->walker)
return -ENOMEM;
 
diff --git a/net/ipv6/ila/ila_xlat.c b/net/ipv6/ila/ila_xlat.c
index 295ca29a23c3..0b03533453e4 100644
--- a/net/ipv6/ila/ila_xlat.c
+++ b/net/ipv6/ila/ila_xlat.c
@@ -501,7 +501,8 @@ static int ila_nl_dump_start(struct netlink_callback *cb)
struct ila_net *ilan = net_generic(net, ila_net_id);
struct ila_dump_iter *iter = (struct ila_dump_iter *)cb->args;
 
-   return rhashtable_walk_init(>rhash_table, >rhiter);
+   return rhashtable_walk_init(>rhash_table, >rhiter,
+   GFP_KERNEL);
 }
 
 static int ila_nl_dump_done(struct netlink_callback *cb)
diff --git a/net/netfilter/nft_hash.c b/net/netfilter/nft_hash.c
index 3f9d45d3d9b7..6fa016564f90 100644
--- a/net/netfilter/nft_hash.c
+++ b/net/netfilter/nft_hash.c
@@ -192,7 +192,7 @@ static void nft_hash_walk(const struct nft_ctx *ctx, const 
struct nft_set *set,
u8 genmask = nft_genmask_cur(read_pnet(>pnet));
int err;
 
-   err = rhashtable_walk_init(>ht, );
+   err = rhashtable_walk_init(>ht, , GFP_KERNEL);
iter->err = err;
if (err)
return;
@@ -248,7 +248,7 @@ static void nft_hash_gc(struct work_struct *work)
priv = container_of(work, struct nft_hash, gc_work.work);
set  = nft_set_container_of(priv);
 
-   err = rhashtable_walk_init(>ht, );
+   err = rhashtable_walk_init(>ht, , GFP_KERNEL);
if (err)
goto schedule;
 
diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c
index c8416792cce0..6e0cbdeb21d3 100644
--- a/net/netlink/af_netlink.c
+++ b/net/netlink/af_netlink.c
@@ -2335,7 +2335,8 @@ static int netlink_walk_start(struct nl_seq_iter *iter)
 {
int err;
 
-   err = rhashtable_walk_init(_table[iter->link].hash, >hti);
+   err = rhashtable_walk_init(_table[iter->link].hash, >hti,
+  GFP_KERNEL);
if (err) {
iter->link = MAX_LINKS;
return err;
diff --git a/net/sctp/proc.c b/net/sctp/proc.c
index cfc3c7101a38..c5991e5e5daf 100644
--- a/net/sctp/proc.c
+++ b/net/sctp/proc.c
@@ -319,7 +319,8 @@ static int sctp_transport_walk_start(struct seq_file *seq)
struct sctp_ht_iter *iter = seq->private;
int err;
 
-   err = rhashtable_walk_init(_transport_hashtable, >hti);
+   err = rhashtable_walk_init(_transport_hashtable, >hti,
+

[PATCH RFC 2/2] mac80211: mesh: convert path table to rhashtable

2016-02-28 Thread Bob Copeland
In the time since the mesh path table was implemented as an
RCU-traversable, dynamically growing hash table, a generic RCU
hashtable implementation was added to the kernel.

Switch the mesh path table over to rhashtable to remove some code
and also gain some features like automatic shrinking.

Cc: Thomas Graf 
Cc: netdev@vger.kernel.org
Signed-off-by: Bob Copeland 
---
 net/mac80211/ieee80211_i.h  |  11 +-
 net/mac80211/mesh.c |   6 -
 net/mac80211/mesh.h |  31 +-
 net/mac80211/mesh_pathtbl.c | 786 ++--
 4 files changed, 259 insertions(+), 575 deletions(-)

diff --git a/net/mac80211/ieee80211_i.h b/net/mac80211/ieee80211_i.h
index 597f1b6260f4..a2857cd6d479 100644
--- a/net/mac80211/ieee80211_i.h
+++ b/net/mac80211/ieee80211_i.h
@@ -697,17 +697,10 @@ struct ieee80211_if_mesh {
/* offset from skb->data while building IE */
int meshconf_offset;
 
-   struct mesh_table __rcu *mesh_paths;
-   struct mesh_table __rcu *mpp_paths; /* Store paths for MPP */
+   struct mesh_table *mesh_paths;
+   struct mesh_table *mpp_paths; /* Store paths for MPP */
int mesh_paths_generation;
int mpp_paths_generation;
-
-   /* Protects assignment of the mesh_paths/mpp_paths table
-* pointer for resize against reading it for add/delete
-* of individual paths.  Pure readers (lookups) just use
-* RCU.
-*/
-   rwlock_t pathtbl_resize_lock;
 };
 
 #ifdef CONFIG_MAC80211_MESH
diff --git a/net/mac80211/mesh.c b/net/mac80211/mesh.c
index c92af2a7714d..a216c439b6f2 100644
--- a/net/mac80211/mesh.c
+++ b/net/mac80211/mesh.c
@@ -1347,12 +1347,6 @@ void ieee80211_mesh_work(struct ieee80211_sub_if_data 
*sdata)
   ifmsh->last_preq + 
msecs_to_jiffies(ifmsh->mshcfg.dot11MeshHWMPpreqMinInterval)))
mesh_path_start_discovery(sdata);
 
-   if (test_and_clear_bit(MESH_WORK_GROW_MPATH_TABLE, >wrkq_flags))
-   mesh_mpath_table_grow(sdata);
-
-   if (test_and_clear_bit(MESH_WORK_GROW_MPP_TABLE, >wrkq_flags))
-   mesh_mpp_table_grow(sdata);
-
if (test_and_clear_bit(MESH_WORK_HOUSEKEEPING, >wrkq_flags))
ieee80211_mesh_housekeeping(sdata);
 
diff --git a/net/mac80211/mesh.h b/net/mac80211/mesh.h
index f3cc3917e048..cc6854db156e 100644
--- a/net/mac80211/mesh.h
+++ b/net/mac80211/mesh.h
@@ -51,10 +51,6 @@ enum mesh_path_flags {
  *
  *
  * @MESH_WORK_HOUSEKEEPING: run the periodic mesh housekeeping tasks
- * @MESH_WORK_GROW_MPATH_TABLE: the mesh path table is full and needs
- * to grow.
- * @MESH_WORK_GROW_MPP_TABLE: the mesh portals table is full and needs to
- * grow
  * @MESH_WORK_ROOT: the mesh root station needs to send a frame
  * @MESH_WORK_DRIFT_ADJUST: time to compensate for clock drift relative to 
other
  * mesh nodes
@@ -62,8 +58,6 @@ enum mesh_path_flags {
  */
 enum mesh_deferred_task_flags {
MESH_WORK_HOUSEKEEPING,
-   MESH_WORK_GROW_MPATH_TABLE,
-   MESH_WORK_GROW_MPP_TABLE,
MESH_WORK_ROOT,
MESH_WORK_DRIFT_ADJUST,
MESH_WORK_MBSS_CHANGED,
@@ -105,6 +99,7 @@ enum mesh_deferred_task_flags {
 struct mesh_path {
u8 dst[ETH_ALEN];
u8 mpp[ETH_ALEN];   /* used for MPP or MAP */
+   struct rhash_head rhash;
struct hlist_node gate_list;
struct ieee80211_sub_if_data *sdata;
struct sta_info __rcu *next_hop;
@@ -129,34 +124,17 @@ struct mesh_path {
 /**
  * struct mesh_table
  *
- * @hash_buckets: array of hash buckets of the table
- * @hashwlock: array of locks to protect write operations, one per bucket
- * @hash_mask: 2^size_order - 1, used to compute hash idx
- * @hash_rnd: random value used for hash computations
  * @entries: number of entries in the table
- * @free_node: function to free nodes of the table
- * @copy_node: function to copy nodes of the table
- * @size_order: determines size of the table, there will be 2^size_order hash
- * buckets
  * @known_gates: list of known mesh gates and their mpaths by the station. The
  * gate's mpath may or may not be resolved and active.
- *
- * rcu_head: RCU head to free the table
+ * @rhash: the rhashtable containing struct mesh_paths, keyed by dest addr
  */
 struct mesh_table {
-   /* Number of buckets will be 2^N */
-   struct hlist_head *hash_buckets;
-   spinlock_t *hashwlock;  /* One per bucket, for add/del */
-   unsigned int hash_mask; /* (2^size_order) - 1 */
-   __u32 hash_rnd; /* Used for hash generation */
atomic_t entries;   /* Up to MAX_MESH_NEIGHBOURS */
-   void (*free_node) (struct hlist_node *p, bool free_leafs);
-   int (*copy_node) (struct hlist_node *p, struct mesh_table *newtbl);
-   int size_order;
struct hlist_head *known_gates;
spinlock_t gates_lock;
 
-   struct rcu_head rcu_head;
+   struct rhashtable rhead;
 

Re: [net-next][PATCH v2 01/13] RDS: Drop stale iWARP RDMA transport

2016-02-28 Thread santosh.shilim...@oracle.com

On 2/28/16 11:51 AM, Or Gerlitz wrote:

On Sun, Feb 28, 2016 at 4:19 AM, Santosh Shilimkar
 wrote:

RDS iWarp support code has become stale and non testable. As
indicated earlier, am dropping the support for it.

If new iWarp user(s) shows up in future, we can adapat the RDS IB
transprt for the special RDMA READ sink case. iWarp needs an MR
for the RDMA READ sink.

Signed-off-by: Santosh Shilimkar 
Signed-off-by: Santosh Shilimkar 


Hi, just wondered if there's any special reason that all this series
carries double S.O.B signature line with your name appearing twice on
two different email addresses?


Nothing special. I sign of all my patches with k.org id and have to
keep Oracle id as well being a payed Oracle employee. ;-)

Regards,
Santosh


Letter of Intent

2016-02-28 Thread James Bayor
I am James Bayor, I work with The United Bank for Africa. I emailed you earlier 
without any response.
I have a sensitive and private offer from the top executive to seek your 
partnership in re-profiling some offshore investment funds.
Please contact me on my private email: james.bay...@gmail.com for more details.

Best regards
James Bayor. (Msc, MBA, FCA)
Regional Manager.
UNITED BANK FOR AFRICA PLC, RC 103022.
53, Marina Lagos.
Nigeria.


Re: [PATCH V3 3/3] vhost_net: basic polling support

2016-02-28 Thread Christian Borntraeger
On 02/26/2016 09:42 AM, Jason Wang wrote:
> This patch tries to poll for new added tx buffer or socket receive
> queue for a while at the end of tx/rx processing. The maximum time
> spent on polling were specified through a new kind of vring ioctl.
> 
> Signed-off-by: Jason Wang 
> ---
>  drivers/vhost/net.c| 79 
> +++---
>  drivers/vhost/vhost.c  | 14 
>  drivers/vhost/vhost.h  |  1 +
>  include/uapi/linux/vhost.h |  6 
>  4 files changed, 95 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
> index 9eda69e..c91af93 100644
> --- a/drivers/vhost/net.c
> +++ b/drivers/vhost/net.c
> @@ -287,6 +287,44 @@ static void vhost_zerocopy_callback(struct ubuf_info 
> *ubuf, bool success)
>   rcu_read_unlock_bh();
>  }
> 
> +static inline unsigned long busy_clock(void)
> +{
> + return local_clock() >> 10;
> +}
> +
> +static bool vhost_can_busy_poll(struct vhost_dev *dev,
> + unsigned long endtime)
> +{
> + return likely(!need_resched()) &&
> +likely(!time_after(busy_clock(), endtime)) &&
> +likely(!signal_pending(current)) &&
> +!vhost_has_work(dev) &&
> +single_task_running();
> +}
> +
> +static int vhost_net_tx_get_vq_desc(struct vhost_net *net,
> + struct vhost_virtqueue *vq,
> + struct iovec iov[], unsigned int iov_size,
> + unsigned int *out_num, unsigned int *in_num)
> +{
> + unsigned long uninitialized_var(endtime);
> + int r = vhost_get_vq_desc(vq, vq->iov, ARRAY_SIZE(vq->iov),
> + out_num, in_num, NULL, NULL);
> +
> + if (r == vq->num && vq->busyloop_timeout) {
> + preempt_disable();
> + endtime = busy_clock() + vq->busyloop_timeout;
> + while (vhost_can_busy_poll(vq->dev, endtime) &&
> +vhost_vq_avail_empty(vq->dev, vq))
> + cpu_relax();


Can you use cpu_relax_lowlatency (which should be the same as cpu_relax for 
almost
everybody but s390? cpu_relax (without low latency might give up the time slice
when running under another hypervisor (like LPAR on s390), which might not be 
what
we want here.



[...] 
> +static int vhost_net_rx_peek_head_len(struct vhost_net *net, struct sock *sk)
> +{
> + struct vhost_net_virtqueue *nvq = >vqs[VHOST_NET_VQ_TX];
> + struct vhost_virtqueue *vq = >vq;
> + unsigned long uninitialized_var(endtime);
> + int len = peek_head_len(sk);
> +
> + if (!len && vq->busyloop_timeout) {
> + /* Both tx vq and rx socket were polled here */
> + mutex_lock(>mutex);
> + vhost_disable_notify(>dev, vq);
> +
> + preempt_disable();
> + endtime = busy_clock() + vq->busyloop_timeout;
> +
> + while (vhost_can_busy_poll(>dev, endtime) &&
> +skb_queue_empty(>sk_receive_queue) &&
> +vhost_vq_avail_empty(>dev, vq))
> + cpu_relax();

here as well.



Re: [PATCH v4 net-next] net: Implement fast csum_partial for x86_64

2016-02-28 Thread George Spelvin
I was just noticing that these two:

> +static inline unsigned long add64_with_carry(unsigned long a, unsigned long 
> b)
> +{
> +   asm("addq %2,%0\n\t"
> +   "adcq $0,%0"
> +   : "=r" (a)
> +   : "0" (a), "rm" (b));
> +   return a;
> +}
> +
> +static inline unsigned int add32_with_carry3(unsigned int a, unsigned int b,
> +unsigned int c)
> +{
> +   asm("addl %2,%0\n\t"
> +   "adcl %3,%0\n\t"
> +   "adcl $0,%0"
> +   : "=r" (a)
> +   : "" (a), "rm" (b), "rm" (c));
> +
> +   return a;
> +}

Could use some additional GCC asm wizardry.

There are a couple of lesser-known inline asm features which
could be brought to bear here:

1) The "%" modifier, meaning "operation is commutative", and
2) Multiple alternatives, and the "?" modifier.
3) The earlyclobber modifier "&".


For the first, I'd make it

> +   asm("addq %2,%0\n\t"
> +   "adcq $0,%0"
> +   : "=%a,r?" (a)
> +   : "%0,0" (a), "g,g" (b));

I also switched "rm" to "g" (general operand), since it's more compact,
and an immediate operand is technically allowed.

By including the "%", this tells GCC that swapping the a and b inputs is
fine if that helps register allocation.

The comma separates two alternative sets of constraints.  "a,r?" says
there are two options: using register %eax, and a second using any
register which is slightly worse (larger code).  The later "g,g" shows
the corresponding constraints for b.

Technically, there's a third option and you could do something like

> +   : "=a,r?,rm!" (a)
> +   : "%0,0,0" (a), "g,g,r" (b));

Where if b is a register, then a could be a memory location (and the !
means "avoid unless it saves a spill"), but that's probably not even worth
telling gcc about.


The three-input form can do the same.  There's a third feature we should add:
an "earlyclobber" notation on the output, since otherwise GCC will
turn
add32_with_carry3(a, b, a)
into
addl %ebx,%eax
addl %eax,%eax
addl $0,%eax
... as unlikely as gcc is to find an opportunity for such an optimization.

What I'd *like* to write is

> +   asm("addl %2,%0\n\t"
> +   "adcl %3,%0\n\t"
> +   "adcl $0,%0"
> +   : "=,?" (a)
> +   : "%0,0" (a), "%g,g" (b), "g,g" (c));

... but TFM explains "GCC can only handle one commutative pair in an asm;
if you use more, the compiler may fail."  So I have to pick one.

Also, if the idiom is "add32_with_carry3(sum, result >> 32, result);", then
it would be better to add c then b to match the length of the
dependency chains.  I.e.

> +   asm("addl %2,%0\n\t"
> +   "adcl %3,%0\n\t"
> +   "adcl $0,%0"
> +   : "=,?" (a)
> +   : "%0,0" (a), "g,g" (c), "g,g" (b));

(I would have also switched to "+a,r?" constraints, but I'm not positive
how + interacts with %.)


Re: [PATCH v4 net-next] net: Implement fast csum_partial for x86_64

2016-02-28 Thread Eric Dumazet
On ven., 2016-02-26 at 12:03 -0800, Tom Herbert wrote:
> +
> + /*
> +  * Length is greater than 64. Sum to eight byte alignment before
> +  * proceeding with main loop.
> +  */
> + aligned = !!((unsigned long)buff & 0x1);
> + if (aligned) {
> + unsigned int align = 7 & -(unsigned long)buff;
> +
> + result = csum_partial_lt8_head(*(unsigned long *)buff, align);
> + buff += align;
> + len -= align;
> + result = rotate_by8_if_odd(result, align);
> + }
> +

This looks like you wanted to test 3 low order bits, not only the 1 low
order.

aligned = !((unsigned long)buff & 0x7);

if (!aligned) {
 ...
}

Or rename the variable to notaligned






Re: [net-next][PATCH v2 01/13] RDS: Drop stale iWARP RDMA transport

2016-02-28 Thread Or Gerlitz
On Sun, Feb 28, 2016 at 4:19 AM, Santosh Shilimkar
 wrote:
> RDS iWarp support code has become stale and non testable. As
> indicated earlier, am dropping the support for it.
>
> If new iWarp user(s) shows up in future, we can adapat the RDS IB
> transprt for the special RDMA READ sink case. iWarp needs an MR
> for the RDMA READ sink.
>
> Signed-off-by: Santosh Shilimkar 
> Signed-off-by: Santosh Shilimkar 

Hi, just wondered if there's any special reason that all this series
carries double S.O.B signature line with your name appearing twice on
two different email addresses?


[PATCH] of_mdio: fix kernel-doc for of_phy_connect()

2016-02-28 Thread Sergei Shtylyov
The 'flags' parameter of the of_phy_connect() function wasn't described
in  the kernel-doc comment...

Signed-off-by: Sergei Shtylyov 

---
The patch is against DaveM's 'net.git' repo.

 drivers/of/of_mdio.c |1 +
 1 file changed, 1 insertion(+)

Index: net/drivers/of/of_mdio.c
===
--- net.orig/drivers/of/of_mdio.c
+++ net/drivers/of/of_mdio.c
@@ -305,6 +305,7 @@ EXPORT_SYMBOL(of_phy_find_device);
  * @dev: pointer to net_device claiming the phy
  * @phy_np: Pointer to device tree node for the PHY
  * @hndlr: Link state callback for the network device
+ * @flags: flags to pass to the PHY
  * @iface: PHY data interface type
  *
  * If successful, returns a pointer to the phy_device with the embedded



Re: [PATCH] socket.7: Document some BPF-related socket options

2016-02-28 Thread Michael Kerrisk (man-pages)
Hello Craig,

Thanks for putting this together. I have a few comments.
Would you please amend your patch and resend? (And include Alexei
in a "Reviewed-by" tag.)

On 02/25/2016 09:27 PM, Craig Gallek wrote:
> From: Craig Gallek 
> 
> Document the behavior and the first kernel version for each of the
> following socket options:
> SO_ATTACH_FILTER
> SO_ATTACH_BPF
> SO_ATTACH_REUSEPORT_CBPF
> SO_ATTACH_REUSEPORT_EBPF
> SO_DETACH_FILTER
> SO_DETACH_BPF
> 
> Signed-off-by: Craig Gallek 
> ---
>  man7/socket.7 | 104 
> --
>  1 file changed, 86 insertions(+), 18 deletions(-)
> 
> diff --git a/man7/socket.7 b/man7/socket.7
> index db7cb8324dde..79b4f3158541 100644
> --- a/man7/socket.7
> +++ b/man7/socket.7
> @@ -53,13 +53,6 @@
>  .\" SO_BPF_EXTENSIONS (3.14)
>  .\" commit ea02f9411d9faa3553ed09ce0ec9f00ceae9885e
>  .\"  Author: Michal Sekletar 
> -.\" SO_ATTACH_BPF (3.19)
> -.\" and SO_DETACH_BPF as synonym for SO_DETACH_FILTER
> -.\" commit 89aa075832b0da4402acebd698d0411dcc82d03e
> -.\"  Author: Alexei Starovoitov 
> -.\"  SO_ATTACH_REUSEPORT_CBPF, SO_ATTACH_REUSEPORT_EBPF (4.5)
> -.\"  commit 538950a1b7527a0a52ccd9337e3fcd304f027f13
> -.\"  Author: Craig Gallek 
>  .\"
>  .TH SOCKET 7 2015-05-07 Linux "Linux Programmer's Manual"
>  .SH NAME
> @@ -311,6 +304,80 @@ The value 0 indicates that this is not a listening 
> socket,
>  the value 1 indicates that this is a listening socket.
>  This socket option is read-only.
>  .TP
> +.BR SO_ATTACH_FILTER " and " SO_ATTACH_BPF
> +Attach a classic or extended BPF program (respectively) to the socket
> +for use as a filter of incoming packets.  A packet will be dropped if
> +the filter returns zero or have its data truncated to the non-zero
> +length returned.  

I find that last sentence hard to parse. How about something like:

A packet will be dropped if the filter program returns zero or will 
have its data truncated to the non-zero length returned [returned by 
what? The filter? Make this clearer please.]

>If the value returned is greater or equal to the
> +packet's data length, the packet is allowed to proceed unmodified.
> +
> +The argument for
> +.BR SO_ATTACH_FILTER
> +is a
> +.I sock_fprog
> +structure in
> +.B .
> +.sp
> +.in +4n
> +.nf
> +struct sock_fprog {
> +unsigned short  len;
> +struct sock_filter *filter;
> +};
> +.fi
> +.in
> +.IP
> +The argument for
> +.BR SO_ATTACH_BPF
> +is a file descriptor returned by the
> +.BR bpf (2)
> +system call and must represent a program of type

s/represent/refer to/

> +.BR BPF_PROG_TYPE_SOCKET_FILTER.
> +
> +.BR SO_ATTACH_FILTER
> +is available in Linux 2.2.

s/in/since/

> +.BR SO_ATTACH_BPF
> +is available in Linux 3.19.  Both classic and extended BPF are

s/in/since/

> +explained in the kernel source file
> +.I Documentation/networking/filter.txt

Presumably, it is not possible to attach multiple filters to a socket.
This should be stated explicitly somewhere here, as well as an
explanation of what happens if you try to add a filter to a socket
that already has one. Does it replace the existing filter, or does
an error result.

Seems like SOCK_FILTER_LOCKED also needs documenting here somewhere...

> +.TP
> +.BR SO_ATTACH_REUSEPORT_CBPF " and " SO_ATTACH_REUSEPORT_EBPF " (since Linux 
> 4.5)"
> +For use with the
> +.BR SO_REUSEPORT
> +option, these options allow the user to define a classic or extended
> +BPF program (respectively) which defines how packets are assigned to
> +the sockets in the reuseport group.  The program must return an index

Is there some documentation on "reuseport groups" that we can refer
to here? If yes, please add a reference.

s/program/BPF program/

> +between 0 and N-1 representing the socket which should receive the
> +packet (where N is the number of sockets in the group). If the BPF
> +program returns an invalid index, socket selection will fall back to
> +the plain
> +.BR SO_REUSEPORT
> +mechanism.
> +
> +Sockets are numbered in the order in which they are added to the group
> +(that is, the order of
> +.BR bind (2)
> +calls for UDP sockets or the order of
> +.BR listen (2)
> +calls for TCP sockets).  New sockets added to the group will inherit
> +the program.  When a socket is removed from the group (via

s/program/BPF program/

s/the group/a reuseport group/

> +.BR close (2))
> +the last socket in the group will be moved into the closed socket's
> +position.

Wow! That's interesting behavior that seems like it could easily 
trip up users!

> +
> +These options may be set repeatedly at any time on any single socket
> +in the group to replace the current BPF program used by all sockets in
> +the group.
> +.BR SO_ATTACH_REUSEPORT_CBPF
> +takes the same socket argument type as
> +.BR SO_ATTACH_FILTER
> +and
> +.BR 

Re: [PATCH v4 net-next] net: Implement fast csum_partial for x86_64

2016-02-28 Thread Tom Herbert
On Sun, Feb 28, 2016 at 10:56 AM, Alexander Duyck
 wrote:
> On Sat, Feb 27, 2016 at 12:30 AM, Alexander Duyck
>  wrote:
>>> +{
>>> +   asm("lea 40f(, %[slen], 4), %%r11\n\t"
>>> +   "clc\n\t"
>>> +   "jmpq *%%r11\n\t"
>>> +   "adcq 7*8(%[src]),%[res]\n\t"
>>> +   "adcq 6*8(%[src]),%[res]\n\t"
>>> +   "adcq 5*8(%[src]),%[res]\n\t"
>>> +   "adcq 4*8(%[src]),%[res]\n\t"
>>> +   "adcq 3*8(%[src]),%[res]\n\t"
>>> +   "adcq 2*8(%[src]),%[res]\n\t"
>>> +   "adcq 1*8(%[src]),%[res]\n\t"
>>> +   "adcq 0*8(%[src]),%[res]\n\t"
>>> +   "nop\n\t"
>>> +   "40: adcq $0,%[res]"
>>> +   : [res] "=r" (sum)
>>> +   : [src] "r" (buff),
>>> + [slen] "r" (-((unsigned long)(len >> 3))), "[res]" 
>>> (sum)
>>> +   : "r11");
>>> +
>>
>> With this patch I cannot mix/match different length checksums without
>> things failing.  In perf the jmpq in the loop above seems to be set to
>> a fixed value so perhaps it is something in how the compiler is
>> interpreting the inline assembler.
>
> The perf thing was a red herring.  Turns out the code is working
> correctly there.
>
> I actually found the root cause.  The problem is in add32_with_carry3.
>
Thanks for the follow-up. btw are you trying to build csum_partial in
userspace for testing, or was this all in kernel?


>> +static inline unsigned int add32_with_carry3(unsigned int a, unsigned int b,
>> +unsigned int c)
>> +{
>> +   asm("addl %2,%0\n\t"
>> +   "adcl %3,%0\n\t"
>> +   "adcl $0,%0"
>> +   : "=r" (a)
>> +   : "" (a), "rm" (b), "rm" (c));
>> +
>> +   return a;
>> +}
>> +
>
> You need to set the 'a' input variable attribute to "0" instead of ""
> and then things work for me correctly.
>
> - Alex


Re: [PATCH v4 net-next] net: Implement fast csum_partial for x86_64

2016-02-28 Thread Alexander Duyck
On Sat, Feb 27, 2016 at 12:30 AM, Alexander Duyck
 wrote:
>> +{
>> +   asm("lea 40f(, %[slen], 4), %%r11\n\t"
>> +   "clc\n\t"
>> +   "jmpq *%%r11\n\t"
>> +   "adcq 7*8(%[src]),%[res]\n\t"
>> +   "adcq 6*8(%[src]),%[res]\n\t"
>> +   "adcq 5*8(%[src]),%[res]\n\t"
>> +   "adcq 4*8(%[src]),%[res]\n\t"
>> +   "adcq 3*8(%[src]),%[res]\n\t"
>> +   "adcq 2*8(%[src]),%[res]\n\t"
>> +   "adcq 1*8(%[src]),%[res]\n\t"
>> +   "adcq 0*8(%[src]),%[res]\n\t"
>> +   "nop\n\t"
>> +   "40: adcq $0,%[res]"
>> +   : [res] "=r" (sum)
>> +   : [src] "r" (buff),
>> + [slen] "r" (-((unsigned long)(len >> 3))), "[res]" 
>> (sum)
>> +   : "r11");
>> +
>
> With this patch I cannot mix/match different length checksums without
> things failing.  In perf the jmpq in the loop above seems to be set to
> a fixed value so perhaps it is something in how the compiler is
> interpreting the inline assembler.

The perf thing was a red herring.  Turns out the code is working
correctly there.

I actually found the root cause.  The problem is in add32_with_carry3.

> +static inline unsigned int add32_with_carry3(unsigned int a, unsigned int b,
> +unsigned int c)
> +{
> +   asm("addl %2,%0\n\t"
> +   "adcl %3,%0\n\t"
> +   "adcl $0,%0"
> +   : "=r" (a)
> +   : "" (a), "rm" (b), "rm" (c));
> +
> +   return a;
> +}
> +

You need to set the 'a' input variable attribute to "0" instead of ""
and then things work for me correctly.

- Alex


[PATCH] of_mdio: kill useless variable in of_mdiobus_register()

2016-02-28 Thread Sergei Shtylyov
of_mdiobus_register()  declares the 'paddr' variable to hold the result of
the of_get_property()  but only uses it once after that while the function
can be called directly from the *if* statement. Remove that variable and
switch to calling of_find_property() instead since  we don't care about
the "reg" property's value anyway...

Signed-off-by: Sergei Shtylyov 

---
The patch is against DaveM's 'net-next.git' repo.

drivers/of/of_mdio.c |4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

Index: net-next/drivers/of/of_mdio.c
===
--- net-next.orig/drivers/of/of_mdio.c
+++ net-next/drivers/of/of_mdio.c
@@ -211,7 +211,6 @@ static bool of_mdiobus_child_is_phy(stru
 int of_mdiobus_register(struct mii_bus *mdio, struct device_node *np)
 {
struct device_node *child;
-   const __be32 *paddr;
bool scanphys = false;
int addr, rc;
 
@@ -246,8 +245,7 @@ int of_mdiobus_register(struct mii_bus *
/* auto scan for PHYs with empty reg property */
for_each_available_child_of_node(np, child) {
/* Skip PHYs with reg property set */
-   paddr = of_get_property(child, "reg", NULL);
-   if (paddr)
+   if (of_find_property(child, "reg", NULL))
continue;
 
for (addr = 0; addr < PHY_MAX_ADDR; addr++) {



Re: Softirq priority inversion from "softirq: reduce latencies"

2016-02-28 Thread Francois Romieu
Mike Galbraith  :
[...]
> Hrm, relatively new + tasklet woes rings a bell.  Ah, that..
> 
> 
> What's worse is that at the point where this code was written it was
> already well known that tasklets are a steaming pile of crap and
> should die.
> 
> 
> Source thereof https://lwn.net/Articles/588457/

tasklets are ingrained in the dmaengine API (see 
Documentation/dmaengine/client.txt
and drivers/dma/virt-dma.h::vchan_cookie_complete).

Moving everything to irq context or handling his own sub-{jiffy/ms} timer
while losing async dma doesn't exactly smell like roses either. :o(

-- 
Ueimor


[PATCH RFC v2 00/32] Make DSA switches linux devices.

2016-02-28 Thread Andrew Lunn
This is the second RFC for rearchitecturing how DSA devices are
probed. This patchset allows switches to be linux devices, probed by
the ususal mechanism for whatever bus they hang off. They then
register with the DSA core.

DSA has been limited to devices which hang off an MDIO bus, or with a
bit of work, memory mapped devices. This refactoring generalizes DSA
so that switches on other sorts of busses, eg. SPI can be supported.

The code should remain backwards compatible. The old device tree
binding are still supported. It is extended with phandles to switch
devices.

The changes also make it eaiser for the drivers to be kernel modules,
and the patches contain cleanups and fixes so that the modules can be
unloaded and loaded.

Patches can be found in

https://github.com/lunn/linux.git v4.5-rc2-net-next-dsa-proposal-4


Andrew Lunn (30):
  dsa: Rename mv88e6123_61_65 to mv88e6123 to be consistent
  dsa: Make setup and finish more symmetrical
  net: dsa: Pass the dsa device to the switch drivers
  net: dsa: Have the switch driver allocate there own private memory
  net: dsa: Remove allocation of driver private memory
  net: dsa: Keep the mii bus and address in the private structure
  net: dsa: dsa.c: Refactor to increase symmetry
  driver: component: Add support for empty match table
  net: dsa: Add basic support for component master support
  net: dsa: Keep a reference to the switch device for component matching
  net: dsa: Add slave component matches based on a phandle to the slave.
  net: dsa: Make dsa,mii-bus optional
  net: dsa: Add register/unregister functions for switch drivers
  net: dsa: Rename DSA probe function.
  dsa: mv88e6xxx: Use bus in mv88e6xxx_lookup_name()
  dsa: mv88e6xxx: Add shared code for binding/unbinding a switch driver.
  dsa: mv88e6xxx: Prepare for turning this into a library module
  dsa: mv88e6xxx: Add macro for registering the drivers
  dsa: Add mdio device support to Marvell switches
  net: mdio: Add mdiodev_{read|write} helpers
  net: dsa: Better integrate the drivers with mdio device
  net: dsa: Add some debug prints for error cases
  net: dsa: Setup the switches after all have been probed
  net: dsa: Only setup platform switches, not device switches
  net: dsa: If a switch fails to probe, defer probing
  Documentation: DSA: Describe how probe of DSA and switches work.
  dsa: slave: Don't reference NULL pointer during phy_disconnect
  dsa: Destroy fixed link phys after the phy has been disconnected
  dsa: dsa: Fix freeing of fixed-phys from user ports.
  phy: fixed: Fix removal of phys.

Florian Fainelli (2):
  net: dsa: Move platform data allocation for OF
  net: dsa: bcm_sf2: make it a real platform driver

 .../devicetree/bindings/net/dsa/broadcom.txt   |  54 +++
 Documentation/devicetree/bindings/net/dsa/dsa.txt  |   5 +-
 .../devicetree/bindings/net/dsa/marvell.txt|  29 ++
 Documentation/networking/dsa/dsa.txt   |  48 +++
 drivers/base/component.c   |  33 +-
 drivers/net/dsa/Kconfig|   2 +-
 drivers/net/dsa/Makefile   |  19 +-
 drivers/net/dsa/bcm_sf2.c  | 292 +---
 drivers/net/dsa/mv88e6060.c| 138 +++-
 drivers/net/dsa/mv88e6060.h|  10 +
 drivers/net/dsa/mv88e6123.c| 173 ++
 drivers/net/dsa/mv88e6123_61_65.c  | 124 ---
 drivers/net/dsa/mv88e6131.c|  70 +++-
 drivers/net/dsa/mv88e6171.c|  70 +++-
 drivers/net/dsa/mv88e6352.c|  72 +++-
 drivers/net/dsa/mv88e6xxx.c| 200 +++
 drivers/net/dsa/mv88e6xxx.h|  40 ++-
 drivers/net/phy/fixed_phy.c|  10 +-
 drivers/net/phy/mdio_device.c  |  68 
 include/linux/mdio.h   |   5 +
 include/linux/phy_fixed.h  |   2 +-
 include/net/dsa.h  |  17 +-
 net/dsa/dsa.c  | 372 ++---
 net/dsa/slave.c|  12 +-
 24 files changed, 1339 insertions(+), 526 deletions(-)
 create mode 100644 Documentation/devicetree/bindings/net/dsa/broadcom.txt
 create mode 100644 Documentation/devicetree/bindings/net/dsa/marvell.txt
 create mode 100644 drivers/net/dsa/mv88e6123.c
 delete mode 100644 drivers/net/dsa/mv88e6123_61_65.c

-- 
2.7.0



[PATCH RFC v2 21/32] net: mdio: Add mdiodev_{read|write} helpers

2016-02-28 Thread Andrew Lunn
An mdio driver will want to perform reads/writes on its device.
Provide an abstraction over the mdiobus functions, which require more
knowledge of bus addresses.

Signed-off-by: Andrew Lunn 
---
 drivers/net/phy/mdio_device.c | 68 +++
 include/linux/mdio.h  |  5 
 2 files changed, 73 insertions(+)

diff --git a/drivers/net/phy/mdio_device.c b/drivers/net/phy/mdio_device.c
index 9c88e6749b9a..5bf6a2b93773 100644
--- a/drivers/net/phy/mdio_device.c
+++ b/drivers/net/phy/mdio_device.c
@@ -23,6 +23,74 @@
 #include 
 #include 
 
+/**
+ * mdiodev_read_nested - Nested version of the mdiodev_read function
+ * @mdiodev: the mdio device to read
+ * @regnum: register number to read
+ *
+ * In case of nested MDIO bus access avoid lockdep false positives by
+ * using mutex_lock_nested().
+ *
+ * NOTE: MUST NOT be called from interrupt context,
+ * because the bus read/write functions may wait for an interrupt
+ * to conclude the operation.
+ */
+int mdiodev_read_nested(struct mdio_device *mdiodev, u32 regnum)
+{
+   return mdiobus_read_nested(mdiodev->bus, mdiodev->addr, regnum);
+}
+EXPORT_SYMBOL(mdiodev_read_nested);
+
+/**
+ * mdiodev_read - Convenience function for reading a given MII mgmt register
+ * @mdiodev: the mdio device to read
+ * @regnum: register number to read
+ *
+ * NOTE: MUST NOT be called from interrupt context,
+ * because the bus read/write functions may wait for an interrupt
+ * to conclude the operation.
+ */
+int mdiodev_read(struct mdio_device *mdiodev, u32 regnum)
+{
+   return mdiobus_read(mdiodev->bus, mdiodev->addr, regnum);
+}
+EXPORT_SYMBOL(mdiodev_read);
+
+/**
+ * mdiodev_write_nested - Nested version of the mdiodev_write function
+ * @mdiodev: the mdio device to read
+ * @regnum: register number to write
+ * @val: value to write to @regnum
+ *
+ * In case of nested MDIO bus access avoid lockdep false positives by
+ * using mutex_lock_nested().
+ *
+ * NOTE: MUST NOT be called from interrupt context,
+ * because the bus read/write functions may wait for an interrupt
+ * to conclude the operation.
+ */
+int mdiodev_write_nested(struct mdio_device *mdiodev, u32 regnum, u16 val)
+{
+   return mdiobus_write_nested(mdiodev->bus, mdiodev->addr, regnum, val);
+}
+EXPORT_SYMBOL(mdiodev_write_nested);
+
+/**
+ * mdiodev_write - Convenience function for writing a given MII mgmt register
+ * @mdiodev: the mdio device to read
+ * @regnum: register number to write
+ * @val: value to write to @regnum
+ *
+ * NOTE: MUST NOT be called from interrupt context,
+ * because the bus read/write functions may wait for an interrupt
+ * to conclude the operation.
+ */
+int mdiodev_write(struct mdio_device *mdiodev, u32 regnum, u16 val)
+{
+   return mdiobus_write(mdiodev->bus, mdiodev->addr, regnum, val);
+}
+EXPORT_SYMBOL(mdiodev_write);
+
 void mdio_device_free(struct mdio_device *mdiodev)
 {
put_device(>dev);
diff --git a/include/linux/mdio.h b/include/linux/mdio.h
index 5bfd99d1a40a..58e39fbaa3e8 100644
--- a/include/linux/mdio.h
+++ b/include/linux/mdio.h
@@ -227,6 +227,11 @@ int mdiobus_read_nested(struct mii_bus *bus, int addr, u32 
regnum);
 int mdiobus_write(struct mii_bus *bus, int addr, u32 regnum, u16 val);
 int mdiobus_write_nested(struct mii_bus *bus, int addr, u32 regnum, u16 val);
 
+int mdiodev_read(struct mdio_device *mdiodev, u32 regnum);
+int mdiodev_read_nested(struct mdio_device *mdiodev, u32 regnum);
+int mdiodev_write(struct mdio_device *mdiodev, u32 regnum, u16 val);
+int mdiodev_write_nested(struct mdio_device *mdiodev, u32 regnum, u16 val);
+
 int mdiobus_register_device(struct mdio_device *mdiodev);
 int mdiobus_unregister_device(struct mdio_device *mdiodev);
 bool mdiobus_is_registered_device(struct mii_bus *bus, int addr);
-- 
2.7.0



[PATCH RFC v2 09/32] driver: component: Add support for empty match table

2016-02-28 Thread Andrew Lunn
Before calling component_master_add_with_match(), matches should be
added using component_match_add() to the opaque match. How many
matches are added typically depends on the contents of the device
tree. It is not inconceivable that the number is zero, for example the
components are optional.

This results in calling component_master_add_with_match() passing a
NULL pointer for the match structure. The component infrastructure
does not like this. So handle the case by allocating a match with zero
entries.

Signed-off-by: Andrew Lunn 
---
 drivers/base/component.c | 33 ++---
 1 file changed, 26 insertions(+), 7 deletions(-)

diff --git a/drivers/base/component.c b/drivers/base/component.c
index 89f5cf68d80a..36c4cf626fa8 100644
--- a/drivers/base/component.c
+++ b/drivers/base/component.c
@@ -237,6 +237,24 @@ static int component_match_realloc(struct device *dev,
 }
 
 /*
+ * Allocate a match array.
+ *
+ */
+static struct component_match *component_match_alloc(struct device *master)
+{
+   struct component_match *match;
+
+   match = devres_alloc(devm_component_match_release,
+sizeof(*match), GFP_KERNEL);
+   if (!match)
+   return ERR_PTR(-ENOMEM);
+
+   devres_add(master, match);
+
+   return match;
+}
+
+/*
  * Add a component to be matched, with a release function.
  *
  * The match array is first created or extended if necessary.
@@ -252,14 +270,9 @@ void component_match_add_release(struct device *master,
return;
 
if (!match) {
-   match = devres_alloc(devm_component_match_release,
-sizeof(*match), GFP_KERNEL);
-   if (!match) {
-   *matchptr = ERR_PTR(-ENOMEM);
+   match = component_match_alloc(master);
+   if (IS_ERR(match))
return;
-   }
-
-   devres_add(master, match);
 
*matchptr = match;
}
@@ -290,6 +303,12 @@ int component_master_add_with_match(struct device *dev,
struct master *master;
int ret;
 
+   if (!match) {
+   match = component_match_alloc(dev);
+   if (IS_ERR(match))
+   return PTR_ERR(match);
+   }
+
/* Reallocate the match array for its true size */
ret = component_match_realloc(dev, match, match->num);
if (ret)
-- 
2.7.0



[PATCH RFC v2 05/32] net: dsa: Have the switch driver allocate there own private memory

2016-02-28 Thread Andrew Lunn
Now the switch devices have a dev pointer, make use if it for allocating
the drivers private data structures using a devm_kzalloc().

Signed-off-by: Andrew Lunn 
Acked-by: Florian Fainelli 
---
v2: Added missing assignment of priv to ds->priv.
---
 drivers/net/dsa/bcm_sf2.c   |  9 +++--
 drivers/net/dsa/mv88e6123.c |  6 +++---
 drivers/net/dsa/mv88e6131.c |  6 +++---
 drivers/net/dsa/mv88e6171.c |  6 +++---
 drivers/net/dsa/mv88e6352.c |  6 +++---
 drivers/net/dsa/mv88e6xxx.c | 13 ++---
 drivers/net/dsa/mv88e6xxx.h |  5 -
 include/net/dsa.h   |  8 +++-
 8 files changed, 40 insertions(+), 19 deletions(-)

diff --git a/drivers/net/dsa/bcm_sf2.c b/drivers/net/dsa/bcm_sf2.c
index 6925b3c13895..fbb17e042e7b 100644
--- a/drivers/net/dsa/bcm_sf2.c
+++ b/drivers/net/dsa/bcm_sf2.c
@@ -929,7 +929,7 @@ static void bcm_sf2_identify_ports(struct bcm_sf2_priv 
*priv,
 static int bcm_sf2_sw_setup(struct dsa_switch *ds, struct device *dev)
 {
const char *reg_names[BCM_SF2_REGS_NUM] = BCM_SF2_REGS_NAME;
-   struct bcm_sf2_priv *priv = ds_to_priv(ds);
+   struct bcm_sf2_priv *priv;
struct device_node *dn;
void __iomem **base;
unsigned int port;
@@ -937,6 +937,12 @@ static int bcm_sf2_sw_setup(struct dsa_switch *ds, struct 
device *dev)
u32 reg, rev;
int ret;
 
+   priv = devm_kzalloc(dev, sizeof(*priv), GFP_KERNEL);
+   if (!priv)
+   return -ENOMEM;
+
+   ds->priv = priv;
+
spin_lock_init(>indir_lock);
mutex_init(>stats_mutex);
 
@@ -1365,7 +1371,6 @@ static int bcm_sf2_sw_set_wol(struct dsa_switch *ds, int 
port,
 
 static struct dsa_switch_driver bcm_sf2_switch_driver = {
.tag_protocol   = DSA_TAG_PROTO_BRCM,
-   .priv_size  = sizeof(struct bcm_sf2_priv),
.probe  = bcm_sf2_sw_probe,
.setup  = bcm_sf2_sw_setup,
.set_addr   = bcm_sf2_sw_set_addr,
diff --git a/drivers/net/dsa/mv88e6123.c b/drivers/net/dsa/mv88e6123.c
index fab428bb7545..9d39f108793b 100644
--- a/drivers/net/dsa/mv88e6123.c
+++ b/drivers/net/dsa/mv88e6123.c
@@ -70,13 +70,14 @@ static int mv88e6123_setup_global(struct dsa_switch *ds)
 
 static int mv88e6123_setup(struct dsa_switch *ds, struct device *dev)
 {
-   struct mv88e6xxx_priv_state *ps = ds_to_priv(ds);
+   struct mv88e6xxx_priv_state *ps;
int ret;
 
-   ret = mv88e6xxx_setup_common(ds);
+   ret = mv88e6xxx_setup_common(ds, dev);
if (ret < 0)
return ret;
 
+   ps = ds_to_priv(ds);
switch (ps->id) {
case PORT_SWITCH_ID_6123:
ps->num_ports = 3;
@@ -102,7 +103,6 @@ static int mv88e6123_setup(struct dsa_switch *ds, struct 
device *dev)
 
 struct dsa_switch_driver mv88e6123_switch_driver = {
.tag_protocol   = DSA_TAG_PROTO_EDSA,
-   .priv_size  = sizeof(struct mv88e6xxx_priv_state),
.probe  = mv88e6123_probe,
.setup  = mv88e6123_setup,
.set_addr   = mv88e6xxx_set_addr_indirect,
diff --git a/drivers/net/dsa/mv88e6131.c b/drivers/net/dsa/mv88e6131.c
index d82cf3d38455..3103b4953af4 100644
--- a/drivers/net/dsa/mv88e6131.c
+++ b/drivers/net/dsa/mv88e6131.c
@@ -88,13 +88,14 @@ static int mv88e6131_setup_global(struct dsa_switch *ds)
 
 static int mv88e6131_setup(struct dsa_switch *ds, struct device *dev)
 {
-   struct mv88e6xxx_priv_state *ps = ds_to_priv(ds);
+   struct mv88e6xxx_priv_state *ps;
int ret;
 
-   ret = mv88e6xxx_setup_common(ds);
+   ret = mv88e6xxx_setup_common(ds, dev);
if (ret < 0)
return ret;
 
+   ps = ds_to_priv(ds);
mv88e6xxx_ppu_state_init(ds);
 
switch (ps->id) {
@@ -159,7 +160,6 @@ mv88e6131_phy_write(struct dsa_switch *ds,
 
 struct dsa_switch_driver mv88e6131_switch_driver = {
.tag_protocol   = DSA_TAG_PROTO_DSA,
-   .priv_size  = sizeof(struct mv88e6xxx_priv_state),
.probe  = mv88e6131_probe,
.setup  = mv88e6131_setup,
.set_addr   = mv88e6xxx_set_addr_direct,
diff --git a/drivers/net/dsa/mv88e6171.c b/drivers/net/dsa/mv88e6171.c
index 9635f14ec1fb..29a77366afc6 100644
--- a/drivers/net/dsa/mv88e6171.c
+++ b/drivers/net/dsa/mv88e6171.c
@@ -66,13 +66,14 @@ static int mv88e6171_setup_global(struct dsa_switch *ds)
 
 static int mv88e6171_setup(struct dsa_switch *ds, struct device *dev)
 {
-   struct mv88e6xxx_priv_state *ps = ds_to_priv(ds);
+   struct mv88e6xxx_priv_state *ps;
int ret;
 
-   ret = mv88e6xxx_setup_common(ds);
+   ret = mv88e6xxx_setup_common(ds, dev);
if (ret < 0)
return ret;
 
+   ps = ds_to_priv(ds);
ps->num_ports = 7;
 
ret = mv88e6xxx_switch_reset(ds, true);
@@ -88,7 +89,6 @@ static int 

[PATCH RFC v2 01/32] net: dsa: Move platform data allocation for OF

2016-02-28 Thread Andrew Lunn
From: Florian Fainelli 

Do not have dsa_of_probe() allocate and assign struct dsa_platform_data,
but instead do this outside of this function such that we can control
exactly the storage of that data structure.

This is a preliminary change to allow multiple callers of dsa_of_probe()
not to clobber an existing dsa_platform_data and later, control where
this data structure is coming from.

Signed-off-by: Florian Fainelli 
---
 net/dsa/dsa.c | 48 +++-
 1 file changed, 19 insertions(+), 29 deletions(-)

diff --git a/net/dsa/dsa.c b/net/dsa/dsa.c
index fa4daba8db55..be68c00b8cfd 100644
--- a/net/dsa/dsa.c
+++ b/net/dsa/dsa.c
@@ -651,13 +651,12 @@ static void dsa_of_free_platform_data(struct 
dsa_platform_data *pd)
kfree(pd->chip);
 }
 
-static int dsa_of_probe(struct device *dev)
+static int dsa_of_probe(struct device *dev, struct dsa_platform_data *pd)
 {
struct device_node *np = dev->of_node;
struct device_node *child, *mdio, *ethernet, *port;
struct mii_bus *mdio_bus, *mdio_bus_switch;
struct net_device *ethernet_dev;
-   struct dsa_platform_data *pd;
struct dsa_chip_data *cd;
const char *port_name;
int chip_index, port_index;
@@ -666,7 +665,7 @@ static int dsa_of_probe(struct device *dev)
enum of_gpio_flags of_flags;
unsigned long flags;
u32 eeprom_len;
-   int ret;
+   int ret = 0;
 
mdio = of_parse_phandle(np, "dsa,mii-bus", 0);
if (!mdio)
@@ -688,13 +687,6 @@ static int dsa_of_probe(struct device *dev)
goto out_put_mdio;
}
 
-   pd = kzalloc(sizeof(*pd), GFP_KERNEL);
-   if (!pd) {
-   ret = -ENOMEM;
-   goto out_put_ethernet;
-   }
-
-   dev->platform_data = pd;
pd->of_netdev = ethernet_dev;
pd->nr_chips = of_get_available_child_count(np);
if (pd->nr_chips > DSA_MAX_SWITCHES)
@@ -702,10 +694,8 @@ static int dsa_of_probe(struct device *dev)
 
pd->chip = kcalloc(pd->nr_chips, sizeof(struct dsa_chip_data),
   GFP_KERNEL);
-   if (!pd->chip) {
-   ret = -ENOMEM;
-   goto out_free;
-   }
+   if (!pd->chip)
+   goto out_put_mdio;
 
chip_index = -1;
for_each_available_child_of_node(np, child) {
@@ -795,34 +785,29 @@ static int dsa_of_probe(struct device *dev)
 
 out_free_chip:
dsa_of_free_platform_data(pd);
-out_free:
-   kfree(pd);
-   dev->platform_data = NULL;
-out_put_ethernet:
-   put_device(_dev->dev);
+
 out_put_mdio:
put_device(_bus->dev);
return ret;
 }
 
-static void dsa_of_remove(struct device *dev)
+static void dsa_of_remove(struct device *dev, struct dsa_platform_data *pd)
 {
-   struct dsa_platform_data *pd = dev->platform_data;
-
if (!dev->of_node)
return;
 
dsa_of_free_platform_data(pd);
put_device(>of_netdev->dev);
-   kfree(pd);
 }
 #else
-static inline int dsa_of_probe(struct device *dev)
+static inline int dsa_of_probe(struct device *dev,
+  struct dsa_platform_data *pd)
 {
return 0;
 }
 
-static inline void dsa_of_remove(struct device *dev)
+static inline void dsa_of_remove(struct device *dev,
+struct dsa_platform_data *pd)
 {
 }
 #endif
@@ -881,11 +866,15 @@ static int dsa_probe(struct platform_device *pdev)
   dsa_driver_version);
 
if (pdev->dev.of_node) {
-   ret = dsa_of_probe(>dev);
+   pd = devm_kzalloc(>dev, sizeof(*pd), GFP_KERNEL);
+   if (!pd)
+   return -ENOMEM;
+
+   ret = dsa_of_probe(>dev, pd);
if (ret)
return ret;
 
-   pd = pdev->dev.platform_data;
+   pdev->dev.platform_data = pd;
}
 
if (pd == NULL || (pd->netdev == NULL && pd->of_netdev == NULL))
@@ -926,7 +915,7 @@ static int dsa_probe(struct platform_device *pdev)
return 0;
 
 out:
-   dsa_of_remove(>dev);
+   dsa_of_remove(>dev, pd);
 
return ret;
 }
@@ -948,9 +937,10 @@ static void dsa_remove_dst(struct dsa_switch_tree *dst)
 static int dsa_remove(struct platform_device *pdev)
 {
struct dsa_switch_tree *dst = platform_get_drvdata(pdev);
+   struct dsa_platform_data *pd = pdev->dev.platform_data;
 
dsa_remove_dst(dst);
-   dsa_of_remove(>dev);
+   dsa_of_remove(>dev, pd);
 
return 0;
 }
-- 
2.7.0



[PATCH RFC v2 28/32] Documentation: DSA: Describe how probe of DSA and switches work.

2016-02-28 Thread Andrew Lunn
With the introduction of switches as linux devices and the use of the
component framework, probing has become more complex. Add some
documentation.

Signed-off-by: Andrew Lunn 
---
 Documentation/networking/dsa/dsa.txt | 48 
 1 file changed, 48 insertions(+)

diff --git a/Documentation/networking/dsa/dsa.txt 
b/Documentation/networking/dsa/dsa.txt
index aa9c1f9313cd..376afa135a81 100644
--- a/Documentation/networking/dsa/dsa.txt
+++ b/Documentation/networking/dsa/dsa.txt
@@ -398,6 +398,54 @@ Switch configuration
   on the management interface and "hardcode"/"force" this MAC address for the
   CPU/management interface as an optimization
 
+Call flow
+-
+
+With the ability for switch devices to be true linux devices, the call
+flow is somewhat complex. The component framework is used to link the
+dsa framework as the master, with switch devices, as slaves.
+
+A switch device should add itself as a component in its probe
+function.
+
+The DSA framework can either be configured using a platform_data
+structure or from the device tree. If device tree is being used, the
+dsa framework probe function will allocate a platform_data structure,
+and populate it using the device tree, via the dsa_of_probe()
+function.  Within the DSA device tree, switch devices are represented
+by a phandle to the switch device. These phandles are saved into the
+platform data so that when switch slaves register themselves, they can
+be correctly positioned in the DSA cluster.
+
+The DSA probe function then creates a dsa_switch_tree structure which
+is the overarching structure representing a switch cluster. The probe
+function then looks in the platform data for the phandles to slave
+devices, and adds a component match based on the phandle. The
+component master is then created. This causes the component framework
+to link slaves to the master.
+
+If all the slave switch can be found, the masters bind function is
+called, dsa_bind(). This in tern causes the switch slaves bind
+function to be called.
+
+The switches bind function allocated memory for its own private use,
+and for a dsa_switch structure, which represents one switch in a DSA
+cluster. The switch then registers with the DSA framework using
+dsa_switch_register().
+
+dsa_switch_register() looks in the platform data and finds the
+position within the cluster for the switch which is registering. The
+switches dsa_switch structure is then attached to the dsa_switch_tree
+structure in the correct place.
+
+Once all slave switches have registered, dsa_setup_dst() is used to
+complete the construction of the dsa_switch_tree structure. This
+starts by setting up switches which are not slave devices. The MDIO
+address of the switch is passed to each switch driver to see if it can
+drive the switch. If it can, a dsa_switch structure is allocated to
+represent the switch and linked into the dsa_switch_tree at the
+correct location.
+
 PHY devices and link management
 ---
 
-- 
2.7.0



[PATCH RFC v2 08/32] net: dsa: dsa.c: Refactor to increase symmetry

2016-02-28 Thread Andrew Lunn
Create a dsa_switch_finish() which does the opposite of dsa_switch_setup().
Create a dsa_finish_dst() which does the opposite of dsa_setup_dst().

Signed-off-by: Andrew Lunn 
---
 net/dsa/dsa.c | 44 +---
 1 file changed, 29 insertions(+), 15 deletions(-)

diff --git a/net/dsa/dsa.c b/net/dsa/dsa.c
index 6c9d1d812873..5062ca91852d 100644
--- a/net/dsa/dsa.c
+++ b/net/dsa/dsa.c
@@ -462,6 +462,13 @@ dsa_switch_setup(struct dsa_switch_tree *dst, int index,
return ds;
 }
 
+static void dsa_switch_finish(struct dsa_switch *ds, struct device *parent)
+{
+   dsa_switch_finish_one(ds);
+
+   devm_kfree(parent, ds);
+}
+
 #ifdef CONFIG_PM_SLEEP
 static int dsa_switch_suspend(struct dsa_switch *ds)
 {
@@ -855,6 +862,26 @@ static int dsa_setup_dst(struct dsa_switch_tree *dst, 
struct net_device *dev,
return 0;
 }
 
+static void dsa_finish_dst(struct dsa_switch_tree *dst, struct device *parent,
+  struct dsa_platform_data *pd)
+{
+   struct net_device *dev = dst->master_netdev;
+   int i;
+
+   dev->dsa_ptr = NULL;
+   /* Ensure no more packets get sent to the tag receive
+* function.
+*/
+   wmb();
+
+   for (i = 0; i < pd->nr_chips; i++) {
+   struct dsa_switch *ds = dst->ds[i];
+
+   dsa_switch_finish(ds, parent);
+   dst->ds[i] = NULL;
+   }
+}
+
 static int dsa_probe(struct platform_device *pdev)
 {
struct dsa_platform_data *pd = pdev->dev.platform_data;
@@ -920,27 +947,14 @@ out:
return ret;
 }
 
-static void dsa_remove_dst(struct dsa_switch_tree *dst)
-{
-   int i;
-
-   for (i = 0; i < dst->pd->nr_chips; i++) {
-   struct dsa_switch *ds = dst->ds[i];
-
-   if (ds)
-   dsa_switch_finish_one(ds);
-   }
-
-   dev_put(dst->master_netdev);
-}
-
 static int dsa_remove(struct platform_device *pdev)
 {
struct dsa_switch_tree *dst = platform_get_drvdata(pdev);
struct dsa_platform_data *pd = pdev->dev.platform_data;
 
-   dsa_remove_dst(dst);
+   dsa_finish_dst(dst, >dev, pd);
dsa_of_remove(>dev, pd);
+   dev_put(dst->master_netdev);
 
return 0;
 }
-- 
2.7.0



[PATCH RFC v2 26/32] net: dsa: Only setup platform switches, not device switches

2016-02-28 Thread Andrew Lunn
Switches which are linux devices will register themselves with DSA.
Such switches already have a dsa_switch in the dsa_switch_tree, and
don't need to be probed using the old mechanism.

Similarly, it is the responsibility of the switch driver to free its
resources when it unregisters.

Signed-off-by: Andrew Lunn 
---
 net/dsa/dsa.c | 37 ++---
 1 file changed, 18 insertions(+), 19 deletions(-)

diff --git a/net/dsa/dsa.c b/net/dsa/dsa.c
index b9b10e050927..f3ffc937b152 100644
--- a/net/dsa/dsa.c
+++ b/net/dsa/dsa.c
@@ -830,19 +830,25 @@ static int dsa_setup_dst(struct dsa_switch_tree *dst, 
struct net_device *dev,
 struct device *parent)
 {
int i, ret;
-   unsigned configured = 0;
struct dsa_switch *ds;
+   struct dsa_chip_data *cd;
struct dsa_platform_data *pd = dst->pd;
 
dst->cpu_switch = -1;
dst->cpu_port = -1;
 
for (i = 0; i < pd->nr_chips; i++) {
-   ds = dsa_switch_setup(dst, i, parent, pd->chip[i].host_dev);
-   if (IS_ERR(ds)) {
-   netdev_err(dev, "[%d]: couldn't create dsa switch 
instance (error %ld)\n",
-  i, PTR_ERR(ds));
-   continue;
+   cd = >chip[i];
+   if (!cd->of_chip) {
+   ds = dsa_switch_setup(dst, i, parent,
+ pd->chip[i].host_dev);
+   if (IS_ERR(ds)) {
+   netdev_err(dev, "[%d]: couldn't create dsa 
switch instance (error %ld)\n",
+  i, PTR_ERR(ds));
+   return PTR_ERR(ds);
+   }
+
+   dst->ds[i] = ds;
}
}
 
@@ -851,19 +857,9 @@ static int dsa_setup_dst(struct dsa_switch_tree *dst, 
struct net_device *dev,
ret = dsa_switch_setup_one(ds, parent);
if (ret)
return ret;
-
-   dst->ds[i] = ds;
-
-   ++configured;
}
 
/*
-* If no switch was found, exit cleanly
-*/
-   if (!configured)
-   return -EPROBE_DEFER;
-
-   /*
 * If we use a tagging format that doesn't have an ethertype
 * field, make sure that all packets from this point on get
 * sent to the tag format's receive function.
@@ -878,6 +874,7 @@ static void dsa_finish_dst(struct dsa_switch_tree *dst, 
struct device *parent,
   struct dsa_platform_data *pd)
 {
struct net_device *dev = dst->master_netdev;
+   struct dsa_chip_data *cd;
struct dsa_switch *ds;
int i;
 
@@ -893,9 +890,11 @@ static void dsa_finish_dst(struct dsa_switch_tree *dst, 
struct device *parent,
}
 
for (i = 0; i < pd->nr_chips; i++) {
-   ds = dst->ds[i];
-
-   dsa_switch_finish(ds, parent);
+   cd = >chip[i];
+   if (!cd->of_chip) {
+   ds = dst->ds[i];
+   dsa_switch_finish(ds, parent);
+   }
dst->ds[i] = NULL;
}
 }
-- 
2.7.0



[PATCH RFC v2 31/32] dsa: dsa: Fix freeing of fixed-phys from user ports.

2016-02-28 Thread Andrew Lunn
All ports types can have a fixed PHY associated with it. Remove the
check which limits removal to only CPU and DSA ports.

Signed-off-by: Andrew Lunn 
---
 net/dsa/dsa.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/net/dsa/dsa.c b/net/dsa/dsa.c
index 73f962d81d40..e969a43f7a21 100644
--- a/net/dsa/dsa.c
+++ b/net/dsa/dsa.c
@@ -400,9 +400,6 @@ static void dsa_switch_finish_one(struct dsa_switch *ds)
 
/* Remove any fixed link PHYs */
for (port = 0; port < DSA_MAX_PORTS; port++) {
-   if (!(dsa_is_cpu_port(ds, port) || dsa_is_dsa_port(ds, port)))
-   continue;
-
port_dn = cd->port_dn[port];
if (of_phy_is_fixed_link(port_dn)) {
phydev = of_phy_find_device(port_dn);
-- 
2.7.0



[PATCH RFC v2 30/32] dsa: Destroy fixed link phys after the phy has been disconnected

2016-02-28 Thread Andrew Lunn
The phy is disconnected from the slave in dsa_slave_destroy(). Don't
destroy fixed link phys until after this, since there can be fixed
linked phys connected to ports.

Signed-off-by: Andrew Lunn 
---
 net/dsa/dsa.c | 24 
 1 file changed, 12 insertions(+), 12 deletions(-)

diff --git a/net/dsa/dsa.c b/net/dsa/dsa.c
index c61b7ab092f5..73f962d81d40 100644
--- a/net/dsa/dsa.c
+++ b/net/dsa/dsa.c
@@ -387,7 +387,18 @@ static void dsa_switch_finish_one(struct dsa_switch *ds)
hwmon_device_unregister(ds->hwmon_dev);
 #endif
 
-   /* Disable configuration of the CPU and DSA ports */
+   /* Destroy network devices for physical switch ports. */
+   for (port = 0; port < DSA_MAX_PORTS; port++) {
+   if (!(ds->phys_port_mask & (1 << port)))
+   continue;
+
+   if (!ds->ports[port])
+   continue;
+
+   dsa_slave_destroy(ds->ports[port]);
+   }
+
+   /* Remove any fixed link PHYs */
for (port = 0; port < DSA_MAX_PORTS; port++) {
if (!(dsa_is_cpu_port(ds, port) || dsa_is_dsa_port(ds, port)))
continue;
@@ -405,17 +416,6 @@ static void dsa_switch_finish_one(struct dsa_switch *ds)
}
}
 
-   /* Destroy network devices for physical switch ports. */
-   for (port = 0; port < DSA_MAX_PORTS; port++) {
-   if (!(ds->phys_port_mask & (1 << port)))
-   continue;
-
-   if (!ds->ports[port])
-   continue;
-
-   dsa_slave_destroy(ds->ports[port]);
-   }
-
mdiobus_unregister(ds->slave_mii_bus);
 }
 
-- 
2.7.0



[PATCH RFC v2 23/32] net: dsa: bcm_sf2: make it a real platform driver

2016-02-28 Thread Andrew Lunn
From: Florian Fainelli 

The Broadcom Starfighter 2 switch driver should be a proper platform
driver, now that the DSA code has been updated to allow that, register
a switch device, feed it with the proper configuration data coming
from Device Tree and register our switch device with DSA.

The bulk of the changes consist in moving what bcm_sf2_sw_setup() did
into the component slave bind function.

This change does not however prevent the old DSA binding from working.

Signed-off-by: Florian Fainelli 
Signed-off-by: Andrew Lunn 
---
 .../devicetree/bindings/net/dsa/broadcom.txt   |  54 
 drivers/net/dsa/bcm_sf2.c  | 293 -
 2 files changed, 224 insertions(+), 123 deletions(-)
 create mode 100644 Documentation/devicetree/bindings/net/dsa/broadcom.txt

diff --git a/Documentation/devicetree/bindings/net/dsa/broadcom.txt 
b/Documentation/devicetree/bindings/net/dsa/broadcom.txt
new file mode 100644
index ..ea7c40b611fc
--- /dev/null
+++ b/Documentation/devicetree/bindings/net/dsa/broadcom.txt
@@ -0,0 +1,54 @@
+* Broadcom Starfighter 2 integrated switch device
+
+WARNING: This binding is currently UNSTABLE. Do not write it into a
+FLASH never to be upgraded. Once it is stable, this warning will be
+removed.
+
+If you need a stable binding, see the old dsa.txt.
+
+Required properties:
+
+- compatible: should be "brcm,brcm-sf2"
+- reg: addresses and length of the register sets for the device, must be 6
+  pairs of register addresses and lengths
+- interrupts: interrupts for the devices, must be two interrupts
+
+- reg-names: litteral names for the device base register addresses,
+  when present must be: "core", "reg", "intrl2_0", "intrl2_1", "fcb",
+  "acb"
+
+- interrupt-names: litternal names for the device interrupt lines,
+  when present must be: "switch_0" and "switch_1"
+
+- brcm,num-gphy: specify the maximum number of integrated gigabit PHYs
+  in the switch
+
+Optional properties:
+
+- brcm,num-rgmii-ports: specify the maximum number of RGMII interfaces
+  supported by the switch
+
+- brcm,fcb-pause-override: boolean property, if present indicates that
+  the switch supports Failover Control Block pause override capability
+
+- brcm,acb-packets-inflight: boolean property, if present indicates
+  that the switch Admission Control Block supports reporting the
+  number of packets in-flight in a switch queue
+
+Example:
+
+   switchdev0: switchdev0 {
+   compatible = "brcm,brcm-sf2";
+   reg = <0x0 0x4
+   0x4 0x110
+   0x40340 0x30
+   0x40380 0x30
+   0x40400 0x34
+   0x40600 0x208>;
+   interrupts = <0 0x18 0
+   0 0x19 0>;
+   brcm,num-gphy = <1>;
+   brcm,num-rgmii-ports = <2>;
+   brcm,fcb-pause-override;
+   brcm,acb-packets-inflight;
+   };
diff --git a/drivers/net/dsa/bcm_sf2.c b/drivers/net/dsa/bcm_sf2.c
index c0c83c2f2691..b23b044d8401 100644
--- a/drivers/net/dsa/bcm_sf2.c
+++ b/drivers/net/dsa/bcm_sf2.c
@@ -9,6 +9,7 @@
  * (at your option) any later version.
  */
 
+#include 
 #include 
 #include 
 #include 
@@ -928,83 +929,8 @@ static void bcm_sf2_identify_ports(struct bcm_sf2_priv 
*priv,
 
 static int bcm_sf2_sw_setup(struct dsa_switch *ds, struct device *dev)
 {
-   const char *reg_names[BCM_SF2_REGS_NUM] = BCM_SF2_REGS_NAME;
-   struct bcm_sf2_priv *priv;
-   struct device_node *dn;
-   void __iomem **base;
-   unsigned int port;
-   unsigned int i;
-   u32 reg, rev;
-   int ret;
-
-   priv = devm_kzalloc(dev, sizeof(*priv), GFP_KERNEL);
-   if (!priv)
-   return -ENOMEM;
-
-   ds->priv = priv;
-
-   spin_lock_init(>indir_lock);
-   mutex_init(>stats_mutex);
-
-   /* All the interesting properties are at the parent device_node
-* level
-*/
-   dn = ds->pd->of_node->parent;
-   bcm_sf2_identify_ports(priv, ds->pd->of_node);
-
-   priv->irq0 = irq_of_parse_and_map(dn, 0);
-   priv->irq1 = irq_of_parse_and_map(dn, 1);
-
-   base = >core;
-   for (i = 0; i < BCM_SF2_REGS_NUM; i++) {
-   *base = of_iomap(dn, i);
-   if (*base == NULL) {
-   pr_err("unable to find register: %s\n", reg_names[i]);
-   ret = -ENOMEM;
-   goto out_unmap;
-   }
-   base++;
-   }
-
-   ret = bcm_sf2_sw_rst(priv);
-   if (ret) {
-   pr_err("unable to software reset switch: %d\n", ret);
-   goto out_unmap;
-   }
-
-   /* Disable all interrupts and request them */
-   bcm_sf2_intr_disable(priv);
-
-   ret = request_irq(priv->irq0, bcm_sf2_switch_0_isr, 0,
- "switch_0", priv);
-   if 

[PATCH RFC v2 04/32] net: dsa: Pass the dsa device to the switch drivers

2016-02-28 Thread Andrew Lunn
By passing a device structure to the switch devices, it allows them
to use devm_* methods for resource management.

Signed-off-by: Andrew Lunn 
Acked-by: Florian Fainelli 
---
 drivers/net/dsa/bcm_sf2.c   | 2 +-
 drivers/net/dsa/mv88e6060.c | 2 +-
 drivers/net/dsa/mv88e6123.c | 2 +-
 drivers/net/dsa/mv88e6131.c | 2 +-
 drivers/net/dsa/mv88e6171.c | 2 +-
 drivers/net/dsa/mv88e6352.c | 2 +-
 include/net/dsa.h   | 2 +-
 net/dsa/dsa.c   | 2 +-
 8 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/drivers/net/dsa/bcm_sf2.c b/drivers/net/dsa/bcm_sf2.c
index 6f946fedbb77..6925b3c13895 100644
--- a/drivers/net/dsa/bcm_sf2.c
+++ b/drivers/net/dsa/bcm_sf2.c
@@ -926,7 +926,7 @@ static void bcm_sf2_identify_ports(struct bcm_sf2_priv 
*priv,
}
 }
 
-static int bcm_sf2_sw_setup(struct dsa_switch *ds)
+static int bcm_sf2_sw_setup(struct dsa_switch *ds, struct device *dev)
 {
const char *reg_names[BCM_SF2_REGS_NUM] = BCM_SF2_REGS_NAME;
struct bcm_sf2_priv *priv = ds_to_priv(ds);
diff --git a/drivers/net/dsa/mv88e6060.c b/drivers/net/dsa/mv88e6060.c
index 0527f485c3dc..34bc374882c7 100644
--- a/drivers/net/dsa/mv88e6060.c
+++ b/drivers/net/dsa/mv88e6060.c
@@ -172,7 +172,7 @@ static int mv88e6060_setup_port(struct dsa_switch *ds, int 
p)
return 0;
 }
 
-static int mv88e6060_setup(struct dsa_switch *ds)
+static int mv88e6060_setup(struct dsa_switch *ds, struct device *dev)
 {
int i;
int ret;
diff --git a/drivers/net/dsa/mv88e6123.c b/drivers/net/dsa/mv88e6123.c
index 69a6f79dcb10..fab428bb7545 100644
--- a/drivers/net/dsa/mv88e6123.c
+++ b/drivers/net/dsa/mv88e6123.c
@@ -68,7 +68,7 @@ static int mv88e6123_setup_global(struct dsa_switch *ds)
return 0;
 }
 
-static int mv88e6123_setup(struct dsa_switch *ds)
+static int mv88e6123_setup(struct dsa_switch *ds, struct device *dev)
 {
struct mv88e6xxx_priv_state *ps = ds_to_priv(ds);
int ret;
diff --git a/drivers/net/dsa/mv88e6131.c b/drivers/net/dsa/mv88e6131.c
index a92ca651c399..d82cf3d38455 100644
--- a/drivers/net/dsa/mv88e6131.c
+++ b/drivers/net/dsa/mv88e6131.c
@@ -86,7 +86,7 @@ static int mv88e6131_setup_global(struct dsa_switch *ds)
return 0;
 }
 
-static int mv88e6131_setup(struct dsa_switch *ds)
+static int mv88e6131_setup(struct dsa_switch *ds, struct device *dev)
 {
struct mv88e6xxx_priv_state *ps = ds_to_priv(ds);
int ret;
diff --git a/drivers/net/dsa/mv88e6171.c b/drivers/net/dsa/mv88e6171.c
index 6e18213b9c04..9635f14ec1fb 100644
--- a/drivers/net/dsa/mv88e6171.c
+++ b/drivers/net/dsa/mv88e6171.c
@@ -64,7 +64,7 @@ static int mv88e6171_setup_global(struct dsa_switch *ds)
return 0;
 }
 
-static int mv88e6171_setup(struct dsa_switch *ds)
+static int mv88e6171_setup(struct dsa_switch *ds, struct device *dev)
 {
struct mv88e6xxx_priv_state *ps = ds_to_priv(ds);
int ret;
diff --git a/drivers/net/dsa/mv88e6352.c b/drivers/net/dsa/mv88e6352.c
index cc6c54553418..c2c4153e3423 100644
--- a/drivers/net/dsa/mv88e6352.c
+++ b/drivers/net/dsa/mv88e6352.c
@@ -75,7 +75,7 @@ static int mv88e6352_setup_global(struct dsa_switch *ds)
return 0;
 }
 
-static int mv88e6352_setup(struct dsa_switch *ds)
+static int mv88e6352_setup(struct dsa_switch *ds, struct device *dev)
 {
struct mv88e6xxx_priv_state *ps = ds_to_priv(ds);
int ret;
diff --git a/include/net/dsa.h b/include/net/dsa.h
index 26a0e86e611e..f5b4f1bcfdf3 100644
--- a/include/net/dsa.h
+++ b/include/net/dsa.h
@@ -213,7 +213,7 @@ struct dsa_switch_driver {
 * Probing and setup.
 */
char*(*probe)(struct device *host_dev, int sw_addr);
-   int (*setup)(struct dsa_switch *ds);
+   int (*setup)(struct dsa_switch *ds, struct device *dev);
int (*set_addr)(struct dsa_switch *ds, u8 *addr);
u32 (*get_phy_flags)(struct dsa_switch *ds, int port);
 
diff --git a/net/dsa/dsa.c b/net/dsa/dsa.c
index 6e9176545dda..d6ea1f1a1a34 100644
--- a/net/dsa/dsa.c
+++ b/net/dsa/dsa.c
@@ -300,7 +300,7 @@ static int dsa_switch_setup_one(struct dsa_switch *ds, 
struct device *parent)
/*
 * Do basic register setup.
 */
-   ret = drv->setup(ds);
+   ret = drv->setup(ds, parent);
if (ret < 0)
goto out;
 
-- 
2.7.0



[PATCH RFC v2 29/32] dsa: slave: Don't reference NULL pointer during phy_disconnect

2016-02-28 Thread Andrew Lunn
When the phy is disconnected, the parent pointer to the netdev it was
attached to is set to NULL. The code then tries to suspend the phy,
but dsa_slave_fixed_link_update needs the parent pointer to determine
which switch the phy is connected to. So it dereferenced a NULL
pointer. Check for this condition.

Signed-off-by: Andrew Lunn 
---
 net/dsa/slave.c | 12 
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/net/dsa/slave.c b/net/dsa/slave.c
index 40b9ca72aae3..d0d29f73e2f2 100644
--- a/net/dsa/slave.c
+++ b/net/dsa/slave.c
@@ -982,11 +982,15 @@ static void dsa_slave_adjust_link(struct net_device *dev)
 static int dsa_slave_fixed_link_update(struct net_device *dev,
   struct fixed_phy_status *status)
 {
-   struct dsa_slave_priv *p = netdev_priv(dev);
-   struct dsa_switch *ds = p->parent;
+   struct dsa_slave_priv *p;
+   struct dsa_switch *ds;
 
-   if (ds->drv->fixed_link_update)
-   ds->drv->fixed_link_update(ds, p->port, status);
+   if (dev) {
+   p = netdev_priv(dev);
+   ds = p->parent;
+   if (ds->drv->fixed_link_update)
+   ds->drv->fixed_link_update(ds, p->port, status);
+   }
 
return 0;
 }
-- 
2.7.0



[PATCH RFC v2 02/32] dsa: Rename mv88e6123_61_65 to mv88e6123 to be consistent

2016-02-28 Thread Andrew Lunn
All the drivers support multiple chips, but mv88e6123_61_65 is the
only one that reflects this in its naming. Change it to be consistent
with the other drivers.

Signed-off-by: Andrew Lunn 
---
 drivers/net/dsa/Kconfig   |   2 +-
 drivers/net/dsa/Makefile  |   4 +-
 drivers/net/dsa/mv88e6123.c   | 124 ++
 drivers/net/dsa/mv88e6123_61_65.c | 124 --
 drivers/net/dsa/mv88e6xxx.c   |   4 +-
 drivers/net/dsa/mv88e6xxx.h   |   2 +-
 6 files changed, 130 insertions(+), 130 deletions(-)
 create mode 100644 drivers/net/dsa/mv88e6123.c
 delete mode 100644 drivers/net/dsa/mv88e6123_61_65.c

diff --git a/drivers/net/dsa/Kconfig b/drivers/net/dsa/Kconfig
index 4c483d937481..90ba003d8fdf 100644
--- a/drivers/net/dsa/Kconfig
+++ b/drivers/net/dsa/Kconfig
@@ -27,7 +27,7 @@ config NET_DSA_MV88E6131
  This enables support for the Marvell 88E6085/6095/6095F/6131
  ethernet switch chips.
 
-config NET_DSA_MV88E6123_61_65
+config NET_DSA_MV88E6123
tristate "Marvell 88E6123/6161/6165 ethernet switch chip support"
depends on NET_DSA
select NET_DSA_MV88E6XXX
diff --git a/drivers/net/dsa/Makefile b/drivers/net/dsa/Makefile
index e2d51c4b9382..a6e09939be65 100644
--- a/drivers/net/dsa/Makefile
+++ b/drivers/net/dsa/Makefile
@@ -1,8 +1,8 @@
 obj-$(CONFIG_NET_DSA_MV88E6060) += mv88e6060.o
 obj-$(CONFIG_NET_DSA_MV88E6XXX) += mv88e6xxx_drv.o
 mv88e6xxx_drv-y += mv88e6xxx.o
-ifdef CONFIG_NET_DSA_MV88E6123_61_65
-mv88e6xxx_drv-y += mv88e6123_61_65.o
+ifdef CONFIG_NET_DSA_MV88E6123
+mv88e6xxx_drv-y += mv88e6123.o
 endif
 ifdef CONFIG_NET_DSA_MV88E6131
 mv88e6xxx_drv-y += mv88e6131.o
diff --git a/drivers/net/dsa/mv88e6123.c b/drivers/net/dsa/mv88e6123.c
new file mode 100644
index ..69a6f79dcb10
--- /dev/null
+++ b/drivers/net/dsa/mv88e6123.c
@@ -0,0 +1,124 @@
+/*
+ * net/dsa/mv88e6123_61_65.c - Marvell 88e6123/6161/6165 switch chip support
+ * Copyright (c) 2008-2009 Marvell Semiconductor
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include "mv88e6xxx.h"
+
+static const struct mv88e6xxx_switch_id mv88e6123_table[] = {
+   { PORT_SWITCH_ID_6123, "Marvell 88E6123" },
+   { PORT_SWITCH_ID_6123_A1, "Marvell 88E6123 (A1)" },
+   { PORT_SWITCH_ID_6123_A2, "Marvell 88E6123 (A2)" },
+   { PORT_SWITCH_ID_6161, "Marvell 88E6161" },
+   { PORT_SWITCH_ID_6161_A1, "Marvell 88E6161 (A1)" },
+   { PORT_SWITCH_ID_6161_A2, "Marvell 88E6161 (A2)" },
+   { PORT_SWITCH_ID_6165, "Marvell 88E6165" },
+   { PORT_SWITCH_ID_6165_A1, "Marvell 88E6165 (A1)" },
+   { PORT_SWITCH_ID_6165_A2, "Marvell 88e6165 (A2)" },
+};
+
+static char *mv88e6123_probe(struct device *host_dev, int sw_addr)
+{
+   return mv88e6xxx_lookup_name(host_dev, sw_addr, mv88e6123_table,
+ARRAY_SIZE(mv88e6123_table));
+}
+
+static int mv88e6123_setup_global(struct dsa_switch *ds)
+{
+   u32 upstream_port = dsa_upstream_port(ds);
+   int ret;
+   u32 reg;
+
+   ret = mv88e6xxx_setup_global(ds);
+   if (ret)
+   return ret;
+
+   /* Disable the PHY polling unit (since there won't be any
+* external PHYs to poll), don't discard packets with
+* excessive collisions, and mask all interrupt sources.
+*/
+   REG_WRITE(REG_GLOBAL, GLOBAL_CONTROL, 0x);
+
+   /* Configure the upstream port, and configure the upstream
+* port as the port to which ingress and egress monitor frames
+* are to be sent.
+*/
+   reg = upstream_port << GLOBAL_MONITOR_CONTROL_INGRESS_SHIFT |
+   upstream_port << GLOBAL_MONITOR_CONTROL_EGRESS_SHIFT |
+   upstream_port << GLOBAL_MONITOR_CONTROL_ARP_SHIFT;
+   REG_WRITE(REG_GLOBAL, GLOBAL_MONITOR_CONTROL, reg);
+
+   /* Disable remote management for now, and set the switch's
+* DSA device number.
+*/
+   REG_WRITE(REG_GLOBAL, GLOBAL_CONTROL_2, ds->index & 0x1f);
+
+   return 0;
+}
+
+static int mv88e6123_setup(struct dsa_switch *ds)
+{
+   struct mv88e6xxx_priv_state *ps = ds_to_priv(ds);
+   int ret;
+
+   ret = mv88e6xxx_setup_common(ds);
+   if (ret < 0)
+   return ret;
+
+   switch (ps->id) {
+   case PORT_SWITCH_ID_6123:
+   ps->num_ports = 3;
+   break;
+   case PORT_SWITCH_ID_6161:
+   case PORT_SWITCH_ID_6165:
+   ps->num_ports = 6;
+   break;
+   default:
+   return -ENODEV;
+   }
+
+   ret = mv88e6xxx_switch_reset(ds, false);
+   if (ret < 0)
+   return 

[PATCH RFC v2 25/32] net: dsa: Setup the switches after all have been probed

2016-02-28 Thread Andrew Lunn
Some switches register themselves with DSA, which others are probed by
DSA itself. Move the setup call to after all switches have been
successfully probed. Similarly, finish each before releasing them.

Signed-off-by: Andrew Lunn 
---
 net/dsa/dsa.c | 27 ---
 1 file changed, 16 insertions(+), 11 deletions(-)

diff --git a/net/dsa/dsa.c b/net/dsa/dsa.c
index 9acfbe7c34f7..b9b10e050927 100644
--- a/net/dsa/dsa.c
+++ b/net/dsa/dsa.c
@@ -426,7 +426,6 @@ dsa_switch_setup(struct dsa_switch_tree *dst, int index,
struct dsa_chip_data *pd = dst->pd->chip + index;
struct dsa_switch_driver *drv;
struct dsa_switch *ds;
-   int ret;
char *name;
 
/*
@@ -456,17 +455,11 @@ dsa_switch_setup(struct dsa_switch_tree *dst, int index,
ds->tag_protocol = drv->tag_protocol;
ds->master_dev = host_dev;
 
-   ret = dsa_switch_setup_one(ds, parent);
-   if (ret)
-   return ERR_PTR(ret);
-
return ds;
 }
 
 static void dsa_switch_finish(struct dsa_switch *ds, struct device *parent)
 {
-   dsa_switch_finish_one(ds);
-
devm_kfree(parent, ds);
 }
 
@@ -836,22 +829,28 @@ static inline void dsa_of_remove(struct device *dev,
 static int dsa_setup_dst(struct dsa_switch_tree *dst, struct net_device *dev,
 struct device *parent)
 {
-   int i;
+   int i, ret;
unsigned configured = 0;
+   struct dsa_switch *ds;
struct dsa_platform_data *pd = dst->pd;
 
dst->cpu_switch = -1;
dst->cpu_port = -1;
 
for (i = 0; i < pd->nr_chips; i++) {
-   struct dsa_switch *ds;
-
ds = dsa_switch_setup(dst, i, parent, pd->chip[i].host_dev);
if (IS_ERR(ds)) {
netdev_err(dev, "[%d]: couldn't create dsa switch 
instance (error %ld)\n",
   i, PTR_ERR(ds));
continue;
}
+   }
+
+   for (i = 0; i < pd->nr_chips; i++) {
+   ds = dst->ds[i];
+   ret = dsa_switch_setup_one(ds, parent);
+   if (ret)
+   return ret;
 
dst->ds[i] = ds;
 
@@ -879,6 +878,7 @@ static void dsa_finish_dst(struct dsa_switch_tree *dst, 
struct device *parent,
   struct dsa_platform_data *pd)
 {
struct net_device *dev = dst->master_netdev;
+   struct dsa_switch *ds;
int i;
 
dev->dsa_ptr = NULL;
@@ -888,7 +888,12 @@ static void dsa_finish_dst(struct dsa_switch_tree *dst, 
struct device *parent,
wmb();
 
for (i = 0; i < pd->nr_chips; i++) {
-   struct dsa_switch *ds = dst->ds[i];
+   ds = dst->ds[i];
+   dsa_switch_finish_one(ds);
+   }
+
+   for (i = 0; i < pd->nr_chips; i++) {
+   ds = dst->ds[i];
 
dsa_switch_finish(ds, parent);
dst->ds[i] = NULL;
-- 
2.7.0



[PATCH RFC v2 19/32] dsa: mv88e6xxx: Add macro for registering the drivers

2016-02-28 Thread Andrew Lunn
The macro cuts down on boilerplate. The switch driver needs to both
register itself as an MDIO driver and register itself as a switch
driver. This second registration is needed to retain backwards
compatibility with the old binding.

Signed-off-by: Andrew Lunn 
---
 drivers/net/dsa/mv88e6xxx.h | 23 +++
 1 file changed, 23 insertions(+)

diff --git a/drivers/net/dsa/mv88e6xxx.h b/drivers/net/dsa/mv88e6xxx.h
index a4ae99b7cfd0..ce05964da85f 100644
--- a/drivers/net/dsa/mv88e6xxx.h
+++ b/drivers/net/dsa/mv88e6xxx.h
@@ -546,3 +546,26 @@ extern struct dsa_switch_driver mv88e6171_switch_driver;
 
 
 #endif
+
+/**
+ * mv88e6xxx_module_driver() - Helper macro for registering mv88e6xxx drivers
+ *
+ * Helper macro for mv88e6xxx drivers which do not do anything special
+ * in module init/exit. Each module may only use this macro once, and
+ * calling it replaces module_init() and module_exit().
+ */
+#define mv88e6xxx_module_driver(_mdio_driver, _switch_driver)  \
+static int __init mv88e6xxx_module_init(void)  \
+{  \
+   register_switch_driver(&_switch_driver);\
+   return mdio_driver_register(&_mdio_driver); \
+}  \
+module_init(mv88e6xxx_module_init);\
+   \
+static void __exit mv88e6xxx_module_exit(void) \
+{  \
+   mdio_driver_unregister(&_mdio_driver);  \
+   unregister_switch_driver(&_switch_driver);  \
+}  \
+module_exit(mv88e6xxx_module_exit)
+
-- 
2.7.0



[PATCH RFC v2 14/32] net: dsa: Add register/unregister functions for switch drivers

2016-02-28 Thread Andrew Lunn
A switch device driver registers with DSA as part of the component
slave bind and unregisters on component slave unbind.

Signed-off-by: Andrew Lunn 
---
 include/net/dsa.h |  3 +++
 net/dsa/dsa.c | 53 +
 2 files changed, 56 insertions(+)

diff --git a/include/net/dsa.h b/include/net/dsa.h
index ea4cfdf1b549..dbb90f2c475b 100644
--- a/include/net/dsa.h
+++ b/include/net/dsa.h
@@ -345,6 +345,9 @@ struct dsa_switch_driver {
 void register_switch_driver(struct dsa_switch_driver *type);
 void unregister_switch_driver(struct dsa_switch_driver *type);
 struct mii_bus *dsa_host_dev_to_mii_bus(struct device *dev);
+int dsa_switch_register(struct dsa_switch_tree *dst, struct dsa_switch *ds,
+   struct device_node *np, const char *name);
+void dsa_switch_unregister(struct dsa_switch *ds);
 
 static inline void *ds_to_priv(struct dsa_switch *ds)
 {
diff --git a/net/dsa/dsa.c b/net/dsa/dsa.c
index fb6d390503e1..0be85a14a835 100644
--- a/net/dsa/dsa.c
+++ b/net/dsa/dsa.c
@@ -997,6 +997,59 @@ static const struct component_master_ops dsa_ops = {
.unbind = dsa_unbind,
 };
 
+static int dsa_find_chip_index(struct dsa_switch_tree *dst,
+  struct device_node *np)
+{
+   struct dsa_platform_data *pd = dst->pd;
+   struct dsa_chip_data *cd;
+   int i;
+
+   for (i = 0; i < pd->nr_chips; i++) {
+   cd = >chip[i];
+   if (cd->of_chip == np)
+   return i;
+   }
+   return -ENODEV;
+}
+
+int dsa_switch_register(struct dsa_switch_tree *dst, struct dsa_switch *ds,
+   struct device_node *np, const char *name)
+{
+   struct dsa_platform_data *pd = dst->pd;
+   int index = dsa_find_chip_index(dst, np);
+
+   if (index < 0)
+   return index;
+
+   netdev_info(dst->master_netdev, "[%d]: detected a %s switch\n",
+   index, name);
+
+   if (dst->ds[index])
+   return -EINVAL;
+
+   ds->index = index;
+   ds->pd = >chip[index];
+   ds->dst = dst;
+   dst->ds[index] = ds;
+   ds->tag_protocol = ds->drv->tag_protocol;
+   ds->master_dev = >master_netdev->dev;
+
+   return 0;
+}
+EXPORT_SYMBOL_GPL(dsa_switch_register);
+
+void dsa_switch_unregister(struct dsa_switch *ds)
+{
+   struct dsa_switch_tree *dst = ds->dst;
+   int index = ds->index;
+
+#ifdef CONFIG_PM_SLEEP
+   dsa_switch_suspend(ds);
+#endif
+   dst->ds[index] = NULL;
+}
+EXPORT_SYMBOL_GPL(dsa_switch_unregister);
+
 static void dsa_shutdown(struct platform_device *pdev)
 {
 }
-- 
2.7.0



[PATCH RFC v2 15/32] net: dsa: Rename DSA probe function.

2016-02-28 Thread Andrew Lunn
Rename the function called from the DSA to perform a probe for the
switch. This makes the normal _probe() name available for a standard
Linux device driver probe function.

Signed-off-by: Andrew Lunn 
---
 drivers/net/dsa/bcm_sf2.c   | 4 ++--
 drivers/net/dsa/mv88e6060.c | 4 ++--
 drivers/net/dsa/mv88e6123.c | 4 ++--
 drivers/net/dsa/mv88e6131.c | 4 ++--
 drivers/net/dsa/mv88e6171.c | 4 ++--
 drivers/net/dsa/mv88e6352.c | 4 ++--
 6 files changed, 12 insertions(+), 12 deletions(-)

diff --git a/drivers/net/dsa/bcm_sf2.c b/drivers/net/dsa/bcm_sf2.c
index fbb17e042e7b..c0c83c2f2691 100644
--- a/drivers/net/dsa/bcm_sf2.c
+++ b/drivers/net/dsa/bcm_sf2.c
@@ -135,7 +135,7 @@ static int bcm_sf2_sw_get_sset_count(struct dsa_switch *ds)
return BCM_SF2_STATS_SIZE;
 }
 
-static char *bcm_sf2_sw_probe(struct device *host_dev, int sw_addr)
+static char *bcm_sf2_sw_drv_probe(struct device *host_dev, int sw_addr)
 {
return "Broadcom Starfighter 2";
 }
@@ -1371,7 +1371,7 @@ static int bcm_sf2_sw_set_wol(struct dsa_switch *ds, int 
port,
 
 static struct dsa_switch_driver bcm_sf2_switch_driver = {
.tag_protocol   = DSA_TAG_PROTO_BRCM,
-   .probe  = bcm_sf2_sw_probe,
+   .probe  = bcm_sf2_sw_drv_probe,
.setup  = bcm_sf2_sw_setup,
.set_addr   = bcm_sf2_sw_set_addr,
.get_phy_flags  = bcm_sf2_sw_get_phy_flags,
diff --git a/drivers/net/dsa/mv88e6060.c b/drivers/net/dsa/mv88e6060.c
index faf9834fe1cc..59e8b0fb8431 100644
--- a/drivers/net/dsa/mv88e6060.c
+++ b/drivers/net/dsa/mv88e6060.c
@@ -51,7 +51,7 @@ static int reg_write(struct dsa_switch *ds, int addr, int 
reg, u16 val)
return __ret;   \
})
 
-static char *mv88e6060_probe(struct device *host_dev, int sw_addr)
+static char *mv88e6060_drv_probe(struct device *host_dev, int sw_addr)
 {
struct mii_bus *bus = dsa_host_dev_to_mii_bus(host_dev);
int ret;
@@ -245,7 +245,7 @@ mv88e6060_phy_write(struct dsa_switch *ds, int port, int 
regnum, u16 val)
 
 static struct dsa_switch_driver mv88e6060_switch_driver = {
.tag_protocol   = DSA_TAG_PROTO_TRAILER,
-   .probe  = mv88e6060_probe,
+   .probe  = mv88e6060_drv_probe,
.setup  = mv88e6060_setup,
.set_addr   = mv88e6060_set_addr,
.phy_read   = mv88e6060_phy_read,
diff --git a/drivers/net/dsa/mv88e6123.c b/drivers/net/dsa/mv88e6123.c
index 9d39f108793b..6d6fca62e8b1 100644
--- a/drivers/net/dsa/mv88e6123.c
+++ b/drivers/net/dsa/mv88e6123.c
@@ -29,7 +29,7 @@ static const struct mv88e6xxx_switch_id mv88e6123_table[] = {
{ PORT_SWITCH_ID_6165_A2, "Marvell 88e6165 (A2)" },
 };
 
-static char *mv88e6123_probe(struct device *host_dev, int sw_addr)
+static char *mv88e6123_drv_probe(struct device *host_dev, int sw_addr)
 {
return mv88e6xxx_lookup_name(host_dev, sw_addr, mv88e6123_table,
 ARRAY_SIZE(mv88e6123_table));
@@ -103,7 +103,7 @@ static int mv88e6123_setup(struct dsa_switch *ds, struct 
device *dev)
 
 struct dsa_switch_driver mv88e6123_switch_driver = {
.tag_protocol   = DSA_TAG_PROTO_EDSA,
-   .probe  = mv88e6123_probe,
+   .probe  = mv88e6123_drv_probe,
.setup  = mv88e6123_setup,
.set_addr   = mv88e6xxx_set_addr_indirect,
.phy_read   = mv88e6xxx_phy_read,
diff --git a/drivers/net/dsa/mv88e6131.c b/drivers/net/dsa/mv88e6131.c
index 3103b4953af4..e0aa3be7f5a9 100644
--- a/drivers/net/dsa/mv88e6131.c
+++ b/drivers/net/dsa/mv88e6131.c
@@ -25,7 +25,7 @@ static const struct mv88e6xxx_switch_id mv88e6131_table[] = {
{ PORT_SWITCH_ID_6185, "Marvell 88E6185" },
 };
 
-static char *mv88e6131_probe(struct device *host_dev, int sw_addr)
+static char *mv88e6131_drv_probe(struct device *host_dev, int sw_addr)
 {
return mv88e6xxx_lookup_name(host_dev, sw_addr, mv88e6131_table,
 ARRAY_SIZE(mv88e6131_table));
@@ -160,7 +160,7 @@ mv88e6131_phy_write(struct dsa_switch *ds,
 
 struct dsa_switch_driver mv88e6131_switch_driver = {
.tag_protocol   = DSA_TAG_PROTO_DSA,
-   .probe  = mv88e6131_probe,
+   .probe  = mv88e6131_drv_probe,
.setup  = mv88e6131_setup,
.set_addr   = mv88e6xxx_set_addr_direct,
.phy_read   = mv88e6131_phy_read,
diff --git a/drivers/net/dsa/mv88e6171.c b/drivers/net/dsa/mv88e6171.c
index 29a77366afc6..8fc4db23744e 100644
--- a/drivers/net/dsa/mv88e6171.c
+++ b/drivers/net/dsa/mv88e6171.c
@@ -24,7 +24,7 @@ static const struct mv88e6xxx_switch_id mv88e6171_table[] = {
{ PORT_SWITCH_ID_6351, "Marvell 88E6351" },
 };
 
-static char *mv88e6171_probe(struct device *host_dev, int sw_addr)
+static char 

[PATCH RFC v2 20/32] dsa: Add mdio device support to Marvell switches

2016-02-28 Thread Andrew Lunn
Allow Marvell switches to be mdio devices, which probe and then
register with the DSA framework, as component slaves.

At the same time, make them separate modules, and make mv88e6xxx a
library module.

Signed-off-by: Andrew Lunn 
---
v2: s/Copywrite/Copyright
---
 .../devicetree/bindings/net/dsa/marvell.txt|  29 +
 drivers/net/dsa/Makefile   |  19 +---
 drivers/net/dsa/mv88e6060.c| 122 ++---
 drivers/net/dsa/mv88e6123.c|  50 -
 drivers/net/dsa/mv88e6131.c|  51 -
 drivers/net/dsa/mv88e6171.c|  51 -
 drivers/net/dsa/mv88e6352.c|  53 -
 drivers/net/dsa/mv88e6xxx.c|  35 --
 8 files changed, 329 insertions(+), 81 deletions(-)
 create mode 100644 Documentation/devicetree/bindings/net/dsa/marvell.txt

diff --git a/Documentation/devicetree/bindings/net/dsa/marvell.txt 
b/Documentation/devicetree/bindings/net/dsa/marvell.txt
new file mode 100644
index ..51b7cd9408f2
--- /dev/null
+++ b/Documentation/devicetree/bindings/net/dsa/marvell.txt
@@ -0,0 +1,29 @@
+Marvell DSA Switch Device Tree Bindings
+---
+
+WARNING: This binding is currently unstable. Do not program it into a
+FLASH never to be changed again. Once this binding is stable, this
+warning will be removed.
+
+If you need a stable binding, use the old dsa.txt binding.
+
+Marvell Switches are MDIO devices. The following properties should be
+placed as a child node of an mdio device.
+
+Required properties:
+- compatible   : Should be one of "marvell,mv88e6123",
+ "marvell,mv88e6131", "marvell,mv88e6171",
+ "marvell,mv88e6352" or "marvell,mv88e6060"
+- reg  : Address on the MII bus for the switch.
+
+Example:
+
+   mdio {
+   #address-cells = <1>;
+   #size-cells = <0>;
+
+   switch0: switch@1 {
+   reg = <0>;
+   compatible = "marvell,mv88e6131";
+   };
+   };
diff --git a/drivers/net/dsa/Makefile b/drivers/net/dsa/Makefile
index a6e09939be65..3e1f36120e02 100644
--- a/drivers/net/dsa/Makefile
+++ b/drivers/net/dsa/Makefile
@@ -1,16 +1,7 @@
 obj-$(CONFIG_NET_DSA_MV88E6060) += mv88e6060.o
-obj-$(CONFIG_NET_DSA_MV88E6XXX) += mv88e6xxx_drv.o
-mv88e6xxx_drv-y += mv88e6xxx.o
-ifdef CONFIG_NET_DSA_MV88E6123
-mv88e6xxx_drv-y += mv88e6123.o
-endif
-ifdef CONFIG_NET_DSA_MV88E6131
-mv88e6xxx_drv-y += mv88e6131.o
-endif
-ifdef CONFIG_NET_DSA_MV88E6352
-mv88e6xxx_drv-y += mv88e6352.o
-endif
-ifdef CONFIG_NET_DSA_MV88E6171
-mv88e6xxx_drv-y += mv88e6171.o
-endif
+obj-$(CONFIG_NET_DSA_MV88E6XXX) += mv88e6xxx.o
+obj-$(CONFIG_NET_DSA_MV88E6131) += mv88e6123.o
+obj-$(CONFIG_NET_DSA_MV88E6131) += mv88e6131.o
+obj-$(CONFIG_NET_DSA_MV88E6352) += mv88e6352.o
+obj-$(CONFIG_NET_DSA_MV88E6171) += mv88e6171.o
 obj-$(CONFIG_NET_DSA_BCM_SF2)  += bcm_sf2.o
diff --git a/drivers/net/dsa/mv88e6060.c b/drivers/net/dsa/mv88e6060.c
index 59e8b0fb8431..723273c8ff32 100644
--- a/drivers/net/dsa/mv88e6060.c
+++ b/drivers/net/dsa/mv88e6060.c
@@ -1,6 +1,7 @@
 /*
  * net/dsa/mv88e6060.c - Driver for Marvell 88e6060 switch chips
  * Copyright (c) 2008-2009 Marvell Semiconductor
+ * Copyright (c) 2015 Andrew Lunn 
  *
  * This program is free software; you can redistribute it and/or modify
  * it under the terms of the GNU General Public License as published by
@@ -8,12 +9,14 @@
  * (at your option) any later version.
  */
 
+#include 
 #include 
 #include 
 #include 
 #include 
 #include 
 #include 
+#include 
 #include 
 #include "mv88e6060.h"
 
@@ -51,14 +54,10 @@ static int reg_write(struct dsa_switch *ds, int addr, int 
reg, u16 val)
return __ret;   \
})
 
-static char *mv88e6060_drv_probe(struct device *host_dev, int sw_addr)
+static char *mv88e6060_name(struct mii_bus *bus, int sw_addr)
 {
-   struct mii_bus *bus = dsa_host_dev_to_mii_bus(host_dev);
int ret;
 
-   if (bus == NULL)
-   return NULL;
-
ret = mdiobus_read(bus, sw_addr + REG_PORT(0), PORT_SWITCH_ID);
if (ret >= 0) {
if (ret == PORT_SWITCH_ID_6060)
@@ -73,6 +72,16 @@ static char *mv88e6060_drv_probe(struct device *host_dev, 
int sw_addr)
return NULL;
 }
 
+static char *mv88e6060_drv_probe(struct device *host_dev, int sw_addr)
+{
+   struct mii_bus *bus = dsa_host_dev_to_mii_bus(host_dev);
+
+   if (!bus)
+   return NULL;
+
+   return mv88e6060_name(bus, sw_addr);
+}
+
 static int mv88e6060_switch_reset(struct dsa_switch *ds)
 {
int i;
@@ -170,19 +179,22 @@ static int mv88e6060_setup(struct dsa_switch *ds, struct 
device *dev)
 {
int i;
int ret;
-   struct mv88e6060_priv *priv;
+   

[PATCH RFC v2 27/32] net: dsa: If a switch fails to probe, defer probing

2016-02-28 Thread Andrew Lunn
Switches are either listed in device tree or platform_data. They
should exist. If the probe fails, defer the probe, which is the likely
cause of failure, not broken device tree or platform data.

Signed-off-by: Andrew Lunn 
Acked-by: Florian Fainelli 
---
 net/dsa/dsa.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/dsa/dsa.c b/net/dsa/dsa.c
index f3ffc937b152..c61b7ab092f5 100644
--- a/net/dsa/dsa.c
+++ b/net/dsa/dsa.c
@@ -435,7 +435,7 @@ dsa_switch_setup(struct dsa_switch_tree *dst, int index,
if (drv == NULL) {
netdev_err(dst->master_netdev, "[%d]: could not detect attached 
switch\n",
   index);
-   return ERR_PTR(-EINVAL);
+   return ERR_PTR(-EPROBE_DEFER);
}
netdev_info(dst->master_netdev, "[%d]: detected a %s switch\n",
index, name);
-- 
2.7.0



[PATCH RFC v2 16/32] dsa: mv88e6xxx: Use bus in mv88e6xxx_lookup_name()

2016-02-28 Thread Andrew Lunn
mv88e6xxx_lookup_name() returns the model name of a switch at a given
address on an MII bus. Using mii_bus to identify the bus rather than
the host device is more logical, so change the parameter.

Signed-off-by: Andrew Lunn 
---
v2: Check bus is valid before dereferencing it.
---
 drivers/net/dsa/mv88e6123.c | 4 +++-
 drivers/net/dsa/mv88e6131.c | 4 +++-
 drivers/net/dsa/mv88e6171.c | 4 +++-
 drivers/net/dsa/mv88e6352.c | 4 +++-
 drivers/net/dsa/mv88e6xxx.c | 6 +++---
 drivers/net/dsa/mv88e6xxx.h | 2 +-
 6 files changed, 16 insertions(+), 8 deletions(-)

diff --git a/drivers/net/dsa/mv88e6123.c b/drivers/net/dsa/mv88e6123.c
index 6d6fca62e8b1..2c23762cbed8 100644
--- a/drivers/net/dsa/mv88e6123.c
+++ b/drivers/net/dsa/mv88e6123.c
@@ -31,7 +31,9 @@ static const struct mv88e6xxx_switch_id mv88e6123_table[] = {
 
 static char *mv88e6123_drv_probe(struct device *host_dev, int sw_addr)
 {
-   return mv88e6xxx_lookup_name(host_dev, sw_addr, mv88e6123_table,
+   struct mii_bus *bus = dsa_host_dev_to_mii_bus(host_dev);
+
+   return mv88e6xxx_lookup_name(bus, sw_addr, mv88e6123_table,
 ARRAY_SIZE(mv88e6123_table));
 }
 
diff --git a/drivers/net/dsa/mv88e6131.c b/drivers/net/dsa/mv88e6131.c
index e0aa3be7f5a9..02d2bca095af 100644
--- a/drivers/net/dsa/mv88e6131.c
+++ b/drivers/net/dsa/mv88e6131.c
@@ -27,7 +27,9 @@ static const struct mv88e6xxx_switch_id mv88e6131_table[] = {
 
 static char *mv88e6131_drv_probe(struct device *host_dev, int sw_addr)
 {
-   return mv88e6xxx_lookup_name(host_dev, sw_addr, mv88e6131_table,
+   struct mii_bus *bus = dsa_host_dev_to_mii_bus(host_dev);
+
+   return mv88e6xxx_lookup_name(bus, sw_addr, mv88e6131_table,
 ARRAY_SIZE(mv88e6131_table));
 }
 
diff --git a/drivers/net/dsa/mv88e6171.c b/drivers/net/dsa/mv88e6171.c
index 8fc4db23744e..d557be12feb7 100644
--- a/drivers/net/dsa/mv88e6171.c
+++ b/drivers/net/dsa/mv88e6171.c
@@ -26,7 +26,9 @@ static const struct mv88e6xxx_switch_id mv88e6171_table[] = {
 
 static char *mv88e6171_drv_probe(struct device *host_dev, int sw_addr)
 {
-   return mv88e6xxx_lookup_name(host_dev, sw_addr, mv88e6171_table,
+   struct mii_bus *bus = dsa_host_dev_to_mii_bus(host_dev);
+
+   return mv88e6xxx_lookup_name(bus, sw_addr, mv88e6171_table,
 ARRAY_SIZE(mv88e6171_table));
 }
 
diff --git a/drivers/net/dsa/mv88e6352.c b/drivers/net/dsa/mv88e6352.c
index 2877ad8acefa..959835d69af6 100644
--- a/drivers/net/dsa/mv88e6352.c
+++ b/drivers/net/dsa/mv88e6352.c
@@ -38,7 +38,9 @@ static const struct mv88e6xxx_switch_id mv88e6352_table[] = {
 
 static char *mv88e6352_drv_probe(struct device *host_dev, int sw_addr)
 {
-   return mv88e6xxx_lookup_name(host_dev, sw_addr, mv88e6352_table,
+   struct mii_bus *bus = dsa_host_dev_to_mii_bus(host_dev);
+
+   return mv88e6xxx_lookup_name(bus, sw_addr, mv88e6352_table,
 ARRAY_SIZE(mv88e6352_table));
 }
 
diff --git a/drivers/net/dsa/mv88e6xxx.c b/drivers/net/dsa/mv88e6xxx.c
index 2e945f325db1..7f67bf47cdb6 100644
--- a/drivers/net/dsa/mv88e6xxx.c
+++ b/drivers/net/dsa/mv88e6xxx.c
@@ -2622,11 +2622,10 @@ int mv88e6xxx_get_temp_alarm(struct dsa_switch *ds, 
bool *alarm)
 }
 #endif /* CONFIG_NET_DSA_HWMON */
 
-char *mv88e6xxx_lookup_name(struct device *host_dev, int sw_addr,
+char *mv88e6xxx_lookup_name(struct mii_bus *bus, int sw_addr,
const struct mv88e6xxx_switch_id *table,
unsigned int num)
 {
-   struct mii_bus *bus = dsa_host_dev_to_mii_bus(host_dev);
int i, ret;
 
if (!bus)
@@ -2644,7 +2643,8 @@ char *mv88e6xxx_lookup_name(struct device *host_dev, int 
sw_addr,
/* Look up only the product number */
for (i = 0; i < num; ++i) {
if (table[i].id == (ret & PORT_SWITCH_ID_PROD_NUM_MASK)) {
-   dev_warn(host_dev, "unknown revision %d, using base 
switch 0x%x\n",
+   dev_warn(>dev,
+"unknown revision %d, using base switch 
0x%x\n",
 ret & PORT_SWITCH_ID_REV_MASK,
 ret & PORT_SWITCH_ID_PROD_NUM_MASK);
return table[i].name;
diff --git a/drivers/net/dsa/mv88e6xxx.h b/drivers/net/dsa/mv88e6xxx.h
index 53f2fc82069b..dce72d1007c2 100644
--- a/drivers/net/dsa/mv88e6xxx.h
+++ b/drivers/net/dsa/mv88e6xxx.h
@@ -444,7 +444,7 @@ struct mv88e6xxx_hw_stat {
 };
 
 int mv88e6xxx_switch_reset(struct dsa_switch *ds, bool ppu_active);
-char *mv88e6xxx_lookup_name(struct device *host_dev, int sw_addr,
+char *mv88e6xxx_lookup_name(struct mii_bus *bus, int sw_addr,
const struct mv88e6xxx_switch_id *table,
unsigned int num);
 int mv88e6xxx_setup_ports(struct dsa_switch *ds);
-- 
2.7.0



[PATCH RFC v2 13/32] net: dsa: Make dsa,mii-bus optional

2016-02-28 Thread Andrew Lunn
When all the switches are devices and register to the DSA framework,
having a dsa,mii-bus property is not required.

Signed-off-by: Andrew Lunn 
Acked-by: Florian Fainelli 
---
 Documentation/devicetree/bindings/net/dsa/dsa.txt |  3 +-
 net/dsa/dsa.c | 36 +--
 2 files changed, 23 insertions(+), 16 deletions(-)

diff --git a/Documentation/devicetree/bindings/net/dsa/dsa.txt 
b/Documentation/devicetree/bindings/net/dsa/dsa.txt
index f99e5694a61f..625b2e5d8ae2 100644
--- a/Documentation/devicetree/bindings/net/dsa/dsa.txt
+++ b/Documentation/devicetree/bindings/net/dsa/dsa.txt
@@ -8,11 +8,12 @@ Required properties:
  Second cell is used only when cascading/chaining.
 - #size-cells  : Must be 0
 - dsa,ethernet : Should be a phandle to a valid Ethernet device node
-- dsa,mii-bus  : Should be a phandle to a valid MDIO bus device node
 
 Optional properties:
 - interrupts   : property with a value describing the switch
  interrupt number (not supported by the driver)
+- dsa,mii-bus  : Should be a phandle to a valid MDIO bus device node.
+ Required when not all switches are devices.
 
 A DSA node can contain multiple switch chips which are therefore child nodes of
 the parent DSA node. The maximum number of allowed child nodes is 4
diff --git a/net/dsa/dsa.c b/net/dsa/dsa.c
index 73a3fd561ef3..fb6d390503e1 100644
--- a/net/dsa/dsa.c
+++ b/net/dsa/dsa.c
@@ -662,8 +662,8 @@ static void dsa_of_free_platform_data(struct 
dsa_platform_data *pd)
 static int dsa_of_probe(struct device *dev, struct dsa_platform_data *pd)
 {
struct device_node *np = dev->of_node;
-   struct device_node *child, *chip, *mdio, *ethernet, *port;
-   struct mii_bus *mdio_bus, *mdio_bus_switch;
+   struct device_node *child, *chip, *mdio, *switch_mdio, *ethernet, *port;
+   struct mii_bus *mdio_bus = NULL, *mdio_bus_switch;
struct net_device *ethernet_dev;
struct dsa_chip_data *cd;
const char *port_name;
@@ -676,12 +676,6 @@ static int dsa_of_probe(struct device *dev, struct 
dsa_platform_data *pd)
int ret = 0;
 
mdio = of_parse_phandle(np, "dsa,mii-bus", 0);
-   if (!mdio)
-   return -EINVAL;
-
-   mdio_bus = of_mdio_find_bus(mdio);
-   if (!mdio_bus)
-   return -EPROBE_DEFER;
 
ethernet = of_parse_phandle(np, "dsa,ethernet", 0);
if (!ethernet) {
@@ -713,11 +707,22 @@ static int dsa_of_probe(struct device *dev, struct 
dsa_platform_data *pd)
cd->of_node = child;
 
chip = of_parse_phandle(child, "switch", 0);
-   if (chip)
+   if (chip) {
cd->of_chip = chip;
+   } else {
+   if (!mdio)
+   return -EINVAL;
 
-   /* When assigning the host device, increment its refcount */
-   cd->host_dev = get_device(_bus->dev);
+   mdio_bus = of_mdio_find_bus(mdio);
+   if (!mdio_bus)
+   return -EPROBE_DEFER;
+
+   /*
+* When assigning the host device, increment
+* its refcount
+*/
+   cd->host_dev = get_device(_bus->dev);
+   }
 
sw_addr = of_get_property(child, "reg", NULL);
if (!sw_addr)
@@ -730,9 +735,9 @@ static int dsa_of_probe(struct device *dev, struct 
dsa_platform_data *pd)
if (!of_property_read_u32(child, "eeprom-length", _len))
cd->eeprom_len = eeprom_len;
 
-   mdio = of_parse_phandle(child, "mii-bus", 0);
-   if (mdio) {
-   mdio_bus_switch = of_mdio_find_bus(mdio);
+   switch_mdio = of_parse_phandle(child, "mii-bus", 0);
+   if (switch_mdio) {
+   mdio_bus_switch = of_mdio_find_bus(switch_mdio);
if (!mdio_bus_switch) {
ret = -EPROBE_DEFER;
goto out_free_chip;
@@ -791,7 +796,8 @@ static int dsa_of_probe(struct device *dev, struct 
dsa_platform_data *pd)
 
/* The individual chips hold their own refcount on the mdio bus,
 * so drop ours */
-   put_device(_bus->dev);
+   if (mdio_bus)
+   put_device(_bus->dev);
 
return 0;
 
-- 
2.7.0



[PATCH RFC v2 22/32] net: dsa: Better integrate the drivers with mdio device

2016-02-28 Thread Andrew Lunn
Don't unpack the mdiodev into its bus and address value. Rather keep
it is a core data structure for addressing. This does however mean
when the driver is instantiated the old way, we have to create a dummy
mdiodev structure.

Signed-off-by: Andrew Lunn 
---
 drivers/net/dsa/mv88e6060.c | 27 ++
 drivers/net/dsa/mv88e6060.h |  3 +-
 drivers/net/dsa/mv88e6123.c |  7 +++--
 drivers/net/dsa/mv88e6131.c |  7 +++--
 drivers/net/dsa/mv88e6171.c |  7 +++--
 drivers/net/dsa/mv88e6352.c |  7 +++--
 drivers/net/dsa/mv88e6xxx.c | 68 +++--
 drivers/net/dsa/mv88e6xxx.h |  9 ++
 8 files changed, 74 insertions(+), 61 deletions(-)

diff --git a/drivers/net/dsa/mv88e6060.c b/drivers/net/dsa/mv88e6060.c
index 723273c8ff32..27834cac422f 100644
--- a/drivers/net/dsa/mv88e6060.c
+++ b/drivers/net/dsa/mv88e6060.c
@@ -24,7 +24,8 @@ static int reg_read(struct dsa_switch *ds, int addr, int reg)
 {
struct mv88e6060_priv *priv = ds_to_priv(ds);
 
-   return mdiobus_read_nested(priv->bus, priv->sw_addr + addr, reg);
+   return mdiobus_read_nested(priv->mdiodev->bus,
+  priv->mdiodev->addr + addr, reg);
 }
 
 #define REG_READ(addr, reg)\
@@ -42,7 +43,8 @@ static int reg_write(struct dsa_switch *ds, int addr, int 
reg, u16 val)
 {
struct mv88e6060_priv *priv = ds_to_priv(ds);
 
-   return mdiobus_write_nested(priv->bus, priv->sw_addr + addr, reg, val);
+   return mdiobus_write_nested(priv->mdiodev->bus,
+   priv->mdiodev->addr + addr, reg, val);
 }
 
 #define REG_WRITE(addr, reg, val)  \
@@ -179,6 +181,7 @@ static int mv88e6060_setup(struct dsa_switch *ds, struct 
device *dev)
 {
int i;
int ret;
+   struct mdio_device *mdiodev;
struct mv88e6060_priv *priv = ds_to_priv(ds);
 
if (!priv) {
@@ -187,13 +190,18 @@ static int mv88e6060_setup(struct dsa_switch *ds, struct 
device *dev)
if (!priv)
return -ENOMEM;
 
+   mdiodev = devm_kzalloc(dev, sizeof(*mdiodev), GFP_KERNEL);
+   if (!mdiodev)
+   return -ENOMEM;
+
ds->priv = priv;
 
-   priv->bus = dsa_host_dev_to_mii_bus(ds->master_dev);
-   if (!priv->bus)
+   mdiodev->bus = dsa_host_dev_to_mii_bus(ds->master_dev);
+   if (!mdiodev->bus)
return -ENODEV;
 
-   priv->sw_addr = ds->pd->sw_addr;
+   mdiodev->addr = ds->pd->sw_addr;
+   priv->mdiodev = mdiodev;
}
 
ret = mv88e6060_switch_reset(ds);
@@ -280,14 +288,11 @@ static int mv88e6060_bind(struct device *dev,
 
priv = (struct mv88e6060_priv *)(ds + 1);
ds->priv = priv;
-   priv->bus = mdiodev->bus;
-   priv->sw_addr = mdiodev->addr;
-
-   get_device(>bus->dev);
+   priv->mdiodev = mdiodev;
 
ds->drv = _switch_driver;
 
-   name = mv88e6060_name(priv->bus, priv->sw_addr);
+   name = mv88e6060_name(priv->mdiodev->bus, priv->mdiodev->addr);
if (!name) {
dev_err(dev, "Failed to find switch");
return -ENODEV;
@@ -303,10 +308,8 @@ static void mv88e6060_unbind(struct device *dev, struct 
device *master,
 void *data)
 {
struct dsa_switch *ds = dev_get_drvdata(dev);
-   struct mv88e6060_priv *priv = ds_to_priv(ds);
 
dsa_switch_unregister(ds);
-   put_device(>bus->dev);
 }
 
 static const struct component_ops mv88e6060_component_ops = {
diff --git a/drivers/net/dsa/mv88e6060.h b/drivers/net/dsa/mv88e6060.h
index 10249bd16292..bf0b8d5bde11 100644
--- a/drivers/net/dsa/mv88e6060.h
+++ b/drivers/net/dsa/mv88e6060.h
@@ -115,8 +115,7 @@ struct mv88e6060_priv {
 * single address which contains two registers used for
 * indirect access to more registers.
 */
-   struct mii_bus *bus;
-   int sw_addr;
+   struct mdio_device *mdiodev;
 };
 
 #endif
diff --git a/drivers/net/dsa/mv88e6123.c b/drivers/net/dsa/mv88e6123.c
index 4c488f9f2a34..76e88e037311 100644
--- a/drivers/net/dsa/mv88e6123.c
+++ b/drivers/net/dsa/mv88e6123.c
@@ -34,9 +34,12 @@ static const struct mv88e6xxx_switch_id mv88e6123_table[] = {
 
 static char *mv88e6123_drv_probe(struct device *host_dev, int sw_addr)
 {
-   struct mii_bus *bus = dsa_host_dev_to_mii_bus(host_dev);
+   struct mdio_device mdiodev;
 
-   return mv88e6xxx_lookup_name(bus, sw_addr, mv88e6123_table,
+   mdiodev.bus = dsa_host_dev_to_mii_bus(host_dev);
+   mdiodev.addr = sw_addr;
+
+   return mv88e6xxx_lookup_name(, mv88e6123_table,
 ARRAY_SIZE(mv88e6123_table));
 }
 
diff --git a/drivers/net/dsa/mv88e6131.c b/drivers/net/dsa/mv88e6131.c
index e5a4e2b11322..6696ce638bbc 100644
--- 

[PATCH RFC v2 32/32] phy: fixed: Fix removal of phys.

2016-02-28 Thread Andrew Lunn
The fixed phys delete function simply removed the fixed phy from the
internal linked list and freed the memory. It however did not
unregister the associated phy device. This meant it was still possible
to find the phy device on the mdio bus.

Make fixed_phy_del() an internal function and add a
fixed_phy_unregister() to unregisters the phy device and then uses
fixed_phy_del() to free resources.

Modify DSA to use this new API function. It then becomes possible to
unload and load driver modules with setups that use fixed phys.

Signed-off-by: Andrew Lunn 
---
 drivers/net/phy/fixed_phy.c | 10 --
 include/linux/phy_fixed.h   |  5 ++---
 net/dsa/dsa.c   |  4 +---
 3 files changed, 11 insertions(+), 8 deletions(-)

diff --git a/drivers/net/phy/fixed_phy.c b/drivers/net/phy/fixed_phy.c
index ab9c473d75ea..a6caf99a3904 100644
--- a/drivers/net/phy/fixed_phy.c
+++ b/drivers/net/phy/fixed_phy.c
@@ -285,7 +285,7 @@ err_regs:
 }
 EXPORT_SYMBOL_GPL(fixed_phy_add);
 
-void fixed_phy_del(int phy_addr)
+static void fixed_phy_del(int phy_addr)
 {
struct fixed_mdio_bus *fmb = _fmb;
struct fixed_phy *fp, *tmp;
@@ -300,7 +300,6 @@ void fixed_phy_del(int phy_addr)
}
}
 }
-EXPORT_SYMBOL_GPL(fixed_phy_del);
 
 static int phy_fixed_addr;
 static DEFINE_SPINLOCK(phy_fixed_addr_lock);
@@ -371,6 +370,13 @@ struct phy_device *fixed_phy_register(unsigned int irq,
 }
 EXPORT_SYMBOL_GPL(fixed_phy_register);
 
+void fixed_phy_unregister(struct phy_device *phy)
+{
+   phy_device_remove(phy);
+
+   fixed_phy_del(phy->mdio.addr);
+}
+
 static int __init fixed_mdio_bus_init(void)
 {
struct fixed_mdio_bus *fmb = _fmb;
diff --git a/include/linux/phy_fixed.h b/include/linux/phy_fixed.h
index 2400d2ea4f34..1d41ec44e39d 100644
--- a/include/linux/phy_fixed.h
+++ b/include/linux/phy_fixed.h
@@ -19,7 +19,7 @@ extern struct phy_device *fixed_phy_register(unsigned int irq,
 struct fixed_phy_status *status,
 int link_gpio,
 struct device_node *np);
-extern void fixed_phy_del(int phy_addr);
+extern void fixed_phy_unregister(struct phy_device *phydev);
 extern int fixed_phy_set_link_update(struct phy_device *phydev,
int (*link_update)(struct net_device *,
   struct fixed_phy_status *));
@@ -40,9 +40,8 @@ static inline struct phy_device *fixed_phy_register(unsigned 
int irq,
 {
return ERR_PTR(-ENODEV);
 }
-static inline int fixed_phy_del(int phy_addr)
+static inline void fixed_phy_unregister(struct phy_device *phydev)
 {
-   return -ENODEV;
 }
 static inline int fixed_phy_set_link_update(struct phy_device *phydev,
int (*link_update)(struct net_device *,
diff --git a/net/dsa/dsa.c b/net/dsa/dsa.c
index e969a43f7a21..445097927131 100644
--- a/net/dsa/dsa.c
+++ b/net/dsa/dsa.c
@@ -404,11 +404,9 @@ static void dsa_switch_finish_one(struct dsa_switch *ds)
if (of_phy_is_fixed_link(port_dn)) {
phydev = of_phy_find_device(port_dn);
if (phydev) {
-   int addr = phydev->mdio.addr;
-
phy_device_free(phydev);
of_node_put(port_dn);
-   fixed_phy_del(addr);
+   fixed_phy_unregister(phydev);
}
}
}
-- 
2.7.0



[PATCH RFC v2 06/32] net: dsa: Remove allocation of driver private memory

2016-02-28 Thread Andrew Lunn
The drivers now allocate their own memory for private usage. Remove
the allocation from the core code.

Signed-off-by: Andrew Lunn 
Acked-by: Florian Fainelli 
---
 include/net/dsa.h | 1 -
 net/dsa/dsa.c | 2 +-
 2 files changed, 1 insertion(+), 2 deletions(-)

diff --git a/include/net/dsa.h b/include/net/dsa.h
index 05067b030962..f6b8001a500f 100644
--- a/include/net/dsa.h
+++ b/include/net/dsa.h
@@ -213,7 +213,6 @@ struct dsa_switch_driver {
struct list_headlist;
 
enum dsa_tag_protocol   tag_protocol;
-   int priv_size;
 
/*
 * Probing and setup.
diff --git a/net/dsa/dsa.c b/net/dsa/dsa.c
index d6ea1f1a1a34..6c9d1d812873 100644
--- a/net/dsa/dsa.c
+++ b/net/dsa/dsa.c
@@ -444,7 +444,7 @@ dsa_switch_setup(struct dsa_switch_tree *dst, int index,
/*
 * Allocate and initialise switch state.
 */
-   ds = devm_kzalloc(parent, sizeof(*ds) + drv->priv_size, GFP_KERNEL);
+   ds = devm_kzalloc(parent, sizeof(*ds), GFP_KERNEL);
if (ds == NULL)
return ERR_PTR(-ENOMEM);
 
-- 
2.7.0



[PATCH RFC v2 12/32] net: dsa: Add slave component matches based on a phandle to the slave.

2016-02-28 Thread Andrew Lunn
Switch devices are component slaves. Such devices have a "switch"
property in the DSA device tree which is a phandle to the switch
device. Add a component match on the device node.

Signed-off-by: Andrew Lunn 
---
 net/dsa/dsa.c | 17 +++--
 1 file changed, 15 insertions(+), 2 deletions(-)

diff --git a/net/dsa/dsa.c b/net/dsa/dsa.c
index f2af801554ec..73a3fd561ef3 100644
--- a/net/dsa/dsa.c
+++ b/net/dsa/dsa.c
@@ -886,18 +886,25 @@ static void dsa_finish_dst(struct dsa_switch_tree *dst, 
struct device *parent,
}
 }
 
+static int compare_of(struct device *dev, void *data)
+{
+   return dev->of_node == data;
+}
+
 static int dsa_probe(struct platform_device *pdev)
 {
struct dsa_platform_data *pd = pdev->dev.platform_data;
+   struct device_node *np = pdev->dev.of_node;
struct component_match *match = NULL;
+   struct device_node *chip;
struct net_device *dev;
struct dsa_switch_tree *dst;
-   int ret;
+   int i, ret;
 
pr_notice_once("Distributed Switch Architecture driver version %s\n",
   dsa_driver_version);
 
-   if (pdev->dev.of_node) {
+   if (np) {
pd = devm_kzalloc(>dev, sizeof(*pd), GFP_KERNEL);
if (!pd)
return -ENOMEM;
@@ -941,6 +948,12 @@ static int dsa_probe(struct platform_device *pdev)
 
platform_set_drvdata(pdev, dst);
 
+   for (i = 0; i < pd->nr_chips; i++) {
+   chip = pd->chip[i].of_chip;
+   if (chip)
+   component_match_add(>dev, , compare_of,
+   chip);
+   }
 
return component_master_add_with_match(>dev, _ops, match);
 
-- 
2.7.0



[PATCH RFC v2 17/32] dsa: mv88e6xxx: Add shared code for binding/unbinding a switch driver.

2016-02-28 Thread Andrew Lunn
Switch drivers are component slaves. When they are bound to a master
component, the bind function is called and resources can be reserved.
Add the shared code.

Signed-off-by: Andrew Lunn 
---
 drivers/net/dsa/mv88e6xxx.c | 77 ++---
 drivers/net/dsa/mv88e6xxx.h |  5 +++
 2 files changed, 71 insertions(+), 11 deletions(-)

diff --git a/drivers/net/dsa/mv88e6xxx.c b/drivers/net/dsa/mv88e6xxx.c
index 7f67bf47cdb6..4a4af245b0eb 100644
--- a/drivers/net/dsa/mv88e6xxx.c
+++ b/drivers/net/dsa/mv88e6xxx.c
@@ -4,6 +4,7 @@
  *
  * Copyright (c) 2015 CMC Electronics, Inc.
  * Added support for VLAN Table Unit operations
+ * Copyright (c) 2015 Andrew Lunn 
  *
  * This program is free software; you can redistribute it and/or modify
  * it under the terms of the GNU General Public License as published by
@@ -12,14 +13,16 @@
  */
 
 #include 
+#include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
-#include 
 #include 
 #include 
 #include 
@@ -2196,19 +2199,22 @@ int mv88e6xxx_setup_ports(struct dsa_switch *ds)
 
 int mv88e6xxx_setup_common(struct dsa_switch *ds, struct device *dev)
 {
-   struct mv88e6xxx_priv_state *ps;
+   struct mv88e6xxx_priv_state *ps = ds_to_priv(ds);
 
-   ps = devm_kzalloc(dev, sizeof(*ps), GFP_KERNEL);
-   if (!ps)
-   return -ENOMEM;
+   if (!ps) {
+   ps = devm_kzalloc(dev, sizeof(*ps), GFP_KERNEL);
+   if (!ps)
+   return -ENOMEM;
 
-   ds->priv = ps;
-   ps->ds = ds;
-   ps->bus = dsa_host_dev_to_mii_bus(ds->master_dev);
-   if (!ps->bus)
-   return -ENODEV;
+   ds->priv = ps;
+   ps->ds = ds;
 
-   ps->sw_addr = ds->pd->sw_addr;
+   ps->bus = dsa_host_dev_to_mii_bus(ds->master_dev);
+   if (!ps->bus)
+   return -ENODEV;
+
+   ps->sw_addr = ds->pd->sw_addr;
+   }
 
mutex_init(>smi_mutex);
 
@@ -2654,6 +2660,55 @@ char *mv88e6xxx_lookup_name(struct mii_bus *bus, int 
sw_addr,
return NULL;
 }
 
+int mv88e6xxx_bind(struct device *dev,
+  struct dsa_switch_tree *dst,
+  struct dsa_switch_driver *ops,
+  const struct mv88e6xxx_switch_id *table,
+  unsigned int table_size)
+{
+   struct mdio_device *mdiodev = to_mdio_device(dev);
+   struct mv88e6xxx_priv_state *ps;
+   struct device_node *np = dev->of_node;
+   struct dsa_switch *ds;
+   const char *name;
+
+   ds = devm_kzalloc(dev, sizeof(*ds) + sizeof(*ps), GFP_KERNEL);
+   if (!ds)
+   return -ENOMEM;
+
+   ps = (struct mv88e6xxx_priv_state *)(ds + 1);
+   ds->priv = ps;
+   ps->ds = ds;
+   ps->bus = mdiodev->bus;
+   ps->sw_addr = mdiodev->addr;
+
+   get_device(>bus->dev);
+
+   ds->drv = ops;
+
+   name = mv88e6xxx_lookup_name(ps->bus, ps->sw_addr, table, table_size);
+   if (!name) {
+   dev_err(dev, "Failed to find switch");
+   return -ENODEV;
+   }
+
+   dev_set_drvdata(dev, ds);
+   dsa_switch_register(dst, ds, np, name);
+
+   return 0;
+}
+
+void mv88e6xxx_unbind(struct device *dev, struct device *master, void *data)
+{
+   struct dsa_switch *ds = dev_get_drvdata(dev);
+   struct mv88e6xxx_priv_state *ps = ds_to_priv(ds);
+
+   dsa_switch_unregister(ds);
+   devm_kfree(dev, ds);
+
+   put_device(>bus->dev);
+}
+
 static int __init mv88e6xxx_init(void)
 {
 #if IS_ENABLED(CONFIG_NET_DSA_MV88E6131)
diff --git a/drivers/net/dsa/mv88e6xxx.h b/drivers/net/dsa/mv88e6xxx.h
index dce72d1007c2..a4ae99b7cfd0 100644
--- a/drivers/net/dsa/mv88e6xxx.h
+++ b/drivers/net/dsa/mv88e6xxx.h
@@ -447,6 +447,11 @@ int mv88e6xxx_switch_reset(struct dsa_switch *ds, bool 
ppu_active);
 char *mv88e6xxx_lookup_name(struct mii_bus *bus, int sw_addr,
const struct mv88e6xxx_switch_id *table,
unsigned int num);
+int mv88e6xxx_bind(struct device *dev, struct dsa_switch_tree *dst,
+  struct dsa_switch_driver *ops,
+  const struct mv88e6xxx_switch_id *table,
+  unsigned int table_size);
+void mv88e6xxx_unbind(struct device *dev, struct device *master, void *data);
 int mv88e6xxx_setup_ports(struct dsa_switch *ds);
 int mv88e6xxx_setup_common(struct dsa_switch *ds, struct device *dev);
 int mv88e6xxx_setup_global(struct dsa_switch *ds);
-- 
2.7.0



[PATCH RFC v2 07/32] net: dsa: Keep the mii bus and address in the private structure

2016-02-28 Thread Andrew Lunn
Rather than looking up the mii bus and address every time, do it once
and setup, and keep it in the private structure.

Signed-off-by: Andrew Lunn 
Acked-by: Florian Fainelli 
---
v2: Add check for valid mii_bus.
---
 drivers/net/dsa/mv88e6060.c | 27 +--
 drivers/net/dsa/mv88e6060.h | 11 +++
 drivers/net/dsa/mv88e6xxx.c | 19 +--
 drivers/net/dsa/mv88e6xxx.h |  6 ++
 4 files changed, 43 insertions(+), 20 deletions(-)

diff --git a/drivers/net/dsa/mv88e6060.c b/drivers/net/dsa/mv88e6060.c
index 34bc374882c7..faf9834fe1cc 100644
--- a/drivers/net/dsa/mv88e6060.c
+++ b/drivers/net/dsa/mv88e6060.c
@@ -19,12 +19,9 @@
 
 static int reg_read(struct dsa_switch *ds, int addr, int reg)
 {
-   struct mii_bus *bus = dsa_host_dev_to_mii_bus(ds->master_dev);
+   struct mv88e6060_priv *priv = ds_to_priv(ds);
 
-   if (bus == NULL)
-   return -EINVAL;
-
-   return mdiobus_read_nested(bus, ds->pd->sw_addr + addr, reg);
+   return mdiobus_read_nested(priv->bus, priv->sw_addr + addr, reg);
 }
 
 #define REG_READ(addr, reg)\
@@ -40,12 +37,9 @@ static int reg_read(struct dsa_switch *ds, int addr, int reg)
 
 static int reg_write(struct dsa_switch *ds, int addr, int reg, u16 val)
 {
-   struct mii_bus *bus = dsa_host_dev_to_mii_bus(ds->master_dev);
-
-   if (bus == NULL)
-   return -EINVAL;
+   struct mv88e6060_priv *priv = ds_to_priv(ds);
 
-   return mdiobus_write_nested(bus, ds->pd->sw_addr + addr, reg, val);
+   return mdiobus_write_nested(priv->bus, priv->sw_addr + addr, reg, val);
 }
 
 #define REG_WRITE(addr, reg, val)  \
@@ -176,6 +170,19 @@ static int mv88e6060_setup(struct dsa_switch *ds, struct 
device *dev)
 {
int i;
int ret;
+   struct mv88e6060_priv *priv;
+
+   priv = devm_kzalloc(dev, sizeof(*priv), GFP_KERNEL);
+   if (!priv)
+   return -ENOMEM;
+
+   ds->priv = priv;
+
+   priv->bus = dsa_host_dev_to_mii_bus(ds->master_dev);
+   if (!priv->bus)
+   return -ENODEV;
+
+   priv->sw_addr = ds->pd->sw_addr;
 
ret = mv88e6060_switch_reset(ds);
if (ret < 0)
diff --git a/drivers/net/dsa/mv88e6060.h b/drivers/net/dsa/mv88e6060.h
index cc9b2ed4aff4..10249bd16292 100644
--- a/drivers/net/dsa/mv88e6060.h
+++ b/drivers/net/dsa/mv88e6060.h
@@ -108,4 +108,15 @@
 #define GLOBAL_ATU_MAC_23  0x0e
 #define GLOBAL_ATU_MAC_45  0x0f
 
+struct mv88e6060_priv {
+   /* MDIO bus and address on bus to use. When in single chip
+* mode, address is 0, and the switch uses multiple addresses
+* on the bus.  When in multi-chip mode, the switch uses a
+* single address which contains two registers used for
+* indirect access to more registers.
+*/
+   struct mii_bus *bus;
+   int sw_addr;
+};
+
 #endif
diff --git a/drivers/net/dsa/mv88e6xxx.c b/drivers/net/dsa/mv88e6xxx.c
index c2d26168fdf8..2e945f325db1 100644
--- a/drivers/net/dsa/mv88e6xxx.c
+++ b/drivers/net/dsa/mv88e6xxx.c
@@ -94,15 +94,12 @@ static int __mv88e6xxx_reg_read(struct mii_bus *bus, int 
sw_addr, int addr,
 
 static int _mv88e6xxx_reg_read(struct dsa_switch *ds, int addr, int reg)
 {
-   struct mii_bus *bus = dsa_host_dev_to_mii_bus(ds->master_dev);
+   struct mv88e6xxx_priv_state *ps = ds_to_priv(ds);
int ret;
 
assert_smi_lock(ds);
 
-   if (bus == NULL)
-   return -EINVAL;
-
-   ret = __mv88e6xxx_reg_read(bus, ds->pd->sw_addr, addr, reg);
+   ret = __mv88e6xxx_reg_read(ps->bus, ps->sw_addr, addr, reg);
if (ret < 0)
return ret;
 
@@ -159,17 +156,14 @@ static int __mv88e6xxx_reg_write(struct mii_bus *bus, int 
sw_addr, int addr,
 static int _mv88e6xxx_reg_write(struct dsa_switch *ds, int addr, int reg,
u16 val)
 {
-   struct mii_bus *bus = dsa_host_dev_to_mii_bus(ds->master_dev);
+   struct mv88e6xxx_priv_state *ps = ds_to_priv(ds);
 
assert_smi_lock(ds);
 
-   if (bus == NULL)
-   return -EINVAL;
-
dev_dbg(ds->master_dev, "-> addr: 0x%.2x reg: 0x%.2x val: 0x%.4x\n",
addr, reg, val);
 
-   return __mv88e6xxx_reg_write(bus, ds->pd->sw_addr, addr, reg, val);
+   return __mv88e6xxx_reg_write(ps->bus, ps->sw_addr, addr, reg, val);
 }
 
 int mv88e6xxx_reg_write(struct dsa_switch *ds, int addr, int reg, u16 val)
@@ -2210,6 +2204,11 @@ int mv88e6xxx_setup_common(struct dsa_switch *ds, struct 
device *dev)
 
ds->priv = ps;
ps->ds = ds;
+   ps->bus = dsa_host_dev_to_mii_bus(ds->master_dev);
+   if (!ps->bus)
+   return -ENODEV;
+
+   ps->sw_addr = ds->pd->sw_addr;
 
mutex_init(>smi_mutex);
 
diff --git a/drivers/net/dsa/mv88e6xxx.h b/drivers/net/dsa/mv88e6xxx.h
index 8a13162353cd..53f2fc82069b 100644
--- 

[PATCH RFC v2 24/32] net: dsa: Add some debug prints for error cases

2016-02-28 Thread Andrew Lunn
Due to the complexity it can be hard to know why DSA fails to probe.
Add some debug prints for the common error cases.

Signed-off-by: Andrew Lunn 
Acked-by: Florian Fainelli 
---
 net/dsa/dsa.c | 13 ++---
 1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/net/dsa/dsa.c b/net/dsa/dsa.c
index 0be85a14a835..9acfbe7c34f7 100644
--- a/net/dsa/dsa.c
+++ b/net/dsa/dsa.c
@@ -679,6 +679,7 @@ static int dsa_of_probe(struct device *dev, struct 
dsa_platform_data *pd)
 
ethernet = of_parse_phandle(np, "dsa,ethernet", 0);
if (!ethernet) {
+   dev_dbg(dev, "Missing mandatory dsa,ethernet property\n");
ret = -EINVAL;
goto out_put_mdio;
}
@@ -710,8 +711,10 @@ static int dsa_of_probe(struct device *dev, struct 
dsa_platform_data *pd)
if (chip) {
cd->of_chip = chip;
} else {
-   if (!mdio)
+   if (!mdio) {
+   dev_dbg(dev, "Missing required dsa,mii-bus 
property\n");
return -EINVAL;
+   }
 
mdio_bus = of_mdio_find_bus(mdio);
if (!mdio_bus)
@@ -1018,14 +1021,18 @@ int dsa_switch_register(struct dsa_switch_tree *dst, 
struct dsa_switch *ds,
struct dsa_platform_data *pd = dst->pd;
int index = dsa_find_chip_index(dst, np);
 
-   if (index < 0)
+   if (index < 0) {
+   netdev_dbg(dst->master_netdev, "Registration for unknown 
switch\n");
return index;
+   }
 
netdev_info(dst->master_netdev, "[%d]: detected a %s switch\n",
index, name);
 
-   if (dst->ds[index])
+   if (dst->ds[index]) {
+   netdev_dbg(dst->master_netdev, "Device already registered\n");
return -EINVAL;
+   }
 
ds->index = index;
ds->pd = >chip[index];
-- 
2.7.0



[PATCH RFC v2 18/32] dsa: mv88e6xxx: Prepare for turning this into a library module

2016-02-28 Thread Andrew Lunn
Export all the functions so that we can later turn the module into a
library module.

Signed-off-by: Andrew Lunn 
---
 drivers/net/dsa/mv88e6xxx.c | 46 +
 1 file changed, 46 insertions(+)

diff --git a/drivers/net/dsa/mv88e6xxx.c b/drivers/net/dsa/mv88e6xxx.c
index 4a4af245b0eb..b0838bf77fd9 100644
--- a/drivers/net/dsa/mv88e6xxx.c
+++ b/drivers/net/dsa/mv88e6xxx.c
@@ -123,6 +123,7 @@ int mv88e6xxx_reg_read(struct dsa_switch *ds, int addr, int 
reg)
 
return ret;
 }
+EXPORT_SYMBOL_GPL(mv88e6xxx_reg_read);
 
 static int __mv88e6xxx_reg_write(struct mii_bus *bus, int sw_addr, int addr,
 int reg, u16 val)
@@ -180,6 +181,7 @@ int mv88e6xxx_reg_write(struct dsa_switch *ds, int addr, 
int reg, u16 val)
 
return ret;
 }
+EXPORT_SYMBOL_GPL(mv88e6xxx_reg_write);
 
 int mv88e6xxx_set_addr_direct(struct dsa_switch *ds, u8 *addr)
 {
@@ -189,6 +191,7 @@ int mv88e6xxx_set_addr_direct(struct dsa_switch *ds, u8 
*addr)
 
return 0;
 }
+EXPORT_SYMBOL_GPL(mv88e6xxx_set_addr_direct);
 
 int mv88e6xxx_set_addr_indirect(struct dsa_switch *ds, u8 *addr)
 {
@@ -214,6 +217,7 @@ int mv88e6xxx_set_addr_indirect(struct dsa_switch *ds, u8 
*addr)
 
return 0;
 }
+EXPORT_SYMBOL_GPL(mv88e6xxx_set_addr_indirect);
 
 static int _mv88e6xxx_phy_read(struct dsa_switch *ds, int addr, int regnum)
 {
@@ -339,6 +343,7 @@ void mv88e6xxx_ppu_state_init(struct dsa_switch *ds)
ps->ppu_timer.data = (unsigned long)ps;
ps->ppu_timer.function = mv88e6xxx_ppu_reenable_timer;
 }
+EXPORT_SYMBOL_GPL(mv88e6xxx_ppu_state_init);
 
 int mv88e6xxx_phy_read_ppu(struct dsa_switch *ds, int addr, int regnum)
 {
@@ -352,6 +357,7 @@ int mv88e6xxx_phy_read_ppu(struct dsa_switch *ds, int addr, 
int regnum)
 
return ret;
 }
+EXPORT_SYMBOL_GPL(mv88e6xxx_phy_read_ppu);
 
 int mv88e6xxx_phy_write_ppu(struct dsa_switch *ds, int addr,
int regnum, u16 val)
@@ -366,6 +372,7 @@ int mv88e6xxx_phy_write_ppu(struct dsa_switch *ds, int addr,
 
return ret;
 }
+EXPORT_SYMBOL_GPL(mv88e6xxx_phy_write_ppu);
 #endif
 
 static bool mv88e6xxx_6065_family(struct dsa_switch *ds)
@@ -546,6 +553,7 @@ void mv88e6xxx_adjust_link(struct dsa_switch *ds, int port,
 out:
mutex_unlock(>smi_mutex);
 }
+EXPORT_SYMBOL_GPL(mv88e6xxx_adjust_link);
 
 static int _mv88e6xxx_stats_wait(struct dsa_switch *ds)
 {
@@ -742,6 +750,7 @@ void mv88e6xxx_get_strings(struct dsa_switch *ds, int port, 
uint8_t *data)
}
}
 }
+EXPORT_SYMBOL_GPL(mv88e6xxx_get_strings);
 
 int mv88e6xxx_get_sset_count(struct dsa_switch *ds)
 {
@@ -755,6 +764,7 @@ int mv88e6xxx_get_sset_count(struct dsa_switch *ds)
}
return j;
 }
+EXPORT_SYMBOL_GPL(mv88e6xxx_get_sset_count);
 
 void
 mv88e6xxx_get_ethtool_stats(struct dsa_switch *ds,
@@ -782,11 +792,13 @@ mv88e6xxx_get_ethtool_stats(struct dsa_switch *ds,
 
mutex_unlock(>smi_mutex);
 }
+EXPORT_SYMBOL_GPL(mv88e6xxx_get_ethtool_stats);
 
 int mv88e6xxx_get_regs_len(struct dsa_switch *ds, int port)
 {
return 32 * sizeof(u16);
 }
+EXPORT_SYMBOL_GPL(mv88e6xxx_get_regs_len);
 
 void mv88e6xxx_get_regs(struct dsa_switch *ds, int port,
struct ethtool_regs *regs, void *_p)
@@ -806,6 +818,7 @@ void mv88e6xxx_get_regs(struct dsa_switch *ds, int port,
p[i] = ret;
}
 }
+EXPORT_SYMBOL_GPL(mv88e6xxx_get_regs);
 
 static int _mv88e6xxx_wait(struct dsa_switch *ds, int reg, int offset,
   u16 mask)
@@ -849,12 +862,14 @@ int mv88e6xxx_eeprom_load_wait(struct dsa_switch *ds)
return mv88e6xxx_wait(ds, REG_GLOBAL2, GLOBAL2_EEPROM_OP,
  GLOBAL2_EEPROM_OP_LOAD);
 }
+EXPORT_SYMBOL_GPL(mv88e6xxx_eeprom_load_wait);
 
 int mv88e6xxx_eeprom_busy_wait(struct dsa_switch *ds)
 {
return mv88e6xxx_wait(ds, REG_GLOBAL2, GLOBAL2_EEPROM_OP,
  GLOBAL2_EEPROM_OP_BUSY);
 }
+EXPORT_SYMBOL_GPL(mv88e6xxx_eeprom_busy_wait);
 
 static int _mv88e6xxx_atu_wait(struct dsa_switch *ds)
 {
@@ -921,6 +936,7 @@ out:
mutex_unlock(>smi_mutex);
return reg;
 }
+EXPORT_SYMBOL_GPL(mv88e6xxx_get_eee);
 
 int mv88e6xxx_set_eee(struct dsa_switch *ds, int port,
  struct phy_device *phydev, struct ethtool_eee *e)
@@ -947,6 +963,7 @@ out:
 
return ret;
 }
+EXPORT_SYMBOL_GPL(mv88e6xxx_set_eee);
 
 static int _mv88e6xxx_atu_cmd(struct dsa_switch *ds, u16 cmd)
 {
@@ -1134,6 +1151,7 @@ int mv88e6xxx_port_stp_update(struct dsa_switch *ds, int 
port, u8 state)
 
return 0;
 }
+EXPORT_SYMBOL_GPL(mv88e6xxx_port_stp_update);
 
 static int _mv88e6xxx_port_pvid_get(struct dsa_switch *ds, int port, u16 *pvid)
 {
@@ -1160,6 +1178,7 @@ int mv88e6xxx_port_pvid_get(struct dsa_switch *ds, int 
port, u16 *pvid)
 
return 0;
 }
+EXPORT_SYMBOL_GPL(mv88e6xxx_port_pvid_get);
 
 static int _mv88e6xxx_port_pvid_set(struct 

[PATCH RFC v2 11/32] net: dsa: Keep a reference to the switch device for component matching

2016-02-28 Thread Andrew Lunn
The switch devices are component slaves. Such devices have a "switch"
property in the DSA device tree which is a phandle to the switch
device. When the slaves bind they register to DSA. So the DSA can
match the switch to the correct switch instance in the device tree,
keep a reference to the device tree node during parsing of the device
tree.

Signed-off-by: Andrew Lunn 
---
 Documentation/devicetree/bindings/net/dsa/dsa.txt | 2 ++
 include/net/dsa.h | 3 +++
 net/dsa/dsa.c | 6 +-
 3 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/Documentation/devicetree/bindings/net/dsa/dsa.txt 
b/Documentation/devicetree/bindings/net/dsa/dsa.txt
index 5fdbbcdf8c4b..f99e5694a61f 100644
--- a/Documentation/devicetree/bindings/net/dsa/dsa.txt
+++ b/Documentation/devicetree/bindings/net/dsa/dsa.txt
@@ -27,6 +27,8 @@ Each of these switch child nodes should have the following 
required properties:
 
 A switch child node has the following optional property:
 
+- switch   : A phandle to a switch device.
+
 - eeprom-length: Set to the length of an EEPROM connected to 
the
  switch. Must be set if the switch can not detect
  the presence and/or size of a connected EEPROM,
diff --git a/include/net/dsa.h b/include/net/dsa.h
index f6b8001a500f..ea4cfdf1b549 100644
--- a/include/net/dsa.h
+++ b/include/net/dsa.h
@@ -48,6 +48,9 @@ struct dsa_chip_data {
 */
struct device_node *of_node;
 
+   /* Device tree node pointer for the switch chip device. */
+   struct device_node *of_chip;
+
/*
 * The names of the switch's ports.  Use "cpu" to
 * designate the switch port that the cpu is connected to,
diff --git a/net/dsa/dsa.c b/net/dsa/dsa.c
index a139c35061a1..f2af801554ec 100644
--- a/net/dsa/dsa.c
+++ b/net/dsa/dsa.c
@@ -662,7 +662,7 @@ static void dsa_of_free_platform_data(struct 
dsa_platform_data *pd)
 static int dsa_of_probe(struct device *dev, struct dsa_platform_data *pd)
 {
struct device_node *np = dev->of_node;
-   struct device_node *child, *mdio, *ethernet, *port;
+   struct device_node *child, *chip, *mdio, *ethernet, *port;
struct mii_bus *mdio_bus, *mdio_bus_switch;
struct net_device *ethernet_dev;
struct dsa_chip_data *cd;
@@ -712,6 +712,10 @@ static int dsa_of_probe(struct device *dev, struct 
dsa_platform_data *pd)
 
cd->of_node = child;
 
+   chip = of_parse_phandle(child, "switch", 0);
+   if (chip)
+   cd->of_chip = chip;
+
/* When assigning the host device, increment its refcount */
cd->host_dev = get_device(_bus->dev);
 
-- 
2.7.0



[PATCH RFC v2 03/32] dsa: Make setup and finish more symmetrical

2016-02-28 Thread Andrew Lunn
Rename and reposition dsa_switch_destroy() to dsa_switch_finish_one()
to make it clear it is the opposite of dsa_switch_setup_one().

Signed-off-by: Andrew Lunn 
---
 net/dsa/dsa.c | 90 +--
 1 file changed, 45 insertions(+), 45 deletions(-)

diff --git a/net/dsa/dsa.c b/net/dsa/dsa.c
index be68c00b8cfd..6e9176545dda 100644
--- a/net/dsa/dsa.c
+++ b/net/dsa/dsa.c
@@ -374,6 +374,50 @@ out:
return ret;
 }
 
+static void dsa_switch_finish_one(struct dsa_switch *ds)
+{
+   struct device_node *port_dn;
+   struct phy_device *phydev;
+   struct dsa_chip_data *cd = ds->pd;
+   int port;
+
+#ifdef CONFIG_NET_DSA_HWMON
+   if (ds->hwmon_dev)
+   hwmon_device_unregister(ds->hwmon_dev);
+#endif
+
+   /* Disable configuration of the CPU and DSA ports */
+   for (port = 0; port < DSA_MAX_PORTS; port++) {
+   if (!(dsa_is_cpu_port(ds, port) || dsa_is_dsa_port(ds, port)))
+   continue;
+
+   port_dn = cd->port_dn[port];
+   if (of_phy_is_fixed_link(port_dn)) {
+   phydev = of_phy_find_device(port_dn);
+   if (phydev) {
+   int addr = phydev->mdio.addr;
+
+   phy_device_free(phydev);
+   of_node_put(port_dn);
+   fixed_phy_del(addr);
+   }
+   }
+   }
+
+   /* Destroy network devices for physical switch ports. */
+   for (port = 0; port < DSA_MAX_PORTS; port++) {
+   if (!(ds->phys_port_mask & (1 << port)))
+   continue;
+
+   if (!ds->ports[port])
+   continue;
+
+   dsa_slave_destroy(ds->ports[port]);
+   }
+
+   mdiobus_unregister(ds->slave_mii_bus);
+}
+
 static struct dsa_switch *
 dsa_switch_setup(struct dsa_switch_tree *dst, int index,
 struct device *parent, struct device *host_dev)
@@ -418,50 +462,6 @@ dsa_switch_setup(struct dsa_switch_tree *dst, int index,
return ds;
 }
 
-static void dsa_switch_destroy(struct dsa_switch *ds)
-{
-   struct device_node *port_dn;
-   struct phy_device *phydev;
-   struct dsa_chip_data *cd = ds->pd;
-   int port;
-
-#ifdef CONFIG_NET_DSA_HWMON
-   if (ds->hwmon_dev)
-   hwmon_device_unregister(ds->hwmon_dev);
-#endif
-
-   /* Disable configuration of the CPU and DSA ports */
-   for (port = 0; port < DSA_MAX_PORTS; port++) {
-   if (!(dsa_is_cpu_port(ds, port) || dsa_is_dsa_port(ds, port)))
-   continue;
-
-   port_dn = cd->port_dn[port];
-   if (of_phy_is_fixed_link(port_dn)) {
-   phydev = of_phy_find_device(port_dn);
-   if (phydev) {
-   int addr = phydev->mdio.addr;
-
-   phy_device_free(phydev);
-   of_node_put(port_dn);
-   fixed_phy_del(addr);
-   }
-   }
-   }
-
-   /* Destroy network devices for physical switch ports. */
-   for (port = 0; port < DSA_MAX_PORTS; port++) {
-   if (!(ds->phys_port_mask & (1 << port)))
-   continue;
-
-   if (!ds->ports[port])
-   continue;
-
-   dsa_slave_destroy(ds->ports[port]);
-   }
-
-   mdiobus_unregister(ds->slave_mii_bus);
-}
-
 #ifdef CONFIG_PM_SLEEP
 static int dsa_switch_suspend(struct dsa_switch *ds)
 {
@@ -928,7 +928,7 @@ static void dsa_remove_dst(struct dsa_switch_tree *dst)
struct dsa_switch *ds = dst->ds[i];
 
if (ds)
-   dsa_switch_destroy(ds);
+   dsa_switch_finish_one(ds);
}
 
dev_put(dst->master_netdev);
-- 
2.7.0



[PATCH RFC v2 10/32] net: dsa: Add basic support for component master support

2016-02-28 Thread Andrew Lunn
Start using the component framework. The DSA device will be the
master, and the switch drivers will be slaves. Add basic component
master support to the DSA framework.

Signed-off-by: Andrew Lunn 
---
 net/dsa/dsa.c | 54 --
 1 file changed, 40 insertions(+), 14 deletions(-)

diff --git a/net/dsa/dsa.c b/net/dsa/dsa.c
index 5062ca91852d..a139c35061a1 100644
--- a/net/dsa/dsa.c
+++ b/net/dsa/dsa.c
@@ -9,6 +9,7 @@
  * (at your option) any later version.
  */
 
+#include 
 #include 
 #include 
 #include 
@@ -28,7 +29,7 @@
 #include "dsa_priv.h"
 
 char dsa_driver_version[] = "0.1";
-
+static const struct component_master_ops dsa_ops;
 
 /* switch driver registration ***/
 static DEFINE_MUTEX(dsa_switch_drivers_mutex);
@@ -820,13 +821,12 @@ static inline void dsa_of_remove(struct device *dev,
 #endif
 
 static int dsa_setup_dst(struct dsa_switch_tree *dst, struct net_device *dev,
-struct device *parent, struct dsa_platform_data *pd)
+struct device *parent)
 {
int i;
unsigned configured = 0;
+   struct dsa_platform_data *pd = dst->pd;
 
-   dst->pd = pd;
-   dst->master_netdev = dev;
dst->cpu_switch = -1;
dst->cpu_port = -1;
 
@@ -885,6 +885,7 @@ static void dsa_finish_dst(struct dsa_switch_tree *dst, 
struct device *parent,
 static int dsa_probe(struct platform_device *pdev)
 {
struct dsa_platform_data *pd = pdev->dev.platform_data;
+   struct component_match *match = NULL;
struct net_device *dev;
struct dsa_switch_tree *dst;
int ret;
@@ -931,15 +932,13 @@ static int dsa_probe(struct platform_device *pdev)
goto out;
}
 
+   dst->pd = pd;
+   dst->master_netdev = dev;
+
platform_set_drvdata(pdev, dst);
 
-   ret = dsa_setup_dst(dst, dev, >dev, pd);
-   if (ret) {
-   dev_put(dev);
-   goto out;
-   }
 
-   return 0;
+   return component_master_add_with_match(>dev, _ops, match);
 
 out:
dsa_of_remove(>dev, pd);
@@ -947,22 +946,49 @@ out:
return ret;
 }
 
-static int dsa_remove(struct platform_device *pdev)
+static int dsa_bind(struct device *dev)
 {
+   struct dsa_switch_tree *dst = dev_get_drvdata(dev);
+   int ret;
+
+   ret = component_bind_all(dev, dst);
+   if (ret)
+   return ret;
+
+   return dsa_setup_dst(dst, dst->master_netdev, dev);
+}
+
+static void dsa_unbind(struct device *dev)
+{
+   struct platform_device *pdev = to_platform_device(dev);
struct dsa_switch_tree *dst = platform_get_drvdata(pdev);
struct dsa_platform_data *pd = pdev->dev.platform_data;
 
dsa_finish_dst(dst, >dev, pd);
-   dsa_of_remove(>dev, pd);
-   dev_put(dst->master_netdev);
 
-   return 0;
+   dev_put(dst->master_netdev);
 }
 
+static const struct component_master_ops dsa_ops = {
+   .bind = dsa_bind,
+   .unbind = dsa_unbind,
+};
+
 static void dsa_shutdown(struct platform_device *pdev)
 {
 }
 
+static int dsa_remove(struct platform_device *pdev)
+{
+   struct dsa_platform_data *pd = pdev->dev.platform_data;
+
+   component_master_del(>dev, _ops);
+
+   dsa_of_remove(>dev, pd);
+
+   return 0;
+}
+
 static int dsa_switch_rcv(struct sk_buff *skb, struct net_device *dev,
  struct packet_type *pt, struct net_device *orig_dev)
 {
-- 
2.7.0



Re: [PATCH] net/9p: convert to new CQ API

2016-02-28 Thread Christoph Hellwig
On Sun, Feb 28, 2016 at 01:03:47PM +0200, Sagi Grimberg wrote:
>
>> Trivial conversion to the new RDMA CQ API.
>
> Looks nice and simple :)
>
> But I think that the fact that CQ processing is now
> done in soft-IRQ (which is an improvement!) needs to
> be documented.

Yeah, I meant to do that, but ended up beeing lazy as usual :)


[PATCH/RFC v6 net-next] ravb: Add dma queue interrupt support

2016-02-28 Thread Yoshihiro Kaneko
From: Kazuya Mizuguchi 

This patch supports the following interrupts.

- One interrupt for multiple (timestamp, error, gPTP)
- One interrupt for emac
- Four interrupts for dma queue (best effort rx/tx, network control rx/tx)

This patch improve efficiency of the interrupt handler by adding the
interrupt handler corresponding to each interrupt source described
above. Additionally, it reduces the number of times of the access to
EthernetAVB IF.
Also this patch prevent this driver depends on the whim of a boot loader.

[ykaneko0...@gmail.com: define bit names of registers]
[ykaneko0...@gmail.com: add comment for gen3 only registers]
[ykaneko0...@gmail.com: fix coding style]
[ykaneko0...@gmail.com: update changelog]
[ykaneko0...@gmail.com: gen3: fix initialization of interrupts]
[ykaneko0...@gmail.com: gen3: fix clearing interrupts]
[ykaneko0...@gmail.com: gen3: add helper function for request_irq()]
[ykaneko0...@gmail.com: gen3: remove IRQF_SHARED flag for request_irq()]
[ykaneko0...@gmail.com: revert ravb_close() and ravb_ptp_stop()]
[ykaneko0...@gmail.com: avoid calling free_irq() to non-hooked interrupts]
[ykaneko0...@gmail.com: make NC/BE interrupt handler a function]
[ykaneko0...@gmail.com: make timestamp interrupt handler a function]
[ykaneko0...@gmail.com: timestamp interrupt is handled in multiple
 interrupt handler instead of dma queue interrupt handler]
Signed-off-by: Kazuya Mizuguchi 
Signed-off-by: Yoshihiro Kaneko 
---

This patch is based on the master branch of David Miller's next networking
tree.

v6 [Yoshihiro Kaneko]
* As suggested by Sergei Shtylyov
  drivers/net/ethernet/renesas/ravb_main.c:
- rename ravb_nc_be_interrupt to ravb_queue_interrupt, change the type
   of return value to 'bool', rename ravb_queue to 'q'
- stop use of 'for' loop for queue interrupt in ravb_interrupt()
- fix comment for ravb_multi_interrupt()
- rename ravb_dmaq_interrupt to ravb_rx_tx_interrupt
- move timestamp interrupt handler into ravb_multi_interrupt()
- make timestamp interrupt handler a funtion
- rename out_free_irq2 label to out_free_irq_nc_tx
- remove IRQF_SHARED flag for request_irq()
  drivers/net/ethernet/renesas/ravb_ptp.c:
- fix coding style

v5 [Yoshihiro Kaneko]
* As suggested by Sergei Shtylyov
  drivers/net/ethernet/renesas/ravb_main.c:
- stop copying ravb_queue parameter in ravb_dmaq_interrupt()
- clear TFUF instead of disabling
- factored out NC/BE interrupt handler
- rename hook_irq() in ravb_hook_irq()
- add calling free_irq() for the EMAC IRQ
- stop using a loop for free_irq() to avoid calling free_irq() for
  non-hooked interrupt handlers
- add test for failure of devm_kasprintf in ravb_hook_irq()
- update changelog

v4 [Yoshihiro Kaneko]
* compile tested only
* As suggested by Sergei Shtylyov
  drivers/net/ethernet/renesas/ravb.h:
- make two lines of comment into one line.
- remove unused definition of xxx_ALL.
  drivers/net/ethernet/renesas/ravb_main.c:
- remove unrelated change (fix indentation).
- output warning messages when napi_schedule_prep() fails in ravb_dmaq_
  interrupt() like ravb_interrupt().
- change the function name from req_irq to hook_irq.
- fix programming error in hook_irq().
- do free_irq() for rx_irqs[] and tx_irqs[] for only gen3 in out_free_
  irq label in ravb_open().

v3 [Yoshihiro Kaneko]
* compile tested only
* As suggested by Sergei Shtylyov
  - update changelog
  drivers/net/ethernet/renesas/ravb.h:
- add comments to the additional registers like CIE
  drivers/net/ethernet/renesas/ravb_main.c:
- fix the initialization of the interrupt in ravb_dmac_init()
- revert ravb_error_interrupt() because gen3 code is wrong
- change the comment "Management" in ravb_multi_interrupt()
- add a helper function for request_irq() in ravb_open()
- revert ravb_close() because atomicity is not necessary here
  drivers/net/ethernet/renesas/ravb_ptp.c:
- revert ravb_ptp_stop() because atomicity is not necessary here

v2 [Yoshihiro Kaneko]
* compile tested only
* As suggested by Sergei Shtylyov
  - add comment to CIE
  - remove comments from CIE bits
  - fix value of TIx_ALL
  - define each bits for CIE, GIE, GID, RIE0, RID0, RIE2, RID2, TIE, TID
  - reversed Christmas tree declaration ordered
  - rename _ravb_emac_interrupt() to ravb_emac_interrupt_unlocked()
  - remove unnecessary clearing of CIE
  - use a bit name corresponding to the target register, RIE0, RIE2, TIE,
TID, RID2, GID, GIE

 drivers/net/ethernet/renesas/ravb.h  | 204 +++
 drivers/net/ethernet/renesas/ravb_main.c | 274 ++-
 drivers/net/ethernet/renesas/ravb_ptp.c  |  17 +-
 3 files changed, 447 insertions(+), 48 deletions(-)

diff --git a/drivers/net/ethernet/renesas/ravb.h 
b/drivers/net/ethernet/renesas/ravb.h
index b2160d1..5c16241 

[PATCH net-next V1 04/10] net/mlx5e: Move common case counters within sq_stats struct

2016-02-28 Thread Saeed Mahameed
From: Tariq Toukan 

For data cache locality considerations, we moved the nop and
csum_offload_inner within sq_stats struct as they are more
commonly accessed in xmit path.

Signed-off-by: Tariq Toukan 
Signed-off-by: Saeed Mahameed 
---
 drivers/net/ethernet/mellanox/mlx5/core/en.h |   10 ++
 1 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h 
b/drivers/net/ethernet/mellanox/mlx5/core/en.h
index 4511984..b289660 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h
@@ -260,26 +260,28 @@ static const char sq_stats_strings[][ETH_GSTRING_LEN] = {
"tso_bytes",
"tso_inner_packets",
"tso_inner_bytes",
-   "csum_offload_none",
"csum_offload_inner",
+   "nop",
+   "csum_offload_none",
"stopped",
"wake",
"dropped",
-   "nop"
 };
 
 struct mlx5e_sq_stats {
+   /* commonly accessed in data path */
u64 packets;
u64 tso_packets;
u64 tso_bytes;
u64 tso_inner_packets;
u64 tso_inner_bytes;
-   u64 csum_offload_none;
u64 csum_offload_inner;
+   u64 nop;
+   /* less likely accessed in data path */
+   u64 csum_offload_none;
u64 stopped;
u64 wake;
u64 dropped;
-   u64 nop;
 #define NUM_SQ_STATS 11
 };
 
-- 
1.7.1



[PATCH net-next V1 09/10] net/mlx5: Fix global UAR mapping

2016-02-28 Thread Saeed Mahameed
From: Moshe Lazer 

Remove the global WC mapping of the total UARs since
UAR mapping should be decided per UAR (e.g we want
different mappings for EQs, CQs vs QPs).

We use ARCH_HAS_IOREMAP_WC to know if the current arch supports WC
(Write combining) IO memory mapping, if it is not supported
"uar->bf_map" will be NULL, thus we will use NC (Non Cached) mapping
"uar->map".

Fixes: 88a85f99e51f ('TX latency optimization to save DMA reads')
Signed-off-by: Moshe Lazer 
Reviewed-by: Achiad Shochat 
Signed-off-by: Saeed Mahameed 
---
 drivers/net/ethernet/mellanox/mlx5/core/main.c |   28 +---
 drivers/net/ethernet/mellanox/mlx5/core/uar.c  |   12 +-
 include/linux/mlx5/driver.h|2 -
 3 files changed, 7 insertions(+), 35 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/main.c 
b/drivers/net/ethernet/mellanox/mlx5/core/main.c
index 1545a94..8b7133d 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/main.c
@@ -767,22 +767,6 @@ static int mlx5_core_set_issi(struct mlx5_core_dev *dev)
return -ENOTSUPP;
 }
 
-static int map_bf_area(struct mlx5_core_dev *dev)
-{
-   resource_size_t bf_start = pci_resource_start(dev->pdev, 0);
-   resource_size_t bf_len = pci_resource_len(dev->pdev, 0);
-
-   dev->priv.bf_mapping = io_mapping_create_wc(bf_start, bf_len);
-
-   return dev->priv.bf_mapping ? 0 : -ENOMEM;
-}
-
-static void unmap_bf_area(struct mlx5_core_dev *dev)
-{
-   if (dev->priv.bf_mapping)
-   io_mapping_free(dev->priv.bf_mapping);
-}
-
 static void mlx5_add_device(struct mlx5_interface *intf, struct mlx5_priv 
*priv)
 {
struct mlx5_device_context *dev_ctx;
@@ -1103,14 +1087,9 @@ static int mlx5_load_one(struct mlx5_core_dev *dev, 
struct mlx5_priv *priv)
goto err_stop_eqs;
}
 
-   if (map_bf_area(dev))
-   dev_err(>dev, "Failed to map blue flame area\n");
-
err = mlx5_irq_set_affinity_hints(dev);
-   if (err) {
+   if (err)
dev_err(>dev, "Failed to alloc affinity hint cpumask\n");
-   goto err_unmap_bf_area;
-   }
 
MLX5_INIT_DOORBELL_LOCK(>cq_uar_lock);
 
@@ -1169,10 +1148,6 @@ err_fs:
mlx5_cleanup_qp_table(dev);
mlx5_cleanup_cq_table(dev);
mlx5_irq_clear_affinity_hints(dev);
-
-err_unmap_bf_area:
-   unmap_bf_area(dev);
-
free_comp_eqs(dev);
 
 err_stop_eqs:
@@ -1242,7 +1217,6 @@ static int mlx5_unload_one(struct mlx5_core_dev *dev, 
struct mlx5_priv *priv)
mlx5_cleanup_qp_table(dev);
mlx5_cleanup_cq_table(dev);
mlx5_irq_clear_affinity_hints(dev);
-   unmap_bf_area(dev);
free_comp_eqs(dev);
mlx5_stop_eqs(dev);
mlx5_free_uuars(dev, >uuari);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/uar.c 
b/drivers/net/ethernet/mellanox/mlx5/core/uar.c
index eb05c84..d287bcb 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/uar.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/uar.c
@@ -246,11 +246,11 @@ int mlx5_alloc_map_uar(struct mlx5_core_dev *mdev, struct 
mlx5_uar *uar)
err = -ENOMEM;
goto err_free_uar;
}
-
-   if (mdev->priv.bf_mapping)
-   uar->bf_map = io_mapping_map_wc(mdev->priv.bf_mapping,
-   uar->index << PAGE_SHIFT);
-
+#ifdef ARCH_HAS_IOREMAP_WC
+   uar->bf_map = ioremap_wc(pfn << PAGE_SHIFT, PAGE_SIZE);
+   if (!uar->bf_map)
+   mlx5_core_warn(mdev, "ioremap_wc() failed\n");
+#endif
return 0;
 
 err_free_uar:
@@ -262,7 +262,7 @@ EXPORT_SYMBOL(mlx5_alloc_map_uar);
 
 void mlx5_unmap_free_uar(struct mlx5_core_dev *mdev, struct mlx5_uar *uar)
 {
-   io_mapping_unmap(uar->bf_map);
+   iounmap(uar->bf_map);
iounmap(uar->map);
mlx5_cmd_free_uar(mdev, uar->index);
 }
diff --git a/include/linux/mlx5/driver.h b/include/linux/mlx5/driver.h
index 3388a43..335d43a 100644
--- a/include/linux/mlx5/driver.h
+++ b/include/linux/mlx5/driver.h
@@ -460,8 +460,6 @@ struct mlx5_priv {
struct mlx5_uuar_info   uuari;
MLX5_DECLARE_DOORBELL_LOCK(cq_uar_lock);
 
-   struct io_mapping   *bf_mapping;
-
/* pages stuff */
struct workqueue_struct *pg_wq;
struct rb_root  page_root;
-- 
1.7.1



[PATCH net-next V1 06/10] net/mlx5e: Don't try to modify CQ moderation if it is not supported

2016-02-28 Thread Saeed Mahameed
From: Gal Pressman 

If CQ moderation is not supported by the device, print a warning on
netdevice load, and return error when trying to modify/query cq
moderation via ethtool.

Fixes: f62b8bb8f2d3 ('net/mlx5: Extend mlx5_core to support ConnectX-4
Ethernet functionality')
Signed-off-by: Gal Pressman 

Signed-off-by: Saeed Mahameed 
---
 .../net/ethernet/mellanox/mlx5/core/en_ethtool.c   |6 ++
 drivers/net/ethernet/mellanox/mlx5/core/en_main.c  |   12 ++--
 2 files changed, 12 insertions(+), 6 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c 
b/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c
index 577b4b1..a1b3bb4 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c
@@ -400,6 +400,9 @@ static int mlx5e_get_coalesce(struct net_device *netdev,
 {
struct mlx5e_priv *priv = netdev_priv(netdev);
 
+   if (!MLX5_CAP_GEN(priv->mdev, cq_moderation))
+   return -ENOTSUPP;
+
coal->rx_coalesce_usecs   = priv->params.rx_cq_moderation_usec;
coal->rx_max_coalesced_frames = priv->params.rx_cq_moderation_pkts;
coal->tx_coalesce_usecs   = priv->params.tx_cq_moderation_usec;
@@ -417,6 +420,9 @@ static int mlx5e_set_coalesce(struct net_device *netdev,
int tc;
int i;
 
+   if (!MLX5_CAP_GEN(mdev, cq_moderation))
+   return -ENOTSUPP;
+
priv->params.tx_cq_moderation_usec = coal->tx_coalesce_usecs;
priv->params.tx_cq_moderation_pkts = coal->tx_max_coalesced_frames;
priv->params.rx_cq_moderation_usec = coal->rx_coalesce_usecs;
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c 
b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
index 30fd971..b20a35b 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -870,12 +870,10 @@ static int mlx5e_open_cq(struct mlx5e_channel *c,
if (err)
goto err_destroy_cq;
 
-   err = mlx5_core_modify_cq_moderation(mdev, >mcq,
-moderation_usecs,
-moderation_frames);
-   if (err)
-   goto err_destroy_cq;
-
+   if (MLX5_CAP_GEN(mdev, cq_moderation))
+   mlx5_core_modify_cq_moderation(mdev, >mcq,
+  moderation_usecs,
+  moderation_frames);
return 0;
 
 err_destroy_cq:
@@ -2218,6 +2216,8 @@ static int mlx5e_check_required_hca_cap(struct 
mlx5_core_dev *mdev)
}
if (!MLX5_CAP_ETH(mdev, self_lb_en_modifiable))
mlx5_core_warn(mdev, "Self loop back prevention is not 
supported\n");
+   if (!MLX5_CAP_GEN(mdev, cq_moderation))
+   mlx5_core_warn(mdev, "CQ modiration is not supported\n");
 
return 0;
 }
-- 
1.7.1



[PATCH net-next V1 10/10] net/mlx5: Avoid double mapping of io mapped memory

2016-02-28 Thread Saeed Mahameed
From: Moshe Lazer 

Device page may be mapped to non-cached(NC) or to write combining(WC).
The code before this fix tries to map it both to WC and NC
contrary to what stated in Intel's software developer manual.

Fixes: 88a85f99e51f ('TX latency optimization to save DMA reads')
Signed-off-by: Moshe Lazer 
Reviewed-by: Achiad Shochat 
Signed-off-by: Saeed Mahameed 
---
 drivers/net/ethernet/mellanox/mlx5/core/en.h  |   16 --
 drivers/net/ethernet/mellanox/mlx5/core/en_main.c |   12 +--
 drivers/net/ethernet/mellanox/mlx5/core/en_tx.c   |2 +-
 drivers/net/ethernet/mellanox/mlx5/core/uar.c |   33 +---
 include/linux/mlx5/driver.h   |3 +-
 5 files changed, 38 insertions(+), 28 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h 
b/drivers/net/ethernet/mellanox/mlx5/core/en.h
index b289660..9c0e80e 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h
@@ -388,6 +388,7 @@ struct mlx5e_sq_dma {
 
 enum {
MLX5E_SQ_STATE_WAKE_TXQ_ENABLE,
+   MLX5E_SQ_STATE_BF_ENABLE,
 };
 
 struct mlx5e_sq {
@@ -416,7 +417,6 @@ struct mlx5e_sq {
struct mlx5_wq_cyc wq;
u32dma_fifo_mask;
void __iomem  *uar_map;
-   void __iomem  *uar_bf_map;
struct netdev_queue   *txq;
u32sqn;
u16bf_buf_size;
@@ -664,16 +664,12 @@ static inline void mlx5e_tx_notify_hw(struct mlx5e_sq *sq,
 * doorbell
 */
wmb();
-
-   if (bf_sz) {
-   __iowrite64_copy(sq->uar_bf_map + ofst, >ctrl, bf_sz);
-
-   /* flush the write-combining mapped buffer */
-   wmb();
-
-   } else {
+   if (bf_sz)
+   __iowrite64_copy(sq->uar_map + ofst, >ctrl, bf_sz);
+   else
mlx5_write64((__be32 *)>ctrl, sq->uar_map + ofst, NULL);
-   }
+   /* flush the write-combining mapped buffer */
+   wmb();
 
sq->bf_offset ^= sq->bf_buf_size;
 }
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c 
b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
index b20a35b..5063c0e 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -548,7 +548,7 @@ static int mlx5e_create_sq(struct mlx5e_channel *c,
int txq_ix;
int err;
 
-   err = mlx5_alloc_map_uar(mdev, >uar);
+   err = mlx5_alloc_map_uar(mdev, >uar, true);
if (err)
return err;
 
@@ -560,8 +560,12 @@ static int mlx5e_create_sq(struct mlx5e_channel *c,
goto err_unmap_free_uar;
 
sq->wq.db   = >wq.db[MLX5_SND_DBR];
-   sq->uar_map = sq->uar.map;
-   sq->uar_bf_map  = sq->uar.bf_map;
+   if (sq->uar.bf_map) {
+   set_bit(MLX5E_SQ_STATE_BF_ENABLE, >state);
+   sq->uar_map = sq->uar.bf_map;
+   } else {
+   sq->uar_map = sq->uar.map;
+   }
sq->bf_buf_size = (1 << MLX5_CAP_GEN(mdev, log_bf_reg_size)) / 2;
sq->max_inline  = param->max_inline;
 
@@ -2418,7 +2422,7 @@ static void *mlx5e_create_netdev(struct mlx5_core_dev 
*mdev)
 
priv = netdev_priv(netdev);
 
-   err = mlx5_alloc_map_uar(mdev, >cq_uar);
+   err = mlx5_alloc_map_uar(mdev, >cq_uar, false);
if (err) {
mlx5_core_err(mdev, "alloc_map uar failed, %d\n", err);
goto err_free_netdev;
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c 
b/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c
index a05c070..c34f4f3 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c
@@ -303,7 +303,7 @@ static netdev_tx_t mlx5e_sq_xmit(struct mlx5e_sq *sq, 
struct sk_buff *skb)
if (!skb->xmit_more || netif_xmit_stopped(sq->txq)) {
int bf_sz = 0;
 
-   if (bf && sq->uar_bf_map)
+   if (bf && test_bit(MLX5E_SQ_STATE_BF_ENABLE, >state))
bf_sz = wi->num_wqebbs << 3;
 
cseg->fm_ce_se = MLX5_WQE_CTRL_CQ_UPDATE;
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/uar.c 
b/drivers/net/ethernet/mellanox/mlx5/core/uar.c
index d287bcb..512f9cb 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/uar.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/uar.c
@@ -226,7 +226,8 @@ int mlx5_free_uuars(struct mlx5_core_dev *dev, struct 
mlx5_uuar_info *uuari)
return 0;
 }
 
-int mlx5_alloc_map_uar(struct mlx5_core_dev *mdev, struct mlx5_uar *uar)
+int mlx5_alloc_map_uar(struct mlx5_core_dev *mdev, struct mlx5_uar *uar,
+  bool map_wc)
 {
phys_addr_t pfn;
phys_addr_t uar_bar_start;
@@ -240,20 +241,28 @@ int mlx5_alloc_map_uar(struct mlx5_core_dev *mdev, 

[PATCH net-next V1 07/10] net/mlx5e: Don't modify CQ before it was created

2016-02-28 Thread Saeed Mahameed
From: Gal Pressman 

Calling mlx5e_set_coalesce while the interface is down will result in
modifying CQs that don't exist.

Fixes: f62b8bb8f2d3 ('net/mlx5: Extend mlx5_core to support ConnectX-4
Ethernet functionality')
Signed-off-by: Gal Pressman 

Signed-off-by: Saeed Mahameed 
---
 .../net/ethernet/mellanox/mlx5/core/en_ethtool.c   |6 ++
 1 files changed, 6 insertions(+), 0 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c 
b/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c
index a1b3bb4..0959656 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c
@@ -423,11 +423,15 @@ static int mlx5e_set_coalesce(struct net_device *netdev,
if (!MLX5_CAP_GEN(mdev, cq_moderation))
return -ENOTSUPP;
 
+   mutex_lock(>state_lock);
priv->params.tx_cq_moderation_usec = coal->tx_coalesce_usecs;
priv->params.tx_cq_moderation_pkts = coal->tx_max_coalesced_frames;
priv->params.rx_cq_moderation_usec = coal->rx_coalesce_usecs;
priv->params.rx_cq_moderation_pkts = coal->rx_max_coalesced_frames;
 
+   if (!test_bit(MLX5E_STATE_OPENED, >state))
+   goto out;
+
for (i = 0; i < priv->params.num_channels; ++i) {
c = priv->channel[i];
 
@@ -443,6 +447,8 @@ static int mlx5e_set_coalesce(struct net_device *netdev,
   coal->rx_max_coalesced_frames);
}
 
+out:
+   mutex_unlock(>state_lock);
return 0;
 }
 
-- 
1.7.1



[PATCH net-next V1 08/10] net/mlx5: Make command timeout way shorter

2016-02-28 Thread Saeed Mahameed
From: Or Gerlitz 

The command timeout is terribly long, whole two hours. Make it 60s so if
things do go wrong, the user gets feedback in relatively short time, so
they can take corrective actions and/or investigate using tools and such.

Fixes: e126ba97dba9 ('mlx5: Add driver for Mellanox Connect-IB adapters')
Signed-off-by: Or Gerlitz 
Signed-off-by: Leon Romanovsky 
Signed-off-by: Saeed Mahameed 
---
 include/linux/mlx5/driver.h |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/include/linux/mlx5/driver.h b/include/linux/mlx5/driver.h
index a815da9..3388a43 100644
--- a/include/linux/mlx5/driver.h
+++ b/include/linux/mlx5/driver.h
@@ -54,7 +54,7 @@ enum {
/* one minute for the sake of bringup. Generally, commands must always
 * complete and we may need to increase this timeout value
 */
-   MLX5_CMD_TIMEOUT_MSEC   = 7200 * 1000,
+   MLX5_CMD_TIMEOUT_MSEC   = 60 * 1000,
MLX5_CMD_WQ_MAX_NAME= 32,
 };
 
-- 
1.7.1



[PATCH net-next V1 02/10] net/mlx5e: Placement changed for carrier state updates

2016-02-28 Thread Saeed Mahameed
From: Tariq Toukan 

More proper to declare carrier state UP only after the channels
are ready for traffic.

Signed-off-by: Tariq Toukan 
Signed-off-by: Saeed Mahameed 
---
 drivers/net/ethernet/mellanox/mlx5/core/en_main.c |4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c 
b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
index 38944b8..013be09 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -1451,8 +1451,8 @@ int mlx5e_open_locked(struct net_device *netdev)
goto err_close_channels;
}
 
-   mlx5e_update_carrier(priv);
mlx5e_redirect_rqts(priv);
+   mlx5e_update_carrier(priv);
mlx5e_timestamp_init(priv);
 
schedule_delayed_work(>update_stats_work, 0);
@@ -1491,8 +1491,8 @@ int mlx5e_close_locked(struct net_device *netdev)
clear_bit(MLX5E_STATE_OPENED, >state);
 
mlx5e_timestamp_cleanup(priv);
-   mlx5e_redirect_rqts(priv);
netif_carrier_off(priv->netdev);
+   mlx5e_redirect_rqts(priv);
mlx5e_close_channels(priv);
 
return 0;
-- 
1.7.1



[PATCH net-next V1 00/10] mlx5 driver updates

2016-02-28 Thread Saeed Mahameed
Hi Dave, 

This series includes some bug fixes and updates for the mlx5 core
and ethernet driver.

>From Gal, two fixes that protects the update CQ moderation flows
when it is not allowed.

>From Moshe, two fixes for the core and ethernet driver in 
non-cached(NC) and write combining(WC) buffers mappings, 
which prevents the driver from double memory mappings.

>From Or, reduce the firmware command completion timeout.

>From Tariq, several small trivial fixes.

Changes from v0:
- "Fix global UAR mapping" commit messages updated to explain 
ARCH_HAS_IOREMAP_WC usage.
- rebased to commit 8d3f2806f8fb 'Merge branch ethtool-ksettings'

Thanks,
Saeed

Gal Pressman (2):
  net/mlx5e: Don't try to modify CQ moderation if it is not supported
  net/mlx5e: Don't modify CQ before it was created

Moshe Lazer (2):
  net/mlx5: Fix global UAR mapping
  net/mlx5: Avoid double mapping of io mapped memory

Or Gerlitz (1):
  net/mlx5: Make command timeout way shorter

Tariq Toukan (5):
  net/mlx5e: Replace async events spinlock with synchronize_irq()
  net/mlx5e: Placement changed for carrier state updates
  net/mlx5e: Changed naming convention of tx queues in ethtool stats
  net/mlx5e: Move common case counters within sq_stats struct
  net/mlx5e: Set drop RQ's necessary parameters only

 drivers/net/ethernet/mellanox/mlx5/core/en.h   |   27 -
 .../net/ethernet/mellanox/mlx5/core/en_ethtool.c   |   27 ++--
 drivers/net/ethernet/mellanox/mlx5/core/en_main.c  |   64 ++-
 drivers/net/ethernet/mellanox/mlx5/core/en_tx.c|2 +-
 drivers/net/ethernet/mellanox/mlx5/core/eq.c   |5 ++
 drivers/net/ethernet/mellanox/mlx5/core/main.c |   28 +
 .../net/ethernet/mellanox/mlx5/core/mlx5_core.h|1 +
 drivers/net/ethernet/mellanox/mlx5/core/uar.c  |   31 ++
 include/linux/mlx5/driver.h|7 +-
 9 files changed, 97 insertions(+), 95 deletions(-)



[PATCH net-next V1 01/10] net/mlx5e: Replace async events spinlock with synchronize_irq()

2016-02-28 Thread Saeed Mahameed
From: Tariq Toukan 

We only need to flush the irq handler to make sure it does not
queue a work into the global work queue after we start to flush it.
So using synchronize_irq() is more appropriate than a spin lock.

Signed-off-by: Tariq Toukan 
Signed-off-by: Saeed Mahameed 
---
 drivers/net/ethernet/mellanox/mlx5/core/en.h   |1 -
 drivers/net/ethernet/mellanox/mlx5/core/en_main.c  |   24 ++-
 drivers/net/ethernet/mellanox/mlx5/core/eq.c   |5 
 .../net/ethernet/mellanox/mlx5/core/mlx5_core.h|1 +
 4 files changed, 14 insertions(+), 17 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h 
b/drivers/net/ethernet/mellanox/mlx5/core/en.h
index 1dca3dc..4511984 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h
@@ -555,7 +555,6 @@ struct mlx5e_priv {
struct mlx5e_vxlan_db  vxlan;
 
struct mlx5e_paramsparams;
-   spinlock_t async_events_spinlock; /* sync hw events */
struct work_struct update_carrier_work;
struct work_struct set_rx_mode_work;
struct delayed_workupdate_stats_work;
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c 
b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
index 0d45f35..38944b8 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -275,9 +275,14 @@ static void mlx5e_update_stats_work(struct work_struct 
*work)
mutex_unlock(>state_lock);
 }
 
-static void __mlx5e_async_event(struct mlx5e_priv *priv,
-   enum mlx5_dev_event event)
+static void mlx5e_async_event(struct mlx5_core_dev *mdev, void *vpriv,
+ enum mlx5_dev_event event, unsigned long param)
 {
+   struct mlx5e_priv *priv = vpriv;
+
+   if (!test_bit(MLX5E_STATE_ASYNC_EVENTS_ENABLE, >state))
+   return;
+
switch (event) {
case MLX5_DEV_EVENT_PORT_UP:
case MLX5_DEV_EVENT_PORT_DOWN:
@@ -289,17 +294,6 @@ static void __mlx5e_async_event(struct mlx5e_priv *priv,
}
 }
 
-static void mlx5e_async_event(struct mlx5_core_dev *mdev, void *vpriv,
- enum mlx5_dev_event event, unsigned long param)
-{
-   struct mlx5e_priv *priv = vpriv;
-
-   spin_lock(>async_events_spinlock);
-   if (test_bit(MLX5E_STATE_ASYNC_EVENTS_ENABLE, >state))
-   __mlx5e_async_event(priv, event);
-   spin_unlock(>async_events_spinlock);
-}
-
 static void mlx5e_enable_async_events(struct mlx5e_priv *priv)
 {
set_bit(MLX5E_STATE_ASYNC_EVENTS_ENABLE, >state);
@@ -307,9 +301,8 @@ static void mlx5e_enable_async_events(struct mlx5e_priv 
*priv)
 
 static void mlx5e_disable_async_events(struct mlx5e_priv *priv)
 {
-   spin_lock_irq(>async_events_spinlock);
clear_bit(MLX5E_STATE_ASYNC_EVENTS_ENABLE, >state);
-   spin_unlock_irq(>async_events_spinlock);
+   synchronize_irq(mlx5_get_msix_vec(priv->mdev, MLX5_EQ_VEC_ASYNC));
 }
 
 #define MLX5E_HW2SW_MTU(hwmtu) (hwmtu - (ETH_HLEN + VLAN_HLEN + ETH_FCS_LEN))
@@ -2290,7 +2283,6 @@ static void mlx5e_build_netdev_priv(struct mlx5_core_dev 
*mdev,
mlx5e_ets_init(priv);
 #endif
 
-   spin_lock_init(>async_events_spinlock);
mutex_init(>state_lock);
 
INIT_WORK(>update_carrier_work, mlx5e_update_carrier_work);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eq.c 
b/drivers/net/ethernet/mellanox/mlx5/core/eq.c
index 647a3ca..18fccec 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/eq.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/eq.c
@@ -442,6 +442,11 @@ int mlx5_destroy_unmap_eq(struct mlx5_core_dev *dev, 
struct mlx5_eq *eq)
 }
 EXPORT_SYMBOL_GPL(mlx5_destroy_unmap_eq);
 
+u32 mlx5_get_msix_vec(struct mlx5_core_dev *dev, int vecidx)
+{
+   return dev->priv.msix_arr[MLX5_EQ_VEC_ASYNC].vector;
+}
+
 int mlx5_eq_init(struct mlx5_core_dev *dev)
 {
int err;
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h 
b/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h
index 0336847..0b0b226 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h
@@ -99,6 +99,7 @@ int mlx5_core_enable_hca(struct mlx5_core_dev *dev, u16 
func_id);
 int mlx5_core_disable_hca(struct mlx5_core_dev *dev, u16 func_id);
 int mlx5_wait_for_vf_pages(struct mlx5_core_dev *dev);
 cycle_t mlx5_read_internal_timer(struct mlx5_core_dev *dev);
+u32 mlx5_get_msix_vec(struct mlx5_core_dev *dev, int vecidx);
 
 void mlx5e_init(void);
 void mlx5e_cleanup(void);
-- 
1.7.1



[PATCH net-next V1 05/10] net/mlx5e: Set drop RQ's necessary parameters only

2016-02-28 Thread Saeed Mahameed
From: Tariq Toukan 

By its role, there is no need to set all the other parameters
for the drop RQ.

Signed-off-by: Tariq Toukan 
Signed-off-by: Saeed Mahameed 
---
 drivers/net/ethernet/mellanox/mlx5/core/en_main.c |   12 ++--
 1 files changed, 10 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c 
b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
index 013be09..30fd971 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -1064,6 +1064,15 @@ static void mlx5e_build_rq_param(struct mlx5e_priv *priv,
param->wq.linear = 1;
 }
 
+static void mlx5e_build_drop_rq_param(struct mlx5e_rq_param *param)
+{
+   void *rqc = param->rqc;
+   void *wq = MLX5_ADDR_OF(rqc, rqc, wq);
+
+   MLX5_SET(wq, wq, wq_type, MLX5_WQ_TYPE_LINKED_LIST);
+   MLX5_SET(wq, wq, log_wq_stride,ilog2(sizeof(struct mlx5e_rx_wqe)));
+}
+
 static void mlx5e_build_sq_param(struct mlx5e_priv *priv,
 struct mlx5e_sq_param *param)
 {
@@ -1574,8 +1583,7 @@ static int mlx5e_open_drop_rq(struct mlx5e_priv *priv)
 
memset(_param, 0, sizeof(cq_param));
memset(_param, 0, sizeof(rq_param));
-   mlx5e_build_rx_cq_param(priv, _param);
-   mlx5e_build_rq_param(priv, _param);
+   mlx5e_build_drop_rq_param(_param);
 
err = mlx5e_create_drop_cq(priv, cq, _param);
if (err)
-- 
1.7.1



[PATCH net-next V1 03/10] net/mlx5e: Changed naming convention of tx queues in ethtool stats

2016-02-28 Thread Saeed Mahameed
From: Tariq Toukan 

Instead of the pair (channel, tc), we now use a single number that
goes over all tx queues of a TC, for all TCs.

Signed-off-by: Tariq Toukan 
Signed-off-by: Saeed Mahameed 
---
 .../net/ethernet/mellanox/mlx5/core/en_ethtool.c   |   15 ---
 1 files changed, 8 insertions(+), 7 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c 
b/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c
index e9760f8..577b4b1 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c
@@ -211,13 +211,14 @@ static void mlx5e_get_strings(struct net_device *dev,
sprintf(data + (idx++) * ETH_GSTRING_LEN,
"rx%d_%s", i, rq_stats_strings[j]);
 
-   for (i = 0; i < priv->params.num_channels; i++)
-   for (tc = 0; tc < priv->params.num_tc; tc++)
+   for (tc = 0; tc < priv->params.num_tc; tc++)
+   for (i = 0; i < priv->params.num_channels; i++)
for (j = 0; j < NUM_SQ_STATS; j++)
sprintf(data +
-   (idx++) * ETH_GSTRING_LEN,
-   "tx%d_%d_%s", i, tc,
-   sq_stats_strings[j]);
+ (idx++) * ETH_GSTRING_LEN,
+ "tx%d_%s",
+ priv->channeltc_to_txq_map[i][tc],
+ sq_stats_strings[j]);
break;
}
 }
@@ -249,8 +250,8 @@ static void mlx5e_get_ethtool_stats(struct net_device *dev,
>state) ? 0 :
   ((u64 *)>channel[i]->rq.stats)[j];
 
-   for (i = 0; i < priv->params.num_channels; i++)
-   for (tc = 0; tc < priv->params.num_tc; tc++)
+   for (tc = 0; tc < priv->params.num_tc; tc++)
+   for (i = 0; i < priv->params.num_channels; i++)
for (j = 0; j < NUM_SQ_STATS; j++)
data[idx++] = !test_bit(MLX5E_STATE_OPENED,
>state) ? 0 :
-- 
1.7.1



Re: [PATCH/RFC v5 net-next] ravb: Add dma queue interrupt support

2016-02-28 Thread Yoshihiro Kaneko
2016-02-22 4:05 GMT+09:00 Sergei Shtylyov :
> On 02/14/2016 10:39 PM, Yoshihiro Kaneko wrote:
>
>> From: Kazuya Mizuguchi 
>>
>> This patch supports the following interrupts.
>>
>> - One interrupt for multiple (error, gPTP)
>> - One interrupt for emac
>> - Four interrupts for dma queue (best effort rx/tx, network control rx/tx)
>>
>> This patch improve efficiency of the interrupt handler by adding the
>> interrupt handler corresponding to each interrupt source described
>> above. Additionally, it reduces the number of times of the access to
>> EthernetAVB IF.
>> Also this patch prevent this driver depends on the whim of a boot loader.
>>
>> [ykaneko0...@gmail.com: define bit names of registers]
>> [ykaneko0...@gmail.com: add comment for gen3 only registers]
>> [ykaneko0...@gmail.com: fix coding style]
>> [ykaneko0...@gmail.com: update changelog]
>> [ykaneko0...@gmail.com: gen3: fix initialization of interrupts]
>> [ykaneko0...@gmail.com: gen3: fix clearing interrupts]
>> [ykaneko0...@gmail.com: gen3: add helper function for request_irq()]
>> [ykaneko0...@gmail.com: revert ravb_close() and ravb_ptp_stop()]
>> [ykaneko0...@gmail.com: avoid calling free_irq() to non-hooked interrupts]
>> [ykaneko0...@gmail.com: make NC/BE interrupt handler a function]
>> Signed-off-by: Kazuya Mizuguchi 
>> Signed-off-by: Yoshihiro Kaneko 
>
>
> [...]
>
>> diff --git a/drivers/net/ethernet/renesas/ravb_main.c
>> b/drivers/net/ethernet/renesas/ravb_main.c
>> index c936682..1bec71e 100644
>> --- a/drivers/net/ethernet/renesas/ravb_main.c
>> +++ b/drivers/net/ethernet/renesas/ravb_main.c
>
> [...]
>>
>> @@ -697,6 +726,39 @@ static void ravb_error_interrupt(struct net_device
>> *ndev)
>> }
>>   }
>>
>> +static int ravb_nc_be_interrupt(struct net_device *ndev, int ravb_queue,
>
>
>I'd call this function e.g. ravb_queue_interrupt(). And make it return
> 'bool' or even 'irqreturn_t' directly. And I'd suggest a shorter name for
> the 'ravb_queue' parameter, like 'queue' or even 'q'...

Agreed.

>
>> +   u32 ris0, u32 *ric0, u32 tis, u32 *tic)
>
>
>You don't seem to need 'ric0' and 'tic' past the call sites, so no real
> need to pass them by reference.

When Rx/Tx interrupt for NC and BE is issued at the same time,
this function is called twice (for NC, BE) from ravb_interrupt.
The interrupt mask of NC set in the first call will be reset in the next
call for BE. So it is necessary to keep the modified value of "ric0" and
"tic".

>
> [...]
>>
>> @@ -725,31 +787,15 @@ static irqreturn_t ravb_interrupt(int irq, void
>> *dev_id)
>>
>> /* Network control and best effort queue RX/TX */
>> for (q = RAVB_NC; q >= RAVB_BE; q--) {
>> -   if (((ris0 & ric0) & BIT(q)) ||
>> -   ((tis  & tic)  & BIT(q))) {
>> -   if (napi_schedule_prep(>napi[q])) {
>> -   /* Mask RX and TX interrupts */
>> -   ric0 &= ~BIT(q);
>> -   tic &= ~BIT(q);
>> -   ravb_write(ndev, ric0, RIC0);
>> -   ravb_write(ndev, tic, TIC);
>> -   __napi_schedule(>napi[q]);
>> -   } else {
>> -   netdev_warn(ndev,
>> -   "ignoring interrupt,
>> rx status 0x%08x, rx mask 0x%08x,\n",
>> -   ris0, ric0);
>> -   netdev_warn(ndev,
>> -   "
>> tx status 0x%08x, tx mask 0x%08x.\n",
>> -   tis, tic);
>> -   }
>> +   if (ravb_nc_be_interrupt(ndev, q, ris0, ,
>> tis,
>> +))
>> result = IRQ_HANDLED;
>> -   }
>> }
>
>
>Unroll this *for* loop please...

OK.

>
>> }
>
> [...]
>>
>> @@ -767,6 +813,73 @@ static irqreturn_t ravb_interrupt(int irq, void
>> *dev_id)
>> return result;
>>   }
>>
>> +/* Descriptor IRQ/Error/Management interrupt handler */
>
>
>You don't handle the descriptor interrupt, do you?

Oops. I will fix this.

>
>> +static irqreturn_t ravb_multi_interrupt(int irq, void *dev_id)
>
> [...]
>>
>> +static irqreturn_t ravb_dmaq_interrupt(int irq, void *dev_id, int
>> ravb_queue)
>
>
>Perhaps, ravb_rx_tx_interrupt()?

Agreed.

>
>> +{
>> +   struct net_device *ndev = dev_id;
>> +   struct ravb_private *priv = netdev_priv(ndev);
>> +   irqreturn_t result = IRQ_NONE;
>> +   u32 ris0, ric0, tis, tic;
>> +
>> +   

Re: [net-next PATCH v2 2/8] net: rework setup_tc ndo op to consume general tc operand

2016-02-28 Thread Alaa Hleihel
Hi John,

On 2/16/2016 08:00, John Fastabend wrote:
> @@ -140,9 +141,11 @@ static int mqprio_init(struct Qdisc *sch, struct nlattr 
> *opt)
>* supplied and verified mapping
>*/
>   if (qopt->hw) {
> + struct tc_to_netdev tc = {.type = TC_SETUP_MQPRIO,
> +   .tc = qopt->num_tc};
> +
>

Using gcc 4.4.7, the compilation failed on these lines (since that compiler 
does not support anonymous unions):

  CC [M]  net/sched/sch_mqprio.o
net/sched/sch_mqprio.c: In function ?mqprio_init?:
net/sched/sch_mqprio.c:145: error: unknown field ?tc? specified in initializer
net/sched/sch_mqprio.c:145: warning: missing braces around initializer
net/sched/sch_mqprio.c:145: warning: (near initialization for ?tc.?)
make[2]: *** [net/sched/sch_mqprio.o] Error 1
make[1]: *** [net/sched] Error 2
make: *** [net] Error 2


It can be fixed by surrounding it with braces; something like:

diff --git a/net/sched/sch_mqprio.c b/net/sched/sch_mqprio.c
index f9947d1..9b844f7 100644
--- a/net/sched/sch_mqprio.c
+++ b/net/sched/sch_mqprio.c
@@ -142,7 +142,7 @@ static int mqprio_init(struct Qdisc *sch, struct nlattr 
*opt)
 */
if (qopt->hw) {
struct tc_to_netdev tc = {.type = TC_SETUP_MQPRIO,
- .tc = qopt->num_tc};
+{.tc = qopt->num_tc}};

priv->hw_owned = 1;
err = dev->netdev_ops->ndo_setup_tc(dev, sch->handle, 0, );


Would you please fix it ?

Regards,
Alaa



Re: [PATCH V3 3/3] vhost_net: basic polling support

2016-02-28 Thread Michael S. Tsirkin
On Fri, Feb 26, 2016 at 04:42:44PM +0800, Jason Wang wrote:
> This patch tries to poll for new added tx buffer or socket receive
> queue for a while at the end of tx/rx processing. The maximum time
> spent on polling were specified through a new kind of vring ioctl.
> 
> Signed-off-by: Jason Wang 

Looks good overall, but I still see one problem.

> ---
>  drivers/vhost/net.c| 79 
> +++---
>  drivers/vhost/vhost.c  | 14 
>  drivers/vhost/vhost.h  |  1 +
>  include/uapi/linux/vhost.h |  6 
>  4 files changed, 95 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
> index 9eda69e..c91af93 100644
> --- a/drivers/vhost/net.c
> +++ b/drivers/vhost/net.c
> @@ -287,6 +287,44 @@ static void vhost_zerocopy_callback(struct ubuf_info 
> *ubuf, bool success)
>   rcu_read_unlock_bh();
>  }
>  
> +static inline unsigned long busy_clock(void)
> +{
> + return local_clock() >> 10;
> +}
> +
> +static bool vhost_can_busy_poll(struct vhost_dev *dev,
> + unsigned long endtime)
> +{
> + return likely(!need_resched()) &&
> +likely(!time_after(busy_clock(), endtime)) &&
> +likely(!signal_pending(current)) &&
> +!vhost_has_work(dev) &&
> +single_task_running();

So I find it quite unfortunate that this still uses single_task_running.
This means that for example a SCHED_IDLE task will prevent polling from
becoming active, and that seems like a bug, or at least
an undocumented feature :).

Unfortunately this logic affects the behaviour as observed
by userspace, so we can't merge it like this and tune
afterwards, since otherwise mangement tools will start
depending on this logic.


> +}
> +
> +static int vhost_net_tx_get_vq_desc(struct vhost_net *net,
> + struct vhost_virtqueue *vq,
> + struct iovec iov[], unsigned int iov_size,
> + unsigned int *out_num, unsigned int *in_num)
> +{
> + unsigned long uninitialized_var(endtime);
> + int r = vhost_get_vq_desc(vq, vq->iov, ARRAY_SIZE(vq->iov),
> + out_num, in_num, NULL, NULL);
> +
> + if (r == vq->num && vq->busyloop_timeout) {
> + preempt_disable();
> + endtime = busy_clock() + vq->busyloop_timeout;
> + while (vhost_can_busy_poll(vq->dev, endtime) &&
> +vhost_vq_avail_empty(vq->dev, vq))
> + cpu_relax();
> + preempt_enable();
> + r = vhost_get_vq_desc(vq, vq->iov, ARRAY_SIZE(vq->iov),
> + out_num, in_num, NULL, NULL);
> + }
> +
> + return r;
> +}
> +
>  /* Expects to be always run from workqueue - which acts as
>   * read-size critical section for our kind of RCU. */
>  static void handle_tx(struct vhost_net *net)
> @@ -331,10 +369,9 @@ static void handle_tx(struct vhost_net *net)
> % UIO_MAXIOV == nvq->done_idx))
>   break;
>  
> - head = vhost_get_vq_desc(vq, vq->iov,
> -  ARRAY_SIZE(vq->iov),
> -  , ,
> -  NULL, NULL);
> + head = vhost_net_tx_get_vq_desc(net, vq, vq->iov,
> + ARRAY_SIZE(vq->iov),
> + , );
>   /* On error, stop handling until the next kick. */
>   if (unlikely(head < 0))
>   break;
> @@ -435,6 +472,38 @@ static int peek_head_len(struct sock *sk)
>   return len;
>  }
>  
> +static int vhost_net_rx_peek_head_len(struct vhost_net *net, struct sock *sk)
> +{
> + struct vhost_net_virtqueue *nvq = >vqs[VHOST_NET_VQ_TX];
> + struct vhost_virtqueue *vq = >vq;
> + unsigned long uninitialized_var(endtime);
> + int len = peek_head_len(sk);
> +
> + if (!len && vq->busyloop_timeout) {
> + /* Both tx vq and rx socket were polled here */
> + mutex_lock(>mutex);
> + vhost_disable_notify(>dev, vq);
> +
> + preempt_disable();
> + endtime = busy_clock() + vq->busyloop_timeout;
> +
> + while (vhost_can_busy_poll(>dev, endtime) &&
> +skb_queue_empty(>sk_receive_queue) &&
> +vhost_vq_avail_empty(>dev, vq))
> + cpu_relax();
> +
> + preempt_enable();
> +
> + if (vhost_enable_notify(>dev, vq))
> + vhost_poll_queue(>poll);
> + mutex_unlock(>mutex);
> +
> + len = peek_head_len(sk);
> + }
> +
> + return len;
> +}
> +
>  /* This is a multi-buffer version of vhost_get_desc, that works if
>   *   vq has read descriptors only.
>   * @vq   - the relevant virtqueue

Re: [PATCH] net/mlx5e: make VXLAN support conditional

2016-02-28 Thread Saeed Mahameed
On Fri, Feb 26, 2016 at 11:13 PM, Arnd Bergmann  wrote:
> VXLAN can be disabled at compile-time or it can be a loadable
> module while mlx5 is built-in, which leads to a link error:
>
> drivers/net/built-in.o: In function `mlx5e_create_netdev':
> ntb_netdev.c:(.text+0x106de4): undefined reference to `vxlan_get_rx_port'
>
> This avoids the link error and makes the vxlan code optional,
> like the other ethernet drivers do as well.
>
> Signed-off-by: Arnd Bergmann 
> Fixes: b3f63c3d5e2c ("net/mlx5e: Add netdev support for VXLAN tunneling")
Hi Arnd,

Thanks for the patch, I will suggest some slight modifications and we
will handle it and re-post the patch.

>
> struct mlx5e_paramsparams;
> spinlock_t async_events_spinlock; /* sync hw events */
> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c 
> b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
> index 0d45f35aee72..44fc4bc35ffd 100644
> --- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
> +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
> @@ -2116,6 +2116,9 @@ static netdev_features_t 
> mlx5e_vxlan_features_check(struct mlx5e_priv *priv,
> u16 proto;
> u16 port = 0;
>
> +   if (!IS_ENABLED(CONFIG_MLX5_CORE_EN_VXLAN))
> +   goto out;
> +

I would rather wrap the whole mlx5e_features_check with the suggested
config flag and disable it in case CONFIG_MLX5_CORE_EN_VXLAN is OFF,
since this function is only needed for when vxlan is supported.

>
>  static inline bool mlx5e_vxlan_allowed(struct mlx5_core_dev *mdev)
>  {
> -   return (MLX5_CAP_ETH(mdev, tunnel_stateless_vxlan) &&
> +   return IS_ENABLED(CONFIG_MLX5_CORE_EN_VXLAN) &&
> +   (MLX5_CAP_ETH(mdev, tunnel_stateless_vxlan) &&
> mlx5_core_is_pf(mdev));
>  }

Same here.


Re: [PATCH] net/9p: convert to new CQ API

2016-02-28 Thread Sagi Grimberg



Trivial conversion to the new RDMA CQ API.


Looks nice and simple :)

But I think that the fact that CQ processing is now
done in soft-IRQ (which is an improvement!) needs to
be documented.

Other than that, looks great

Reviewed-by: Sagi Grimberg 

P.S.
I was also confused in the past about 9p and if anyone
actually uses it...


[PATCH net-next 2/4] qed: Add support for HW attentions

2016-02-28 Thread Yuval Mintz
HW is capable of generating attentnions for a multitude of reasons,
but current driver is enabling attention generation only for management
firmware [required for link notifications].

This patch enables almost all of the possible reasons for HW attentions,
logging the HW block generating the attention and preventing further
attentions from that source [to prevent possible attention flood].
It also lays the infrastructure for additional exploration of the various
attentions.

Signed-off-by: Yuval Mintz 
---
 drivers/net/ethernet/qlogic/qed/qed_int.c  | 395 ++---
 drivers/net/ethernet/qlogic/qed/qed_reg_addr.h |   2 +
 2 files changed, 357 insertions(+), 40 deletions(-)

diff --git a/drivers/net/ethernet/qlogic/qed/qed_int.c 
b/drivers/net/ethernet/qlogic/qed/qed_int.c
index 7fd1be6..c914ac5 100644
--- a/drivers/net/ethernet/qlogic/qed/qed_int.c
+++ b/drivers/net/ethernet/qlogic/qed/qed_int.c
@@ -42,21 +42,210 @@ struct qed_sb_sp_info {
 #define SB_ATTN_ALIGNED_SIZE(p_hwfn) \
ALIGNED_TYPE_SIZE(struct atten_status_block, p_hwfn)
 
-#define ATTN_STATE_BITS (0xfff)
+struct aeu_invert_reg_bit {
+   char bit_name[30];
+
+#define ATTENTION_PARITY(1 << 0)
+
+#define ATTENTION_LENGTH_MASK   (0x0ff0)
+#define ATTENTION_LENGTH_SHIFT  (4)
+#define ATTENTION_LENGTH(flags) (((flags) & ATTENTION_LENGTH_MASK) >> \
+ATTENTION_LENGTH_SHIFT)
+#define ATTENTION_SINGLE(1 << ATTENTION_LENGTH_SHIFT)
+#define ATTENTION_PAR   (ATTENTION_SINGLE | ATTENTION_PARITY)
+#define ATTENTION_PAR_INT   ((2 << ATTENTION_LENGTH_SHIFT) | \
+ATTENTION_PARITY)
+
+/* Multiple bits start with this offset */
+#define ATTENTION_OFFSET_MASK   (0x000ff000)
+#define ATTENTION_OFFSET_SHIFT  (12)
+   unsigned int flags;
+};
+
+struct aeu_invert_reg {
+   struct aeu_invert_reg_bit bits[32];
+};
+
+#define MAX_ATTN_GRPS   (8)
+#define NUM_ATTN_REGS   (9)
+
+/* Notice aeu_invert_reg must be defined in the same order of bits as HW;  */
+static struct aeu_invert_reg aeu_descs[NUM_ATTN_REGS] = {
+   {
+   {   /* After Invert 1 */
+   {"GPIO0 function%d",
+(32 << ATTENTION_LENGTH_SHIFT)},
+   }
+   },
+
+   {
+   {   /* After Invert 2 */
+   {"PGLUE config_space", ATTENTION_SINGLE},
+   {"PGLUE misc_flr", ATTENTION_SINGLE},
+   {"PGLUE B RBC", ATTENTION_PAR_INT},
+   {"PGLUE misc_mctp", ATTENTION_SINGLE},
+   {"Flash event", ATTENTION_SINGLE},
+   {"SMB event", ATTENTION_SINGLE},
+   {"Main Power", ATTENTION_SINGLE},
+   {"SW timers #%d", (8 << ATTENTION_LENGTH_SHIFT) |
+ (1 << ATTENTION_OFFSET_SHIFT)},
+   {"PCIE glue/PXP VPD %d",
+(16 << ATTENTION_LENGTH_SHIFT)},
+   }
+   },
+
+   {
+   {   /* After Invert 3 */
+   {"General Attention %d",
+(32 << ATTENTION_LENGTH_SHIFT)},
+   }
+   },
+
+   {
+   {   /* After Invert 4 */
+   {"General Attention 32", ATTENTION_SINGLE},
+   {"General Attention %d",
+(2 << ATTENTION_LENGTH_SHIFT) |
+(33 << ATTENTION_OFFSET_SHIFT)},
+   {"General Attention 35", ATTENTION_SINGLE},
+   {"CNIG port %d", (4 << ATTENTION_LENGTH_SHIFT)},
+   {"MCP CPU", ATTENTION_SINGLE},
+   {"MCP Watchdog timer", ATTENTION_SINGLE},
+   {"MCP M2P", ATTENTION_SINGLE},
+   {"AVS stop status ready", ATTENTION_SINGLE},
+   {"MSTAT", ATTENTION_PAR_INT},
+   {"MSTAT per-path", ATTENTION_PAR_INT},
+   {"Reserved %d", (6 << ATTENTION_LENGTH_SHIFT)},
+   {"NIG", ATTENTION_PAR_INT},
+   {"BMB/OPTE/MCP", ATTENTION_PAR_INT},
+   {"BTB", ATTENTION_PAR_INT},
+   {"BRB", ATTENTION_PAR_INT},
+   {"PRS", ATTENTION_PAR_INT},
+   }
+   },
+
+   {
+   {   /* After Invert 5 */
+   {"SRC", ATTENTION_PAR_INT},
+   {"PB Client1", ATTENTION_PAR_INT},
+   {"PB Client2", ATTENTION_PAR_INT},
+   {"RPB", ATTENTION_PAR_INT},
+   {"PBF", ATTENTION_PAR_INT},
+   {"QM", ATTENTION_PAR_INT},
+   {"TM", ATTENTION_PAR_INT},
+  

[PATCH net-next 4/4] qed: Print additional HW attention info

2016-02-28 Thread Yuval Mintz
This patch utilizes the attention infrastructure to log additional
information that relates only to specific HW blocks.
For some of those HW blocks, it also stops automatically disabling the
attention generation as the attention is considered benign and thus
should only be logged; No fear of it flooding the system.

Signed-off-by: Yuval Mintz 
---
 drivers/net/ethernet/qlogic/qed/qed_int.c  | 526 -
 drivers/net/ethernet/qlogic/qed/qed_reg_addr.h |  58 +++
 2 files changed, 479 insertions(+), 105 deletions(-)

diff --git a/drivers/net/ethernet/qlogic/qed/qed_int.c 
b/drivers/net/ethernet/qlogic/qed/qed_int.c
index c8bca77..58fe664 100644
--- a/drivers/net/ethernet/qlogic/qed/qed_int.c
+++ b/drivers/net/ethernet/qlogic/qed/qed_int.c
@@ -66,6 +66,9 @@ struct aeu_invert_reg_bit {
 #define ATTENTION_OFFSET_SHIFT  (12)
unsigned int flags;
 
+   /* Callback to call if attention will be triggered */
+   int (*cb)(struct qed_hwfn *p_hwfn);
+
enum block_id block_index;
 };
 
@@ -1285,170 +1288,463 @@ static struct attn_hw_block attn_blocks[] = {
{"misc_aeu", { {0, 0, NULL, NULL} } },
{"bar0_map", { {0, 0, NULL, NULL} } },};
 
+/* Specific HW attention callbacks */
+static int qed_mcp_attn_cb(struct qed_hwfn *p_hwfn)
+{
+   u32 tmp = qed_rd(p_hwfn, p_hwfn->p_dpc_ptt, MCP_REG_CPU_STATE);
+
+   /* This might occur on certain instances; Log it once then mask it */
+   DP_INFO(p_hwfn->cdev, "MCP_REG_CPU_STATE: %08x - Masking...\n",
+   tmp);
+   qed_wr(p_hwfn, p_hwfn->p_dpc_ptt, MCP_REG_CPU_EVENT_MASK,
+  0x);
+
+   return 0;
+}
+
+#define QED_PSWHST_ATTENTION_INCORRECT_ACCESS  (0x1)
+#define ATTENTION_INCORRECT_ACCESS_WR_MASK (0x1)
+#define ATTENTION_INCORRECT_ACCESS_WR_SHIFT(0)
+#define ATTENTION_INCORRECT_ACCESS_CLIENT_MASK (0xf)
+#define ATTENTION_INCORRECT_ACCESS_CLIENT_SHIFT(1)
+#define ATTENTION_INCORRECT_ACCESS_VF_VALID_MASK   (0x1)
+#define ATTENTION_INCORRECT_ACCESS_VF_VALID_SHIFT  (5)
+#define ATTENTION_INCORRECT_ACCESS_VF_ID_MASK  (0xff)
+#define ATTENTION_INCORRECT_ACCESS_VF_ID_SHIFT (6)
+#define ATTENTION_INCORRECT_ACCESS_PF_ID_MASK  (0xf)
+#define ATTENTION_INCORRECT_ACCESS_PF_ID_SHIFT (14)
+#define ATTENTION_INCORRECT_ACCESS_BYTE_EN_MASK(0xff)
+#define ATTENTION_INCORRECT_ACCESS_BYTE_EN_SHIFT   (18)
+static int qed_pswhst_attn_cb(struct qed_hwfn *p_hwfn)
+{
+   u32 tmp = qed_rd(p_hwfn, p_hwfn->p_dpc_ptt,
+PSWHST_REG_INCORRECT_ACCESS_VALID);
+
+   if (tmp & QED_PSWHST_ATTENTION_INCORRECT_ACCESS) {
+   u32 addr, data, length;
+
+   addr = qed_rd(p_hwfn, p_hwfn->p_dpc_ptt,
+ PSWHST_REG_INCORRECT_ACCESS_ADDRESS);
+   data = qed_rd(p_hwfn, p_hwfn->p_dpc_ptt,
+ PSWHST_REG_INCORRECT_ACCESS_DATA);
+   length = qed_rd(p_hwfn, p_hwfn->p_dpc_ptt,
+   PSWHST_REG_INCORRECT_ACCESS_LENGTH);
+
+   DP_INFO(p_hwfn->cdev,
+   "Incorrect access to %08x of length %08x - PF [%02x] VF 
[%04x] [valid %02x] client [%02x] write [%02x] Byte-Enable [%04x] [%08x]\n",
+   addr, length,
+   (u8) GET_FIELD(data, ATTENTION_INCORRECT_ACCESS_PF_ID),
+   (u8) GET_FIELD(data, ATTENTION_INCORRECT_ACCESS_VF_ID),
+   (u8) GET_FIELD(data,
+  ATTENTION_INCORRECT_ACCESS_VF_VALID),
+   (u8) GET_FIELD(data,
+  ATTENTION_INCORRECT_ACCESS_CLIENT),
+   (u8) GET_FIELD(data, ATTENTION_INCORRECT_ACCESS_WR),
+   (u8) GET_FIELD(data,
+  ATTENTION_INCORRECT_ACCESS_BYTE_EN),
+   data);
+   }
+
+   return 0;
+}
+
+#define QED_GRC_ATTENTION_VALID_BIT(1 << 0)
+#define QED_GRC_ATTENTION_ADDRESS_MASK (0x7f)
+#define QED_GRC_ATTENTION_ADDRESS_SHIFT(0)
+#define QED_GRC_ATTENTION_RDWR_BIT (1 << 23)
+#define QED_GRC_ATTENTION_MASTER_MASK  (0xf)
+#define QED_GRC_ATTENTION_MASTER_SHIFT (24)
+#define QED_GRC_ATTENTION_PF_MASK  (0xf)
+#define QED_GRC_ATTENTION_PF_SHIFT (0)
+#define QED_GRC_ATTENTION_VF_MASK  (0xff)
+#define QED_GRC_ATTENTION_VF_SHIFT (4)
+#define QED_GRC_ATTENTION_PRIV_MASK(0x3)
+#define QED_GRC_ATTENTION_PRIV_SHIFT   (14)
+#define QED_GRC_ATTENTION_PRIV_VF  (0)
+static const char *attn_master_to_str(u8 master)
+{
+   switch (master) {
+   case 1: return "PXP";
+   case 2: return "MCP";
+   case 3: return "MSDM";
+   case 4: return "PSDM";
+   case 5: return "YSDM";
+   case 6: return "USDM";
+   case 7: return "TSDM";
+   case 8: return "XSDM";
+

[PATCH net-next 3/4] qed: Print HW attention reasons

2016-02-28 Thread Yuval Mintz
Each HW block contains common information about attention reasons,
raising a bit for each one of the different sub-reasons that caused it
to raise an attention.

This patch extends the infrastructure by allowing logging of the various
reasons causing the HW blocks to generate an attention.

Signed-off-by: Yuval Mintz 
---
 drivers/net/ethernet/qlogic/qed/qed_int.c | 1543 +++--
 1 file changed, 1436 insertions(+), 107 deletions(-)

diff --git a/drivers/net/ethernet/qlogic/qed/qed_int.c 
b/drivers/net/ethernet/qlogic/qed/qed_int.c
index c914ac5..c8bca77 100644
--- a/drivers/net/ethernet/qlogic/qed/qed_int.c
+++ b/drivers/net/ethernet/qlogic/qed/qed_int.c
@@ -39,6 +39,11 @@ struct qed_sb_sp_info {
struct qed_pi_info  pi_info_arr[PIS_PER_SB];
 };
 
+enum qed_attention_type {
+   QED_ATTN_TYPE_ATTN,
+   QED_ATTN_TYPE_PARITY,
+};
+
 #define SB_ATTN_ALIGNED_SIZE(p_hwfn) \
ALIGNED_TYPE_SIZE(struct atten_status_block, p_hwfn)
 
@@ -60,6 +65,8 @@ struct aeu_invert_reg_bit {
 #define ATTENTION_OFFSET_MASK   (0x000ff000)
 #define ATTENTION_OFFSET_SHIFT  (12)
unsigned int flags;
+
+   enum block_id block_index;
 };
 
 struct aeu_invert_reg {
@@ -69,158 +76,1379 @@ struct aeu_invert_reg {
 #define MAX_ATTN_GRPS   (8)
 #define NUM_ATTN_REGS   (9)
 
+/* HW Attention register */
+struct attn_hw_reg {
+   u16 reg_idx; /* Index of this register in its block */
+   u16 num_of_bits; /* number of valid attention bits */
+   u32 sts_addr;/* Address of the STS register */
+   u32 sts_clr_addr;/* Address of the STS_CLR register */
+   u32 sts_wr_addr; /* Address of the STS_WR register */
+   u32 mask_addr;   /* Address of the MASK register */
+};
+
+/* HW block attention registers */
+struct attn_hw_regs {
+   u16 num_of_int_regs;/* Number of interrupt regs */
+   u16 num_of_prty_regs;   /* Number of parity regs */
+   struct attn_hw_reg **int_regs;  /* interrupt regs */
+   struct attn_hw_reg **prty_regs; /* parity regs */
+};
+
+/* HW block attention registers */
+struct attn_hw_block {
+   const char *name; /* Block name */
+   struct attn_hw_regs chip_regs[1];
+};
+
+static struct attn_hw_reg grc_int0_bb_b0 = {
+   0, 4, 0x50180, 0x5018c, 0x50188, 0x50184};
+
+static struct attn_hw_reg *grc_int_bb_b0_regs[1] = {
+   _int0_bb_b0};
+
+static struct attn_hw_reg grc_prty1_bb_b0 = {
+   0, 2, 0x50200, 0x5020c, 0x50208, 0x50204};
+
+static struct attn_hw_reg *grc_prty_bb_b0_regs[1] = {
+   _prty1_bb_b0};
+
+static struct attn_hw_reg miscs_int0_bb_b0 = {
+   0, 3, 0x9180, 0x918c, 0x9188, 0x9184};
+
+static struct attn_hw_reg miscs_int1_bb_b0 = {
+   1, 11, 0x9190, 0x919c, 0x9198, 0x9194};
+
+static struct attn_hw_reg *miscs_int_bb_b0_regs[2] = {
+   _int0_bb_b0, _int1_bb_b0};
+
+static struct attn_hw_reg miscs_prty0_bb_b0 = {
+   0, 1, 0x91a0, 0x91ac, 0x91a8, 0x91a4};
+
+static struct attn_hw_reg *miscs_prty_bb_b0_regs[1] = {
+   _prty0_bb_b0};
+
+static struct attn_hw_reg misc_int0_bb_b0 = {
+   0, 1, 0x8180, 0x818c, 0x8188, 0x8184};
+
+static struct attn_hw_reg *misc_int_bb_b0_regs[1] = {
+   _int0_bb_b0};
+
+static struct attn_hw_reg pglue_b_int0_bb_b0 = {
+   0, 23, 0x2a8180, 0x2a818c, 0x2a8188, 0x2a8184};
+
+static struct attn_hw_reg *pglue_b_int_bb_b0_regs[1] = {
+   _b_int0_bb_b0};
+
+static struct attn_hw_reg pglue_b_prty0_bb_b0 = {
+   0, 1, 0x2a8190, 0x2a819c, 0x2a8198, 0x2a8194};
+
+static struct attn_hw_reg pglue_b_prty1_bb_b0 = {
+   1, 22, 0x2a8200, 0x2a820c, 0x2a8208, 0x2a8204};
+
+static struct attn_hw_reg *pglue_b_prty_bb_b0_regs[2] = {
+   _b_prty0_bb_b0, _b_prty1_bb_b0};
+
+static struct attn_hw_reg cnig_int0_bb_b0 = {
+   0, 6, 0x2182e8, 0x2182f4, 0x2182f0, 0x2182ec};
+
+static struct attn_hw_reg *cnig_int_bb_b0_regs[1] = {
+   _int0_bb_b0};
+
+static struct attn_hw_reg cnig_prty0_bb_b0 = {
+   0, 2, 0x218348, 0x218354, 0x218350, 0x21834c};
+
+static struct attn_hw_reg *cnig_prty_bb_b0_regs[1] = {
+   _prty0_bb_b0};
+
+static struct attn_hw_reg cpmu_int0_bb_b0 = {
+   0, 1, 0x303e0, 0x303ec, 0x303e8, 0x303e4};
+
+static struct attn_hw_reg *cpmu_int_bb_b0_regs[1] = {
+   _int0_bb_b0};
+
+static struct attn_hw_reg ncsi_int0_bb_b0 = {
+   0, 1, 0x404cc, 0x404d8, 0x404d4, 0x404d0};
+
+static struct attn_hw_reg *ncsi_int_bb_b0_regs[1] = {
+   _int0_bb_b0};
+
+static struct attn_hw_reg ncsi_prty1_bb_b0 = {
+   0, 1, 0x4, 0x4000c, 0x40008, 0x40004};
+
+static struct attn_hw_reg *ncsi_prty_bb_b0_regs[1] = {
+   _prty1_bb_b0};
+
+static struct attn_hw_reg opte_prty1_bb_b0 = {
+   0, 11, 0x53000, 0x5300c, 0x53008, 0x53004};
+
+static struct attn_hw_reg opte_prty0_bb_b0 = {
+   1, 1, 0x53208, 0x53214, 0x53210, 0x5320c};
+
+static struct attn_hw_reg 

[PATCH net-next 1/4] qed: Semantic refactoring of interrupt code

2016-02-28 Thread Yuval Mintz
Signed-off-by: Yuval Mintz 
---
 drivers/net/ethernet/qlogic/qed/qed_dev.c  |   6 +-
 drivers/net/ethernet/qlogic/qed/qed_int.c  | 155 -
 drivers/net/ethernet/qlogic/qed/qed_int.h  |   6 +-
 drivers/net/ethernet/qlogic/qed/qed_main.c |  15 +--
 include/linux/qed/qed_if.h |   6 ++
 5 files changed, 111 insertions(+), 77 deletions(-)

diff --git a/drivers/net/ethernet/qlogic/qed/qed_dev.c 
b/drivers/net/ethernet/qlogic/qed/qed_dev.c
index acfe7be..d9a5175 100644
--- a/drivers/net/ethernet/qlogic/qed/qed_dev.c
+++ b/drivers/net/ethernet/qlogic/qed/qed_dev.c
@@ -1011,13 +1011,17 @@ static void qed_hw_get_resc(struct qed_hwfn *p_hwfn)
 {
u32 *resc_start = p_hwfn->hw_info.resc_start;
u32 *resc_num = p_hwfn->hw_info.resc_num;
+   struct qed_sb_cnt_info sb_cnt_info;
int num_funcs, i;
 
num_funcs = MAX_NUM_PFS_BB;
 
+   memset(_cnt_info, 0, sizeof(sb_cnt_info));
+   qed_int_get_num_sbs(p_hwfn, _cnt_info);
+
resc_num[QED_SB] = min_t(u32,
 (MAX_SB_PER_PATH_BB / num_funcs),
-qed_int_get_num_sbs(p_hwfn, NULL));
+sb_cnt_info.sb_cnt);
resc_num[QED_L2_QUEUE] = MAX_NUM_L2_QUEUES_BB / num_funcs;
resc_num[QED_VPORT] = MAX_NUM_VPORTS_BB / num_funcs;
resc_num[QED_RSS_ENG] = ETH_RSS_ENGINE_NUM_BB / num_funcs;
diff --git a/drivers/net/ethernet/qlogic/qed/qed_int.c 
b/drivers/net/ethernet/qlogic/qed/qed_int.c
index fa73daa..7fd1be6 100644
--- a/drivers/net/ethernet/qlogic/qed/qed_int.c
+++ b/drivers/net/ethernet/qlogic/qed/qed_int.c
@@ -343,17 +343,17 @@ void qed_int_sp_dpc(unsigned long hwfn_cookie)
 
 static void qed_int_sb_attn_free(struct qed_hwfn *p_hwfn)
 {
-   struct qed_dev *cdev   = p_hwfn->cdev;
-   struct qed_sb_attn_info *p_sb   = p_hwfn->p_sb_attn;
-
-   if (p_sb) {
-   if (p_sb->sb_attn)
-   dma_free_coherent(>pdev->dev,
- SB_ATTN_ALIGNED_SIZE(p_hwfn),
- p_sb->sb_attn,
- p_sb->sb_phys);
-   kfree(p_sb);
-   }
+   struct qed_sb_attn_info *p_sb = p_hwfn->p_sb_attn;
+
+   if (!p_sb)
+   return;
+
+   if (p_sb->sb_attn)
+   dma_free_coherent(_hwfn->cdev->pdev->dev,
+ SB_ATTN_ALIGNED_SIZE(p_hwfn),
+ p_sb->sb_attn,
+ p_sb->sb_phys);
+   kfree(p_sb);
 }
 
 static void qed_int_sb_attn_setup(struct qed_hwfn *p_hwfn,
@@ -433,6 +433,7 @@ void qed_init_cau_sb_entry(struct qed_hwfn *p_hwfn,
   u16 vf_number,
   u8 vf_valid)
 {
+   struct qed_dev *cdev = p_hwfn->cdev;
u32 cau_state;
 
memset(p_sb_entry, 0, sizeof(*p_sb_entry));
@@ -451,14 +452,12 @@ void qed_init_cau_sb_entry(struct qed_hwfn *p_hwfn,
 
cau_state = CAU_HC_DISABLE_STATE;
 
-   if (p_hwfn->cdev->int_coalescing_mode == QED_COAL_MODE_ENABLE) {
+   if (cdev->int_coalescing_mode == QED_COAL_MODE_ENABLE) {
cau_state = CAU_HC_ENABLE_STATE;
-   if (!p_hwfn->cdev->rx_coalesce_usecs)
-   p_hwfn->cdev->rx_coalesce_usecs =
-   QED_CAU_DEF_RX_USECS;
-   if (!p_hwfn->cdev->tx_coalesce_usecs)
-   p_hwfn->cdev->tx_coalesce_usecs =
-   QED_CAU_DEF_TX_USECS;
+   if (!cdev->rx_coalesce_usecs)
+   cdev->rx_coalesce_usecs = QED_CAU_DEF_RX_USECS;
+   if (!cdev->tx_coalesce_usecs)
+   cdev->tx_coalesce_usecs = QED_CAU_DEF_TX_USECS;
}
 
SET_FIELD(p_sb_entry->data, CAU_SB_ENTRY_STATE0, cau_state);
@@ -638,8 +637,10 @@ int qed_int_sb_release(struct qed_hwfn *p_hwfn,
sb_info->sb_ack = 0;
memset(sb_info->sb_virt, 0, sizeof(*sb_info->sb_virt));
 
-   p_hwfn->sbs_info[sb_id] = NULL;
-   p_hwfn->num_sbs--;
+   if (p_hwfn->sbs_info[sb_id] != NULL) {
+   p_hwfn->sbs_info[sb_id] = NULL;
+   p_hwfn->num_sbs--;
+   }
 
return 0;
 }
@@ -648,14 +649,15 @@ static void qed_int_sp_sb_free(struct qed_hwfn *p_hwfn)
 {
struct qed_sb_sp_info *p_sb = p_hwfn->p_sp_sb;
 
-   if (p_sb) {
-   if (p_sb->sb_info.sb_virt)
-   dma_free_coherent(_hwfn->cdev->pdev->dev,
- SB_ALIGNED_SIZE(p_hwfn),
- p_sb->sb_info.sb_virt,
- p_sb->sb_info.sb_phys);
-   kfree(p_sb);
-   }
+   if (!p_sb)
+   return;
+
+   if (p_sb->sb_info.sb_virt)
+   dma_free_coherent(_hwfn->cdev->pdev->dev,
+ 

[PATCH net-next 0/4] qed: Attention support patch series

2016-02-28 Thread Yuval Mintz
Until now we've only enabled attention generation for the sake of
management firmware indications [required for link notifications].

This series enables [almost] all the attention sources of the HW,
currently for the sake of logging information relating to issues
experienced by HW. In future, infrastructure laid here would also be used
for the sake of the recovery process.

The first patch in the series is a semantic alignemnt of the code.
The later 3 patches incremently create said infrastructure and enrich
the logged information.
Notice #3 contains quite a bit of structures [consisting of ~1K lines]
that will eventually be removed and incorporated in the binary fw file.

Dave,

Please consider applying this series to `net-next'.

Thanks,
Yuval

Yuval Mintz (4):
  qed: Semantic refactoring of interrupt code
  qed: Add support for HW attentions
  qed: Print HW attention reasons
  qed: Print additional HW attention info

 drivers/net/ethernet/qlogic/qed/qed_dev.c  |6 +-
 drivers/net/ethernet/qlogic/qed/qed_int.c  | 2195 ++--
 drivers/net/ethernet/qlogic/qed/qed_int.h  |6 +-
 drivers/net/ethernet/qlogic/qed/qed_main.c |   15 +-
 drivers/net/ethernet/qlogic/qed/qed_reg_addr.h |   60 +
 include/linux/qed/qed_if.h |6 +
 6 files changed, 2171 insertions(+), 117 deletions(-)

-- 
1.9.3



Re: [net-next][PATCH v2 11/13] RDS: IB: add Fastreg MR (FRMR) detection support

2016-02-28 Thread santosh.shilim...@oracle.com


On 2/28/16 1:08 AM, Christoph Hellwig wrote:

On Sat, Feb 27, 2016 at 06:19:48PM -0800, Santosh Shilimkar wrote:

Discovere Fast Memmory Registration support using IB device
IB_DEVICE_MEM_MGT_EXTENSIONS. Certain HCA might support just FRMR
or FMR or both FMR and FRWR. In case both mr type are supported,
default FMR is used.

Default MR is still kept as FMR against what everyone else
is following. Default will be changed to FRMR once the
RDS performance with FRMR is comparable with FMR. The
work is in progress for the same.

Signed-off-by: Santosh Shilimkar 
Signed-off-by: Santosh Shilimkar 
---
v2: Dropped the module parameter as suggested by David Miller


This means we only use the safer method if the HCA doesn't support
the other one.  All other RDMA ULP that support both methods have
a module_param so the veto from Dave is a bit unfortunate.  Anyway,
let's get the code in for now and figure out what to use later.


Indeed. It wasn't really deal breaker for RDS so I agreed to drop it.


Just curious:  where / how do you see worse peformance using FRs?


I wouldn't call it worse but its not comparable. Use case
is multi-threaded RDS RDMA perf test(s). Hopefully the follow
up series which am working on should minimise the gap. I am
leaving the details for later, but one of the main issue I
saw was contention on driver post_send() lock from send, MR reg
and MR invalidation.

Regards,
Santosh


Re: [net-next][PATCH v2 01/13] RDS: Drop stale iWARP RDMA transport

2016-02-28 Thread santosh.shilim...@oracle.com



On 2/28/16 1:05 AM, Christoph Hellwig wrote:

On Sat, Feb 27, 2016 at 06:19:38PM -0800, Santosh Shilimkar wrote:

RDS iWarp support code has become stale and non testable. As
indicated earlier, am dropping the support for it.

If new iWarp user(s) shows up in future, we can adapat the RDS IB
transprt for the special RDMA READ sink case. iWarp needs an MR
for the RDMA READ sink.


Please take a look at the RDMA RW API series I posted yesterday - if
you can adopt RDS to that you should get iWarp support for free.


Will have a look. Thanks for the pointer.


But having two different codebases for IB/RoCE vs iWarp was always a bad
idea, so great to see the second one retired!

Acked-by: Christoph Hellwig 


Thanks !!


Re: [PATCH V3 0/3] basic busy polling support for vhost_net

2016-02-28 Thread Michael S. Tsirkin
On Fri, Feb 26, 2016 at 11:45:02AM -0500, David Miller wrote:
> From: Jason Wang 
> Date: Fri, 26 Feb 2016 16:42:41 +0800
> 
> > This series tries to add basic busy polling for vhost net. The idea is
> > simple: at the end of tx/rx processing, busy polling for new tx added
> > descriptor and rx receive socket for a while. The maximum number of
> > time (in us) could be spent on busy polling was specified ioctl.
> 
> I'm assuming this will go through Michael's tree.

Definitely.


Re: [PATCH] asm-generic: remove old nonatomic-io wrapper files

2016-02-28 Thread Christoph Hellwig
Thanks Arnd, this looks good to me:

Reviewed-by: Christoph Hellwig 


Re: [net-next][PATCH v2 11/13] RDS: IB: add Fastreg MR (FRMR) detection support

2016-02-28 Thread Christoph Hellwig
On Sat, Feb 27, 2016 at 06:19:48PM -0800, Santosh Shilimkar wrote:
> Discovere Fast Memmory Registration support using IB device
> IB_DEVICE_MEM_MGT_EXTENSIONS. Certain HCA might support just FRMR
> or FMR or both FMR and FRWR. In case both mr type are supported,
> default FMR is used.
> 
> Default MR is still kept as FMR against what everyone else
> is following. Default will be changed to FRMR once the
> RDS performance with FRMR is comparable with FMR. The
> work is in progress for the same.
> 
> Signed-off-by: Santosh Shilimkar 
> Signed-off-by: Santosh Shilimkar 
> ---
> v2: Dropped the module parameter as suggested by David Miller

This means we only use the safer method if the HCA doesn't support
the other one.  All other RDMA ULP that support both methods have
a module_param so the veto from Dave is a bit unfortunate.  Anyway,
let's get the code in for now and figure out what to use later.

Just curious:  where / how do you see worse peformance using FRs?


Re: [net-next][PATCH v2 01/13] RDS: Drop stale iWARP RDMA transport

2016-02-28 Thread Christoph Hellwig
On Sat, Feb 27, 2016 at 06:19:38PM -0800, Santosh Shilimkar wrote:
> RDS iWarp support code has become stale and non testable. As
> indicated earlier, am dropping the support for it.
> 
> If new iWarp user(s) shows up in future, we can adapat the RDS IB
> transprt for the special RDMA READ sink case. iWarp needs an MR
> for the RDMA READ sink.

Please take a look at the RDMA RW API series I posted yesterday - if
you can adopt RDS to that you should get iWarp support for free.

But having two different codebases for IB/RoCE vs iWarp was always a bad
idea, so great to see the second one retired!

Acked-by: Christoph Hellwig 


Re: [PATCH 3/3] 3c59x: Use setup_timer()

2016-02-28 Thread Amitoj Kaur Chawla
On Sun, Feb 28, 2016 at 1:23 PM, Stafford Horne  wrote:
>
>
> On Sun, 28 Feb 2016, Amitoj Kaur Chawla wrote:
>
>> On Sun, Feb 28, 2016 at 12:18 AM, Stafford Horne  wrote:
>> >
>> >
>> > On Thu, 25 Feb 2016, David Miller wrote:
>> >
>> >> From: Amitoj Kaur Chawla 
>> >> Date: Wed, 24 Feb 2016 19:28:19 +0530
>> >>
>> >>> Convert a call to init_timer and accompanying intializations of
>> >>> the timer's data and function fields to a call to setup_timer.
>> >>>
>> >>> The Coccinelle semantic patch that fixes this problem is
>> >>> as follows:
>> >>>
>> >>> // 
>> >>> @@
>> >>> expression t,f,d;
>> >>> @@
>> >>>
>> >>> -init_timer();
>> >>> +setup_timer(,f,d);
>> >>>  ...
>> >>> -t.data = d;
>> >>> -t.function = f;
>> >>> // 
>> >>>
>> >>> Signed-off-by: Amitoj Kaur Chawla 
>> >>
>> >>
>> >> Applied.
>> >
>> >
>> > Hi David, Amitoj,
>> >
>> > The patch here seemed to remove the call to add_timer(>timer) which
>> > applies the expires time. Would that be an issue?
>> >
>> > -Stafford
>>
>> I'm sorry. This is my mistake. How can I rectify it now that the patch
>> is applied?
>>
>> Should I send a patch adding it back?
>
>
> I sent a patch just now which could help to restore the behavior.
>
> This is applied on top of your patch which I pulled from Dave's
> tree here:
>
> git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git
>
> -Stafford

Thanks!

Amitoj