date:20170722

Re: [PATCH] EDAC: remove unnecessary static in edac_fake_inject_write()

2017-07-22 Thread Julia Lawall



On Sun, 23 Jul 2017, Gustavo A. R. Silva wrote:

> Hi Julia,
>
> On 07/23/2017 12:07 AM, Julia Lawall wrote:
> >
> >
> > On Sat, 22 Jul 2017, Gustavo A. R. Silva wrote:
> >
> > > Hi Julia, Borislav,
> > >
> > > On 07/22/2017 11:22 AM, Gustavo A. R. Silva wrote:
> > > > Hi all,
> > > >
> > > > On 07/22/2017 01:36 AM, Borislav Petkov wrote:
> > > > > On Fri, Jul 21, 2017 at 10:08:12PM +0200, Julia Lawall wrote:
> > > > > > Someone pointed out that the rule is probably not OK when the
> > > > > > address of
> > > > > > the static variable is taken, because then it is likely being used
> > > > > > as
> > > > > > permanent storage.
> > > > >
> > > > > Makes sense to me.
> > > > >
> > > > > > An improved rule is:
> > > > >
> > > > > Do you think it is worth having it in scripts/coccinelle/ ?
> > > > >
> > > > > I don't think Gustavo would mind putting it there :)
> > > > >
> > > >
> > > > Absolutely, I'd be glad to help out. :)
> > > >
> > >
> > > I've been working on this issue today and, in my opinion, this script is
> > > even
> > > better:
> > >
> > > @bad exists@
> > > position p;
> > > identifier x;
> > > expression e;
> > > type T;
> > > @@
> > >
> > > static T x@p;
> > > ... when != x = e
> > > x = <+...x...+>
> > >
> > > @worse1 exists@
> > > position p;
> > > identifier x;
> > > type T;
> > > @@
> > >
> > > static T x@p;
> > > ...
> > > return 
> > >
> > > @worse2 exists@
> > > position p;
> > > identifier x;
> > > type T;
> > > @@
> > >
> > > static T *x@p;
> > > ...
> > > return x;
> > >
> > > @@
> > > identifier x;
> > > expression e;
> > > type T;
> > > position p != {bad.p,worse1.p,worse2.p};
> > > @@
> > >
> > > -static
> > >   T x@p;
> > >   ... when != x
> > >   when strict
> > > ?x = e;
> > >
> > > It ignores all the cases in which the address of the static variable is
> > > returned to the caller function.
> >
> > I don't understand why you want to restrict the address of a variable case
> > to returns.  Storing the address in a field of a structure that has a
> > lifetime beyond the function body is a problem as well.
> >
>
> Yeah, I totally agree and, personally I consider that a bad coding practice.
> But I think those kinds of issues should be addressed in a different script.

I don't understand the response at all.  My point was that you have taken
a very general reason to not apply the change, ie the presence of 
anywhere, and limited it to a special case: you don't apply the change
when there exists return  and you do apply the script when there exits
a->b =   But the change is not safe to apply in both cases.


>
> > On the other hand returning the value stored in a static variable is not a
> > problem.  That value exists independently of the variable that contains
> > it.  The variable that conains it doesn't need to live on in any way.
> >
>
> Yeah, I agree, but I don't see exactly where this argument is coming from ?
>
> Notice that for both worse1 and worse2, what is returned is the address, not
> the value of the static variable. At least that was my intention, unless I
> maybe missing something ?

return x returns the value of x.  It does not return the address of x.

julia

> > >
> > > Also, there are some cases in which the maintainer can argue something
> > > like
> > > the following:
> > >
> > > https://lkml.org/lkml/2017/7/19/1381
> > >
> > > but that depends on the particular conditions in which the code is
> > > intended to
> > > be executed.
> > >
> > > What do you think?
> >
> > The preserving values argument is not relevant.  The rule checks that the
> > value is never used.  DMA accesses should involve taking an address, which
> > we now disallow.  It seems likely that anything large would have its
> > address taken too, but one could check manually for that.  spgen provides
> > a section where you can describe such issues.
> >
>
> Yeah, those cases should be analyzed manually and in case of doubt check with
> the maintainers.
>
> Thanks
> --
> Gustavo A. R. Silva
>

Re: [PATCH] EDAC: remove unnecessary static in edac_fake_inject_write()

2017-07-22 Thread Julia Lawall



On Sun, 23 Jul 2017, Gustavo A. R. Silva wrote:

> Hi Julia,
>
> On 07/23/2017 12:07 AM, Julia Lawall wrote:
> >
> >
> > On Sat, 22 Jul 2017, Gustavo A. R. Silva wrote:
> >
> > > Hi Julia, Borislav,
> > >
> > > On 07/22/2017 11:22 AM, Gustavo A. R. Silva wrote:
> > > > Hi all,
> > > >
> > > > On 07/22/2017 01:36 AM, Borislav Petkov wrote:
> > > > > On Fri, Jul 21, 2017 at 10:08:12PM +0200, Julia Lawall wrote:
> > > > > > Someone pointed out that the rule is probably not OK when the
> > > > > > address of
> > > > > > the static variable is taken, because then it is likely being used
> > > > > > as
> > > > > > permanent storage.
> > > > >
> > > > > Makes sense to me.
> > > > >
> > > > > > An improved rule is:
> > > > >
> > > > > Do you think it is worth having it in scripts/coccinelle/ ?
> > > > >
> > > > > I don't think Gustavo would mind putting it there :)
> > > > >
> > > >
> > > > Absolutely, I'd be glad to help out. :)
> > > >
> > >
> > > I've been working on this issue today and, in my opinion, this script is
> > > even
> > > better:
> > >
> > > @bad exists@
> > > position p;
> > > identifier x;
> > > expression e;
> > > type T;
> > > @@
> > >
> > > static T x@p;
> > > ... when != x = e
> > > x = <+...x...+>
> > >
> > > @worse1 exists@
> > > position p;
> > > identifier x;
> > > type T;
> > > @@
> > >
> > > static T x@p;
> > > ...
> > > return 
> > >
> > > @worse2 exists@
> > > position p;
> > > identifier x;
> > > type T;
> > > @@
> > >
> > > static T *x@p;
> > > ...
> > > return x;
> > >
> > > @@
> > > identifier x;
> > > expression e;
> > > type T;
> > > position p != {bad.p,worse1.p,worse2.p};
> > > @@
> > >
> > > -static
> > >   T x@p;
> > >   ... when != x
> > >   when strict
> > > ?x = e;
> > >
> > > It ignores all the cases in which the address of the static variable is
> > > returned to the caller function.
> >
> > I don't understand why you want to restrict the address of a variable case
> > to returns.  Storing the address in a field of a structure that has a
> > lifetime beyond the function body is a problem as well.
> >
>
> Yeah, I totally agree and, personally I consider that a bad coding practice.
> But I think those kinds of issues should be addressed in a different script.

I don't understand the response at all.  My point was that you have taken
a very general reason to not apply the change, ie the presence of 
anywhere, and limited it to a special case: you don't apply the change
when there exists return  and you do apply the script when there exits
a->b =   But the change is not safe to apply in both cases.


>
> > On the other hand returning the value stored in a static variable is not a
> > problem.  That value exists independently of the variable that contains
> > it.  The variable that conains it doesn't need to live on in any way.
> >
>
> Yeah, I agree, but I don't see exactly where this argument is coming from ?
>
> Notice that for both worse1 and worse2, what is returned is the address, not
> the value of the static variable. At least that was my intention, unless I
> maybe missing something ?

return x returns the value of x.  It does not return the address of x.

julia

> > >
> > > Also, there are some cases in which the maintainer can argue something
> > > like
> > > the following:
> > >
> > > https://lkml.org/lkml/2017/7/19/1381
> > >
> > > but that depends on the particular conditions in which the code is
> > > intended to
> > > be executed.
> > >
> > > What do you think?
> >
> > The preserving values argument is not relevant.  The rule checks that the
> > value is never used.  DMA accesses should involve taking an address, which
> > we now disallow.  It seems likely that anything large would have its
> > address taken too, but one could check manually for that.  spgen provides
> > a section where you can describe such issues.
> >
>
> Yeah, those cases should be analyzed manually and in case of doubt check with
> the maintainers.
>
> Thanks
> --
> Gustavo A. R. Silva
>

Re: [PATCH] EDAC: remove unnecessary static in edac_fake_inject_write()

2017-07-22 Thread Gustavo A. R. Silva


Hi Julia,

On 07/23/2017 12:07 AM, Julia Lawall wrote:



On Sat, 22 Jul 2017, Gustavo A. R. Silva wrote:


Hi Julia, Borislav,

On 07/22/2017 11:22 AM, Gustavo A. R. Silva wrote:

Hi all,

On 07/22/2017 01:36 AM, Borislav Petkov wrote:

On Fri, Jul 21, 2017 at 10:08:12PM +0200, Julia Lawall wrote:

Someone pointed out that the rule is probably not OK when the address of
the static variable is taken, because then it is likely being used as
permanent storage.


Makes sense to me.


An improved rule is:


Do you think it is worth having it in scripts/coccinelle/ ?

I don't think Gustavo would mind putting it there :)



Absolutely, I'd be glad to help out. :)



I've been working on this issue today and, in my opinion, this script is even
better:

@bad exists@
position p;
identifier x;
expression e;
type T;
@@

static T x@p;
... when != x = e
x = <+...x...+>

@worse1 exists@
position p;
identifier x;
type T;
@@

static T x@p;
...
return 

@worse2 exists@
position p;
identifier x;
type T;
@@

static T *x@p;
...
return x;

@@
identifier x;
expression e;
type T;
position p != {bad.p,worse1.p,worse2.p};
@@

-static
  T x@p;
  ... when != x
  when strict
?x = e;

It ignores all the cases in which the address of the static variable is
returned to the caller function.


I don't understand why you want to restrict the address of a variable case
to returns.  Storing the address in a field of a structure that has a
lifetime beyond the function body is a problem as well.



Yeah, I totally agree and, personally I consider that a bad coding 
practice. But I think those kinds of issues should be addressed in a 
different script.



On the other hand returning the value stored in a static variable is not a
problem.  That value exists independently of the variable that contains
it.  The variable that conains it doesn't need to live on in any way.



Yeah, I agree, but I don't see exactly where this argument is coming from ?

Notice that for both worse1 and worse2, what is returned is the address, 
not the value of the static variable. At least that was my intention, 
unless I maybe missing something ?




Also, there are some cases in which the maintainer can argue something like
the following:

https://lkml.org/lkml/2017/7/19/1381

but that depends on the particular conditions in which the code is intended to
be executed.

What do you think?


The preserving values argument is not relevant.  The rule checks that the
value is never used.  DMA accesses should involve taking an address, which
we now disallow.  It seems likely that anything large would have its
address taken too, but one could check manually for that.  spgen provides
a section where you can describe such issues.



Yeah, those cases should be analyzed manually and in case of doubt check 
with the maintainers.


Thanks
--
Gustavo A. R. Silva

Re: [PATCH] EDAC: remove unnecessary static in edac_fake_inject_write()

2017-07-22 Thread Gustavo A. R. Silva


Hi Julia,

On 07/23/2017 12:07 AM, Julia Lawall wrote:



On Sat, 22 Jul 2017, Gustavo A. R. Silva wrote:


Hi Julia, Borislav,

On 07/22/2017 11:22 AM, Gustavo A. R. Silva wrote:

Hi all,

On 07/22/2017 01:36 AM, Borislav Petkov wrote:

On Fri, Jul 21, 2017 at 10:08:12PM +0200, Julia Lawall wrote:

Someone pointed out that the rule is probably not OK when the address of
the static variable is taken, because then it is likely being used as
permanent storage.


Makes sense to me.


An improved rule is:


Do you think it is worth having it in scripts/coccinelle/ ?

I don't think Gustavo would mind putting it there :)



Absolutely, I'd be glad to help out. :)



I've been working on this issue today and, in my opinion, this script is even
better:

@bad exists@
position p;
identifier x;
expression e;
type T;
@@

static T x@p;
... when != x = e
x = <+...x...+>

@worse1 exists@
position p;
identifier x;
type T;
@@

static T x@p;
...
return 

@worse2 exists@
position p;
identifier x;
type T;
@@

static T *x@p;
...
return x;

@@
identifier x;
expression e;
type T;
position p != {bad.p,worse1.p,worse2.p};
@@

-static
  T x@p;
  ... when != x
  when strict
?x = e;

It ignores all the cases in which the address of the static variable is
returned to the caller function.


I don't understand why you want to restrict the address of a variable case
to returns.  Storing the address in a field of a structure that has a
lifetime beyond the function body is a problem as well.



Yeah, I totally agree and, personally I consider that a bad coding 
practice. But I think those kinds of issues should be addressed in a 
different script.



On the other hand returning the value stored in a static variable is not a
problem.  That value exists independently of the variable that contains
it.  The variable that conains it doesn't need to live on in any way.



Yeah, I agree, but I don't see exactly where this argument is coming from ?

Notice that for both worse1 and worse2, what is returned is the address, 
not the value of the static variable. At least that was my intention, 
unless I maybe missing something ?




Also, there are some cases in which the maintainer can argue something like
the following:

https://lkml.org/lkml/2017/7/19/1381

but that depends on the particular conditions in which the code is intended to
be executed.

What do you think?


The preserving values argument is not relevant.  The rule checks that the
value is never used.  DMA accesses should involve taking an address, which
we now disallow.  It seems likely that anything large would have its
address taken too, but one could check manually for that.  spgen provides
a section where you can describe such issues.



Yeah, those cases should be analyzed manually and in case of doubt check 
with the maintainers.


Thanks
--
Gustavo A. R. Silva

fs/binfmt_flat.c:828:9: error: void value not ignored as it ought to be

2017-07-22 Thread kbuild test robot

Hi Al,

FYI, the error/warning still remains.

tree:   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 
master
head:   4b162c530d9c101381500e586fedb1340595a6ff
commit: 468138d78510688fb5476f98d23f11ac6a63229a binfmt_flat: 
flat_{get,put}_addr_from_rp() should be able to fail
date:   3 weeks ago
config: m32r-mappi.nommu_defconfig (attached as .config)
compiler: m32r-linux-gcc (GCC) 6.2.0
reproduce:
wget 
https://raw.githubusercontent.com/01org/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
git checkout 468138d78510688fb5476f98d23f11ac6a63229a
# save the attached .config to linux build tree
make.cross ARCH=m32r 

All errors (new ones prefixed by >>):

   In file included from include/linux/kernel.h:13:0,
from fs/binfmt_flat.c:20:
   fs/binfmt_flat.c: In function 'load_flat_file':
   include/linux/kern_levels.h:4:18: warning: format '%ld' expects argument of 
type 'long int', but argument 2 has type 'u32 {aka unsigned int}' [-Wformat=]
#define KERN_SOH "\001"  /* ASCII Start Of Header */
 ^
   include/linux/printk.h:136:11: note: in definition of macro 'no_printk'
   printk(fmt, ##__VA_ARGS__); \
  ^~~
   include/linux/kern_levels.h:14:20: note: in expansion of macro 'KERN_SOH'
#define KERN_DEBUG KERN_SOH "7" /* debug-level messages */
   ^~~~
   include/linux/printk.h:339:12: note: in expansion of macro 'KERN_DEBUG'
 no_printk(KERN_DEBUG pr_fmt(fmt), ##__VA_ARGS__)
   ^~
   fs/binfmt_flat.c:577:3: note: in expansion of macro 'pr_debug'
  pr_debug("Allocated data+bss+stack (%ld bytes): %lx\n",
  ^~~~
>> fs/binfmt_flat.c:828:9: error: void value not ignored as it ought to be
ret = flat_put_addr_at_rp(rp, addr, relval);
^

vim +828 fs/binfmt_flat.c

   506  
   507  /*
   508   * Check initial limits. This avoids letting people circumvent
   509   * size limits imposed on them by creating programs with large
   510   * arrays in the data or bss.
   511   */
   512  rlim = rlimit(RLIMIT_DATA);
   513  if (rlim >= RLIM_INFINITY)
   514  rlim = ~0;
   515  if (data_len + bss_len > rlim) {
   516  ret = -ENOMEM;
   517  goto err;
   518  }
   519  
   520  /* Flush all traces of the currently running executable */
   521  if (id == 0) {
   522  ret = flush_old_exec(bprm);
   523  if (ret)
   524  goto err;
   525  
   526  /* OK, This is the point of no return */
   527  set_personality(PER_LINUX_32BIT);
   528  setup_new_exec(bprm);
   529  }
   530  
   531  /*
   532   * calculate the extra space we need to map in
   533   */
   534  extra = max_t(unsigned long, bss_len + stack_len,
   535  relocs * sizeof(unsigned long));
   536  
   537  /*
   538   * there are a couple of cases here,  the separate code/data
   539   * case,  and then the fully copied to RAM case which lumps
   540   * it all together.
   541   */
   542  if (!IS_ENABLED(CONFIG_MMU) && !(flags & 
(FLAT_FLAG_RAM|FLAT_FLAG_GZIP))) {
   543  /*
   544   * this should give us a ROM ptr,  but if it doesn't we 
don't
   545   * really care
   546   */
   547  pr_debug("ROM mapping of file (we hope)\n");
   548  
   549  textpos = vm_mmap(bprm->file, 0, text_len, 
PROT_READ|PROT_EXEC,
   550MAP_PRIVATE|MAP_EXECUTABLE, 0);
   551  if (!textpos || IS_ERR_VALUE(textpos)) {
   552  ret = textpos;
   553  if (!textpos)
   554  ret = -ENOMEM;
   555  pr_err("Unable to mmap process text, errno 
%d\n", ret);
   556  goto err;
   557  }
   558  
   559  len = data_len + extra + MAX_SHARED_LIBS * 
sizeof(unsigned long);
   560  len = PAGE_ALIGN(len);
   561  realdatastart = vm_mmap(NULL, 0, len,
   562  PROT_READ|PROT_WRITE|PROT_EXEC, MAP_PRIVATE, 0);
   563  
   564  if (realdatastart == 0 || IS_ERR_VALUE(realdatastart)) {
   565  ret = realdatastart;
   566  if (!realdatastart)
   567  ret = -ENOMEM;
   568  pr_err("Unable to allocate RAM for process 
data, "
   569 "errno %d\n", ret);
   570  vm_munmap(textpos, text_len);
   571

fs/binfmt_flat.c:828:9: error: void value not ignored as it ought to be

2017-07-22 Thread kbuild test robot

Hi Al,

FYI, the error/warning still remains.

tree:   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 
master
head:   4b162c530d9c101381500e586fedb1340595a6ff
commit: 468138d78510688fb5476f98d23f11ac6a63229a binfmt_flat: 
flat_{get,put}_addr_from_rp() should be able to fail
date:   3 weeks ago
config: m32r-mappi.nommu_defconfig (attached as .config)
compiler: m32r-linux-gcc (GCC) 6.2.0
reproduce:
wget 
https://raw.githubusercontent.com/01org/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
git checkout 468138d78510688fb5476f98d23f11ac6a63229a
# save the attached .config to linux build tree
make.cross ARCH=m32r 

All errors (new ones prefixed by >>):

   In file included from include/linux/kernel.h:13:0,
from fs/binfmt_flat.c:20:
   fs/binfmt_flat.c: In function 'load_flat_file':
   include/linux/kern_levels.h:4:18: warning: format '%ld' expects argument of 
type 'long int', but argument 2 has type 'u32 {aka unsigned int}' [-Wformat=]
#define KERN_SOH "\001"  /* ASCII Start Of Header */
 ^
   include/linux/printk.h:136:11: note: in definition of macro 'no_printk'
   printk(fmt, ##__VA_ARGS__); \
  ^~~
   include/linux/kern_levels.h:14:20: note: in expansion of macro 'KERN_SOH'
#define KERN_DEBUG KERN_SOH "7" /* debug-level messages */
   ^~~~
   include/linux/printk.h:339:12: note: in expansion of macro 'KERN_DEBUG'
 no_printk(KERN_DEBUG pr_fmt(fmt), ##__VA_ARGS__)
   ^~
   fs/binfmt_flat.c:577:3: note: in expansion of macro 'pr_debug'
  pr_debug("Allocated data+bss+stack (%ld bytes): %lx\n",
  ^~~~
>> fs/binfmt_flat.c:828:9: error: void value not ignored as it ought to be
ret = flat_put_addr_at_rp(rp, addr, relval);
^

vim +828 fs/binfmt_flat.c

   506  
   507  /*
   508   * Check initial limits. This avoids letting people circumvent
   509   * size limits imposed on them by creating programs with large
   510   * arrays in the data or bss.
   511   */
   512  rlim = rlimit(RLIMIT_DATA);
   513  if (rlim >= RLIM_INFINITY)
   514  rlim = ~0;
   515  if (data_len + bss_len > rlim) {
   516  ret = -ENOMEM;
   517  goto err;
   518  }
   519  
   520  /* Flush all traces of the currently running executable */
   521  if (id == 0) {
   522  ret = flush_old_exec(bprm);
   523  if (ret)
   524  goto err;
   525  
   526  /* OK, This is the point of no return */
   527  set_personality(PER_LINUX_32BIT);
   528  setup_new_exec(bprm);
   529  }
   530  
   531  /*
   532   * calculate the extra space we need to map in
   533   */
   534  extra = max_t(unsigned long, bss_len + stack_len,
   535  relocs * sizeof(unsigned long));
   536  
   537  /*
   538   * there are a couple of cases here,  the separate code/data
   539   * case,  and then the fully copied to RAM case which lumps
   540   * it all together.
   541   */
   542  if (!IS_ENABLED(CONFIG_MMU) && !(flags & 
(FLAT_FLAG_RAM|FLAT_FLAG_GZIP))) {
   543  /*
   544   * this should give us a ROM ptr,  but if it doesn't we 
don't
   545   * really care
   546   */
   547  pr_debug("ROM mapping of file (we hope)\n");
   548  
   549  textpos = vm_mmap(bprm->file, 0, text_len, 
PROT_READ|PROT_EXEC,
   550MAP_PRIVATE|MAP_EXECUTABLE, 0);
   551  if (!textpos || IS_ERR_VALUE(textpos)) {
   552  ret = textpos;
   553  if (!textpos)
   554  ret = -ENOMEM;
   555  pr_err("Unable to mmap process text, errno 
%d\n", ret);
   556  goto err;
   557  }
   558  
   559  len = data_len + extra + MAX_SHARED_LIBS * 
sizeof(unsigned long);
   560  len = PAGE_ALIGN(len);
   561  realdatastart = vm_mmap(NULL, 0, len,
   562  PROT_READ|PROT_WRITE|PROT_EXEC, MAP_PRIVATE, 0);
   563  
   564  if (realdatastart == 0 || IS_ERR_VALUE(realdatastart)) {
   565  ret = realdatastart;
   566  if (!realdatastart)
   567  ret = -ENOMEM;
   568  pr_err("Unable to allocate RAM for process 
data, "
   569 "errno %d\n", ret);
   570  vm_munmap(textpos, text_len);
   571

Re: undefined reference to `_GLOBAL_OFFSET_TABLE_'

2017-07-22 Thread Nicholas Piggin

On Sun, 23 Jul 2017 08:20:30 +0800
kbuild test robot  wrote:

> Hi Nicholas,
> 
> FYI, the error/warning still remains.

FYI, I still suspect it is this bug

https://sourceware.org/bugzilla/show_bug.cgi?id=21017

If so, then I'm not sure if it can be worked around in Linux.

Thanks,
Nick


> 
> tree:   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 
> master
> head:   4b162c530d9c101381500e586fedb1340595a6ff
> commit: 799c43415442414b1032580c47684cb709dfed6d kbuild: thin archives make 
> default for all archs
> date:   3 weeks ago
> config: microblaze-allnoconfig (attached as .config)
> compiler: microblaze-linux-gcc (GCC) 6.2.0
> reproduce:
> wget 
> https://raw.githubusercontent.com/01org/lkp-tests/master/sbin/make.cross -O 
> ~/bin/make.cross
> chmod +x ~/bin/make.cross
> git checkout 799c43415442414b1032580c47684cb709dfed6d
> # save the attached .config to linux build tree
> make.cross ARCH=microblaze 
> 
> All errors (new ones prefixed by >>):
> 
>mm/slub.o: In function `__slab_free.isra.13':
> >> (.text+0x1038): undefined reference to `_GLOBAL_OFFSET_TABLE_'  
>scripts/link-vmlinux.sh: line 93: 56533 Segmentation fault  ${LD} 
> ${LDFLAGS} ${LDFLAGS_vmlinux} -o ${2} -T ${lds} ${objects}
> 
> ---
> 0-DAY kernel test infrastructureOpen Source Technology Center
> https://lists.01.org/pipermail/kbuild-all   Intel Corporation

Re: undefined reference to `_GLOBAL_OFFSET_TABLE_'

2017-07-22 Thread Nicholas Piggin

On Sun, 23 Jul 2017 08:20:30 +0800
kbuild test robot  wrote:

> Hi Nicholas,
> 
> FYI, the error/warning still remains.

FYI, I still suspect it is this bug

https://sourceware.org/bugzilla/show_bug.cgi?id=21017

If so, then I'm not sure if it can be worked around in Linux.

Thanks,
Nick


> 
> tree:   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 
> master
> head:   4b162c530d9c101381500e586fedb1340595a6ff
> commit: 799c43415442414b1032580c47684cb709dfed6d kbuild: thin archives make 
> default for all archs
> date:   3 weeks ago
> config: microblaze-allnoconfig (attached as .config)
> compiler: microblaze-linux-gcc (GCC) 6.2.0
> reproduce:
> wget 
> https://raw.githubusercontent.com/01org/lkp-tests/master/sbin/make.cross -O 
> ~/bin/make.cross
> chmod +x ~/bin/make.cross
> git checkout 799c43415442414b1032580c47684cb709dfed6d
> # save the attached .config to linux build tree
> make.cross ARCH=microblaze 
> 
> All errors (new ones prefixed by >>):
> 
>mm/slub.o: In function `__slab_free.isra.13':
> >> (.text+0x1038): undefined reference to `_GLOBAL_OFFSET_TABLE_'  
>scripts/link-vmlinux.sh: line 93: 56533 Segmentation fault  ${LD} 
> ${LDFLAGS} ${LDFLAGS_vmlinux} -o ${2} -T ${lds} ${objects}
> 
> ---
> 0-DAY kernel test infrastructureOpen Source Technology Center
> https://lists.01.org/pipermail/kbuild-all   Intel Corporation

Re: [PATCH] EDAC: remove unnecessary static in edac_fake_inject_write()

2017-07-22 Thread Julia Lawall

On Sat, 22 Jul 2017, Gustavo A. R. Silva wrote:

> Hi Julia, Borislav,
>
> On 07/22/2017 11:22 AM, Gustavo A. R. Silva wrote:
> > Hi all,
> >
> > On 07/22/2017 01:36 AM, Borislav Petkov wrote:
> > > On Fri, Jul 21, 2017 at 10:08:12PM +0200, Julia Lawall wrote:
> > > > Someone pointed out that the rule is probably not OK when the address of
> > > > the static variable is taken, because then it is likely being used as
> > > > permanent storage.
> > >
> > > Makes sense to me.
> > >
> > > > An improved rule is:
> > >
> > > Do you think it is worth having it in scripts/coccinelle/ ?
> > >
> > > I don't think Gustavo would mind putting it there :)
> > >
> >
> > Absolutely, I'd be glad to help out. :)
> >
>
> I've been working on this issue today and, in my opinion, this script is even
> better:
>
> @bad exists@
> position p;
> identifier x;
> expression e;
> type T;
> @@
>
> static T x@p;
> ... when != x = e
> x = <+...x...+>
>
> @worse1 exists@
> position p;
> identifier x;
> type T;
> @@
>
> static T x@p;
> ...
> return 
>
> @worse2 exists@
> position p;
> identifier x;
> type T;
> @@
>
> static T *x@p;
> ...
> return x;
>
> @@
> identifier x;
> expression e;
> type T;
> position p != {bad.p,worse1.p,worse2.p};
> @@
>
> -static
>   T x@p;
>   ... when != x
>   when strict
> ?x = e;
>
> It ignores all the cases in which the address of the static variable is
> returned to the caller function.

I don't understand why you want to restrict the address of a variable case
to returns.  Storing the address in a field of a structure that has a
lifetime beyond the function body is a problem as well.

On the other hand returning the value stored in a static variable is not a
problem.  That value exists independently of the variable that contains
it.  The variable that conains it doesn't need to live on in any way.

>
> Also, there are some cases in which the maintainer can argue something like
> the following:
>
> https://lkml.org/lkml/2017/7/19/1381
>
> but that depends on the particular conditions in which the code is intended to
> be executed.
>
> What do you think?

The preserving values argument is not relevant.  The rule checks that the
value is never used.  DMA accesses should involve taking an address, which
we now disallow.  It seems likely that anything large would have its
address taken too, but one could check manually for that.  spgen provides
a section where you can describe such issues.

julia

> Thank you
> --
> Gustavo A. R. Silva
>

Re: [PATCH] EDAC: remove unnecessary static in edac_fake_inject_write()

2017-07-22 Thread Julia Lawall

On Sat, 22 Jul 2017, Gustavo A. R. Silva wrote:

> Hi Julia, Borislav,
>
> On 07/22/2017 11:22 AM, Gustavo A. R. Silva wrote:
> > Hi all,
> >
> > On 07/22/2017 01:36 AM, Borislav Petkov wrote:
> > > On Fri, Jul 21, 2017 at 10:08:12PM +0200, Julia Lawall wrote:
> > > > Someone pointed out that the rule is probably not OK when the address of
> > > > the static variable is taken, because then it is likely being used as
> > > > permanent storage.
> > >
> > > Makes sense to me.
> > >
> > > > An improved rule is:
> > >
> > > Do you think it is worth having it in scripts/coccinelle/ ?
> > >
> > > I don't think Gustavo would mind putting it there :)
> > >
> >
> > Absolutely, I'd be glad to help out. :)
> >
>
> I've been working on this issue today and, in my opinion, this script is even
> better:
>
> @bad exists@
> position p;
> identifier x;
> expression e;
> type T;
> @@
>
> static T x@p;
> ... when != x = e
> x = <+...x...+>
>
> @worse1 exists@
> position p;
> identifier x;
> type T;
> @@
>
> static T x@p;
> ...
> return 
>
> @worse2 exists@
> position p;
> identifier x;
> type T;
> @@
>
> static T *x@p;
> ...
> return x;
>
> @@
> identifier x;
> expression e;
> type T;
> position p != {bad.p,worse1.p,worse2.p};
> @@
>
> -static
>   T x@p;
>   ... when != x
>   when strict
> ?x = e;
>
> It ignores all the cases in which the address of the static variable is
> returned to the caller function.

I don't understand why you want to restrict the address of a variable case
to returns.  Storing the address in a field of a structure that has a
lifetime beyond the function body is a problem as well.

On the other hand returning the value stored in a static variable is not a
problem.  That value exists independently of the variable that contains
it.  The variable that conains it doesn't need to live on in any way.

>
> Also, there are some cases in which the maintainer can argue something like
> the following:
>
> https://lkml.org/lkml/2017/7/19/1381
>
> but that depends on the particular conditions in which the code is intended to
> be executed.
>
> What do you think?

The preserving values argument is not relevant.  The rule checks that the
value is never used.  DMA accesses should involve taking an address, which
we now disallow.  It seems likely that anything large would have its
address taken too, but one could check manually for that.  spgen provides
a section where you can describe such issues.

julia

> Thank you
> --
> Gustavo A. R. Silva
>

include/linux/kernel.h:860:32: error: dereferencing pointer to incomplete type 'struct clock_event_device'

2017-07-22 Thread kbuild test robot

Hi Ian,

FYI, the error/warning still remains.

tree:   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 
master
head:   96080f697786e0a30006fcbcc5b53f350fcb3e9f
commit: c7acec713d14c6ce8a20154f9dfda258d6bcad3b kernel.h: handle pointers to 
arrays better in container_of()
date:   10 days ago
config: ia64-allmodconfig (attached as .config)
compiler: ia64-linux-gcc (GCC) 6.2.0
reproduce:
wget 
https://raw.githubusercontent.com/01org/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
git checkout c7acec713d14c6ce8a20154f9dfda258d6bcad3b
# save the attached .config to linux build tree
make.cross ARCH=ia64 

All errors (new ones prefixed by >>):

   In file included from drivers/clocksource/timer-of.c:25:0:
   drivers/clocksource/timer-of.h:35:28: error: field 'clkevt' has incomplete 
type
 struct clock_event_device clkevt;
   ^~
   In file included from include/linux/err.h:4:0,
from include/linux/clk.h:15,
from drivers/clocksource/timer-of.c:18:
   drivers/clocksource/timer-of.h: In function 'to_timer_of':
>> include/linux/kernel.h:860:32: error: dereferencing pointer to incomplete 
>> type 'struct clock_event_device'
 BUILD_BUG_ON_MSG(!__same_type(*(ptr), ((type *)0)->member) && \
   ^~
   include/linux/compiler.h:517:19: note: in definition of macro 
'__compiletime_assert'
  bool __cond = !(condition);\
  ^
   include/linux/compiler.h:537:2: note: in expansion of macro 
'_compiletime_assert'
 _compiletime_assert(condition, msg, __compiletime_assert_, __LINE__)
 ^~~
   include/linux/build_bug.h:46:37: note: in expansion of macro 
'compiletime_assert'
#define BUILD_BUG_ON_MSG(cond, msg) compiletime_assert(!(cond), msg)
^~
   include/linux/kernel.h:860:2: note: in expansion of macro 'BUILD_BUG_ON_MSG'
 BUILD_BUG_ON_MSG(!__same_type(*(ptr), ((type *)0)->member) && \
 ^~~~
   include/linux/kernel.h:860:20: note: in expansion of macro '__same_type'
 BUILD_BUG_ON_MSG(!__same_type(*(ptr), ((type *)0)->member) && \
   ^~~
   drivers/clocksource/timer-of.h:44:9: note: in expansion of macro 
'container_of'
 return container_of(clkevt, struct timer_of, clkevt);
^~~~
--
   In file included from drivers//clocksource/timer-of.c:25:0:
   drivers//clocksource/timer-of.h:35:28: error: field 'clkevt' has incomplete 
type
 struct clock_event_device clkevt;
   ^~
   In file included from include/linux/err.h:4:0,
from include/linux/clk.h:15,
from drivers//clocksource/timer-of.c:18:
   drivers//clocksource/timer-of.h: In function 'to_timer_of':
>> include/linux/kernel.h:860:32: error: dereferencing pointer to incomplete 
>> type 'struct clock_event_device'
 BUILD_BUG_ON_MSG(!__same_type(*(ptr), ((type *)0)->member) && \
   ^~
   include/linux/compiler.h:517:19: note: in definition of macro 
'__compiletime_assert'
  bool __cond = !(condition);\
  ^
   include/linux/compiler.h:537:2: note: in expansion of macro 
'_compiletime_assert'
 _compiletime_assert(condition, msg, __compiletime_assert_, __LINE__)
 ^~~
   include/linux/build_bug.h:46:37: note: in expansion of macro 
'compiletime_assert'
#define BUILD_BUG_ON_MSG(cond, msg) compiletime_assert(!(cond), msg)
^~
   include/linux/kernel.h:860:2: note: in expansion of macro 'BUILD_BUG_ON_MSG'
 BUILD_BUG_ON_MSG(!__same_type(*(ptr), ((type *)0)->member) && \
 ^~~~
   include/linux/kernel.h:860:20: note: in expansion of macro '__same_type'
 BUILD_BUG_ON_MSG(!__same_type(*(ptr), ((type *)0)->member) && \
   ^~~
   drivers//clocksource/timer-of.h:44:9: note: in expansion of macro 
'container_of'
 return container_of(clkevt, struct timer_of, clkevt);
^~~~

vim +860 include/linux/kernel.h

   843  
   844  
   845  /*
   846   * swap - swap value of @a and @b
   847   */
   848  #define swap(a, b) \
   849  do { typeof(a) __tmp = (a); (a) = (b); (b) = __tmp; } while (0)
   850  
   851  /**
   852   * container_of - cast a member of a structure out to the containing 
structure
   853   * @ptr:the pointer to the member.
   854   * @type:   the type of the container struct this is embedded in.
   855   * @member: the name of the member within the struct.
   856   *
   857   */
   858  #define container_of(ptr, type, member) ({  
\
   859  void *__mptr = (void *)(ptr);   
\
 > 860

include/linux/kernel.h:860:32: error: dereferencing pointer to incomplete type 'struct clock_event_device'

2017-07-22 Thread kbuild test robot

Hi Ian,

FYI, the error/warning still remains.

tree:   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 
master
head:   96080f697786e0a30006fcbcc5b53f350fcb3e9f
commit: c7acec713d14c6ce8a20154f9dfda258d6bcad3b kernel.h: handle pointers to 
arrays better in container_of()
date:   10 days ago
config: ia64-allmodconfig (attached as .config)
compiler: ia64-linux-gcc (GCC) 6.2.0
reproduce:
wget 
https://raw.githubusercontent.com/01org/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
git checkout c7acec713d14c6ce8a20154f9dfda258d6bcad3b
# save the attached .config to linux build tree
make.cross ARCH=ia64 

All errors (new ones prefixed by >>):

   In file included from drivers/clocksource/timer-of.c:25:0:
   drivers/clocksource/timer-of.h:35:28: error: field 'clkevt' has incomplete 
type
 struct clock_event_device clkevt;
   ^~
   In file included from include/linux/err.h:4:0,
from include/linux/clk.h:15,
from drivers/clocksource/timer-of.c:18:
   drivers/clocksource/timer-of.h: In function 'to_timer_of':
>> include/linux/kernel.h:860:32: error: dereferencing pointer to incomplete 
>> type 'struct clock_event_device'
 BUILD_BUG_ON_MSG(!__same_type(*(ptr), ((type *)0)->member) && \
   ^~
   include/linux/compiler.h:517:19: note: in definition of macro 
'__compiletime_assert'
  bool __cond = !(condition);\
  ^
   include/linux/compiler.h:537:2: note: in expansion of macro 
'_compiletime_assert'
 _compiletime_assert(condition, msg, __compiletime_assert_, __LINE__)
 ^~~
   include/linux/build_bug.h:46:37: note: in expansion of macro 
'compiletime_assert'
#define BUILD_BUG_ON_MSG(cond, msg) compiletime_assert(!(cond), msg)
^~
   include/linux/kernel.h:860:2: note: in expansion of macro 'BUILD_BUG_ON_MSG'
 BUILD_BUG_ON_MSG(!__same_type(*(ptr), ((type *)0)->member) && \
 ^~~~
   include/linux/kernel.h:860:20: note: in expansion of macro '__same_type'
 BUILD_BUG_ON_MSG(!__same_type(*(ptr), ((type *)0)->member) && \
   ^~~
   drivers/clocksource/timer-of.h:44:9: note: in expansion of macro 
'container_of'
 return container_of(clkevt, struct timer_of, clkevt);
^~~~
--
   In file included from drivers//clocksource/timer-of.c:25:0:
   drivers//clocksource/timer-of.h:35:28: error: field 'clkevt' has incomplete 
type
 struct clock_event_device clkevt;
   ^~
   In file included from include/linux/err.h:4:0,
from include/linux/clk.h:15,
from drivers//clocksource/timer-of.c:18:
   drivers//clocksource/timer-of.h: In function 'to_timer_of':
>> include/linux/kernel.h:860:32: error: dereferencing pointer to incomplete 
>> type 'struct clock_event_device'
 BUILD_BUG_ON_MSG(!__same_type(*(ptr), ((type *)0)->member) && \
   ^~
   include/linux/compiler.h:517:19: note: in definition of macro 
'__compiletime_assert'
  bool __cond = !(condition);\
  ^
   include/linux/compiler.h:537:2: note: in expansion of macro 
'_compiletime_assert'
 _compiletime_assert(condition, msg, __compiletime_assert_, __LINE__)
 ^~~
   include/linux/build_bug.h:46:37: note: in expansion of macro 
'compiletime_assert'
#define BUILD_BUG_ON_MSG(cond, msg) compiletime_assert(!(cond), msg)
^~
   include/linux/kernel.h:860:2: note: in expansion of macro 'BUILD_BUG_ON_MSG'
 BUILD_BUG_ON_MSG(!__same_type(*(ptr), ((type *)0)->member) && \
 ^~~~
   include/linux/kernel.h:860:20: note: in expansion of macro '__same_type'
 BUILD_BUG_ON_MSG(!__same_type(*(ptr), ((type *)0)->member) && \
   ^~~
   drivers//clocksource/timer-of.h:44:9: note: in expansion of macro 
'container_of'
 return container_of(clkevt, struct timer_of, clkevt);
^~~~

vim +860 include/linux/kernel.h

   843  
   844  
   845  /*
   846   * swap - swap value of @a and @b
   847   */
   848  #define swap(a, b) \
   849  do { typeof(a) __tmp = (a); (a) = (b); (b) = __tmp; } while (0)
   850  
   851  /**
   852   * container_of - cast a member of a structure out to the containing 
structure
   853   * @ptr:the pointer to the member.
   854   * @type:   the type of the container struct this is embedded in.
   855   * @member: the name of the member within the struct.
   856   *
   857   */
   858  #define container_of(ptr, type, member) ({  
\
   859  void *__mptr = (void *)(ptr);   
\
 > 860

Re: [PATCH] documentation: Fix two-CPU control-dependency example

2017-07-22 Thread Paul E. McKenney

On Sat, Jul 22, 2017 at 08:38:57AM +0900, Akira Yokosawa wrote:
> On 2017/07/20 16:07:14 -0700, Paul E. McKenney wrote:
> > On Fri, Jul 21, 2017 at 07:52:03AM +0900, Akira Yokosawa wrote:
> >> On 2017/07/20 14:42:34 -0700, Paul E. McKenney wrote:
> [...]
> >>> For the compilers I know about at the present time, yes.
> >>
> >> So if I respin the patch with the extern, would you still feel reluctant?
> > 
> > Yes, because I am not seeing how this change helps.  What is this telling
> > the reader that the original did not, and how does it help the reader
> > generate good concurrent code?
> 
> Well, what bothers me in the ">" version of two-CPU example is the
> explanation in memory-barriers.txt that follows:
> 
> > These two examples are the LB and WWC litmus tests from this paper:
> > http://www.cl.cam.ac.uk/users/pes20/ppc-supplemental/test6.pdf and this
> > site: https://www.cl.cam.ac.uk/~pes20/ppcmem/index.html.
> 
> I'm wondering if calling the ">" version as an "LB" litmus test is correct.
> Because it always results in "r1 == 0 && r2 == 0", 100%.

As it should, because nothing can become non-zero unless something was
already non-zero.  It is possible to create similarly single-outcome
tests with address dependencies.

But yes, converting to ">=" makes the stores unconditional, and thus
allows more outcomes.  Perhaps we need both?  Another approach is to
write a second value in an "else" statement, keeping the original ">".

I agree that some of these examples can be a bit hard on one's intuition,
but the plain fact is that when it comes to memory models, one's intuition
is not always one's friend.

> An LB litmus test with full memory barriers would be:
> 
>   CPU 0 CPU 1
>   ===   ===
>   r1 = READ_ONCE(x);r2 = READ_ONCE(y);
>   smp_mb(); smp_mb();
>   WRITE_ONCE(y, 1); WRITE_ONCE(x, 1);
> 
>   assert(!(r1 == 1 && r2 == 1));
> 
> and this will result in either of
> 
> r1 == 0 && r2 == 0
> r1 == 0 && r2 == 1
> r1 == 1 && r2 == 0
> 
> but never "r1 == 1 && r2 == 1".

Agreed, because unlike the control-dependency example, the WRITE_ONCE()s
happen unconditionally.

> The difference in the behavior distracts me in reading this part
> of memory-barriers.txt.

Then it likely needs more explanation.

> Your priority seemed to be in reducing the chance of the "if" statement
> to be optimized away.  So I suggested to use "extern" as a compromise.

If the various tools accept the "extern", this might not be a bad thing
to do.

But what this really means is that I need to take another tilt at
the "volatile" windmill in the committee.

> Another way would be to express the ">=" version in a pseudo-asm form.
> 
>   CPU 0 CPU 1
>   ===   ===
>   r1 = LOAD x   r2 = LOAD y
>   if (r1 >= 0)  if (r2 >= 0)
> STORE y = 1   STORE x = 1
> 
>   assert(!(r1 == 1 && r2 == 1));
> 
> This should eliminate any concern of compiler optimization.
> In this final part of CONTROL DEPENDENCIES section, separating the
> problem of optimization and transitivity would clarify the point
> (at least for me).

The problem is that people really do use C-language control dependencies
in the Linux kernel, so we need to describe them.  Maybe someday it
will be necessary to convert them to asm, but I am hoping that we can
avoid that.

> Thoughts?

My hope is that the memory model can help here, but that will in any
case take time.

Thanx, Paul

> Regards, Akira
> 
> > 
> > Thanx, Paul
> > 
> >>  Regards, Akira
> >>
> >>>
> >>> The underlying problem is that the standard says almost nothing about what
> >>> volatile does.  I usually argue that it was intended to be used for MMIO,
> >>> so any optimization that would prevent a device driver from working must
> >>> be prohibited on volatiles.  A lot of people really don't like volatile,
> >>> and would like to eliminate it, and these people also aren't very happy
> >>> about any attempt to better define volatile.
> >>>
> >>> Keeps things entertaining.  ;-)
> >>>
> >>>   Thanx, Paul
> >>>
>

Re: [PATCH] documentation: Fix two-CPU control-dependency example

2017-07-22 Thread Paul E. McKenney

On Sat, Jul 22, 2017 at 08:38:57AM +0900, Akira Yokosawa wrote:
> On 2017/07/20 16:07:14 -0700, Paul E. McKenney wrote:
> > On Fri, Jul 21, 2017 at 07:52:03AM +0900, Akira Yokosawa wrote:
> >> On 2017/07/20 14:42:34 -0700, Paul E. McKenney wrote:
> [...]
> >>> For the compilers I know about at the present time, yes.
> >>
> >> So if I respin the patch with the extern, would you still feel reluctant?
> > 
> > Yes, because I am not seeing how this change helps.  What is this telling
> > the reader that the original did not, and how does it help the reader
> > generate good concurrent code?
> 
> Well, what bothers me in the ">" version of two-CPU example is the
> explanation in memory-barriers.txt that follows:
> 
> > These two examples are the LB and WWC litmus tests from this paper:
> > http://www.cl.cam.ac.uk/users/pes20/ppc-supplemental/test6.pdf and this
> > site: https://www.cl.cam.ac.uk/~pes20/ppcmem/index.html.
> 
> I'm wondering if calling the ">" version as an "LB" litmus test is correct.
> Because it always results in "r1 == 0 && r2 == 0", 100%.

As it should, because nothing can become non-zero unless something was
already non-zero.  It is possible to create similarly single-outcome
tests with address dependencies.

But yes, converting to ">=" makes the stores unconditional, and thus
allows more outcomes.  Perhaps we need both?  Another approach is to
write a second value in an "else" statement, keeping the original ">".

I agree that some of these examples can be a bit hard on one's intuition,
but the plain fact is that when it comes to memory models, one's intuition
is not always one's friend.

> An LB litmus test with full memory barriers would be:
> 
>   CPU 0 CPU 1
>   ===   ===
>   r1 = READ_ONCE(x);r2 = READ_ONCE(y);
>   smp_mb(); smp_mb();
>   WRITE_ONCE(y, 1); WRITE_ONCE(x, 1);
> 
>   assert(!(r1 == 1 && r2 == 1));
> 
> and this will result in either of
> 
> r1 == 0 && r2 == 0
> r1 == 0 && r2 == 1
> r1 == 1 && r2 == 0
> 
> but never "r1 == 1 && r2 == 1".

Agreed, because unlike the control-dependency example, the WRITE_ONCE()s
happen unconditionally.

> The difference in the behavior distracts me in reading this part
> of memory-barriers.txt.

Then it likely needs more explanation.

> Your priority seemed to be in reducing the chance of the "if" statement
> to be optimized away.  So I suggested to use "extern" as a compromise.

If the various tools accept the "extern", this might not be a bad thing
to do.

But what this really means is that I need to take another tilt at
the "volatile" windmill in the committee.

> Another way would be to express the ">=" version in a pseudo-asm form.
> 
>   CPU 0 CPU 1
>   ===   ===
>   r1 = LOAD x   r2 = LOAD y
>   if (r1 >= 0)  if (r2 >= 0)
> STORE y = 1   STORE x = 1
> 
>   assert(!(r1 == 1 && r2 == 1));
> 
> This should eliminate any concern of compiler optimization.
> In this final part of CONTROL DEPENDENCIES section, separating the
> problem of optimization and transitivity would clarify the point
> (at least for me).

The problem is that people really do use C-language control dependencies
in the Linux kernel, so we need to describe them.  Maybe someday it
will be necessary to convert them to asm, but I am hoping that we can
avoid that.

> Thoughts?

My hope is that the memory model can help here, but that will in any
case take time.

Thanx, Paul

> Regards, Akira
> 
> > 
> > Thanx, Paul
> > 
> >>  Regards, Akira
> >>
> >>>
> >>> The underlying problem is that the standard says almost nothing about what
> >>> volatile does.  I usually argue that it was intended to be used for MMIO,
> >>> so any optimization that would prevent a device driver from working must
> >>> be prohibited on volatiles.  A lot of people really don't like volatile,
> >>> and would like to eliminate it, and these people also aren't very happy
> >>> about any attempt to better define volatile.
> >>>
> >>> Keeps things entertaining.  ;-)
> >>>
> >>>   Thanx, Paul
> >>>
>

Re: [PATCH v6] cpufreq: schedutil: Make iowait boost more energy efficient

2017-07-22 Thread Joel Fernandes

On Sat, Jul 22, 2017 at 2:44 PM, Rafael J. Wysocki  wrote:
> On Saturday, July 22, 2017 12:47:53 AM Joel Fernandes wrote:
>> Currently the iowait_boost feature in schedutil makes the frequency go to max
>> on iowait wakeups.  This feature was added to handle a case that Peter
>> described where the throughput of operations involving continuous I/O 
>> requests
>> [1] is reduced due to running at a lower frequency, however the lower
>> throughput itself causes utilization to be low and hence causing frequency to
>> be low hence its "stuck".
>>
>> Instead of going to max, its also possible to achieve the same effect by
>> ramping up to max if there are repeated in_iowait wakeups happening. This 
>> patch
>> is an attempt to do that. We start from a lower frequency (policy->min)
>> and double the boost for every consecutive iowait update until we reach the
>> maximum iowait boost frequency (iowait_boost_max).
>>
>> I ran a synthetic test (continuous O_DIRECT writes in a loop) on an x86 
>> machine
>> with intel_pstate in passive mode using schedutil. In this test the 
>> iowait_boost
>> value ramped from 800MHz to 4GHz in 60ms. The patch achieves the desired 
>> improved
>> throughput as the existing behavior.
>>
>> Also while at it, make iowait_boost and iowait_boost_max as unsigned int 
>> since
>> its unit is kHz and this is consistent with struct cpufreq_policy.
>>
>> [1] https://patchwork.kernel.org/patch/9735885/
>>
>> Cc: Srinivas Pandruvada 
>> Cc: Len Brown 
>> Cc: Rafael J. Wysocki 
>> Cc: Viresh Kumar 
>> Cc: Ingo Molnar 
>> Cc: Peter Zijlstra 
>> Suggested-by: Peter Zijlstra 
>> Suggested-by: Viresh Kumar 
>> Signed-off-by: Joel Fernandes 
>> ---
>> Viresh, made slight modifications to the last approach we agreed on using, 
>> but
>> nothing we didn't already discuss. I also dropped the RFC tag since I think
>> this is increasingly now becoming final (or has become final if no one else 
>> has
>> any other objection).
>>
>>  kernel/sched/cpufreq_schedutil.c | 37 +++--
>>  1 file changed, 31 insertions(+), 6 deletions(-)
>>
>> diff --git a/kernel/sched/cpufreq_schedutil.c 
>> b/kernel/sched/cpufreq_schedutil.c
>> index 622eed1b7658..0c0b6c8c15fc 100644
>> --- a/kernel/sched/cpufreq_schedutil.c
>> +++ b/kernel/sched/cpufreq_schedutil.c
>> @@ -53,6 +53,7 @@ struct sugov_cpu {
>>   struct update_util_data update_util;
>>   struct sugov_policy *sg_policy;
>>
>> + bool iowait_boost_pending;
>>   unsigned long iowait_boost;
>>   unsigned long iowait_boost_max;
>>   u64 last_update;
>> @@ -172,30 +173,53 @@ static void sugov_set_iowait_boost(struct sugov_cpu 
>> *sg_cpu, u64 time,
>>  unsigned int flags)
>>  {
>>   if (flags & SCHED_CPUFREQ_IOWAIT) {
>> - sg_cpu->iowait_boost = sg_cpu->iowait_boost_max;
>> + if (sg_cpu->iowait_boost_pending)
>> + return;
>> +
>> + sg_cpu->iowait_boost_pending = true;
>> +
>> + if (sg_cpu->iowait_boost) {
>> + sg_cpu->iowait_boost = min(sg_cpu->iowait_boost << 1,
>> +sg_cpu->iowait_boost_max);
>
> I would do
>
> sg_cpu->iowait_boost <<= 1;
> if (sg_cpu->iowait_boost > sg_cpu->iowait_boost_max)
> sg_cpu->iowait_boost = 
> sg_cpu->iowait_boost_max;
>
> as that's easeir to read.
>
> The rest of the patch is fine by me.

Done, and resent patches. Also added one more to change the
iowait_boost and iowait_boost_max to unsigned it.

thanks,

-Joel

Re: [PATCH v6] cpufreq: schedutil: Make iowait boost more energy efficient

2017-07-22 Thread Joel Fernandes

On Sat, Jul 22, 2017 at 2:44 PM, Rafael J. Wysocki  wrote:
> On Saturday, July 22, 2017 12:47:53 AM Joel Fernandes wrote:
>> Currently the iowait_boost feature in schedutil makes the frequency go to max
>> on iowait wakeups.  This feature was added to handle a case that Peter
>> described where the throughput of operations involving continuous I/O 
>> requests
>> [1] is reduced due to running at a lower frequency, however the lower
>> throughput itself causes utilization to be low and hence causing frequency to
>> be low hence its "stuck".
>>
>> Instead of going to max, its also possible to achieve the same effect by
>> ramping up to max if there are repeated in_iowait wakeups happening. This 
>> patch
>> is an attempt to do that. We start from a lower frequency (policy->min)
>> and double the boost for every consecutive iowait update until we reach the
>> maximum iowait boost frequency (iowait_boost_max).
>>
>> I ran a synthetic test (continuous O_DIRECT writes in a loop) on an x86 
>> machine
>> with intel_pstate in passive mode using schedutil. In this test the 
>> iowait_boost
>> value ramped from 800MHz to 4GHz in 60ms. The patch achieves the desired 
>> improved
>> throughput as the existing behavior.
>>
>> Also while at it, make iowait_boost and iowait_boost_max as unsigned int 
>> since
>> its unit is kHz and this is consistent with struct cpufreq_policy.
>>
>> [1] https://patchwork.kernel.org/patch/9735885/
>>
>> Cc: Srinivas Pandruvada 
>> Cc: Len Brown 
>> Cc: Rafael J. Wysocki 
>> Cc: Viresh Kumar 
>> Cc: Ingo Molnar 
>> Cc: Peter Zijlstra 
>> Suggested-by: Peter Zijlstra 
>> Suggested-by: Viresh Kumar 
>> Signed-off-by: Joel Fernandes 
>> ---
>> Viresh, made slight modifications to the last approach we agreed on using, 
>> but
>> nothing we didn't already discuss. I also dropped the RFC tag since I think
>> this is increasingly now becoming final (or has become final if no one else 
>> has
>> any other objection).
>>
>>  kernel/sched/cpufreq_schedutil.c | 37 +++--
>>  1 file changed, 31 insertions(+), 6 deletions(-)
>>
>> diff --git a/kernel/sched/cpufreq_schedutil.c 
>> b/kernel/sched/cpufreq_schedutil.c
>> index 622eed1b7658..0c0b6c8c15fc 100644
>> --- a/kernel/sched/cpufreq_schedutil.c
>> +++ b/kernel/sched/cpufreq_schedutil.c
>> @@ -53,6 +53,7 @@ struct sugov_cpu {
>>   struct update_util_data update_util;
>>   struct sugov_policy *sg_policy;
>>
>> + bool iowait_boost_pending;
>>   unsigned long iowait_boost;
>>   unsigned long iowait_boost_max;
>>   u64 last_update;
>> @@ -172,30 +173,53 @@ static void sugov_set_iowait_boost(struct sugov_cpu 
>> *sg_cpu, u64 time,
>>  unsigned int flags)
>>  {
>>   if (flags & SCHED_CPUFREQ_IOWAIT) {
>> - sg_cpu->iowait_boost = sg_cpu->iowait_boost_max;
>> + if (sg_cpu->iowait_boost_pending)
>> + return;
>> +
>> + sg_cpu->iowait_boost_pending = true;
>> +
>> + if (sg_cpu->iowait_boost) {
>> + sg_cpu->iowait_boost = min(sg_cpu->iowait_boost << 1,
>> +sg_cpu->iowait_boost_max);
>
> I would do
>
> sg_cpu->iowait_boost <<= 1;
> if (sg_cpu->iowait_boost > sg_cpu->iowait_boost_max)
> sg_cpu->iowait_boost = 
> sg_cpu->iowait_boost_max;
>
> as that's easeir to read.
>
> The rest of the patch is fine by me.

Done, and resent patches. Also added one more to change the
iowait_boost and iowait_boost_max to unsigned it.

thanks,

-Joel

[PATCH v7 2/2] cpufreq: schedutil: Use unsigned int for iowait boost

2017-07-22 Thread Joel Fernandes

Make iowait_boost and iowait_boost_max as unsigned int since its unit is kHz
and this is consistent with struct cpufreq_policy. Also change the local
variables in sugov_iowait_boost to match this.

Cc: Srinivas Pandruvada 
Cc: Len Brown 
Cc: Rafael J. Wysocki 
Cc: Viresh Kumar 
Cc: Ingo Molnar 
Cc: Peter Zijlstra 
Signed-off-by: Joel Fernandes 
---
 kernel/sched/cpufreq_schedutil.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c
index 570ab6e779e6..7650784eb857 100644
--- a/kernel/sched/cpufreq_schedutil.c
+++ b/kernel/sched/cpufreq_schedutil.c
@@ -54,8 +54,8 @@ struct sugov_cpu {
struct sugov_policy *sg_policy;
 
bool iowait_boost_pending;
-   unsigned long iowait_boost;
-   unsigned long iowait_boost_max;
+   unsigned int iowait_boost;
+   unsigned int iowait_boost_max;
u64 last_update;
 
/* The fields below are only needed when sharing a policy. */
@@ -199,7 +199,7 @@ static void sugov_set_iowait_boost(struct sugov_cpu 
*sg_cpu, u64 time,
 static void sugov_iowait_boost(struct sugov_cpu *sg_cpu, unsigned long *util,
   unsigned long *max)
 {
-   unsigned long boost_util, boost_max;
+   unsigned int boost_util, boost_max;
 
if (!sg_cpu->iowait_boost)
return;
-- 
2.14.0.rc0.284.gd933b75aa4-goog

[PATCH v7 2/2] cpufreq: schedutil: Use unsigned int for iowait boost

2017-07-22 Thread Joel Fernandes

Make iowait_boost and iowait_boost_max as unsigned int since its unit is kHz
and this is consistent with struct cpufreq_policy. Also change the local
variables in sugov_iowait_boost to match this.

Cc: Srinivas Pandruvada 
Cc: Len Brown 
Cc: Rafael J. Wysocki 
Cc: Viresh Kumar 
Cc: Ingo Molnar 
Cc: Peter Zijlstra 
Signed-off-by: Joel Fernandes 
---
 kernel/sched/cpufreq_schedutil.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c
index 570ab6e779e6..7650784eb857 100644
--- a/kernel/sched/cpufreq_schedutil.c
+++ b/kernel/sched/cpufreq_schedutil.c
@@ -54,8 +54,8 @@ struct sugov_cpu {
struct sugov_policy *sg_policy;
 
bool iowait_boost_pending;
-   unsigned long iowait_boost;
-   unsigned long iowait_boost_max;
+   unsigned int iowait_boost;
+   unsigned int iowait_boost_max;
u64 last_update;
 
/* The fields below are only needed when sharing a policy. */
@@ -199,7 +199,7 @@ static void sugov_set_iowait_boost(struct sugov_cpu 
*sg_cpu, u64 time,
 static void sugov_iowait_boost(struct sugov_cpu *sg_cpu, unsigned long *util,
   unsigned long *max)
 {
-   unsigned long boost_util, boost_max;
+   unsigned int boost_util, boost_max;
 
if (!sg_cpu->iowait_boost)
return;
-- 
2.14.0.rc0.284.gd933b75aa4-goog

[PATCH v7 1/2] cpufreq: schedutil: Make iowait boost more energy efficient

2017-07-22 Thread Joel Fernandes

Currently the iowait_boost feature in schedutil makes the frequency go to max
on iowait wakeups.  This feature was added to handle a case that Peter
described where the throughput of operations involving continuous I/O requests
[1] is reduced due to running at a lower frequency, however the lower
throughput itself causes utilization to be low and hence causing frequency to
be low hence its "stuck".

Instead of going to max, its also possible to achieve the same effect by
ramping up to max if there are repeated in_iowait wakeups happening. This patch
is an attempt to do that. We start from a lower frequency (policy->min)
and double the boost for every consecutive iowait update until we reach the
maximum iowait boost frequency (iowait_boost_max).

I ran a synthetic test (continuous O_DIRECT writes in a loop) on an x86 machine
with intel_pstate in passive mode using schedutil. In this test the iowait_boost
value ramped from 800MHz to 4GHz in 60ms. The patch achieves the desired 
improved
throughput as the existing behavior.

[1] https://patchwork.kernel.org/patch/9735885/

Cc: Srinivas Pandruvada 
Cc: Len Brown 
Cc: Rafael J. Wysocki 
Cc: Viresh Kumar 
Cc: Ingo Molnar 
Cc: Peter Zijlstra 
Suggested-by: Peter Zijlstra 
Suggested-by: Viresh Kumar 
Signed-off-by: Joel Fernandes 
---
 kernel/sched/cpufreq_schedutil.c | 38 --
 1 file changed, 32 insertions(+), 6 deletions(-)

diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c
index 622eed1b7658..570ab6e779e6 100644
--- a/kernel/sched/cpufreq_schedutil.c
+++ b/kernel/sched/cpufreq_schedutil.c
@@ -53,6 +53,7 @@ struct sugov_cpu {
struct update_util_data update_util;
struct sugov_policy *sg_policy;
 
+   bool iowait_boost_pending;
unsigned long iowait_boost;
unsigned long iowait_boost_max;
u64 last_update;
@@ -172,30 +173,54 @@ static void sugov_set_iowait_boost(struct sugov_cpu 
*sg_cpu, u64 time,
   unsigned int flags)
 {
if (flags & SCHED_CPUFREQ_IOWAIT) {
-   sg_cpu->iowait_boost = sg_cpu->iowait_boost_max;
+   if (sg_cpu->iowait_boost_pending)
+   return;
+
+   sg_cpu->iowait_boost_pending = true;
+
+   if (sg_cpu->iowait_boost) {
+   sg_cpu->iowait_boost <<= 1;
+   if (sg_cpu->iowait_boost > sg_cpu->iowait_boost_max)
+   sg_cpu->iowait_boost = sg_cpu->iowait_boost_max;
+   } else {
+   sg_cpu->iowait_boost = sg_cpu->sg_policy->policy->min;
+   }
} else if (sg_cpu->iowait_boost) {
s64 delta_ns = time - sg_cpu->last_update;
 
/* Clear iowait_boost if the CPU apprears to have been idle. */
-   if (delta_ns > TICK_NSEC)
+   if (delta_ns > TICK_NSEC) {
sg_cpu->iowait_boost = 0;
+   sg_cpu->iowait_boost_pending = false;
+   }
}
 }
 
 static void sugov_iowait_boost(struct sugov_cpu *sg_cpu, unsigned long *util,
   unsigned long *max)
 {
-   unsigned long boost_util = sg_cpu->iowait_boost;
-   unsigned long boost_max = sg_cpu->iowait_boost_max;
+   unsigned long boost_util, boost_max;
 
-   if (!boost_util)
+   if (!sg_cpu->iowait_boost)
return;
 
+   if (sg_cpu->iowait_boost_pending) {
+   sg_cpu->iowait_boost_pending = false;
+   } else {
+   sg_cpu->iowait_boost >>= 1;
+   if (sg_cpu->iowait_boost < sg_cpu->sg_policy->policy->min) {
+   sg_cpu->iowait_boost = 0;
+   return;
+   }
+   }
+
+   boost_util = sg_cpu->iowait_boost;
+   boost_max = sg_cpu->iowait_boost_max;
+
if (*util * boost_max < *max * boost_util) {
*util = boost_util;
*max = boost_max;
}
-   sg_cpu->iowait_boost >>= 1;
 }
 
 #ifdef CONFIG_NO_HZ_COMMON
@@ -267,6 +292,7 @@ static unsigned int sugov_next_freq_shared(struct sugov_cpu 
*sg_cpu, u64 time)
delta_ns = time - j_sg_cpu->last_update;
if (delta_ns > TICK_NSEC) {
j_sg_cpu->iowait_boost = 0;
+   j_sg_cpu->iowait_boost_pending = false;
continue;
}
if (j_sg_cpu->flags & SCHED_CPUFREQ_RT_DL)
-- 
2.14.0.rc0.284.gd933b75aa4-goog

[PATCH v7 1/2] cpufreq: schedutil: Make iowait boost more energy efficient

2017-07-22 Thread Joel Fernandes

Currently the iowait_boost feature in schedutil makes the frequency go to max
on iowait wakeups.  This feature was added to handle a case that Peter
described where the throughput of operations involving continuous I/O requests
[1] is reduced due to running at a lower frequency, however the lower
throughput itself causes utilization to be low and hence causing frequency to
be low hence its "stuck".

Instead of going to max, its also possible to achieve the same effect by
ramping up to max if there are repeated in_iowait wakeups happening. This patch
is an attempt to do that. We start from a lower frequency (policy->min)
and double the boost for every consecutive iowait update until we reach the
maximum iowait boost frequency (iowait_boost_max).

I ran a synthetic test (continuous O_DIRECT writes in a loop) on an x86 machine
with intel_pstate in passive mode using schedutil. In this test the iowait_boost
value ramped from 800MHz to 4GHz in 60ms. The patch achieves the desired 
improved
throughput as the existing behavior.

[1] https://patchwork.kernel.org/patch/9735885/

Cc: Srinivas Pandruvada 
Cc: Len Brown 
Cc: Rafael J. Wysocki 
Cc: Viresh Kumar 
Cc: Ingo Molnar 
Cc: Peter Zijlstra 
Suggested-by: Peter Zijlstra 
Suggested-by: Viresh Kumar 
Signed-off-by: Joel Fernandes 
---
 kernel/sched/cpufreq_schedutil.c | 38 --
 1 file changed, 32 insertions(+), 6 deletions(-)

diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c
index 622eed1b7658..570ab6e779e6 100644
--- a/kernel/sched/cpufreq_schedutil.c
+++ b/kernel/sched/cpufreq_schedutil.c
@@ -53,6 +53,7 @@ struct sugov_cpu {
struct update_util_data update_util;
struct sugov_policy *sg_policy;
 
+   bool iowait_boost_pending;
unsigned long iowait_boost;
unsigned long iowait_boost_max;
u64 last_update;
@@ -172,30 +173,54 @@ static void sugov_set_iowait_boost(struct sugov_cpu 
*sg_cpu, u64 time,
   unsigned int flags)
 {
if (flags & SCHED_CPUFREQ_IOWAIT) {
-   sg_cpu->iowait_boost = sg_cpu->iowait_boost_max;
+   if (sg_cpu->iowait_boost_pending)
+   return;
+
+   sg_cpu->iowait_boost_pending = true;
+
+   if (sg_cpu->iowait_boost) {
+   sg_cpu->iowait_boost <<= 1;
+   if (sg_cpu->iowait_boost > sg_cpu->iowait_boost_max)
+   sg_cpu->iowait_boost = sg_cpu->iowait_boost_max;
+   } else {
+   sg_cpu->iowait_boost = sg_cpu->sg_policy->policy->min;
+   }
} else if (sg_cpu->iowait_boost) {
s64 delta_ns = time - sg_cpu->last_update;
 
/* Clear iowait_boost if the CPU apprears to have been idle. */
-   if (delta_ns > TICK_NSEC)
+   if (delta_ns > TICK_NSEC) {
sg_cpu->iowait_boost = 0;
+   sg_cpu->iowait_boost_pending = false;
+   }
}
 }
 
 static void sugov_iowait_boost(struct sugov_cpu *sg_cpu, unsigned long *util,
   unsigned long *max)
 {
-   unsigned long boost_util = sg_cpu->iowait_boost;
-   unsigned long boost_max = sg_cpu->iowait_boost_max;
+   unsigned long boost_util, boost_max;
 
-   if (!boost_util)
+   if (!sg_cpu->iowait_boost)
return;
 
+   if (sg_cpu->iowait_boost_pending) {
+   sg_cpu->iowait_boost_pending = false;
+   } else {
+   sg_cpu->iowait_boost >>= 1;
+   if (sg_cpu->iowait_boost < sg_cpu->sg_policy->policy->min) {
+   sg_cpu->iowait_boost = 0;
+   return;
+   }
+   }
+
+   boost_util = sg_cpu->iowait_boost;
+   boost_max = sg_cpu->iowait_boost_max;
+
if (*util * boost_max < *max * boost_util) {
*util = boost_util;
*max = boost_max;
}
-   sg_cpu->iowait_boost >>= 1;
 }
 
 #ifdef CONFIG_NO_HZ_COMMON
@@ -267,6 +292,7 @@ static unsigned int sugov_next_freq_shared(struct sugov_cpu 
*sg_cpu, u64 time)
delta_ns = time - j_sg_cpu->last_update;
if (delta_ns > TICK_NSEC) {
j_sg_cpu->iowait_boost = 0;
+   j_sg_cpu->iowait_boost_pending = false;
continue;
}
if (j_sg_cpu->flags & SCHED_CPUFREQ_RT_DL)
-- 
2.14.0.rc0.284.gd933b75aa4-goog

Re: af_packet: use after free in prb_retire_rx_blk_timer_expired

2017-07-22 Thread Ding Tianhong



On 2017/7/23 3:02, Cong Wang wrote:
> Hello,
> 
> On Sat, Jul 22, 2017 at 2:55 AM, liujian (CE)  wrote:
>> I also hit this issue with trinity test:
>>
>> The call trace:
>>   [exception RIP: prb_retire_rx_blk_timer_expired+70]
>> RIP: 81633be6  RSP: 8801bec03dc0  RFLAGS: 00010246
>> RAX:   RBX: 8801b49d0948  RCX: 
>> RDX: 8801b31057a0  RSI: a56b6b6b6b6b6b6b  RDI: 8801b49d09ec
>> RBP: 8801bec03dd8   R8: 0001   R9: 83e1bf80
>> R10: 0002  R11: 0005  R12: 8801b49d09ec
>> R13: 0100  R14: 81633ba0  R15: 8801b49d0948
>> ORIG_RAX:   CS: 0010  SS: 0018
>>  #7 [8801bec03de0] call_timer_fn at 8108cb76
>>  #8 [8801bec03e18] run_timer_softirq at 8108f87c
>>  #9 [8801bec03e90] __do_softirq at 8108629f
>> #10 [8801bec03f00] call_softirq at 8166a01c
>> #11 [8801bec03f18] do_softirq at 810172ad
>> #12 [8801bec03f30] irq_exit at 81086655
>> #13 [8801bec03f48] msa_irq_exit at 810b1ab3
>> #14 [8801bec03f88] smp_apic_timer_interrupt at 8166aeae
>> #15 [8801bec03fb0] apic_timer_interrupt at 816692dd
>> ---  ---
>>
>> And from vmcore, I can see the pointer GET_CURR_PBLOCK_DESC_FROM_CORE(pkc); 
>> is a56b6b6b6b6b6b6b
>>
> 
> Does the following quick fix help?
> 
> 
> diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
> index 008bb34ee324..09ec1640e5f7 100644
> --- a/net/packet/af_packet.c
> +++ b/net/packet/af_packet.c
> @@ -4264,6 +4264,7 @@ static int packet_set_ring(struct sock *sk,
> union tpacket_req_u *req_u,
> /* Block transmit is not supported yet */
> if (!tx_ring) {
> init_prb_bdqc(po, rb, pg_vec, req_u);
> +   pg_vec = NULL;
> } else {
> struct tpacket_req3 *req3 = _u->req3;
> 

Hi, Cong:

Thanks for your quirk solution, but I still has some doubts about it,
it looks like fix the problem in the packet_setsockopt->packet_set_ring 
processing,
but when in packet_release processing, it may could not release the
real pg_vec for the TPACKET_V3 ring, and then cause the mem leak,
maybe I miss something here, nice to hear from your feedback. :)

what about fix it this way:
--- a/net/packet/af_packet.c
+++ b/net/packet/af_packet.c
@@ -4335,9 +4335,13 @@ static int packet_set_ring(struct sock *sk, union 
tpacket_req_u *req_u,
/* Because we don't support block-based V3 on tx-ring */
if (!tx_ring)
prb_shutdown_retire_blk_timer(po, rb_queue);
+
+   if (pg_vec)
+   free_pg_vec(pg_vec, order, req->tp_block_nr);
+
}

-   if (pg_vec)
+   if (pg_vec && (po->tp_version < TPACKET_V3))
free_pg_vec(pg_vec, order, req->tp_block_nr);
 out:
release_sock(sk);


Regards
Ding

> .
>

Re: af_packet: use after free in prb_retire_rx_blk_timer_expired

2017-07-22 Thread Ding Tianhong



On 2017/7/23 3:02, Cong Wang wrote:
> Hello,
> 
> On Sat, Jul 22, 2017 at 2:55 AM, liujian (CE)  wrote:
>> I also hit this issue with trinity test:
>>
>> The call trace:
>>   [exception RIP: prb_retire_rx_blk_timer_expired+70]
>> RIP: 81633be6  RSP: 8801bec03dc0  RFLAGS: 00010246
>> RAX:   RBX: 8801b49d0948  RCX: 
>> RDX: 8801b31057a0  RSI: a56b6b6b6b6b6b6b  RDI: 8801b49d09ec
>> RBP: 8801bec03dd8   R8: 0001   R9: 83e1bf80
>> R10: 0002  R11: 0005  R12: 8801b49d09ec
>> R13: 0100  R14: 81633ba0  R15: 8801b49d0948
>> ORIG_RAX:   CS: 0010  SS: 0018
>>  #7 [8801bec03de0] call_timer_fn at 8108cb76
>>  #8 [8801bec03e18] run_timer_softirq at 8108f87c
>>  #9 [8801bec03e90] __do_softirq at 8108629f
>> #10 [8801bec03f00] call_softirq at 8166a01c
>> #11 [8801bec03f18] do_softirq at 810172ad
>> #12 [8801bec03f30] irq_exit at 81086655
>> #13 [8801bec03f48] msa_irq_exit at 810b1ab3
>> #14 [8801bec03f88] smp_apic_timer_interrupt at 8166aeae
>> #15 [8801bec03fb0] apic_timer_interrupt at 816692dd
>> ---  ---
>>
>> And from vmcore, I can see the pointer GET_CURR_PBLOCK_DESC_FROM_CORE(pkc); 
>> is a56b6b6b6b6b6b6b
>>
> 
> Does the following quick fix help?
> 
> 
> diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
> index 008bb34ee324..09ec1640e5f7 100644
> --- a/net/packet/af_packet.c
> +++ b/net/packet/af_packet.c
> @@ -4264,6 +4264,7 @@ static int packet_set_ring(struct sock *sk,
> union tpacket_req_u *req_u,
> /* Block transmit is not supported yet */
> if (!tx_ring) {
> init_prb_bdqc(po, rb, pg_vec, req_u);
> +   pg_vec = NULL;
> } else {
> struct tpacket_req3 *req3 = _u->req3;
> 

Hi, Cong:

Thanks for your quirk solution, but I still has some doubts about it,
it looks like fix the problem in the packet_setsockopt->packet_set_ring 
processing,
but when in packet_release processing, it may could not release the
real pg_vec for the TPACKET_V3 ring, and then cause the mem leak,
maybe I miss something here, nice to hear from your feedback. :)

what about fix it this way:
--- a/net/packet/af_packet.c
+++ b/net/packet/af_packet.c
@@ -4335,9 +4335,13 @@ static int packet_set_ring(struct sock *sk, union 
tpacket_req_u *req_u,
/* Because we don't support block-based V3 on tx-ring */
if (!tx_ring)
prb_shutdown_retire_blk_timer(po, rb_queue);
+
+   if (pg_vec)
+   free_pg_vec(pg_vec, order, req->tp_block_nr);
+
}

-   if (pg_vec)
+   if (pg_vec && (po->tp_version < TPACKET_V3))
free_pg_vec(pg_vec, order, req->tp_block_nr);
 out:
release_sock(sk);


Regards
Ding

> .
>

drivers/clocksource/timer-of.h:35:28: error: field 'clkevt' has incomplete type

2017-07-22 Thread kbuild test robot

Hi Daniel,

FYI, the error/warning still remains.

tree:   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 
master
head:   96080f697786e0a30006fcbcc5b53f350fcb3e9f
commit: dc11bae78529526605c5c45c369c9512fd012093 clocksource/drivers: Add 
timer-of common init routine
date:   6 weeks ago
config: ia64-allmodconfig (attached as .config)
compiler: ia64-linux-gcc (GCC) 6.2.0
reproduce:
wget 
https://raw.githubusercontent.com/01org/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
git checkout dc11bae78529526605c5c45c369c9512fd012093
# save the attached .config to linux build tree
make.cross ARCH=ia64 

All errors (new ones prefixed by >>):

   In file included from drivers/clocksource/timer-of.c:25:0:
>> drivers/clocksource/timer-of.h:35:28: error: field 'clkevt' has incomplete 
>> type
 struct clock_event_device clkevt;
   ^~
   In file included from include/linux/clk.h:16:0,
from drivers/clocksource/timer-of.c:18:
   drivers/clocksource/timer-of.h: In function 'to_timer_of':
   include/linux/kernel.h:854:48: error: initialization from incompatible 
pointer type [-Werror=incompatible-pointer-types]
 const typeof( ((type *)0)->member ) *__mptr = (ptr); \
   ^
   drivers/clocksource/timer-of.h:44:9: note: in expansion of macro 
'container_of'
 return container_of(clkevt, struct timer_of, clkevt);
^~~~
   drivers/clocksource/timer-of.c: In function 'timer_irq_init':
   drivers/clocksource/timer-of.c:63:8: error: dereferencing pointer to 
incomplete type 'struct clock_event_device'
 clkevt->irq = of_irq->irq;
   ^~
   cc1: some warnings being treated as errors

vim +/clkevt +35 drivers/clocksource/timer-of.h

32  
33  struct timer_of {
34  unsigned int flags;
  > 35  struct clock_event_device clkevt;
36  struct of_timer_base of_base;
37  struct of_timer_irq  of_irq;
38  struct of_timer_clk  of_clk;
39  void *private_data;
40  };
41  

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


.config.gz
Description: application/gzip

drivers/clocksource/timer-of.h:35:28: error: field 'clkevt' has incomplete type

2017-07-22 Thread kbuild test robot

Hi Daniel,

FYI, the error/warning still remains.

tree:   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 
master
head:   96080f697786e0a30006fcbcc5b53f350fcb3e9f
commit: dc11bae78529526605c5c45c369c9512fd012093 clocksource/drivers: Add 
timer-of common init routine
date:   6 weeks ago
config: ia64-allmodconfig (attached as .config)
compiler: ia64-linux-gcc (GCC) 6.2.0
reproduce:
wget 
https://raw.githubusercontent.com/01org/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
git checkout dc11bae78529526605c5c45c369c9512fd012093
# save the attached .config to linux build tree
make.cross ARCH=ia64 

All errors (new ones prefixed by >>):

   In file included from drivers/clocksource/timer-of.c:25:0:
>> drivers/clocksource/timer-of.h:35:28: error: field 'clkevt' has incomplete 
>> type
 struct clock_event_device clkevt;
   ^~
   In file included from include/linux/clk.h:16:0,
from drivers/clocksource/timer-of.c:18:
   drivers/clocksource/timer-of.h: In function 'to_timer_of':
   include/linux/kernel.h:854:48: error: initialization from incompatible 
pointer type [-Werror=incompatible-pointer-types]
 const typeof( ((type *)0)->member ) *__mptr = (ptr); \
   ^
   drivers/clocksource/timer-of.h:44:9: note: in expansion of macro 
'container_of'
 return container_of(clkevt, struct timer_of, clkevt);
^~~~
   drivers/clocksource/timer-of.c: In function 'timer_irq_init':
   drivers/clocksource/timer-of.c:63:8: error: dereferencing pointer to 
incomplete type 'struct clock_event_device'
 clkevt->irq = of_irq->irq;
   ^~
   cc1: some warnings being treated as errors

vim +/clkevt +35 drivers/clocksource/timer-of.h

32  
33  struct timer_of {
34  unsigned int flags;
  > 35  struct clock_event_device clkevt;
36  struct of_timer_base of_base;
37  struct of_timer_irq  of_irq;
38  struct of_timer_clk  of_clk;
39  void *private_data;
40  };
41  

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


.config.gz
Description: application/gzip

Re: [PATCH] oom_reaper: close race without using oom_lock

2017-07-22 Thread Tetsuo Handa

Tetsuo Handa wrote:
> Log is at http://I-love.SAKURA.ne.jp/tmp/serial-20170722.txt.xz .

Oops, I forgot to remove mmput_async() in Patch2. Below is updated result.
Though, situation (i.e. we can't tell without Patch1 whether we raced with
OOM_MMF_SKIP) is same.

Patch1:

 include/linux/oom.h |  4 
 mm/internal.h   |  4 
 mm/oom_kill.c   | 28 +++-
 mm/page_alloc.c | 10 +++---
 4 files changed, 42 insertions(+), 4 deletions(-)

diff --git a/include/linux/oom.h b/include/linux/oom.h
index 8a266e2..1b0bbb6 100644
--- a/include/linux/oom.h
+++ b/include/linux/oom.h
@@ -11,6 +11,7 @@
 struct notifier_block;
 struct mem_cgroup;
 struct task_struct;
+struct alloc_context;
 
 /*
  * Details of the page allocation that triggered the oom killer that are used 
to
@@ -39,6 +40,9 @@ struct oom_control {
unsigned long totalpages;
struct task_struct *chosen;
unsigned long chosen_points;
+
+   const struct alloc_context *alloc_context;
+   unsigned int alloc_flags;
 };
 
 extern struct mutex oom_lock;
diff --git a/mm/internal.h b/mm/internal.h
index 24d88f0..95a08b5 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -522,4 +522,8 @@ static inline bool is_migrate_highatomic_page(struct page 
*page)
return get_pageblock_migratetype(page) == MIGRATE_HIGHATOMIC;
 }
 
+struct page *get_page_from_freelist(gfp_t gfp_mask, unsigned int order,
+   int alloc_flags,
+   const struct alloc_context *ac);
+
 #endif /* __MM_INTERNAL_H */
diff --git a/mm/oom_kill.c b/mm/oom_kill.c
index 9e8b4f0..fb7b2c8 100644
--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -288,6 +288,9 @@ static enum oom_constraint constrained_alloc(struct 
oom_control *oc)
return CONSTRAINT_NONE;
 }
 
+static unsigned int mmf_oom_skip_raced;
+static unsigned int mmf_oom_skip_not_raced;
+
 static int oom_evaluate_task(struct task_struct *task, void *arg)
 {
struct oom_control *oc = arg;
@@ -303,8 +306,21 @@ static int oom_evaluate_task(struct task_struct *task, 
void *arg)
 * any memory is quite low.
 */
if (!is_sysrq_oom(oc) && tsk_is_oom_victim(task)) {
-   if (test_bit(MMF_OOM_SKIP, >signal->oom_mm->flags))
+   if (test_bit(MMF_OOM_SKIP, >signal->oom_mm->flags)) {
+   const struct alloc_context *ac = oc->alloc_context;
+
+   if (ac) {
+   struct page *page = get_page_from_freelist
+   (oc->gfp_mask, oc->order,
+oc->alloc_flags, ac);
+   if (page) {
+   __free_pages(page, oc->order);
+   mmf_oom_skip_raced++;
+   } else
+   mmf_oom_skip_not_raced++;
+   }
goto next;
+   }
goto abort;
}
 
@@ -1059,6 +1075,16 @@ bool out_of_memory(struct oom_control *oc)
 */
schedule_timeout_killable(1);
}
+   {
+   static unsigned long last;
+   unsigned long now = jiffies;
+
+   if (!last || time_after(now, last + 5 * HZ)) {
+   last = now;
+   pr_info("MMF_OOM_SKIP: raced=%u not_raced=%u\n",
+   mmf_oom_skip_raced, mmf_oom_skip_not_raced);
+   }
+   }
return !!oc->chosen;
 }
 
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 80e4adb..4cf2861 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -3054,7 +3054,7 @@ static bool zone_allows_reclaim(struct zone *local_zone, 
struct zone *zone)
  * get_page_from_freelist goes through the zonelist trying to allocate
  * a page.
  */
-static struct page *
+struct page *
 get_page_from_freelist(gfp_t gfp_mask, unsigned int order, int alloc_flags,
const struct alloc_context *ac)
 {
@@ -3245,7 +3245,8 @@ void warn_alloc(gfp_t gfp_mask, nodemask_t *nodemask, 
const char *fmt, ...)
 
 static inline struct page *
 __alloc_pages_may_oom(gfp_t gfp_mask, unsigned int order,
-   const struct alloc_context *ac, unsigned long *did_some_progress)
+ unsigned int alloc_flags, const struct alloc_context *ac,
+ unsigned long *did_some_progress)
 {
struct oom_control oc = {
.zonelist = ac->zonelist,
@@ -3253,6 +3254,8 @@ void warn_alloc(gfp_t gfp_mask, nodemask_t *nodemask, 
const char *fmt, ...)
.memcg = NULL,
.gfp_mask = gfp_mask,
.order = order,
+   .alloc_context = ac,
+   .alloc_flags = alloc_flags,

Re: [PATCH] oom_reaper: close race without using oom_lock

2017-07-22 Thread Tetsuo Handa

Tetsuo Handa wrote:
> Log is at http://I-love.SAKURA.ne.jp/tmp/serial-20170722.txt.xz .

Oops, I forgot to remove mmput_async() in Patch2. Below is updated result.
Though, situation (i.e. we can't tell without Patch1 whether we raced with
OOM_MMF_SKIP) is same.

Patch1:

 include/linux/oom.h |  4 
 mm/internal.h   |  4 
 mm/oom_kill.c   | 28 +++-
 mm/page_alloc.c | 10 +++---
 4 files changed, 42 insertions(+), 4 deletions(-)

diff --git a/include/linux/oom.h b/include/linux/oom.h
index 8a266e2..1b0bbb6 100644
--- a/include/linux/oom.h
+++ b/include/linux/oom.h
@@ -11,6 +11,7 @@
 struct notifier_block;
 struct mem_cgroup;
 struct task_struct;
+struct alloc_context;
 
 /*
  * Details of the page allocation that triggered the oom killer that are used 
to
@@ -39,6 +40,9 @@ struct oom_control {
unsigned long totalpages;
struct task_struct *chosen;
unsigned long chosen_points;
+
+   const struct alloc_context *alloc_context;
+   unsigned int alloc_flags;
 };
 
 extern struct mutex oom_lock;
diff --git a/mm/internal.h b/mm/internal.h
index 24d88f0..95a08b5 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -522,4 +522,8 @@ static inline bool is_migrate_highatomic_page(struct page 
*page)
return get_pageblock_migratetype(page) == MIGRATE_HIGHATOMIC;
 }
 
+struct page *get_page_from_freelist(gfp_t gfp_mask, unsigned int order,
+   int alloc_flags,
+   const struct alloc_context *ac);
+
 #endif /* __MM_INTERNAL_H */
diff --git a/mm/oom_kill.c b/mm/oom_kill.c
index 9e8b4f0..fb7b2c8 100644
--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -288,6 +288,9 @@ static enum oom_constraint constrained_alloc(struct 
oom_control *oc)
return CONSTRAINT_NONE;
 }
 
+static unsigned int mmf_oom_skip_raced;
+static unsigned int mmf_oom_skip_not_raced;
+
 static int oom_evaluate_task(struct task_struct *task, void *arg)
 {
struct oom_control *oc = arg;
@@ -303,8 +306,21 @@ static int oom_evaluate_task(struct task_struct *task, 
void *arg)
 * any memory is quite low.
 */
if (!is_sysrq_oom(oc) && tsk_is_oom_victim(task)) {
-   if (test_bit(MMF_OOM_SKIP, >signal->oom_mm->flags))
+   if (test_bit(MMF_OOM_SKIP, >signal->oom_mm->flags)) {
+   const struct alloc_context *ac = oc->alloc_context;
+
+   if (ac) {
+   struct page *page = get_page_from_freelist
+   (oc->gfp_mask, oc->order,
+oc->alloc_flags, ac);
+   if (page) {
+   __free_pages(page, oc->order);
+   mmf_oom_skip_raced++;
+   } else
+   mmf_oom_skip_not_raced++;
+   }
goto next;
+   }
goto abort;
}
 
@@ -1059,6 +1075,16 @@ bool out_of_memory(struct oom_control *oc)
 */
schedule_timeout_killable(1);
}
+   {
+   static unsigned long last;
+   unsigned long now = jiffies;
+
+   if (!last || time_after(now, last + 5 * HZ)) {
+   last = now;
+   pr_info("MMF_OOM_SKIP: raced=%u not_raced=%u\n",
+   mmf_oom_skip_raced, mmf_oom_skip_not_raced);
+   }
+   }
return !!oc->chosen;
 }
 
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 80e4adb..4cf2861 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -3054,7 +3054,7 @@ static bool zone_allows_reclaim(struct zone *local_zone, 
struct zone *zone)
  * get_page_from_freelist goes through the zonelist trying to allocate
  * a page.
  */
-static struct page *
+struct page *
 get_page_from_freelist(gfp_t gfp_mask, unsigned int order, int alloc_flags,
const struct alloc_context *ac)
 {
@@ -3245,7 +3245,8 @@ void warn_alloc(gfp_t gfp_mask, nodemask_t *nodemask, 
const char *fmt, ...)
 
 static inline struct page *
 __alloc_pages_may_oom(gfp_t gfp_mask, unsigned int order,
-   const struct alloc_context *ac, unsigned long *did_some_progress)
+ unsigned int alloc_flags, const struct alloc_context *ac,
+ unsigned long *did_some_progress)
 {
struct oom_control oc = {
.zonelist = ac->zonelist,
@@ -3253,6 +3254,8 @@ void warn_alloc(gfp_t gfp_mask, nodemask_t *nodemask, 
const char *fmt, ...)
.memcg = NULL,
.gfp_mask = gfp_mask,
.order = order,
+   .alloc_context = ac,
+   .alloc_flags = alloc_flags,

Re: [PATCH 4/5] PCI: mediatek: Add new generation controller support

2017-07-22 Thread kbuild test robot

Hi Ryder,

[auto build test WARNING on pci/next]
[also build test WARNING on v4.13-rc1 next-20170721]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/honghui-zhang-mediatek-com/PCI-MediaTek-Add-support-for-new-generation-host-controller/20170723-040107
base:   https://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci.git next
config: arm-allmodconfig (attached as .config)
compiler: arm-linux-gnueabi-gcc (Debian 6.1.1-9) 6.1.1 20160705
reproduce:
wget 
https://raw.githubusercontent.com/01org/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# save the attached .config to linux build tree
make.cross ARCH=arm 

All warnings (new ones prefixed by >>):

   drivers/pci/host/pcie-mediatek.c: In function 'mtk_pcie_startup_ports_v2':
>> drivers/pci/host/pcie-mediatek.c:86:36: warning: right shift count >= width 
>> of type [-Wshift-count-overflow]
#define AHB2PCIE_BASEH(base) (base >> 32)
   ^
>> drivers/pci/host/pcie-mediatek.c:440:9: note: in expansion of macro 
>> 'AHB2PCIE_BASEH'
  val = AHB2PCIE_BASEH(mem->start);
^~
   drivers/pci/host/pcie-mediatek.c: In function 'mtk_pcie_parse_ports':
>> drivers/pci/host/pcie-mediatek.c:789:2: warning: this 'if' clause does not 
>> guard... [-Wmisleading-indentation]
 if (pcie->soc->setup_irq)
 ^~
   drivers/pci/host/pcie-mediatek.c:791:3: note: ...this statement, but the 
latter is misleadingly indented as if it is guarded by the 'if'
  if (err)
  ^~

vim +86 drivers/pci/host/pcie-mediatek.c

74  
75  /* PCIe V2 per-port registers */
76  #define PCIE_INT_MASK   0x420
77  #define INTX_MASK   GENMASK(19, 16)
78  #define INTX_SHIFT  16
79  #define INTX_NUM4
80  #define PCIE_INT_STATUS 0x424
81  #define AHB2PCIE_BASE0_L0x438
82  #define AHB2PCIE_BASE0_H0x43c
83  #define PCIE2AXI_WIN0x448
84  #define WIN_ENABLE  BIT(7)
85  #define AHB2PCIE_BASEL(base)(base & GENMASK(31, 0))
  > 86  #define AHB2PCIE_BASEH(base)(base >> 32)
87  #define BASE_SIZE(sz)   (sz & GENMASK(4, 0))
88  #define PCIE2AXI_SIZE   0x
89  

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


.config.gz
Description: application/gzip

Re: [PATCH 4/5] PCI: mediatek: Add new generation controller support

2017-07-22 Thread kbuild test robot

Hi Ryder,

[auto build test WARNING on pci/next]
[also build test WARNING on v4.13-rc1 next-20170721]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/honghui-zhang-mediatek-com/PCI-MediaTek-Add-support-for-new-generation-host-controller/20170723-040107
base:   https://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci.git next
config: arm-allmodconfig (attached as .config)
compiler: arm-linux-gnueabi-gcc (Debian 6.1.1-9) 6.1.1 20160705
reproduce:
wget 
https://raw.githubusercontent.com/01org/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# save the attached .config to linux build tree
make.cross ARCH=arm 

All warnings (new ones prefixed by >>):

   drivers/pci/host/pcie-mediatek.c: In function 'mtk_pcie_startup_ports_v2':
>> drivers/pci/host/pcie-mediatek.c:86:36: warning: right shift count >= width 
>> of type [-Wshift-count-overflow]
#define AHB2PCIE_BASEH(base) (base >> 32)
   ^
>> drivers/pci/host/pcie-mediatek.c:440:9: note: in expansion of macro 
>> 'AHB2PCIE_BASEH'
  val = AHB2PCIE_BASEH(mem->start);
^~
   drivers/pci/host/pcie-mediatek.c: In function 'mtk_pcie_parse_ports':
>> drivers/pci/host/pcie-mediatek.c:789:2: warning: this 'if' clause does not 
>> guard... [-Wmisleading-indentation]
 if (pcie->soc->setup_irq)
 ^~
   drivers/pci/host/pcie-mediatek.c:791:3: note: ...this statement, but the 
latter is misleadingly indented as if it is guarded by the 'if'
  if (err)
  ^~

vim +86 drivers/pci/host/pcie-mediatek.c

74  
75  /* PCIe V2 per-port registers */
76  #define PCIE_INT_MASK   0x420
77  #define INTX_MASK   GENMASK(19, 16)
78  #define INTX_SHIFT  16
79  #define INTX_NUM4
80  #define PCIE_INT_STATUS 0x424
81  #define AHB2PCIE_BASE0_L0x438
82  #define AHB2PCIE_BASE0_H0x43c
83  #define PCIE2AXI_WIN0x448
84  #define WIN_ENABLE  BIT(7)
85  #define AHB2PCIE_BASEL(base)(base & GENMASK(31, 0))
  > 86  #define AHB2PCIE_BASEH(base)(base >> 32)
87  #define BASE_SIZE(sz)   (sz & GENMASK(4, 0))
88  #define PCIE2AXI_SIZE   0x
89  

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


.config.gz
Description: application/gzip

Re: [PATCH] EDAC: remove unnecessary static in edac_fake_inject_write()

2017-07-22 Thread Gustavo A. R. Silva


Hi Julia, Borislav,

On 07/22/2017 11:22 AM, Gustavo A. R. Silva wrote:

Hi all,

On 07/22/2017 01:36 AM, Borislav Petkov wrote:

On Fri, Jul 21, 2017 at 10:08:12PM +0200, Julia Lawall wrote:

Someone pointed out that the rule is probably not OK when the address of
the static variable is taken, because then it is likely being used as
permanent storage.


Makes sense to me.


An improved rule is:


Do you think it is worth having it in scripts/coccinelle/ ?

I don't think Gustavo would mind putting it there :)



Absolutely, I'd be glad to help out. :)



I've been working on this issue today and, in my opinion, this script is 
even better:


@bad exists@
position p;
identifier x;
expression e;
type T;
@@

static T x@p;
... when != x = e
x = <+...x...+>

@worse1 exists@
position p;
identifier x;
type T;
@@

static T x@p;
...
return 

@worse2 exists@
position p;
identifier x;
type T;
@@

static T *x@p;
...
return x;

@@
identifier x;
expression e;
type T;
position p != {bad.p,worse1.p,worse2.p};
@@

-static
  T x@p;
  ... when != x
  when strict
?x = e;

It ignores all the cases in which the address of the static variable is 
returned to the caller function.


Also, there are some cases in which the maintainer can argue something 
like the following:


https://lkml.org/lkml/2017/7/19/1381

but that depends on the particular conditions in which the code is 
intended to be executed.


What do you think?

Thank you
--
Gustavo A. R. Silva

Re: [PATCH] EDAC: remove unnecessary static in edac_fake_inject_write()

2017-07-22 Thread Gustavo A. R. Silva


Hi Julia, Borislav,

On 07/22/2017 11:22 AM, Gustavo A. R. Silva wrote:

Hi all,

On 07/22/2017 01:36 AM, Borislav Petkov wrote:

On Fri, Jul 21, 2017 at 10:08:12PM +0200, Julia Lawall wrote:

Someone pointed out that the rule is probably not OK when the address of
the static variable is taken, because then it is likely being used as
permanent storage.


Makes sense to me.


An improved rule is:


Do you think it is worth having it in scripts/coccinelle/ ?

I don't think Gustavo would mind putting it there :)



Absolutely, I'd be glad to help out. :)



I've been working on this issue today and, in my opinion, this script is 
even better:


@bad exists@
position p;
identifier x;
expression e;
type T;
@@

static T x@p;
... when != x = e
x = <+...x...+>

@worse1 exists@
position p;
identifier x;
type T;
@@

static T x@p;
...
return 

@worse2 exists@
position p;
identifier x;
type T;
@@

static T *x@p;
...
return x;

@@
identifier x;
expression e;
type T;
position p != {bad.p,worse1.p,worse2.p};
@@

-static
  T x@p;
  ... when != x
  when strict
?x = e;

It ignores all the cases in which the address of the static variable is 
returned to the caller function.


Also, there are some cases in which the maintainer can argue something 
like the following:


https://lkml.org/lkml/2017/7/19/1381

but that depends on the particular conditions in which the code is 
intended to be executed.


What do you think?

Thank you
--
Gustavo A. R. Silva

signal not interrupting futex

2017-07-22 Thread Michael Day

We have hit an apparent kernel bug where a signal is not interrupting a 
futex, leading to a deadlock in our code. Here is the relevant strace 
output just before it blocks (complete strace log is attached):


14069 set_robust_list(0x7f7b3e7ee9e0, 24 
14061 futex(0x7f7b46721fd8, FUTEX_WAIT_PRIVATE, 0, NULL 
14069 <... set_robust_list resumed> )   = 0
14069 futex(0x7f7b46721fd8, FUTEX_WAKE_PRIVATE, 1) = 1
14061 <... futex resumed> ) = 0
14061 futex(0x1585ea0, FUTEX_WAIT_PRIVATE, 2, NULL 
14069 tgkill(14061, 14061, SIGPWR)  = 0
14069 futex(0x1586280, FUTEX_WAIT_PRIVATE, 0, NULL

Thread '69 sends SIGPWR to thread '61, but it is never delivered and we 
have not been able to figure out why.


Background information: this deadlock is experienced by our customer 
running Prince on CentOS 7. The bug happens every time on their system, 
but we have not been able to reproduce it on ours yet. They have tried 
two different kernel versions:


3.10.0-327.28.2.el7.x86_64
3.10.0-514.26.2.el7.x86_64

Over the past two years we have heard similar deadlock issues from other 
customers, always on CentOS and typically involving PHP, although these 
are of course very popular systems.


This issue appears to be unrelated to the earlier futex bug affecting 
Haswell processors, but could there be another bug along these lines 
affecting futexes or signal delivery?


What can we do to help debug this issue?

Best regards,

Michael

--
Prince: Print with CSS!
http://www.princexml.com
14061 execve("/usr/lib/prince/bin/prince", ["/usr/lib/prince/bin/prince", "-i", 
"html", "--no-xinclude", "--license-file=/usr/lib/prince/l"..., 
"--no-compress", "--structured-log=normal", "tmp/tmphtmlinvoice.html"], [/* 43 
vars */]) = 0
14061 brk(0)= 0x1c5b000
14061 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) 
= 0x7f7b46881000
14061 access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory)
14061 open("/opt/mercury-2016-02-18/lib/mercury/lib/tls/x86_64/libxml2.so.2", 
O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
14061 stat("/opt/mercury-2016-02-18/lib/mercury/lib/tls/x86_64", 
0x7ffc5a6fab30) = -1 ENOENT (No such file or directory)
14061 open("/opt/mercury-2016-02-18/lib/mercury/lib/tls/libxml2.so.2", 
O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
14061 stat("/opt/mercury-2016-02-18/lib/mercury/lib/tls", 0x7ffc5a6fab30) = -1 
ENOENT (No such file or directory)
14061 open("/opt/mercury-2016-02-18/lib/mercury/lib/x86_64/libxml2.so.2", 
O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
14061 stat("/opt/mercury-2016-02-18/lib/mercury/lib/x86_64", 0x7ffc5a6fab30) = 
-1 ENOENT (No such file or directory)
14061 open("/opt/mercury-2016-02-18/lib/mercury/lib/libxml2.so.2", 
O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
14061 stat("/opt/mercury-2016-02-18/lib/mercury/lib", 0x7ffc5a6fab30) = -1 
ENOENT (No such file or directory)
14061 
open("/opt/mercury-2016-02-18/lib/mercury/lib/hlc.par.gc/tls/x86_64/libxml2.so.2",
 O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
14061 stat("/opt/mercury-2016-02-18/lib/mercury/lib/hlc.par.gc/tls/x86_64", 
0x7ffc5a6fab30) = -1 ENOENT (No such file or directory)
14061 
open("/opt/mercury-2016-02-18/lib/mercury/lib/hlc.par.gc/tls/libxml2.so.2", 
O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
14061 stat("/opt/mercury-2016-02-18/lib/mercury/lib/hlc.par.gc/tls", 
0x7ffc5a6fab30) = -1 ENOENT (No such file or directory)
14061 
open("/opt/mercury-2016-02-18/lib/mercury/lib/hlc.par.gc/x86_64/libxml2.so.2", 
O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
14061 stat("/opt/mercury-2016-02-18/lib/mercury/lib/hlc.par.gc/x86_64", 
0x7ffc5a6fab30) = -1 ENOENT (No such file or directory)
14061 open("/opt/mercury-2016-02-18/lib/mercury/lib/hlc.par.gc/libxml2.so.2", 
O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
14061 stat("/opt/mercury-2016-02-18/lib/mercury/lib/hlc.par.gc", 
0x7ffc5a6fab30) = -1 ENOENT (No such file or directory)
14061 open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
14061 fstat(3, {st_mode=S_IFREG|0644, st_size=49060, ...}) = 0
14061 mmap(NULL, 49060, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f7b46875000
14061 close(3)  = 0
14061 open("/lib64/libxml2.so.2", O_RDONLY|O_CLOEXEC) = 3
14061 read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0 
\351\2\0\0\0\0\0"..., 832) = 832
14061 fstat(3, {st_mode=S_IFREG|0755, st_size=1509376, ...}) = 0
14061 mmap(NULL, 3575896, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) 
= 0x7f7b462f8000
14061 mprotect(0x7f7b46457000, 2093056, PROT_NONE) = 0
14061 mmap(0x7f7b46656000, 40960, PROT_READ|PROT_WRITE, 
MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x15e000) = 0x7f7b46656000
14061 mmap(0x7f7b4666, 4184, PROT_READ|PROT_WRITE, 
MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7f7b4666
14061 close(3)  = 0
14061 open("/lib64/libpthread.so.0",

signal not interrupting futex

2017-07-22 Thread Michael Day

We have hit an apparent kernel bug where a signal is not interrupting a 
futex, leading to a deadlock in our code. Here is the relevant strace 
output just before it blocks (complete strace log is attached):


14069 set_robust_list(0x7f7b3e7ee9e0, 24 
14061 futex(0x7f7b46721fd8, FUTEX_WAIT_PRIVATE, 0, NULL 
14069 <... set_robust_list resumed> )   = 0
14069 futex(0x7f7b46721fd8, FUTEX_WAKE_PRIVATE, 1) = 1
14061 <... futex resumed> ) = 0
14061 futex(0x1585ea0, FUTEX_WAIT_PRIVATE, 2, NULL 
14069 tgkill(14061, 14061, SIGPWR)  = 0
14069 futex(0x1586280, FUTEX_WAIT_PRIVATE, 0, NULL

Thread '69 sends SIGPWR to thread '61, but it is never delivered and we 
have not been able to figure out why.


Background information: this deadlock is experienced by our customer 
running Prince on CentOS 7. The bug happens every time on their system, 
but we have not been able to reproduce it on ours yet. They have tried 
two different kernel versions:


3.10.0-327.28.2.el7.x86_64
3.10.0-514.26.2.el7.x86_64

Over the past two years we have heard similar deadlock issues from other 
customers, always on CentOS and typically involving PHP, although these 
are of course very popular systems.


This issue appears to be unrelated to the earlier futex bug affecting 
Haswell processors, but could there be another bug along these lines 
affecting futexes or signal delivery?


What can we do to help debug this issue?

Best regards,

Michael

--
Prince: Print with CSS!
http://www.princexml.com
14061 execve("/usr/lib/prince/bin/prince", ["/usr/lib/prince/bin/prince", "-i", 
"html", "--no-xinclude", "--license-file=/usr/lib/prince/l"..., 
"--no-compress", "--structured-log=normal", "tmp/tmphtmlinvoice.html"], [/* 43 
vars */]) = 0
14061 brk(0)= 0x1c5b000
14061 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) 
= 0x7f7b46881000
14061 access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory)
14061 open("/opt/mercury-2016-02-18/lib/mercury/lib/tls/x86_64/libxml2.so.2", 
O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
14061 stat("/opt/mercury-2016-02-18/lib/mercury/lib/tls/x86_64", 
0x7ffc5a6fab30) = -1 ENOENT (No such file or directory)
14061 open("/opt/mercury-2016-02-18/lib/mercury/lib/tls/libxml2.so.2", 
O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
14061 stat("/opt/mercury-2016-02-18/lib/mercury/lib/tls", 0x7ffc5a6fab30) = -1 
ENOENT (No such file or directory)
14061 open("/opt/mercury-2016-02-18/lib/mercury/lib/x86_64/libxml2.so.2", 
O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
14061 stat("/opt/mercury-2016-02-18/lib/mercury/lib/x86_64", 0x7ffc5a6fab30) = 
-1 ENOENT (No such file or directory)
14061 open("/opt/mercury-2016-02-18/lib/mercury/lib/libxml2.so.2", 
O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
14061 stat("/opt/mercury-2016-02-18/lib/mercury/lib", 0x7ffc5a6fab30) = -1 
ENOENT (No such file or directory)
14061 
open("/opt/mercury-2016-02-18/lib/mercury/lib/hlc.par.gc/tls/x86_64/libxml2.so.2",
 O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
14061 stat("/opt/mercury-2016-02-18/lib/mercury/lib/hlc.par.gc/tls/x86_64", 
0x7ffc5a6fab30) = -1 ENOENT (No such file or directory)
14061 
open("/opt/mercury-2016-02-18/lib/mercury/lib/hlc.par.gc/tls/libxml2.so.2", 
O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
14061 stat("/opt/mercury-2016-02-18/lib/mercury/lib/hlc.par.gc/tls", 
0x7ffc5a6fab30) = -1 ENOENT (No such file or directory)
14061 
open("/opt/mercury-2016-02-18/lib/mercury/lib/hlc.par.gc/x86_64/libxml2.so.2", 
O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
14061 stat("/opt/mercury-2016-02-18/lib/mercury/lib/hlc.par.gc/x86_64", 
0x7ffc5a6fab30) = -1 ENOENT (No such file or directory)
14061 open("/opt/mercury-2016-02-18/lib/mercury/lib/hlc.par.gc/libxml2.so.2", 
O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
14061 stat("/opt/mercury-2016-02-18/lib/mercury/lib/hlc.par.gc", 
0x7ffc5a6fab30) = -1 ENOENT (No such file or directory)
14061 open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
14061 fstat(3, {st_mode=S_IFREG|0644, st_size=49060, ...}) = 0
14061 mmap(NULL, 49060, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f7b46875000
14061 close(3)  = 0
14061 open("/lib64/libxml2.so.2", O_RDONLY|O_CLOEXEC) = 3
14061 read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0 
\351\2\0\0\0\0\0"..., 832) = 832
14061 fstat(3, {st_mode=S_IFREG|0755, st_size=1509376, ...}) = 0
14061 mmap(NULL, 3575896, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) 
= 0x7f7b462f8000
14061 mprotect(0x7f7b46457000, 2093056, PROT_NONE) = 0
14061 mmap(0x7f7b46656000, 40960, PROT_READ|PROT_WRITE, 
MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x15e000) = 0x7f7b46656000
14061 mmap(0x7f7b4666, 4184, PROT_READ|PROT_WRITE, 
MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7f7b4666
14061 close(3)  = 0
14061 open("/lib64/libpthread.so.0",

Re: [PATCH v12 5/8] virtio-balloon: VIRTIO_BALLOON_F_SG

2017-07-22 Thread Michael S. Tsirkin

On Fri, Jul 14, 2017 at 03:12:43PM +0800, Wei Wang wrote:
> On 07/14/2017 04:19 AM, Michael S. Tsirkin wrote:
> > On Thu, Jul 13, 2017 at 03:42:35PM +0800, Wei Wang wrote:
> > > On 07/12/2017 09:56 PM, Michael S. Tsirkin wrote:
> > > > So the way I see it, there are several issues:
> > > > 
> > > > - internal wait - forces multiple APIs like kick/kick_sync
> > > > note how kick_sync can fail but your code never checks return code
> > > > - need to re-write the last descriptor - might not work
> > > > for alternative layouts which always expose descriptors
> > > > immediately
> > > Probably it wasn't clear. Please let me explain the two functions here:
> > > 
> > > 1) virtqueue_add_chain_desc(vq, head_id, prev_id,..):
> > > grabs a desc from the vq and inserts it to the chain tail (which is 
> > > indexed
> > > by
> > > prev_id, probably better to call it tail_id). Then, the new added desc
> > > becomes
> > > the tail (i.e. the last desc). The _F_NEXT flag is cleared for each desc
> > > when it's
> > > added to the chain, and set when another desc comes to follow later.
> > And this only works if there are multiple rings like
> > avail + descriptor ring.
> > It won't work e.g. with the proposed new layout where
> > writing out a descriptor exposes it immediately.
> 
> I think it can support the 1.1 proposal, too. But before getting
> into that, I think we first need to deep dive into the implementation
> and usage of _first/next/last. The usage would need to lock the vq
> from the first to the end (otherwise, the returned info about the number
> of available desc in the vq, i.e. num_free, would be invalid):
> 
> lock(vq);
> add_first();
> add_next();
> add_last();
> unlock(vq);
> 
> However, I think the case isn't this simple, since we need to check more
> things
> after each add_xx() step. For example, if only one entry is available at the
> time
> we start to use the vq, that is, num_free is 0 after add_first(), we
> wouldn't be
> able to add_next and add_last. So, it would work like this:
> 
> start:
> ...get free page block..
> lock(vq)
> retry:
> ret = add_first(..,_free,);
> if(ret == -ENOSPC) {
> goto retry;
> } else if (!num_free) {
> add_chain_head();
> unlock(vq);
> kick & wait;
> goto start;
> }
> next_one:
> ...get free page block..
> add_next(..,_free,);
> if (!num_free) {
> add_chain_head();
> unlock(vq);
> kick & wait;
> goto start;
> } if (num_free == 1) {
> ...get free page block..
> add_last(..);
> unlock(vq);
> kick & wait;
> goto start;
> } else {
> goto next_one;
> }
> 
> The above seems unnecessary to me to have three different APIs.
> That's the reason to combine them into one virtqueue_add_chain_desc().
> 
> -- or, do you have a different thought about using the three APIs?
> 
> 
> Implementation Reference:
> 
> struct desc_iterator {
> unsigned int head;
> unsigned int tail;
> };
> 
> add_first(*vq, *desc_iterator, *num_free, ..)
> {
> if (vq->vq.num_free < 1)
> return -ENOSPC;
> get_desc(_id);
> desc[desc_id].flag &= ~_F_NEXT;
> desc_iterator->head = desc_id
> desc_iterator->tail = desc_iterator->head;
> *num_free = vq->vq.num_free;
> }
> 
> add_next(vq, desc_iterator, *num_free,..)
> {
> get_desc(_id);
> desc[desc_id].flag &= ~_F_NEXT;
> desc[desc_iterator.tail].next = desc_id;
> desc[desc_iterator->tail].flag |= _F_NEXT;
> desc_iterator->tail = desc_id;
> *num_free = vq->vq.num_free;
> }
> 
> add_last(vq, desc_iterator,..)
> {
> get_desc(_id);
> desc[desc_id].flag &= ~_F_NEXT;
> desc[desc_iterator.tail].next = desc_id;
> desc_iterator->tail = desc_id;
> 
> add_chain_head(); // put the desc_iterator.head to the ring
> }
> 
> 
> Best,
> Wei

OK I thought this over. While we might need these new APIs in
the future, I think that at the moment, there's a way to implement
this feature that is significantly simpler. Just add each s/g
as a separate input buffer.

This needs zero new APIs.

I know that follow-up patches need to add a header in front
so you might be thinking: how am I going to add this
header? The answer is quite simple - add it as a separate
out header.

Host will be able to distinguish between header and pages
by looking at the direction, and - should we want to add
IN data to header - additionally size (<4K => header).

We will be able to look at extended APIs separately down
the road.

-- 
MST

Re: [PATCH v12 5/8] virtio-balloon: VIRTIO_BALLOON_F_SG

2017-07-22 Thread Michael S. Tsirkin

On Fri, Jul 14, 2017 at 03:12:43PM +0800, Wei Wang wrote:
> On 07/14/2017 04:19 AM, Michael S. Tsirkin wrote:
> > On Thu, Jul 13, 2017 at 03:42:35PM +0800, Wei Wang wrote:
> > > On 07/12/2017 09:56 PM, Michael S. Tsirkin wrote:
> > > > So the way I see it, there are several issues:
> > > > 
> > > > - internal wait - forces multiple APIs like kick/kick_sync
> > > > note how kick_sync can fail but your code never checks return code
> > > > - need to re-write the last descriptor - might not work
> > > > for alternative layouts which always expose descriptors
> > > > immediately
> > > Probably it wasn't clear. Please let me explain the two functions here:
> > > 
> > > 1) virtqueue_add_chain_desc(vq, head_id, prev_id,..):
> > > grabs a desc from the vq and inserts it to the chain tail (which is 
> > > indexed
> > > by
> > > prev_id, probably better to call it tail_id). Then, the new added desc
> > > becomes
> > > the tail (i.e. the last desc). The _F_NEXT flag is cleared for each desc
> > > when it's
> > > added to the chain, and set when another desc comes to follow later.
> > And this only works if there are multiple rings like
> > avail + descriptor ring.
> > It won't work e.g. with the proposed new layout where
> > writing out a descriptor exposes it immediately.
> 
> I think it can support the 1.1 proposal, too. But before getting
> into that, I think we first need to deep dive into the implementation
> and usage of _first/next/last. The usage would need to lock the vq
> from the first to the end (otherwise, the returned info about the number
> of available desc in the vq, i.e. num_free, would be invalid):
> 
> lock(vq);
> add_first();
> add_next();
> add_last();
> unlock(vq);
> 
> However, I think the case isn't this simple, since we need to check more
> things
> after each add_xx() step. For example, if only one entry is available at the
> time
> we start to use the vq, that is, num_free is 0 after add_first(), we
> wouldn't be
> able to add_next and add_last. So, it would work like this:
> 
> start:
> ...get free page block..
> lock(vq)
> retry:
> ret = add_first(..,_free,);
> if(ret == -ENOSPC) {
> goto retry;
> } else if (!num_free) {
> add_chain_head();
> unlock(vq);
> kick & wait;
> goto start;
> }
> next_one:
> ...get free page block..
> add_next(..,_free,);
> if (!num_free) {
> add_chain_head();
> unlock(vq);
> kick & wait;
> goto start;
> } if (num_free == 1) {
> ...get free page block..
> add_last(..);
> unlock(vq);
> kick & wait;
> goto start;
> } else {
> goto next_one;
> }
> 
> The above seems unnecessary to me to have three different APIs.
> That's the reason to combine them into one virtqueue_add_chain_desc().
> 
> -- or, do you have a different thought about using the three APIs?
> 
> 
> Implementation Reference:
> 
> struct desc_iterator {
> unsigned int head;
> unsigned int tail;
> };
> 
> add_first(*vq, *desc_iterator, *num_free, ..)
> {
> if (vq->vq.num_free < 1)
> return -ENOSPC;
> get_desc(_id);
> desc[desc_id].flag &= ~_F_NEXT;
> desc_iterator->head = desc_id
> desc_iterator->tail = desc_iterator->head;
> *num_free = vq->vq.num_free;
> }
> 
> add_next(vq, desc_iterator, *num_free,..)
> {
> get_desc(_id);
> desc[desc_id].flag &= ~_F_NEXT;
> desc[desc_iterator.tail].next = desc_id;
> desc[desc_iterator->tail].flag |= _F_NEXT;
> desc_iterator->tail = desc_id;
> *num_free = vq->vq.num_free;
> }
> 
> add_last(vq, desc_iterator,..)
> {
> get_desc(_id);
> desc[desc_id].flag &= ~_F_NEXT;
> desc[desc_iterator.tail].next = desc_id;
> desc_iterator->tail = desc_id;
> 
> add_chain_head(); // put the desc_iterator.head to the ring
> }
> 
> 
> Best,
> Wei

OK I thought this over. While we might need these new APIs in
the future, I think that at the moment, there's a way to implement
this feature that is significantly simpler. Just add each s/g
as a separate input buffer.

This needs zero new APIs.

I know that follow-up patches need to add a header in front
so you might be thinking: how am I going to add this
header? The answer is quite simple - add it as a separate
out header.

Host will be able to distinguish between header and pages
by looking at the direction, and - should we want to add
IN data to header - additionally size (<4K => header).

We will be able to look at extended APIs separately down
the road.

-- 
MST

[PATCH] HID: rmi: Make sure the HID device is opened on resume

2017-07-22 Thread Lyude

So it looks like that suspend/resume has actually always been broken on
hid-rmi. The fact it worked was a rather silly coincidence that was
relying on the HID device to already be opened upon resume. This means
that so long as anything was reading the /dev/input/eventX node for for
an RMI device, it would suspend and resume correctly. As well, if
nothing happened to be keeping the HID device away it would shut off,
then the RMI driver would get confused on resume when it stopped
responding and explode.

So, call hid_hw_open() in rmi_post_resume() so we make sure that the
device is alive before we try talking to it.

This fixes RMI device suspend/resume over HID.

Signed-off-by: Lyude 
Cc: Andrew Duggan 
Cc: sta...@vger.kernel.org
---
 drivers/hid/hid-rmi.c | 15 +++
 1 file changed, 11 insertions(+), 4 deletions(-)

diff --git a/drivers/hid/hid-rmi.c b/drivers/hid/hid-rmi.c
index 5b40c2614599..e7d124f9a27f 100644
--- a/drivers/hid/hid-rmi.c
+++ b/drivers/hid/hid-rmi.c
@@ -431,22 +431,29 @@ static int rmi_post_resume(struct hid_device *hdev)
 {
struct rmi_data *data = hid_get_drvdata(hdev);
struct rmi_device *rmi_dev = data->xport.rmi_dev;
-   int ret;
+   int ret = 0;
 
if (!(data->device_flags & RMI_DEVICE))
return 0;
 
-   ret = rmi_reset_attn_mode(hdev);
+   /* Make sure the HID device is ready to receive events */
+   ret = hid_hw_open(hdev);
if (ret)
return ret;
 
+   ret = rmi_reset_attn_mode(hdev);
+   if (ret)
+   goto out;
+
ret = rmi_driver_resume(rmi_dev, false);
if (ret) {
hid_warn(hdev, "Failed to resume device: %d\n", ret);
-   return ret;
+   goto out;
}
 
-   return 0;
+out:
+   hid_hw_close(hdev);
+   return ret;
 }
 #endif /* CONFIG_PM */
 
-- 
2.13.3

[PATCH] HID: rmi: Make sure the HID device is opened on resume

2017-07-22 Thread Lyude

So it looks like that suspend/resume has actually always been broken on
hid-rmi. The fact it worked was a rather silly coincidence that was
relying on the HID device to already be opened upon resume. This means
that so long as anything was reading the /dev/input/eventX node for for
an RMI device, it would suspend and resume correctly. As well, if
nothing happened to be keeping the HID device away it would shut off,
then the RMI driver would get confused on resume when it stopped
responding and explode.

So, call hid_hw_open() in rmi_post_resume() so we make sure that the
device is alive before we try talking to it.

This fixes RMI device suspend/resume over HID.

Signed-off-by: Lyude 
Cc: Andrew Duggan 
Cc: sta...@vger.kernel.org
---
 drivers/hid/hid-rmi.c | 15 +++
 1 file changed, 11 insertions(+), 4 deletions(-)

diff --git a/drivers/hid/hid-rmi.c b/drivers/hid/hid-rmi.c
index 5b40c2614599..e7d124f9a27f 100644
--- a/drivers/hid/hid-rmi.c
+++ b/drivers/hid/hid-rmi.c
@@ -431,22 +431,29 @@ static int rmi_post_resume(struct hid_device *hdev)
 {
struct rmi_data *data = hid_get_drvdata(hdev);
struct rmi_device *rmi_dev = data->xport.rmi_dev;
-   int ret;
+   int ret = 0;
 
if (!(data->device_flags & RMI_DEVICE))
return 0;
 
-   ret = rmi_reset_attn_mode(hdev);
+   /* Make sure the HID device is ready to receive events */
+   ret = hid_hw_open(hdev);
if (ret)
return ret;
 
+   ret = rmi_reset_attn_mode(hdev);
+   if (ret)
+   goto out;
+
ret = rmi_driver_resume(rmi_dev, false);
if (ret) {
hid_warn(hdev, "Failed to resume device: %d\n", ret);
-   return ret;
+   goto out;
}
 
-   return 0;
+out:
+   hid_hw_close(hdev);
+   return ret;
 }
 #endif /* CONFIG_PM */
 
-- 
2.13.3

Re: [PATCH] KVM: nVMX: Fix exception injection

2017-07-22 Thread Jim Mattson

I think the ancillary data for #DB and #PF should be added to
kvm_queued_exception and plumbed through to where it's needed. Vector
number and error code are not sufficient to describe a #DB or #PF.

On Sat, Jul 22, 2017 at 5:29 PM, Wanpeng Li  wrote:
> 2017-07-22 22:25 GMT+08:00 Jim Mattson :
>> On Fri, Jul 21, 2017 at 1:39 AM, Wanpeng Li  wrote:
>>> Hi Jim,
>>> 2017-07-21 3:16 GMT+08:00 Jim Mattson :
 On Wed, Jul 19, 2017 at 7:31 PM, Wanpeng Li  wrote:
> Hi Jim,
> 2017-07-19 2:47 GMT+08:00 Jim Mattson :
>> Why do we expect the VM_EXIT_INTR_INFO and EXIT_QUALIFICATION fields
>> of the VMCS to have the correct values for the injected exception?
>
> Good point, I think we should synthesize VM_EXIT_INTR_INFO and
> EXIT_QUALIFICATION manually, I will post a patch for it. Btw, how
> about setting EXIT_QULIFICATION to vcpu->arch.cr2 for the page fault
> exception and 0 for other exceptions?

 From the SDM, section 27.1:

 If an event causes a VM exit directly, it does not update
>>>
>>> I mentioned this in the patch description:
>>>
 However, there is no guarantee the exit reason is exception currently, 
 when there is an external interrupt occurred on host, maybe a time 
 interrupt for host which should not be injected to guest, and somewhere 
 queues an exception, then the function nested_vmx_check_exception() will 
 be called and the vmexit emulation codes will try to emulate the 
 "Acknowledge interrupt on exit" behavior, the warning is triggered.
>>>
>>> If you think the scenario is correct, then it should be an event
>>> causes a VM exit indirectly. So if both the scenario which I mentioned
>>> and "This function
>>> assumes it is called with the exit reason in vmcs02 being a #PF
>>> exception" can happen, then maybe we should figure out how to fix both
>>> scenarios suitable.
>>
>> In the situation you describe, the #PF causes a synthesized VM-exit
>> from L2 to L1 directly, not indirectly. From the SDM:
>>
>>An exception causes a VM exit directly if the bit corresponding to
>> that exception is set in the exception bitmap.
>>
>> Hence, CR2 should not be set yet.
>
> Any idea how to synthesize exit qualification for page fault and debug
> exception?
>
> Regards,
> Wanpeng Li
>
>>
>>>
 architectural state as it would have if it had it not caused the VM
 exit:
   - A debug exception does not update DR6, DR7.GD, or
 IA32_DEBUGCTL.LBR. (Information about the nature of the debug
 exception is saved in the exit qualification field.)
   - A page fault does not update CR2. (The linear address causing
 the page fault is saved in the exit-qualification field.)

 This means that vcpu->arch.cr2 should not be set at this point for a
 #PF injection (and vcpu->arch.dr6 should not be set at this point for
 a #DB injection). For all other exceptions, yes, the exit
 qualification should be cleared.

Re: [PATCH] KVM: nVMX: Fix exception injection

2017-07-22 Thread Jim Mattson

I think the ancillary data for #DB and #PF should be added to
kvm_queued_exception and plumbed through to where it's needed. Vector
number and error code are not sufficient to describe a #DB or #PF.

On Sat, Jul 22, 2017 at 5:29 PM, Wanpeng Li  wrote:
> 2017-07-22 22:25 GMT+08:00 Jim Mattson :
>> On Fri, Jul 21, 2017 at 1:39 AM, Wanpeng Li  wrote:
>>> Hi Jim,
>>> 2017-07-21 3:16 GMT+08:00 Jim Mattson :
 On Wed, Jul 19, 2017 at 7:31 PM, Wanpeng Li  wrote:
> Hi Jim,
> 2017-07-19 2:47 GMT+08:00 Jim Mattson :
>> Why do we expect the VM_EXIT_INTR_INFO and EXIT_QUALIFICATION fields
>> of the VMCS to have the correct values for the injected exception?
>
> Good point, I think we should synthesize VM_EXIT_INTR_INFO and
> EXIT_QUALIFICATION manually, I will post a patch for it. Btw, how
> about setting EXIT_QULIFICATION to vcpu->arch.cr2 for the page fault
> exception and 0 for other exceptions?

 From the SDM, section 27.1:

 If an event causes a VM exit directly, it does not update
>>>
>>> I mentioned this in the patch description:
>>>
 However, there is no guarantee the exit reason is exception currently, 
 when there is an external interrupt occurred on host, maybe a time 
 interrupt for host which should not be injected to guest, and somewhere 
 queues an exception, then the function nested_vmx_check_exception() will 
 be called and the vmexit emulation codes will try to emulate the 
 "Acknowledge interrupt on exit" behavior, the warning is triggered.
>>>
>>> If you think the scenario is correct, then it should be an event
>>> causes a VM exit indirectly. So if both the scenario which I mentioned
>>> and "This function
>>> assumes it is called with the exit reason in vmcs02 being a #PF
>>> exception" can happen, then maybe we should figure out how to fix both
>>> scenarios suitable.
>>
>> In the situation you describe, the #PF causes a synthesized VM-exit
>> from L2 to L1 directly, not indirectly. From the SDM:
>>
>>An exception causes a VM exit directly if the bit corresponding to
>> that exception is set in the exception bitmap.
>>
>> Hence, CR2 should not be set yet.
>
> Any idea how to synthesize exit qualification for page fault and debug
> exception?
>
> Regards,
> Wanpeng Li
>
>>
>>>
 architectural state as it would have if it had it not caused the VM
 exit:
   - A debug exception does not update DR6, DR7.GD, or
 IA32_DEBUGCTL.LBR. (Information about the nature of the debug
 exception is saved in the exit qualification field.)
   - A page fault does not update CR2. (The linear address causing
 the page fault is saved in the exit-qualification field.)

 This means that vcpu->arch.cr2 should not be set at this point for a
 #PF injection (and vcpu->arch.dr6 should not be set at this point for
 a #DB injection). For all other exceptions, yes, the exit
 qualification should be cleared.

[PATCH] KVM: nVMX: consult PFER_MASK and PFER_MATCH before nested vmexit if inject #PF

2017-07-22 Thread Wanpeng Li

From: Wanpeng Li 

When generating #PF VM-exit, check equality:
(PFEC & PFEC_MASK) == PFEC_MATCH
If there is equality, the 14 bit of exception bitmap is used to take decision
about generating #PF VM-exit. If there is inequality, inverted 14 bit is used.

Reported-by: Jim Mattson 
Cc: Paolo Bonzini 
Cc: Radim Krčmář 
Signed-off-by: Wanpeng Li 
---
 arch/x86/kvm/vmx.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 29fd8af..8a213f2 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -2429,6 +2429,8 @@ static void skip_emulated_instruction(struct kvm_vcpu 
*vcpu)
vmx_set_interrupt_shadow(vcpu, 0);
 }
 
+static bool nested_vmx_is_page_fault_vmexit(struct vmcs12 *vmcs12, u16 
error_code);
+
 /*
  * KVM wants to inject page-faults which it got to the guest. This function
  * checks whether in a nested guest, we need to inject them to L1 or L2.
@@ -2442,6 +2444,10 @@ static int nested_vmx_check_exception(struct kvm_vcpu 
*vcpu)
(nr == PF_VECTOR && vcpu->arch.exception.nested_apf)))
return 0;
 
+   if (nr == PF_VECTOR && !vcpu->arch.exception.nested_apf &&
+   !nested_vmx_is_page_fault_vmexit(vmcs12, 
vcpu->arch.exception.error_code))
+   return 0;
+
if (vcpu->arch.exception.nested_apf) {
vmcs_write32(VM_EXIT_INTR_ERROR_CODE, 
vcpu->arch.exception.error_code);
nested_vmx_vmexit(vcpu, EXIT_REASON_EXCEPTION_NMI,
-- 
2.7.4

[PATCH] KVM: nVMX: consult PFER_MASK and PFER_MATCH before nested vmexit if inject #PF

2017-07-22 Thread Wanpeng Li

From: Wanpeng Li 

When generating #PF VM-exit, check equality:
(PFEC & PFEC_MASK) == PFEC_MATCH
If there is equality, the 14 bit of exception bitmap is used to take decision
about generating #PF VM-exit. If there is inequality, inverted 14 bit is used.

Reported-by: Jim Mattson 
Cc: Paolo Bonzini 
Cc: Radim Krčmář 
Signed-off-by: Wanpeng Li 
---
 arch/x86/kvm/vmx.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 29fd8af..8a213f2 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -2429,6 +2429,8 @@ static void skip_emulated_instruction(struct kvm_vcpu 
*vcpu)
vmx_set_interrupt_shadow(vcpu, 0);
 }
 
+static bool nested_vmx_is_page_fault_vmexit(struct vmcs12 *vmcs12, u16 
error_code);
+
 /*
  * KVM wants to inject page-faults which it got to the guest. This function
  * checks whether in a nested guest, we need to inject them to L1 or L2.
@@ -2442,6 +2444,10 @@ static int nested_vmx_check_exception(struct kvm_vcpu 
*vcpu)
(nr == PF_VECTOR && vcpu->arch.exception.nested_apf)))
return 0;
 
+   if (nr == PF_VECTOR && !vcpu->arch.exception.nested_apf &&
+   !nested_vmx_is_page_fault_vmexit(vmcs12, 
vcpu->arch.exception.error_code))
+   return 0;
+
if (vcpu->arch.exception.nested_apf) {
vmcs_write32(VM_EXIT_INTR_ERROR_CODE, 
vcpu->arch.exception.error_code);
nested_vmx_vmexit(vcpu, EXIT_REASON_EXCEPTION_NMI,
-- 
2.7.4

Re: [PATCH 4/4] perf stat: Use group read for event groups

2017-07-22 Thread Namhyung Kim

On Fri, Jul 21, 2017 at 02:12:12PM +0200, Jiri Olsa wrote:
> Make perf stat use  group read if there  are groups
> defined. The group read will get the values for all
> member of groups within a single syscall instead of
> calling read syscall for every event.
> 
> We can see considerable less amount of kernel cycles
> spent on single group read, than reading each event
> separately, like for following perf stat command:
> 
>   # perf stat -e {cycles,instructions} -I 10 -a sleep 1
> 
> Monitored with "perf stat -r 5 -e '{cycles:u,cycles:k}'"
> 
> Before:
> 
> 24,325,676  cycles:u
>297,040,775  cycles:k
> 
>1.038554134 seconds time elapsed
> 
> After:
> 25,034,418  cycles:u
>158,256,395  cycles:k
> 
>1.036864497 seconds time elapsed
> 
> The perf_evsel__open fallback changes contributed by Andi Kleen.
> 
> Link: http://lkml.kernel.org/n/tip-b6g8qarwvptr81cqdtfst...@git.kernel.org
> Signed-off-by: Jiri Olsa 
> ---
>  tools/perf/builtin-stat.c | 30 +++---
>  tools/perf/util/counts.h  |  1 +
>  tools/perf/util/evsel.c   | 10 ++
>  3 files changed, 38 insertions(+), 3 deletions(-)
> 
> diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
> index 48ac53b199fc..866da7aa54bf 100644
> --- a/tools/perf/builtin-stat.c
> +++ b/tools/perf/builtin-stat.c
> @@ -213,10 +213,20 @@ static void perf_stat__reset_stats(void)
>  static int create_perf_stat_counter(struct perf_evsel *evsel)
>  {
>   struct perf_event_attr *attr = >attr;
> + struct perf_evsel *leader = evsel->leader;
>  
> - if (stat_config.scale)
> + if (stat_config.scale) {
>   attr->read_format = PERF_FORMAT_TOTAL_TIME_ENABLED |
>   PERF_FORMAT_TOTAL_TIME_RUNNING;
> + }
> +
> + /*
> +  * The event is part of non trivial group, let's enable
> +  * the group read (for leader) and ID retrieval for all
> +  * members.
> +  */
> + if (leader->nr_members > 1)
> + attr->read_format |= PERF_FORMAT_ID|PERF_FORMAT_GROUP;

I just wonder ID is really necessary.  Doesn't it have same order we
can traverse with the for_each_group_member()?

Thanks,
Namhyung


>  
>   attr->inherit = !no_inherit;
>  
> @@ -333,13 +343,21 @@ static int read_counter(struct perf_evsel *counter)
>   struct perf_counts_values *count;
>  
>   count = perf_counts(counter->counts, cpu, thread);
> - if (perf_evsel__read(counter, cpu, thread, count)) {
> +
> + /*
> +  * The leader's group read loads data into its group 
> members
> +  * (via perf_evsel__read_counter) and sets threir 
> count->loaded.
> +  */
> + if (!count->loaded &&
> + perf_evsel__read_counter(counter, cpu, thread)) {
>   counter->counts->scaled = -1;
>   perf_counts(counter->counts, cpu, thread)->ena 
> = 0;
>   perf_counts(counter->counts, cpu, thread)->run 
> = 0;
>   return -1;
>   }
>  
> + count->loaded = false;
> +
>   if (STAT_RECORD) {
>   if (perf_evsel__write_stat_event(counter, cpu, 
> thread, count)) {
>   pr_err("failed to write stat event\n");
> @@ -559,6 +577,11 @@ static int store_counter_ids(struct perf_evsel *counter)
>   return __store_counter_ids(counter, cpus, threads);
>  }
>  
> +static bool perf_evsel__should_store_id(struct perf_evsel *counter)
> +{
> + return STAT_RECORD || counter->attr.read_format & PERF_FORMAT_ID;
> +}
> +
>  static int __run_perf_stat(int argc, const char **argv)
>  {
>   int interval = stat_config.interval;
> @@ -631,7 +654,8 @@ static int __run_perf_stat(int argc, const char **argv)
>   if (l > unit_width)
>   unit_width = l;
>  
> - if (STAT_RECORD && store_counter_ids(counter))
> + if (perf_evsel__should_store_id(counter) &&
> + store_counter_ids(counter))
>   return -1;
>   }
>  
> diff --git a/tools/perf/util/counts.h b/tools/perf/util/counts.h
> index 34d8baaf558a..cb45a6aecf9d 100644
> --- a/tools/perf/util/counts.h
> +++ b/tools/perf/util/counts.h
> @@ -12,6 +12,7 @@ struct perf_counts_values {
>   };
>   u64 values[3];
>   };
> + boolloaded;
>  };
>  
>  struct perf_counts {
> diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
> index 89aecf3a35c7..3735c9e0080d 100644
> --- a/tools/perf/util/evsel.c
> +++ b/tools/perf/util/evsel.c
> @@ -49,6 +49,7 @@ static struct {
>   bool clockid_wrong;
>   bool lbr_flags;
>   bool write_backward;
> + bool group_read;
>  }

Re: [PATCH 4/4] perf stat: Use group read for event groups

2017-07-22 Thread Namhyung Kim

On Fri, Jul 21, 2017 at 02:12:12PM +0200, Jiri Olsa wrote:
> Make perf stat use  group read if there  are groups
> defined. The group read will get the values for all
> member of groups within a single syscall instead of
> calling read syscall for every event.
> 
> We can see considerable less amount of kernel cycles
> spent on single group read, than reading each event
> separately, like for following perf stat command:
> 
>   # perf stat -e {cycles,instructions} -I 10 -a sleep 1
> 
> Monitored with "perf stat -r 5 -e '{cycles:u,cycles:k}'"
> 
> Before:
> 
> 24,325,676  cycles:u
>297,040,775  cycles:k
> 
>1.038554134 seconds time elapsed
> 
> After:
> 25,034,418  cycles:u
>158,256,395  cycles:k
> 
>1.036864497 seconds time elapsed
> 
> The perf_evsel__open fallback changes contributed by Andi Kleen.
> 
> Link: http://lkml.kernel.org/n/tip-b6g8qarwvptr81cqdtfst...@git.kernel.org
> Signed-off-by: Jiri Olsa 
> ---
>  tools/perf/builtin-stat.c | 30 +++---
>  tools/perf/util/counts.h  |  1 +
>  tools/perf/util/evsel.c   | 10 ++
>  3 files changed, 38 insertions(+), 3 deletions(-)
> 
> diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
> index 48ac53b199fc..866da7aa54bf 100644
> --- a/tools/perf/builtin-stat.c
> +++ b/tools/perf/builtin-stat.c
> @@ -213,10 +213,20 @@ static void perf_stat__reset_stats(void)
>  static int create_perf_stat_counter(struct perf_evsel *evsel)
>  {
>   struct perf_event_attr *attr = >attr;
> + struct perf_evsel *leader = evsel->leader;
>  
> - if (stat_config.scale)
> + if (stat_config.scale) {
>   attr->read_format = PERF_FORMAT_TOTAL_TIME_ENABLED |
>   PERF_FORMAT_TOTAL_TIME_RUNNING;
> + }
> +
> + /*
> +  * The event is part of non trivial group, let's enable
> +  * the group read (for leader) and ID retrieval for all
> +  * members.
> +  */
> + if (leader->nr_members > 1)
> + attr->read_format |= PERF_FORMAT_ID|PERF_FORMAT_GROUP;

I just wonder ID is really necessary.  Doesn't it have same order we
can traverse with the for_each_group_member()?

Thanks,
Namhyung


>  
>   attr->inherit = !no_inherit;
>  
> @@ -333,13 +343,21 @@ static int read_counter(struct perf_evsel *counter)
>   struct perf_counts_values *count;
>  
>   count = perf_counts(counter->counts, cpu, thread);
> - if (perf_evsel__read(counter, cpu, thread, count)) {
> +
> + /*
> +  * The leader's group read loads data into its group 
> members
> +  * (via perf_evsel__read_counter) and sets threir 
> count->loaded.
> +  */
> + if (!count->loaded &&
> + perf_evsel__read_counter(counter, cpu, thread)) {
>   counter->counts->scaled = -1;
>   perf_counts(counter->counts, cpu, thread)->ena 
> = 0;
>   perf_counts(counter->counts, cpu, thread)->run 
> = 0;
>   return -1;
>   }
>  
> + count->loaded = false;
> +
>   if (STAT_RECORD) {
>   if (perf_evsel__write_stat_event(counter, cpu, 
> thread, count)) {
>   pr_err("failed to write stat event\n");
> @@ -559,6 +577,11 @@ static int store_counter_ids(struct perf_evsel *counter)
>   return __store_counter_ids(counter, cpus, threads);
>  }
>  
> +static bool perf_evsel__should_store_id(struct perf_evsel *counter)
> +{
> + return STAT_RECORD || counter->attr.read_format & PERF_FORMAT_ID;
> +}
> +
>  static int __run_perf_stat(int argc, const char **argv)
>  {
>   int interval = stat_config.interval;
> @@ -631,7 +654,8 @@ static int __run_perf_stat(int argc, const char **argv)
>   if (l > unit_width)
>   unit_width = l;
>  
> - if (STAT_RECORD && store_counter_ids(counter))
> + if (perf_evsel__should_store_id(counter) &&
> + store_counter_ids(counter))
>   return -1;
>   }
>  
> diff --git a/tools/perf/util/counts.h b/tools/perf/util/counts.h
> index 34d8baaf558a..cb45a6aecf9d 100644
> --- a/tools/perf/util/counts.h
> +++ b/tools/perf/util/counts.h
> @@ -12,6 +12,7 @@ struct perf_counts_values {
>   };
>   u64 values[3];
>   };
> + boolloaded;
>  };
>  
>  struct perf_counts {
> diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
> index 89aecf3a35c7..3735c9e0080d 100644
> --- a/tools/perf/util/evsel.c
> +++ b/tools/perf/util/evsel.c
> @@ -49,6 +49,7 @@ static struct {
>   bool clockid_wrong;
>   bool lbr_flags;
>   bool write_backward;
> + bool group_read;
>  } perf_missing_features;
>

Re: [PATCH] oom_reaper: close race without using oom_lock

2017-07-22 Thread Tetsuo Handa

ertions(+), 2 deletions(-)

diff --git a/mm/oom_kill.c b/mm/oom_kill.c
index 3ef14f0..9cc6634 100644
--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -306,7 +306,7 @@ static int oom_evaluate_task(struct task_struct *task, void 
*arg)
 * any memory is quite low.
 */
if (!is_sysrq_oom(oc) && tsk_is_oom_victim(task)) {
-   if (test_bit(MMF_OOM_SKIP, >signal->oom_mm->flags)) {
+   if (task->signal->oom_mm->async_put_work.func) {
const struct alloc_context *ac = oc->alloc_context;
 
if (ac) {
@@ -321,6 +321,8 @@ static int oom_evaluate_task(struct task_struct *task, void 
*arg)
}
goto next;
}
+   if (test_bit(MMF_OOM_SKIP, >signal->oom_mm->flags))
+   task->signal->oom_mm->async_put_work.func = (void *) 1;
goto abort;
}
 
@@ -652,8 +654,10 @@ static void mark_oom_victim(struct task_struct *tsk)
return;
 
/* oom_mm is bound to the signal struct life time. */
-   if (!cmpxchg(>signal->oom_mm, NULL, mm))
+   if (!cmpxchg(>signal->oom_mm, NULL, mm)) {
mmgrab(tsk->signal->oom_mm);
+   tsk->signal->oom_mm->async_put_work.func = NULL;
+   }
 
/*
 * Make sure that the task is woken up from uninterruptible sleep


Patch4:

 mm/page_alloc.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 4cf2861..3e0e7da 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -3265,7 +3265,7 @@ void warn_alloc(gfp_t gfp_mask, nodemask_t *nodemask, 
const char *fmt, ...)
 * Acquire the oom lock.  If that fails, somebody else is
 * making progress for us.
 */
-   if (!mutex_trylock(_lock)) {
+   if (mutex_lock_killable(_lock)) {
*did_some_progress = 1;
schedule_timeout_uninterruptible(1);
return NULL;


Memory stressor is shown below.

#include 
#include 
#include 
#include 
#include 
#include 
#include 

int main(int argc, char *argv[])
{
static char buffer[4096] = { };
char *buf = NULL;
unsigned long size;
unsigned long i;
for (i = 0; i < 1024; i++) {
if (fork() == 0) {
int fd = open("/proc/self/oom_score_adj", O_WRONLY);
write(fd, "1000", 4);
close(fd);
sleep(1);
if (!i)
pause();
snprintf(buffer, sizeof(buffer), "/tmp/file.%u", 
getpid());
fd = open(buffer, O_WRONLY | O_CREAT | O_APPEND, 0600);
while (write(fd, buffer, sizeof(buffer)) == 
sizeof(buffer)) {
poll(NULL, 0, 10);
fsync(fd);
}
_exit(0);
}
}
for (size = 1048576; size < 512UL * (1 << 30); size <<= 1) {
char *cp = realloc(buf, size);
if (!cp) {
size >>= 1;
break;
    }
buf = cp;
}
sleep(2);
/* Will cause OOM due to overcommit */
for (i = 0; i < size; i += 4096)
buf[i] = 0;
pause();
return 0;
}


Log is at http://I-love.SAKURA.ne.jp/tmp/serial-20170722.txt.xz .

# grep MMF_OOM_SKIP serial-20170722.txt | sed -e 's/=/ /g' | awk ' { if ($5 + 
$7) printf("%10u %10u %10f\n", $5, $7, ($5*100/($5+$7))); else 
printf("-\n"); }'

- # Patch1
 0 10   0.00
 0 25   0.00
16178   8.247423
16591   2.635914
51   1476   3.339882
51   1517   3.252551
51   1559   3.167702
51   1602   3.085299
51   1646   3.005303
51   1832   2.708444
51   1931   2.573158
51   2141   2.326642
   172   2950   5.509289
   172   4890   3.397866
   471   7916   5.615834
   471   8255   5.397662
   471   8717   5.126252
   471   8954   4.997347
   471   9435   4.754694
   471  10060   4.472510
   471  10840   4.164088
   471  10973   4.115694
   471  12475   3.638189
   471  14318   3.184800
   471  14762   3.091971
   471  16122   2.838546
   471  16

Re: [PATCH] oom_reaper: close race without using oom_lock

2017-07-22 Thread Tetsuo Handa

ertions(+), 2 deletions(-)

diff --git a/mm/oom_kill.c b/mm/oom_kill.c
index 3ef14f0..9cc6634 100644
--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -306,7 +306,7 @@ static int oom_evaluate_task(struct task_struct *task, void 
*arg)
 * any memory is quite low.
 */
if (!is_sysrq_oom(oc) && tsk_is_oom_victim(task)) {
-   if (test_bit(MMF_OOM_SKIP, >signal->oom_mm->flags)) {
+   if (task->signal->oom_mm->async_put_work.func) {
const struct alloc_context *ac = oc->alloc_context;
 
if (ac) {
@@ -321,6 +321,8 @@ static int oom_evaluate_task(struct task_struct *task, void 
*arg)
}
goto next;
}
+   if (test_bit(MMF_OOM_SKIP, >signal->oom_mm->flags))
+   task->signal->oom_mm->async_put_work.func = (void *) 1;
goto abort;
}
 
@@ -652,8 +654,10 @@ static void mark_oom_victim(struct task_struct *tsk)
return;
 
/* oom_mm is bound to the signal struct life time. */
-   if (!cmpxchg(>signal->oom_mm, NULL, mm))
+   if (!cmpxchg(>signal->oom_mm, NULL, mm)) {
mmgrab(tsk->signal->oom_mm);
+   tsk->signal->oom_mm->async_put_work.func = NULL;
+   }
 
/*
 * Make sure that the task is woken up from uninterruptible sleep


Patch4:

 mm/page_alloc.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 4cf2861..3e0e7da 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -3265,7 +3265,7 @@ void warn_alloc(gfp_t gfp_mask, nodemask_t *nodemask, 
const char *fmt, ...)
 * Acquire the oom lock.  If that fails, somebody else is
 * making progress for us.
 */
-   if (!mutex_trylock(_lock)) {
+   if (mutex_lock_killable(_lock)) {
*did_some_progress = 1;
schedule_timeout_uninterruptible(1);
return NULL;


Memory stressor is shown below.

#include 
#include 
#include 
#include 
#include 
#include 
#include 

int main(int argc, char *argv[])
{
static char buffer[4096] = { };
char *buf = NULL;
unsigned long size;
unsigned long i;
for (i = 0; i < 1024; i++) {
if (fork() == 0) {
int fd = open("/proc/self/oom_score_adj", O_WRONLY);
write(fd, "1000", 4);
close(fd);
sleep(1);
if (!i)
pause();
snprintf(buffer, sizeof(buffer), "/tmp/file.%u", 
getpid());
fd = open(buffer, O_WRONLY | O_CREAT | O_APPEND, 0600);
while (write(fd, buffer, sizeof(buffer)) == 
sizeof(buffer)) {
poll(NULL, 0, 10);
fsync(fd);
}
_exit(0);
}
}
for (size = 1048576; size < 512UL * (1 << 30); size <<= 1) {
char *cp = realloc(buf, size);
if (!cp) {
size >>= 1;
break;
    }
buf = cp;
}
sleep(2);
/* Will cause OOM due to overcommit */
for (i = 0; i < size; i += 4096)
buf[i] = 0;
pause();
return 0;
}


Log is at http://I-love.SAKURA.ne.jp/tmp/serial-20170722.txt.xz .

# grep MMF_OOM_SKIP serial-20170722.txt | sed -e 's/=/ /g' | awk ' { if ($5 + 
$7) printf("%10u %10u %10f\n", $5, $7, ($5*100/($5+$7))); else 
printf("-\n"); }'

- # Patch1
 0 10   0.00
 0 25   0.00
16178   8.247423
16591   2.635914
51   1476   3.339882
51   1517   3.252551
51   1559   3.167702
51   1602   3.085299
51   1646   3.005303
51   1832   2.708444
51   1931   2.573158
51   2141   2.326642
   172   2950   5.509289
   172   4890   3.397866
   471   7916   5.615834
   471   8255   5.397662
   471   8717   5.126252
   471   8954   4.997347
   471   9435   4.754694
   471  10060   4.472510
   471  10840   4.164088
   471  10973   4.115694
   471  12475   3.638189
   471  14318   3.184800
   471  14762   3.091971
   471  16122   2.838546
   471  16

Re: [PATCH v2 07/10] ARM: dts: sun8i: a83t: Add MMC controller device nodes

2017-07-22 Thread kbuild test robot

Hi Chen-Yu,

[auto build test ERROR on next-20170719]
[cannot apply to clk/clk-next robh/for-next linus/master v4.13-rc1 v4.12 
v4.12-rc7 v4.13-rc1]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/Chen-Yu-Tsai/ARM-sun8i-a83t-Add-support-for-MMC-controllers/20170723-054406
config: arm-at91_dt_defconfig (attached as .config)
compiler: arm-linux-gnueabi-gcc (Debian 6.1.1-9) 6.1.1 20160705
reproduce:
wget 
https://raw.githubusercontent.com/01org/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# save the attached .config to linux build tree
make.cross ARCH=arm 

All errors (new ones prefixed by >>):

>> Error: arch/arm/boot/dts/sun8i-a83t.dtsi:187.19-20 syntax error
>> FATAL ERROR: Unable to parse input tree

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


.config.gz
Description: application/gzip

Re: [PATCH v2 07/10] ARM: dts: sun8i: a83t: Add MMC controller device nodes

2017-07-22 Thread kbuild test robot

Hi Chen-Yu,

[auto build test ERROR on next-20170719]
[cannot apply to clk/clk-next robh/for-next linus/master v4.13-rc1 v4.12 
v4.12-rc7 v4.13-rc1]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/Chen-Yu-Tsai/ARM-sun8i-a83t-Add-support-for-MMC-controllers/20170723-054406
config: arm-at91_dt_defconfig (attached as .config)
compiler: arm-linux-gnueabi-gcc (Debian 6.1.1-9) 6.1.1 20160705
reproduce:
wget 
https://raw.githubusercontent.com/01org/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# save the attached .config to linux build tree
make.cross ARCH=arm 

All errors (new ones prefixed by >>):

>> Error: arch/arm/boot/dts/sun8i-a83t.dtsi:187.19-20 syntax error
>> FATAL ERROR: Unable to parse input tree

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


.config.gz
Description: application/gzip

Re: [PATCH] KVM: nVMX: Fix exception injection

2017-07-22 Thread Wanpeng Li

2017-07-22 22:25 GMT+08:00 Jim Mattson :
> On Fri, Jul 21, 2017 at 1:39 AM, Wanpeng Li  wrote:
>> Hi Jim,
>> 2017-07-21 3:16 GMT+08:00 Jim Mattson :
>>> On Wed, Jul 19, 2017 at 7:31 PM, Wanpeng Li  wrote:
 Hi Jim,
 2017-07-19 2:47 GMT+08:00 Jim Mattson :
> Why do we expect the VM_EXIT_INTR_INFO and EXIT_QUALIFICATION fields
> of the VMCS to have the correct values for the injected exception?

 Good point, I think we should synthesize VM_EXIT_INTR_INFO and
 EXIT_QUALIFICATION manually, I will post a patch for it. Btw, how
 about setting EXIT_QULIFICATION to vcpu->arch.cr2 for the page fault
 exception and 0 for other exceptions?
>>>
>>> From the SDM, section 27.1:
>>>
>>> If an event causes a VM exit directly, it does not update
>>
>> I mentioned this in the patch description:
>>
>>> However, there is no guarantee the exit reason is exception currently, when 
>>> there is an external interrupt occurred on host, maybe a time interrupt for 
>>> host which should not be injected to guest, and somewhere queues an 
>>> exception, then the function nested_vmx_check_exception() will be called 
>>> and the vmexit emulation codes will try to emulate the "Acknowledge 
>>> interrupt on exit" behavior, the warning is triggered.
>>
>> If you think the scenario is correct, then it should be an event
>> causes a VM exit indirectly. So if both the scenario which I mentioned
>> and "This function
>> assumes it is called with the exit reason in vmcs02 being a #PF
>> exception" can happen, then maybe we should figure out how to fix both
>> scenarios suitable.
>
> In the situation you describe, the #PF causes a synthesized VM-exit
> from L2 to L1 directly, not indirectly. From the SDM:
>
>An exception causes a VM exit directly if the bit corresponding to
> that exception is set in the exception bitmap.
>
> Hence, CR2 should not be set yet.

Any idea how to synthesize exit qualification for page fault and debug
exception?

Regards,
Wanpeng Li

>
>>
>>> architectural state as it would have if it had it not caused the VM
>>> exit:
>>>   - A debug exception does not update DR6, DR7.GD, or
>>> IA32_DEBUGCTL.LBR. (Information about the nature of the debug
>>> exception is saved in the exit qualification field.)
>>>   - A page fault does not update CR2. (The linear address causing
>>> the page fault is saved in the exit-qualification field.)
>>>
>>> This means that vcpu->arch.cr2 should not be set at this point for a
>>> #PF injection (and vcpu->arch.dr6 should not be set at this point for
>>> a #DB injection). For all other exceptions, yes, the exit
>>> qualification should be cleared.
>>>

Re: [PATCH] KVM: nVMX: Fix exception injection

2017-07-22 Thread Wanpeng Li

2017-07-22 22:25 GMT+08:00 Jim Mattson :
> On Fri, Jul 21, 2017 at 1:39 AM, Wanpeng Li  wrote:
>> Hi Jim,
>> 2017-07-21 3:16 GMT+08:00 Jim Mattson :
>>> On Wed, Jul 19, 2017 at 7:31 PM, Wanpeng Li  wrote:
 Hi Jim,
 2017-07-19 2:47 GMT+08:00 Jim Mattson :
> Why do we expect the VM_EXIT_INTR_INFO and EXIT_QUALIFICATION fields
> of the VMCS to have the correct values for the injected exception?

 Good point, I think we should synthesize VM_EXIT_INTR_INFO and
 EXIT_QUALIFICATION manually, I will post a patch for it. Btw, how
 about setting EXIT_QULIFICATION to vcpu->arch.cr2 for the page fault
 exception and 0 for other exceptions?
>>>
>>> From the SDM, section 27.1:
>>>
>>> If an event causes a VM exit directly, it does not update
>>
>> I mentioned this in the patch description:
>>
>>> However, there is no guarantee the exit reason is exception currently, when 
>>> there is an external interrupt occurred on host, maybe a time interrupt for 
>>> host which should not be injected to guest, and somewhere queues an 
>>> exception, then the function nested_vmx_check_exception() will be called 
>>> and the vmexit emulation codes will try to emulate the "Acknowledge 
>>> interrupt on exit" behavior, the warning is triggered.
>>
>> If you think the scenario is correct, then it should be an event
>> causes a VM exit indirectly. So if both the scenario which I mentioned
>> and "This function
>> assumes it is called with the exit reason in vmcs02 being a #PF
>> exception" can happen, then maybe we should figure out how to fix both
>> scenarios suitable.
>
> In the situation you describe, the #PF causes a synthesized VM-exit
> from L2 to L1 directly, not indirectly. From the SDM:
>
>An exception causes a VM exit directly if the bit corresponding to
> that exception is set in the exception bitmap.
>
> Hence, CR2 should not be set yet.

Any idea how to synthesize exit qualification for page fault and debug
exception?

Regards,
Wanpeng Li

>
>>
>>> architectural state as it would have if it had it not caused the VM
>>> exit:
>>>   - A debug exception does not update DR6, DR7.GD, or
>>> IA32_DEBUGCTL.LBR. (Information about the nature of the debug
>>> exception is saved in the exit qualification field.)
>>>   - A page fault does not update CR2. (The linear address causing
>>> the page fault is saved in the exit-qualification field.)
>>>
>>> This means that vcpu->arch.cr2 should not be set at this point for a
>>> #PF injection (and vcpu->arch.dr6 should not be set at this point for
>>> a #DB injection). For all other exceptions, yes, the exit
>>> qualification should be cleared.
>>>

undefined reference to `_GLOBAL_OFFSET_TABLE_'

2017-07-22 Thread kbuild test robot

Hi Nicholas,

FYI, the error/warning still remains.

tree:   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 
master
head:   4b162c530d9c101381500e586fedb1340595a6ff
commit: 799c43415442414b1032580c47684cb709dfed6d kbuild: thin archives make 
default for all archs
date:   3 weeks ago
config: microblaze-allnoconfig (attached as .config)
compiler: microblaze-linux-gcc (GCC) 6.2.0
reproduce:
wget 
https://raw.githubusercontent.com/01org/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
git checkout 799c43415442414b1032580c47684cb709dfed6d
# save the attached .config to linux build tree
make.cross ARCH=microblaze 

All errors (new ones prefixed by >>):

   mm/slub.o: In function `__slab_free.isra.13':
>> (.text+0x1038): undefined reference to `_GLOBAL_OFFSET_TABLE_'
   scripts/link-vmlinux.sh: line 93: 56533 Segmentation fault  ${LD} 
${LDFLAGS} ${LDFLAGS_vmlinux} -o ${2} -T ${lds} ${objects}

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


.config.gz
Description: application/gzip

undefined reference to `_GLOBAL_OFFSET_TABLE_'

2017-07-22 Thread kbuild test robot

Hi Nicholas,

FYI, the error/warning still remains.

tree:   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 
master
head:   4b162c530d9c101381500e586fedb1340595a6ff
commit: 799c43415442414b1032580c47684cb709dfed6d kbuild: thin archives make 
default for all archs
date:   3 weeks ago
config: microblaze-allnoconfig (attached as .config)
compiler: microblaze-linux-gcc (GCC) 6.2.0
reproduce:
wget 
https://raw.githubusercontent.com/01org/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
git checkout 799c43415442414b1032580c47684cb709dfed6d
# save the attached .config to linux build tree
make.cross ARCH=microblaze 

All errors (new ones prefixed by >>):

   mm/slub.o: In function `__slab_free.isra.13':
>> (.text+0x1038): undefined reference to `_GLOBAL_OFFSET_TABLE_'
   scripts/link-vmlinux.sh: line 93: 56533 Segmentation fault  ${LD} 
${LDFLAGS} ${LDFLAGS_vmlinux} -o ${2} -T ${lds} ${objects}

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


.config.gz
Description: application/gzip

RE: [PATCH V3 net-next 2/8] net: hns3: Add support of the HNAE3 framework

2017-07-22 Thread Salil Mehta

Hi Stephen,

> -Original Message-
> From: Stephen Hemminger [mailto:step...@networkplumber.org]
> Sent: Monday, June 19, 2017 5:59 PM
> To: Salil Mehta
> Cc: da...@davemloft.net; Zhuangyuzeng (Yisen); huangdaode; lipeng (Y);
> mehta.salil@gmail.com; net...@vger.kernel.org; linux-
> ker...@vger.kernel.org; Linuxarm
> Subject: Re: [PATCH V3 net-next 2/8] net: hns3: Add support of the
> HNAE3 framework
> 
> On Sat, 17 Jun 2017 18:24:25 +0100
> Salil Mehta  wrote:
> 
> > +
> > +/* This struct defines the operation on the handle.
> > + *
> > + * init_ae_dev(): (mandatory)
> > + *   Get PF configure from pci_dev and initialize PF hardware
> > + * uninit_ae_dev()
> > + *   Disable PF device and release PF resource
> > + * register_client
> > + *   Register client to ae_dev
> > + * unregister_client()
> > + *   Unregister client from ae_dev
> > + * start()
> > + *   Enable the hardware
> > + * stop()
> > + *   Disable the hardware
> > + * get_status()
> > + *   Get the carrier state of the back channel of the handle, 1 for
> ok, 0 for
> > + *   non-ok
> > + * get_ksettings_an_result()
> > + *   Get negotiation status,speed and duplex
> > + * update_speed_duplex_h()
> > + *   Update hardware speed and duplex
> > + * get_media_type()
> > + *   Get media type of MAC
> > + * adjust_link()
> > + *   Adjust link status
> > + * set_loopback()
> > + *   Set loopback
> > + * set_promisc_mode
> > + *   Set promisc mode
> > + * set_mtu()
> > + *   set mtu
> > + * get_pauseparam()
> > + *   get tx and rx of pause frame use
> > + * set_pauseparam()
> > + *   set tx and rx of pause frame use
> > + * set_autoneg()
> > + *   set auto autonegotiation of pause frame use
> > + * get_autoneg()
> > + *   get auto autonegotiation of pause frame use
> > + * get_coalesce_usecs()
> > + *   get usecs to delay a TX interrupt after a packet is sent
> > + * get_rx_max_coalesced_frames()
> > + *   get Maximum number of packets to be sent before a TX interrupt.
> > + * set_coalesce_usecs()
> > + *   set usecs to delay a TX interrupt after a packet is sent
> > + * set_coalesce_frames()
> > + *   set Maximum number of packets to be sent before a TX interrupt.
> > + * get_mac_addr()
> > + *   get mac address
> > + * set_mac_addr()
> > + *   set mac address
> > + * add_uc_addr
> > + *   Add unicast addr to mac table
> > + * rm_uc_addr
> > + *   Remove unicast addr from mac table
> > + * set_mc_addr()
> > + *   Set multicast address
> > + * add_mc_addr
> > + *   Add multicast address to mac table
> > + * rm_mc_addr
> > + *   Remove multicast address from mac table
> > + * update_stats()
> > + *   Update Old network device statistics
> > + * get_ethtool_stats()
> > + *   Get ethtool network device statistics
> > + * get_strings()
> > + *   Get a set of strings that describe the requested objects
> > + * get_sset_count()
> > + *   Get number of strings that @get_strings will write
> > + * update_led_status()
> > + *   Update the led status
> > + * set_led_id()
> > + *   Set led id
> > + * get_regs()
> > + *   Get regs dump
> > + * get_regs_len()
> > + *   Get the len of the regs dump
> > + * get_rss_key_size()
> > + *   Get rss key size
> > + * get_rss_indir_size()
> > + *   Get rss indirection table size
> > + * get_rss()
> > + *   Get rss table
> > + * set_rss()
> > + *   Set rss table
> > + * get_tc_size()
> > + *   Get tc size of handle
> > + * get_vector()
> > + *   Get vector number and vector infomation
> > + * map_ring_to_vector()
> > + *   Map rings to vector
> > + * unmap_ring_from_vector()
> > + *   Unmap rings from vector
> > + * add_tunnel_udp()
> > + *   Add tunnel information to hardware
> > + * del_tunnel_udp()
> > + *   Delete tunnel information from hardware
> > + * reset_queue()
> > + *   Reset queue
> > + * get_fw_version()
> > + *   Get firmware version
> > + * get_mdix_mode()
> > + *   Get media typr of phy
> > + * set_vlan_filter()
> > + *   Set vlan filter config of Ports
> > + * set_vf_vlan_filter()
> > + *   Set vlan filter config of vf
> > + */
> > +struct hnae3_ae_ops {
> > +   int (*init_ae_dev)(struct hnae3_ae_dev *ae_dev);
> > +   void (*uninit_ae_dev)(struct hnae3_ae_dev *ae_dev);
> > +
> > +   int (*register_client)(struct hnae3_client *client,
> > +  struct hnae3_ae_dev *ae_dev);
> > +   void (*unregister_client)(struct hnae3_client *client,
> > + struct hnae3_ae_dev *ae_dev);
> > +   int (*start)(struct hnae3_handle *handle);
> > +   void (*stop)(struct hnae3_handle *handle);
> > +   int (*get_status)(struct hnae3_handle *handle);
> > +   void (*get_ksettings_an_result)(struct hnae3_handle *handle,
> > +   u8 *auto_neg, u32 *speed, u8 *duplex);
> > +
> > +   int (*update_speed_duplex_h)(struct hnae3_handle *handle);
> > +   int (*cfg_mac_speed_dup_h)(struct hnae3_handle *handle, int
> speed,
> > +  u8 duplex);
> > +
> > +   void (*get_media_type)(struct hnae3_handle

RE: [PATCH V3 net-next 2/8] net: hns3: Add support of the HNAE3 framework

2017-07-22 Thread Salil Mehta

Hi Stephen,

> -Original Message-
> From: Stephen Hemminger [mailto:step...@networkplumber.org]
> Sent: Monday, June 19, 2017 5:59 PM
> To: Salil Mehta
> Cc: da...@davemloft.net; Zhuangyuzeng (Yisen); huangdaode; lipeng (Y);
> mehta.salil@gmail.com; net...@vger.kernel.org; linux-
> ker...@vger.kernel.org; Linuxarm
> Subject: Re: [PATCH V3 net-next 2/8] net: hns3: Add support of the
> HNAE3 framework
> 
> On Sat, 17 Jun 2017 18:24:25 +0100
> Salil Mehta  wrote:
> 
> > +
> > +/* This struct defines the operation on the handle.
> > + *
> > + * init_ae_dev(): (mandatory)
> > + *   Get PF configure from pci_dev and initialize PF hardware
> > + * uninit_ae_dev()
> > + *   Disable PF device and release PF resource
> > + * register_client
> > + *   Register client to ae_dev
> > + * unregister_client()
> > + *   Unregister client from ae_dev
> > + * start()
> > + *   Enable the hardware
> > + * stop()
> > + *   Disable the hardware
> > + * get_status()
> > + *   Get the carrier state of the back channel of the handle, 1 for
> ok, 0 for
> > + *   non-ok
> > + * get_ksettings_an_result()
> > + *   Get negotiation status,speed and duplex
> > + * update_speed_duplex_h()
> > + *   Update hardware speed and duplex
> > + * get_media_type()
> > + *   Get media type of MAC
> > + * adjust_link()
> > + *   Adjust link status
> > + * set_loopback()
> > + *   Set loopback
> > + * set_promisc_mode
> > + *   Set promisc mode
> > + * set_mtu()
> > + *   set mtu
> > + * get_pauseparam()
> > + *   get tx and rx of pause frame use
> > + * set_pauseparam()
> > + *   set tx and rx of pause frame use
> > + * set_autoneg()
> > + *   set auto autonegotiation of pause frame use
> > + * get_autoneg()
> > + *   get auto autonegotiation of pause frame use
> > + * get_coalesce_usecs()
> > + *   get usecs to delay a TX interrupt after a packet is sent
> > + * get_rx_max_coalesced_frames()
> > + *   get Maximum number of packets to be sent before a TX interrupt.
> > + * set_coalesce_usecs()
> > + *   set usecs to delay a TX interrupt after a packet is sent
> > + * set_coalesce_frames()
> > + *   set Maximum number of packets to be sent before a TX interrupt.
> > + * get_mac_addr()
> > + *   get mac address
> > + * set_mac_addr()
> > + *   set mac address
> > + * add_uc_addr
> > + *   Add unicast addr to mac table
> > + * rm_uc_addr
> > + *   Remove unicast addr from mac table
> > + * set_mc_addr()
> > + *   Set multicast address
> > + * add_mc_addr
> > + *   Add multicast address to mac table
> > + * rm_mc_addr
> > + *   Remove multicast address from mac table
> > + * update_stats()
> > + *   Update Old network device statistics
> > + * get_ethtool_stats()
> > + *   Get ethtool network device statistics
> > + * get_strings()
> > + *   Get a set of strings that describe the requested objects
> > + * get_sset_count()
> > + *   Get number of strings that @get_strings will write
> > + * update_led_status()
> > + *   Update the led status
> > + * set_led_id()
> > + *   Set led id
> > + * get_regs()
> > + *   Get regs dump
> > + * get_regs_len()
> > + *   Get the len of the regs dump
> > + * get_rss_key_size()
> > + *   Get rss key size
> > + * get_rss_indir_size()
> > + *   Get rss indirection table size
> > + * get_rss()
> > + *   Get rss table
> > + * set_rss()
> > + *   Set rss table
> > + * get_tc_size()
> > + *   Get tc size of handle
> > + * get_vector()
> > + *   Get vector number and vector infomation
> > + * map_ring_to_vector()
> > + *   Map rings to vector
> > + * unmap_ring_from_vector()
> > + *   Unmap rings from vector
> > + * add_tunnel_udp()
> > + *   Add tunnel information to hardware
> > + * del_tunnel_udp()
> > + *   Delete tunnel information from hardware
> > + * reset_queue()
> > + *   Reset queue
> > + * get_fw_version()
> > + *   Get firmware version
> > + * get_mdix_mode()
> > + *   Get media typr of phy
> > + * set_vlan_filter()
> > + *   Set vlan filter config of Ports
> > + * set_vf_vlan_filter()
> > + *   Set vlan filter config of vf
> > + */
> > +struct hnae3_ae_ops {
> > +   int (*init_ae_dev)(struct hnae3_ae_dev *ae_dev);
> > +   void (*uninit_ae_dev)(struct hnae3_ae_dev *ae_dev);
> > +
> > +   int (*register_client)(struct hnae3_client *client,
> > +  struct hnae3_ae_dev *ae_dev);
> > +   void (*unregister_client)(struct hnae3_client *client,
> > + struct hnae3_ae_dev *ae_dev);
> > +   int (*start)(struct hnae3_handle *handle);
> > +   void (*stop)(struct hnae3_handle *handle);
> > +   int (*get_status)(struct hnae3_handle *handle);
> > +   void (*get_ksettings_an_result)(struct hnae3_handle *handle,
> > +   u8 *auto_neg, u32 *speed, u8 *duplex);
> > +
> > +   int (*update_speed_duplex_h)(struct hnae3_handle *handle);
> > +   int (*cfg_mac_speed_dup_h)(struct hnae3_handle *handle, int
> speed,
> > +  u8 duplex);
> > +
> > +   void (*get_media_type)(struct hnae3_handle *handle, u8
> *media_type);

RE: [PATCH V2 net-next 1/8] net: hns3: Add support of HNS3 Ethernet Driver for hip08 SoC

2017-07-22 Thread Salil Mehta

Hi Stephen,

> -Original Message-
> From: Stephen Hemminger [mailto:step...@networkplumber.org]
> Sent: Monday, June 19, 2017 4:48 PM
> To: Salil Mehta
> Cc: da...@davemloft.net; Zhuangyuzeng (Yisen); huangdaode; lipeng (Y);
> mehta.salil@gmail.com; net...@vger.kernel.org; linux-
> ker...@vger.kernel.org; Linuxarm
> Subject: Re: [PATCH V2 net-next 1/8] net: hns3: Add support of HNS3
> Ethernet Driver for hip08 SoC
> 
> On Wed, 14 Jun 2017 00:10:28 +0100
> Salil Mehta  wrote:
> 
> > +hns3_nic_get_stats64(struct net_device *ndev, struct
> rtnl_link_stats64 *stats)
> > +{
> > +   struct hns3_nic_priv *priv = netdev_priv(ndev);
> > +   int queue_num = priv->ae_handle->kinfo.num_tqps;
> > +   u64 tx_bytes = 0;
> > +   u64 rx_bytes = 0;
> > +   u64 tx_pkts = 0;
> > +   u64 rx_pkts = 0;
> > +   int idx = 0;
> unnecessary initialization
> 
> > +
> > +   for (idx = 0; idx < queue_num; idx++) {
> > +   tx_bytes += priv->ring_data[idx].ring->stats.tx_bytes;
> > +   tx_pkts += priv->ring_data[idx].ring->stats.tx_pkts;
> > +   rx_bytes +=
> > +   priv->ring_data[idx + queue_num].ring-
> >stats.rx_bytes;
> > +   rx_pkts += priv->ring_data[idx + queue_num].ring-
> >stats.rx_pkts;
> > +   }
> > +
> 
> Since rx_bytes and other statistics are 64 bit values. You need to use
> something to ensure that updates to these values are atomic on 32 bit
> platforms.  The most common way to handle this is with the
> u64_stats_sync
> mechanism which is a nop on 64 bit architectures, and uses a seqcount
> to do updates on 32 bit CPU's.
Sure good point. This has changed in the V4 patch.

Thanks for guiding.
Salil

> 
>

RE: [PATCH V2 net-next 1/8] net: hns3: Add support of HNS3 Ethernet Driver for hip08 SoC

2017-07-22 Thread Salil Mehta

Hi Stephen,

> -Original Message-
> From: Stephen Hemminger [mailto:step...@networkplumber.org]
> Sent: Monday, June 19, 2017 4:48 PM
> To: Salil Mehta
> Cc: da...@davemloft.net; Zhuangyuzeng (Yisen); huangdaode; lipeng (Y);
> mehta.salil@gmail.com; net...@vger.kernel.org; linux-
> ker...@vger.kernel.org; Linuxarm
> Subject: Re: [PATCH V2 net-next 1/8] net: hns3: Add support of HNS3
> Ethernet Driver for hip08 SoC
> 
> On Wed, 14 Jun 2017 00:10:28 +0100
> Salil Mehta  wrote:
> 
> > +hns3_nic_get_stats64(struct net_device *ndev, struct
> rtnl_link_stats64 *stats)
> > +{
> > +   struct hns3_nic_priv *priv = netdev_priv(ndev);
> > +   int queue_num = priv->ae_handle->kinfo.num_tqps;
> > +   u64 tx_bytes = 0;
> > +   u64 rx_bytes = 0;
> > +   u64 tx_pkts = 0;
> > +   u64 rx_pkts = 0;
> > +   int idx = 0;
> unnecessary initialization
> 
> > +
> > +   for (idx = 0; idx < queue_num; idx++) {
> > +   tx_bytes += priv->ring_data[idx].ring->stats.tx_bytes;
> > +   tx_pkts += priv->ring_data[idx].ring->stats.tx_pkts;
> > +   rx_bytes +=
> > +   priv->ring_data[idx + queue_num].ring-
> >stats.rx_bytes;
> > +   rx_pkts += priv->ring_data[idx + queue_num].ring-
> >stats.rx_pkts;
> > +   }
> > +
> 
> Since rx_bytes and other statistics are 64 bit values. You need to use
> something to ensure that updates to these values are atomic on 32 bit
> platforms.  The most common way to handle this is with the
> u64_stats_sync
> mechanism which is a nop on 64 bit architectures, and uses a seqcount
> to do updates on 32 bit CPU's.
Sure good point. This has changed in the V4 patch.

Thanks for guiding.
Salil

> 
>

RE: [PATCH V3 net-next 6/8] net: hns3: Add MDIO support to HNS3 Ethernet driver for hip08 SoC

2017-07-22 Thread Salil Mehta

Hi Andrew,

> -Original Message-
> From: Andrew Lunn [mailto:and...@lunn.ch]
> Sent: Monday, June 19, 2017 4:53 AM
> To: Salil Mehta
> Cc: da...@davemloft.net; Zhuangyuzeng (Yisen); huangdaode; lipeng (Y);
> mehta.salil@gmail.com; net...@vger.kernel.org; linux-
> ker...@vger.kernel.org; Linuxarm
> Subject: Re: [PATCH V3 net-next 6/8] net: hns3: Add MDIO support to
> HNS3 Ethernet driver for hip08 SoC
> 
> On Sat, Jun 17, 2017 at 06:24:29PM +0100, Salil Mehta wrote:
> > This patch adds the support of MDIO bus interface for HNS3 driver.
> > Code provides various interfaces to start and stop the PHY layer
> > and to read and write the MDIO bus or PHY.
> >
> > Signed-off-by: Daode Huang 
> > Signed-off-by: lipeng 
> > Signed-off-by: Salil Mehta 
> > Signed-off-by: Yisen Zhuang 
> > ---
> > Patch V3: Addressed Below comments:
> >  1. Florian Fainelli: https://lkml.org/lkml/2017/6/13/963
> >  2. Andrew Lunn: https://lkml.org/lkml/2017/6/13/1039
> 
> It is normal to say what you actually changed.
> 
> > Patch V2: Addressed below comments:
> >  1. Florian Fainelli: https://lkml.org/lkml/2017/6/10/130
> >  2. Andrew Lunn: https://lkml.org/lkml/2017/6/10/168
> > Patch V1: Initial Submit
> > ---
> >  .../ethernet/hisilicon/hns3/hns3pf/hclge_mdio.c| 249
> +
> >  1 file changed, 249 insertions(+)
> >  create mode 100644
> drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_mdio.c
> >
> > diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_mdio.c
> b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_mdio.c
> > new file mode 100644
> > index 000..5b21c50
> > --- /dev/null
> > +++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_mdio.c
> > @@ -0,0 +1,249 @@
> > +/*
> > + * Copyright (c) 2016~2017 Hisilicon Limited.
> > + *
> > + * This program is free software; you can redistribute it and/or
> modify
> > + * it under the terms of the GNU General Public License as published
> by
> > + * the Free Software Foundation; either version 2 of the License, or
> > + * (at your option) any later version.
> > + */
> > +
> > +#include 
> > +#include 
> > +
> > +#include "hclge_cmd.h"
> > +#include "hclge_main.h"
> > +
> > +enum hclge_mdio_c22_op_seq {
> > +   HCLGE_MDIO_C22_WRITE = 1,
> > +   HCLGE_MDIO_C22_READ = 2
> > +};
> > +
> > +#define HCLGE_MDIO_CTRL_START_BIT   BIT(0)
> > +#define HCLGE_MDIO_CTRL_ST_MSK  GENMASK(2, 1)
> > +#define HCLGE_MDIO_CTRL_ST_LSH  1
> > +#define HCLGE_MDIO_IS_C22(c22)  (((c22) << HCLGE_MDIO_CTRL_ST_LSH) &
> \
> > +   HCLGE_MDIO_CTRL_ST_MSK)
> > +
> > +#define HCLGE_MDIO_CTRL_OP_MSK  GENMASK(4, 3)
> > +#define HCLGE_MDIO_CTRL_OP_LSH  3
> > +#define HCLGE_MDIO_CTRL_OP(access) \
> > +   (((access) << HCLGE_MDIO_CTRL_OP_LSH) & HCLGE_MDIO_CTRL_OP_MSK)
> > +#define HCLGE_MDIO_CTRL_PRTAD_MSK   GENMASK(4, 0)
> > +#define HCLGE_MDIO_CTRL_DEVAD_MSK   GENMASK(4, 0)
> 
> This all seems overly complex. How about
> 
> #define HCLGE_MDIO_CTRL_START_BIT   BIT(0)
> #define HCLGE_MDIO_C22BIT(1)
> #define HCLGE_MDIO_WRITE  (1 << 3)
> #define HCLGE_MDIO_READ   (2 << 3)
> #define HCLGE_MDIO_C22_WRITE  (HCLGE_MDIO_CTRL_START_BIT |
> HCLGE_MDIO_C22 | HCLGE_MDIO_WRITE)
> #define HCLGE_MDIO_C22_READ   (HCLGE_MDIO_CTRL_START_BIT |
> HCLGE_MDIO_C22 | HCLGE_MDIO_READ)
> #define HCLGE_MDIO_C45_WRITE  (HCLGE_MDIO_CTRL_START_BIT |
> HCLGE_MDIO_WRITE)
> #define HCLGE_MDIO_C45_READ   (HCLGE_MDIO_CTRL_START_BIT |
> HCLGE_MDIO_READ)
> 
> #define HCLGE_MDIO_STATUS_ERROR   BIT(0)
> 
> Keep it simple, don't have more defines than what you need.
Sure, changed in V4 Patch, Thanks!

Salil
> 
> > +static int hclge_mdio_write(struct mii_bus *bus, int phy_id, int
> regnum,
> > +   u16 data)
> > +{
> > +   struct hclge_dev *hdev = (struct hclge_dev *)bus->priv;
> > +   struct hclge_mdio_cfg_cmd *mdio_cmd;
> > +   enum hclge_cmd_status status;
> > +   struct hclge_desc desc;
> > +   u8 devad;
> > +
> > +   if (!bus)
> > +   return -EINVAL;
> > +
> > +   devad = ((regnum >> 16) & 0x1f);
> 
> So you have changed this to only support C22. Which means devad is not
> needed, since that is c45 only.
Thanks for catching. Removed this from MDIO file.

Best regards
Salil
> 
> > +
> > +   dev_dbg(>dev, "phy id=%d, devad=%d\n", phy_id, devad);
> > +
> > +   hclge_cmd_setup_basic_desc(, HCLGE_OPC_MDIO_CONFIG, false);
> > +
> > +   mdio_cmd = (struct hclge_mdio_cfg_cmd *)desc.data;
> > +
> > +   mdio_cmd->prtad = phy_id & HCLGE_MDIO_CTRL_PRTAD_MSK;
> > +   mdio_cmd->data_wr = cpu_to_le16(data);
> > +   mdio_cmd->devad = devad & HCLGE_MDIO_CTRL_DEVAD_MSK;
> > +
> > +   /* Write reg and data */
> > +   mdio_cmd->ctrl_bit = HCLGE_MDIO_IS_C22(1);
> 
> Passing the parameter is now pointless if you are only doing C22.
Sure, changed.

> 
> > +   mdio_cmd->ctrl_bit |=

RE: [PATCH V3 net-next 6/8] net: hns3: Add MDIO support to HNS3 Ethernet driver for hip08 SoC

2017-07-22 Thread Salil Mehta

Hi Andrew,

> -Original Message-
> From: Andrew Lunn [mailto:and...@lunn.ch]
> Sent: Monday, June 19, 2017 4:53 AM
> To: Salil Mehta
> Cc: da...@davemloft.net; Zhuangyuzeng (Yisen); huangdaode; lipeng (Y);
> mehta.salil@gmail.com; net...@vger.kernel.org; linux-
> ker...@vger.kernel.org; Linuxarm
> Subject: Re: [PATCH V3 net-next 6/8] net: hns3: Add MDIO support to
> HNS3 Ethernet driver for hip08 SoC
> 
> On Sat, Jun 17, 2017 at 06:24:29PM +0100, Salil Mehta wrote:
> > This patch adds the support of MDIO bus interface for HNS3 driver.
> > Code provides various interfaces to start and stop the PHY layer
> > and to read and write the MDIO bus or PHY.
> >
> > Signed-off-by: Daode Huang 
> > Signed-off-by: lipeng 
> > Signed-off-by: Salil Mehta 
> > Signed-off-by: Yisen Zhuang 
> > ---
> > Patch V3: Addressed Below comments:
> >  1. Florian Fainelli: https://lkml.org/lkml/2017/6/13/963
> >  2. Andrew Lunn: https://lkml.org/lkml/2017/6/13/1039
> 
> It is normal to say what you actually changed.
> 
> > Patch V2: Addressed below comments:
> >  1. Florian Fainelli: https://lkml.org/lkml/2017/6/10/130
> >  2. Andrew Lunn: https://lkml.org/lkml/2017/6/10/168
> > Patch V1: Initial Submit
> > ---
> >  .../ethernet/hisilicon/hns3/hns3pf/hclge_mdio.c| 249
> +
> >  1 file changed, 249 insertions(+)
> >  create mode 100644
> drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_mdio.c
> >
> > diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_mdio.c
> b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_mdio.c
> > new file mode 100644
> > index 000..5b21c50
> > --- /dev/null
> > +++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_mdio.c
> > @@ -0,0 +1,249 @@
> > +/*
> > + * Copyright (c) 2016~2017 Hisilicon Limited.
> > + *
> > + * This program is free software; you can redistribute it and/or
> modify
> > + * it under the terms of the GNU General Public License as published
> by
> > + * the Free Software Foundation; either version 2 of the License, or
> > + * (at your option) any later version.
> > + */
> > +
> > +#include 
> > +#include 
> > +
> > +#include "hclge_cmd.h"
> > +#include "hclge_main.h"
> > +
> > +enum hclge_mdio_c22_op_seq {
> > +   HCLGE_MDIO_C22_WRITE = 1,
> > +   HCLGE_MDIO_C22_READ = 2
> > +};
> > +
> > +#define HCLGE_MDIO_CTRL_START_BIT   BIT(0)
> > +#define HCLGE_MDIO_CTRL_ST_MSK  GENMASK(2, 1)
> > +#define HCLGE_MDIO_CTRL_ST_LSH  1
> > +#define HCLGE_MDIO_IS_C22(c22)  (((c22) << HCLGE_MDIO_CTRL_ST_LSH) &
> \
> > +   HCLGE_MDIO_CTRL_ST_MSK)
> > +
> > +#define HCLGE_MDIO_CTRL_OP_MSK  GENMASK(4, 3)
> > +#define HCLGE_MDIO_CTRL_OP_LSH  3
> > +#define HCLGE_MDIO_CTRL_OP(access) \
> > +   (((access) << HCLGE_MDIO_CTRL_OP_LSH) & HCLGE_MDIO_CTRL_OP_MSK)
> > +#define HCLGE_MDIO_CTRL_PRTAD_MSK   GENMASK(4, 0)
> > +#define HCLGE_MDIO_CTRL_DEVAD_MSK   GENMASK(4, 0)
> 
> This all seems overly complex. How about
> 
> #define HCLGE_MDIO_CTRL_START_BIT   BIT(0)
> #define HCLGE_MDIO_C22BIT(1)
> #define HCLGE_MDIO_WRITE  (1 << 3)
> #define HCLGE_MDIO_READ   (2 << 3)
> #define HCLGE_MDIO_C22_WRITE  (HCLGE_MDIO_CTRL_START_BIT |
> HCLGE_MDIO_C22 | HCLGE_MDIO_WRITE)
> #define HCLGE_MDIO_C22_READ   (HCLGE_MDIO_CTRL_START_BIT |
> HCLGE_MDIO_C22 | HCLGE_MDIO_READ)
> #define HCLGE_MDIO_C45_WRITE  (HCLGE_MDIO_CTRL_START_BIT |
> HCLGE_MDIO_WRITE)
> #define HCLGE_MDIO_C45_READ   (HCLGE_MDIO_CTRL_START_BIT |
> HCLGE_MDIO_READ)
> 
> #define HCLGE_MDIO_STATUS_ERROR   BIT(0)
> 
> Keep it simple, don't have more defines than what you need.
Sure, changed in V4 Patch, Thanks!

Salil
> 
> > +static int hclge_mdio_write(struct mii_bus *bus, int phy_id, int
> regnum,
> > +   u16 data)
> > +{
> > +   struct hclge_dev *hdev = (struct hclge_dev *)bus->priv;
> > +   struct hclge_mdio_cfg_cmd *mdio_cmd;
> > +   enum hclge_cmd_status status;
> > +   struct hclge_desc desc;
> > +   u8 devad;
> > +
> > +   if (!bus)
> > +   return -EINVAL;
> > +
> > +   devad = ((regnum >> 16) & 0x1f);
> 
> So you have changed this to only support C22. Which means devad is not
> needed, since that is c45 only.
Thanks for catching. Removed this from MDIO file.

Best regards
Salil
> 
> > +
> > +   dev_dbg(>dev, "phy id=%d, devad=%d\n", phy_id, devad);
> > +
> > +   hclge_cmd_setup_basic_desc(, HCLGE_OPC_MDIO_CONFIG, false);
> > +
> > +   mdio_cmd = (struct hclge_mdio_cfg_cmd *)desc.data;
> > +
> > +   mdio_cmd->prtad = phy_id & HCLGE_MDIO_CTRL_PRTAD_MSK;
> > +   mdio_cmd->data_wr = cpu_to_le16(data);
> > +   mdio_cmd->devad = devad & HCLGE_MDIO_CTRL_DEVAD_MSK;
> > +
> > +   /* Write reg and data */
> > +   mdio_cmd->ctrl_bit = HCLGE_MDIO_IS_C22(1);
> 
> Passing the parameter is now pointless if you are only doing C22.
Sure, changed.

> 
> > +   mdio_cmd->ctrl_bit |= HCLGE_MDIO_CTRL_OP(HCLGE_MDIO_C22_WRITE);
> > +   mdio_cmd->ctrl_bit |= HCLGE_MDIO_CTRL_START_BIT;
> 
> Given

RE: [PATCH V3 net-next 1/8] net: hns3: Add support of HNS3 Ethernet Driver for hip08 SoC

2017-07-22 Thread Salil Mehta

Hi Bo Yu,

> -Original Message-
> From: Bo Yu [mailto:tsu.y...@gmail.com]
> Sent: Monday, June 19, 2017 1:57 AM
> To: Salil Mehta
> Cc: da...@davemloft.net; Zhuangyuzeng (Yisen); huangdaode; lipeng (Y);
> mehta.salil@gmail.com; net...@vger.kernel.org; linux-
> ker...@vger.kernel.org; Linuxarm
> Subject: Re: [PATCH V3 net-next 1/8] net: hns3: Add support of HNS3
> Ethernet Driver for hip08 SoC
> 
> Hi,
> On Sat, Jun 17, 2017 at 06:24:24PM +0100, Salil Mehta wrote:
> >+struct notifier_block notifier_block;
> >+/* Vxlan/Geneve information */
> >+struct hns3_udp_tunnel udp_tnl[HNS3_UDP_TNL_MAX];
> >+};
> >+
> >+/* the distance between [begin, end) in a ring buffer
> >+ * note: there is a unuse slot between the begin and the end
> >+ */
> >+static inline int ring_dist(struct hns3_enet_ring *ring, int begin,
> int end)
> >+{
> >+return (end - begin + ring->desc_num) % ring->desc_num;
> >+}
> >+
> >+static inline int ring_space(struct hns3_enet_ring *ring)
> >+{
> >+return ring->desc_num -
> >+ring_dist(ring, ring->next_to_clean, ring->next_to_use) -
> 1;
> >+}
> >+
> >+static inline int is_ring_empty(struct hns3_enet_ring *ring)
> >+{
> >+return ring->next_to_use == ring->next_to_clean;
> >+}
> >+
> >+static inline void hns3_write_reg(void __iomem *base, u32 reg, u32
> value)
> >+{
> >+u8 __iomem *reg_addr = READ_ONCE(base);
> >+
> >+writel(value, reg_addr + reg);
> >+}
> >+
> >+#define hns3_write_dev(a, reg, value) \
> >+hns3_write_reg((a)->io_base, (reg), (value))
> >+
> >+#define hnae_queue_xmit(tqp, buf_num) writel_relaxed(buf_num, \
> >+(tqp)->io_base + HNS3_RING_TX_RING_TAIL_REG)
> >+
> >+#define ring_to_dev(ring) (&(ring)->tqp->handle->pdev->dev)
> >+
> >+#define ring_to_dma_dir(ring) (HNAE3_IS_TX_RING(ring) ? \
> >+DMA_TO_DEVICE : DMA_FROM_DEVICE)
> >+
> >+#define tx_ring_data(priv, idx) ((priv)->ring_data[idx])
> >+
> >+#define hnae_buf_size(_ring) ((_ring)->buf_size)
> >+#define hnae_page_order(_ring) (get_order(hnae_buf_size(_ring)))
> >+#define hnae_page_size(_ring) (PAGE_SIZE << hnae_page_order(_ring))
> >+
> >+/* iterator for handling rings in ring group */
> >+#define hns3_for_each_ring(pos, head) \
> >+for (pos = (head).ring; pos != NULL; pos = pos->next)
> 
> Only a pos? Comparsion to NULL could be written "pos" noticed by
> checkpatch.
Fixed in patch V4. Thanks!

Salil
> 
> 
> >+
> >+void hns3_ethtool_set_ops(struct net_device *ndev);
> >+
> >+int hns3_nic_net_xmit_hw(
> >+struct net_device *ndev,
> >+struct sk_buff *skb,
> >+struct hns3_nic_ring_data *ring_data);
> >+int hns3_clean_tx_ring(struct hns3_enet_ring *ring, int budget);
> >+int hns3_clean_rx_ring_ex(
> >+struct hns3_enet_ring *ring,
> >+struct sk_buff **skb_ex,
> >+int budget);
> >+#endif
> >--
> >2.7.4
> >
> >

RE: [PATCH V3 net-next 5/8] net: hns3: Add support of TX Scheduler & Shaper to HNS3 driver

2017-07-22 Thread Salil Mehta

Hi Richard,

> -Original Message-
> From: Richard Cochran [mailto:richardcoch...@gmail.com]
> Sent: Sunday, June 18, 2017 5:45 PM
> To: Salil Mehta
> Cc: da...@davemloft.net; Zhuangyuzeng (Yisen); huangdaode; lipeng (Y);
> mehta.salil@gmail.com; net...@vger.kernel.org; linux-
> ker...@vger.kernel.org; Linuxarm
> Subject: Re: [PATCH V3 net-next 5/8] net: hns3: Add support of TX
> Scheduler & Shaper to HNS3 driver
> 
> On Sat, Jun 17, 2017 at 06:24:28PM +0100, Salil Mehta wrote:
> > +
> > +int hclge_tm_schd_init(struct hclge_dev *hdev);
> > +int hclge_tm_setup_tc(struct hclge_dev *hdev);
> 
> The definition of this function DNE.
Sorry, I did not get what DNE means? Does Not Exist ?
If yes, the I can see the definition of both the functions.

Best regards
Salil
> 
> > +int hclge_pause_setup_hw(struct hclge_dev *hdev);
> > +
> > +#endif
> > --
> > 2.7.4
> 
> Thanks,
> Richard

RE: [PATCH V3 net-next 5/8] net: hns3: Add support of TX Scheduler & Shaper to HNS3 driver

2017-07-22 Thread Salil Mehta

Hi Richard,

> -Original Message-
> From: Richard Cochran [mailto:richardcoch...@gmail.com]
> Sent: Sunday, June 18, 2017 5:45 PM
> To: Salil Mehta
> Cc: da...@davemloft.net; Zhuangyuzeng (Yisen); huangdaode; lipeng (Y);
> mehta.salil@gmail.com; net...@vger.kernel.org; linux-
> ker...@vger.kernel.org; Linuxarm
> Subject: Re: [PATCH V3 net-next 5/8] net: hns3: Add support of TX
> Scheduler & Shaper to HNS3 driver
> 
> On Sat, Jun 17, 2017 at 06:24:28PM +0100, Salil Mehta wrote:
> > +
> > +int hclge_tm_schd_init(struct hclge_dev *hdev);
> > +int hclge_tm_setup_tc(struct hclge_dev *hdev);
> 
> The definition of this function DNE.
Sorry, I did not get what DNE means? Does Not Exist ?
If yes, the I can see the definition of both the functions.

Best regards
Salil
> 
> > +int hclge_pause_setup_hw(struct hclge_dev *hdev);
> > +
> > +#endif
> > --
> > 2.7.4
> 
> Thanks,
> Richard

RE: [PATCH V3 net-next 2/8] net: hns3: Add support of the HNAE3 framework

2017-07-22 Thread Salil Mehta

Hi Bo Yu,

> -Original Message-
> From: Bo Yu [mailto:tsu.y...@gmail.com]
> Sent: Monday, June 19, 2017 1:40 AM
> To: Salil Mehta
> Cc: da...@davemloft.net; Zhuangyuzeng (Yisen); huangdaode; lipeng (Y);
> mehta.salil@gmail.com; net...@vger.kernel.org; linux-
> ker...@vger.kernel.org; Linuxarm
> Subject: Re: [PATCH V3 net-next 2/8] net: hns3: Add support of the
> HNAE3 framework
> 
> Hi,
> On Sat, Jun 17, 2017 at 06:24:25PM +0100, Salil Mehta wrote:
> >+ *   Unregister client from ae_dev
> >+ * start()
> >+ *   Enable the hardware
> >+ * stop()
> >+ *   Disable the hardware
> >+ * get_status()
> >+ *   Get the carrier state of the back channel of the handle, 1 for
> ok, 0 for
> >+ *   non-ok
> >+ * get_ksettings_an_result()
> >+ *   Get negotiation status,speed and duplex
> >+ * update_speed_duplex_h()
> >+ *   Update hardware speed and duplex
> >+ * get_media_type()
> >+ *   Get media type of MAC
> >+ * adjust_link()
> >+ *   Adjust link status
> >+ * set_loopback()
> >+ *   Set loopback
> >+ * set_promisc_mode
> >+ *   Set promisc mode
> >+ * set_mtu()
> >+ *   set mtu
> >+ * get_pauseparam()
> >+ *   get tx and rx of pause frame use
> >+ * set_pauseparam()
> >+ *   set tx and rx of pause frame use
> >+ * set_autoneg()
> >+ *   set auto autonegotiation of pause frame use
> >+ * get_autoneg()
> >+ *   get auto autonegotiation of pause frame use
> >+ * get_coalesce_usecs()
> >+ *   get usecs to delay a TX interrupt after a packet is sent
> >+ * get_rx_max_coalesced_frames()
> >+ *   get Maximum number of packets to be sent before a TX interrupt.
> >+ * set_coalesce_usecs()
> >+ *   set usecs to delay a TX interrupt after a packet is sent
> >+ * set_coalesce_frames()
> >+ *   set Maximum number of packets to be sent before a TX interrupt.
> >+ * get_mac_addr()
> >+ *   get mac address
> >+ * set_mac_addr()
> >+ *   set mac address
> >+ * add_uc_addr
> >+ *   Add unicast addr to mac table
> >+ * rm_uc_addr
> >+ *   Remove unicast addr from mac table
> >+ * set_mc_addr()
> >+ *   Set multicast address
> >+ * add_mc_addr
> >+ *   Add multicast address to mac table
> >+ * rm_mc_addr
> >+ *   Remove multicast address from mac table
> >+ * update_stats()
> >+ *   Update Old network device statistics
> >+ * get_ethtool_stats()
> >+ *   Get ethtool network device statistics
> >+ * get_strings()
> >+ *   Get a set of strings that describe the requested objects
> >+ * get_sset_count()
> >+ *   Get number of strings that @get_strings will write
> >+ * update_led_status()
> >+ *   Update the led status
> >+ * set_led_id()
> >+ *   Set led id
> >+ * get_regs()
> >+ *   Get regs dump
> >+ * get_regs_len()
> >+ *   Get the len of the regs dump
> >+ * get_rss_key_size()
> >+ *   Get rss key size
> >+ * get_rss_indir_size()
> >+ *   Get rss indirection table size
> >+ * get_rss()
> >+ *   Get rss table
> >+ * set_rss()
> >+ *   Set rss table
> >+ * get_tc_size()
> >+ *   Get tc size of handle
> >+ * get_vector()
> >+ *   Get vector number and vector infomation
> 
> Just another spealling : information
> 
> Checkpatch will report it also.
Fixed it. As far as I know chechkpatch.pl depends upon its dictionary
for it to be able to catch such mistakes. Have you prepared your own?

Thanks
Salil
> 
> >+ * map_ring_to_vector()
> >+ *   Map rings to vector
> >+ * unmap_ring_from_vector()
> >+ *   Unmap rings from vector
> >+ * add_tunnel_udp()
> >+ *   Add tunnel information to hardware
> >+ * del_tunnel_udp()
> >+ *   Delete tunnel information from hardware
> >+ * reset_queue()
> >+ *   Reset queue
> >+ * get_fw_version()
> >+ *   Get firmware version
> >+ * get_mdix_mode()
> >+ *   Get media typr of phy
> >+ * set_vlan_filter()
> >+ *   Set vlan filter config of Ports
> >+ * set_vf_vlan_filter()
> >+ *   Set vlan filter config of vf
> >+ */
> >+struct hnae3_ae_ops {
> >+int (*init_ae_dev)(struct hnae3_ae_dev *ae_dev);
> >+void (*uninit_ae_dev)(struct hnae3_ae_dev *ae_dev);
> >+
> >+int (*register_client)(struct hnae3_client *client,
> >+   struct hnae3_ae_dev *ae_dev);
> >+void (*unregister_client)(struct hnae3_client *client,
> >+  struct hnae3_ae_dev *ae_dev);
> >+int (*start)(struct hnae3_handle *handle);
> >+void (*stop)(struct hnae3_handle *handle);
> >+int (*get_status)(struct hnae3_handle *handle);
> >+void (*get_ksettings_an_result)(struct hnae3_handle *handle,
> >+u8 *auto_neg, u32 *speed, u8 *duplex);
> >+
> >+int (*update_speed_duplex_h)(struct hnae3_handle *handle);
> >+int (*cfg_mac_speed_dup_h)(struct hnae3_handle *handle, int
> speed,
> >+   u8 duplex);
> >+
> >+void (*get_media_type)(struct hnae3_handle *handle, u8
> *media_type);
> >+void (*adjust_link)(struct hnae3_handle *handle, int speed, int
> duplex);
> >+int (*set_loopback)(struct hnae3_handle *handle,
> >+enum hnae3_loop loop_mode, bool en);
> >+
> >+

RE: [PATCH V3 net-next 2/8] net: hns3: Add support of the HNAE3 framework

2017-07-22 Thread Salil Mehta

Hi Bo Yu,

> -Original Message-
> From: Bo Yu [mailto:tsu.y...@gmail.com]
> Sent: Monday, June 19, 2017 1:40 AM
> To: Salil Mehta
> Cc: da...@davemloft.net; Zhuangyuzeng (Yisen); huangdaode; lipeng (Y);
> mehta.salil@gmail.com; net...@vger.kernel.org; linux-
> ker...@vger.kernel.org; Linuxarm
> Subject: Re: [PATCH V3 net-next 2/8] net: hns3: Add support of the
> HNAE3 framework
> 
> Hi,
> On Sat, Jun 17, 2017 at 06:24:25PM +0100, Salil Mehta wrote:
> >+ *   Unregister client from ae_dev
> >+ * start()
> >+ *   Enable the hardware
> >+ * stop()
> >+ *   Disable the hardware
> >+ * get_status()
> >+ *   Get the carrier state of the back channel of the handle, 1 for
> ok, 0 for
> >+ *   non-ok
> >+ * get_ksettings_an_result()
> >+ *   Get negotiation status,speed and duplex
> >+ * update_speed_duplex_h()
> >+ *   Update hardware speed and duplex
> >+ * get_media_type()
> >+ *   Get media type of MAC
> >+ * adjust_link()
> >+ *   Adjust link status
> >+ * set_loopback()
> >+ *   Set loopback
> >+ * set_promisc_mode
> >+ *   Set promisc mode
> >+ * set_mtu()
> >+ *   set mtu
> >+ * get_pauseparam()
> >+ *   get tx and rx of pause frame use
> >+ * set_pauseparam()
> >+ *   set tx and rx of pause frame use
> >+ * set_autoneg()
> >+ *   set auto autonegotiation of pause frame use
> >+ * get_autoneg()
> >+ *   get auto autonegotiation of pause frame use
> >+ * get_coalesce_usecs()
> >+ *   get usecs to delay a TX interrupt after a packet is sent
> >+ * get_rx_max_coalesced_frames()
> >+ *   get Maximum number of packets to be sent before a TX interrupt.
> >+ * set_coalesce_usecs()
> >+ *   set usecs to delay a TX interrupt after a packet is sent
> >+ * set_coalesce_frames()
> >+ *   set Maximum number of packets to be sent before a TX interrupt.
> >+ * get_mac_addr()
> >+ *   get mac address
> >+ * set_mac_addr()
> >+ *   set mac address
> >+ * add_uc_addr
> >+ *   Add unicast addr to mac table
> >+ * rm_uc_addr
> >+ *   Remove unicast addr from mac table
> >+ * set_mc_addr()
> >+ *   Set multicast address
> >+ * add_mc_addr
> >+ *   Add multicast address to mac table
> >+ * rm_mc_addr
> >+ *   Remove multicast address from mac table
> >+ * update_stats()
> >+ *   Update Old network device statistics
> >+ * get_ethtool_stats()
> >+ *   Get ethtool network device statistics
> >+ * get_strings()
> >+ *   Get a set of strings that describe the requested objects
> >+ * get_sset_count()
> >+ *   Get number of strings that @get_strings will write
> >+ * update_led_status()
> >+ *   Update the led status
> >+ * set_led_id()
> >+ *   Set led id
> >+ * get_regs()
> >+ *   Get regs dump
> >+ * get_regs_len()
> >+ *   Get the len of the regs dump
> >+ * get_rss_key_size()
> >+ *   Get rss key size
> >+ * get_rss_indir_size()
> >+ *   Get rss indirection table size
> >+ * get_rss()
> >+ *   Get rss table
> >+ * set_rss()
> >+ *   Set rss table
> >+ * get_tc_size()
> >+ *   Get tc size of handle
> >+ * get_vector()
> >+ *   Get vector number and vector infomation
> 
> Just another spealling : information
> 
> Checkpatch will report it also.
Fixed it. As far as I know chechkpatch.pl depends upon its dictionary
for it to be able to catch such mistakes. Have you prepared your own?

Thanks
Salil
> 
> >+ * map_ring_to_vector()
> >+ *   Map rings to vector
> >+ * unmap_ring_from_vector()
> >+ *   Unmap rings from vector
> >+ * add_tunnel_udp()
> >+ *   Add tunnel information to hardware
> >+ * del_tunnel_udp()
> >+ *   Delete tunnel information from hardware
> >+ * reset_queue()
> >+ *   Reset queue
> >+ * get_fw_version()
> >+ *   Get firmware version
> >+ * get_mdix_mode()
> >+ *   Get media typr of phy
> >+ * set_vlan_filter()
> >+ *   Set vlan filter config of Ports
> >+ * set_vf_vlan_filter()
> >+ *   Set vlan filter config of vf
> >+ */
> >+struct hnae3_ae_ops {
> >+int (*init_ae_dev)(struct hnae3_ae_dev *ae_dev);
> >+void (*uninit_ae_dev)(struct hnae3_ae_dev *ae_dev);
> >+
> >+int (*register_client)(struct hnae3_client *client,
> >+   struct hnae3_ae_dev *ae_dev);
> >+void (*unregister_client)(struct hnae3_client *client,
> >+  struct hnae3_ae_dev *ae_dev);
> >+int (*start)(struct hnae3_handle *handle);
> >+void (*stop)(struct hnae3_handle *handle);
> >+int (*get_status)(struct hnae3_handle *handle);
> >+void (*get_ksettings_an_result)(struct hnae3_handle *handle,
> >+u8 *auto_neg, u32 *speed, u8 *duplex);
> >+
> >+int (*update_speed_duplex_h)(struct hnae3_handle *handle);
> >+int (*cfg_mac_speed_dup_h)(struct hnae3_handle *handle, int
> speed,
> >+   u8 duplex);
> >+
> >+void (*get_media_type)(struct hnae3_handle *handle, u8
> *media_type);
> >+void (*adjust_link)(struct hnae3_handle *handle, int speed, int
> duplex);
> >+int (*set_loopback)(struct hnae3_handle *handle,
> >+enum hnae3_loop loop_mode, bool en);
> >+
> >+

RE: [PATCH V3 net-next 1/8] net: hns3: Add support of HNS3 Ethernet Driver for hip08 SoC

2017-07-22 Thread Salil Mehta

Hi Bo Yu,

> -Original Message-
> From: Bo Yu [mailto:tsu.y...@gmail.com]
> Sent: Monday, June 19, 2017 1:18 AM
> To: Salil Mehta
> Cc: da...@davemloft.net; Zhuangyuzeng (Yisen); huangdaode; lipeng (Y);
> mehta.salil@gmail.com; net...@vger.kernel.org; linux-
> ker...@vger.kernel.org; Linuxarm
> Subject: Re: [PATCH V3 net-next 1/8] net: hns3: Add support of HNS3
> Ethernet Driver for hip08 SoC
> 
> Hi,
> On Sat, Jun 17, 2017 at 06:24:24PM +0100, Salil Mehta wrote:
> >+static int hns3_fill_desc(struct hns3_enet_ring *ring, void *priv,
> >+  int size, dma_addr_t dma, int frag_end,
> >+  enum hns_desc_type type)
> >+{
> >+struct hns3_desc_cb *desc_cb = >desc_cb[ring->next_to_use];
> >+struct hns3_desc *desc = >desc[ring->next_to_use];
> >+u32 ol_type_vlan_len_msec = 0;
> >+u16 bdtp_fe_sc_vld_ra_ri = 0;
> >+u32 type_cs_vlan_tso = 0;
> >+struct sk_buff *skb;
> >+u32 paylen = 0;
> >+u16 mss = 0;
> >+__be16 protocol;
> >+u8 ol4_proto;
> >+u8 il4_proto;
> >+int ret;
> >+
> >+/* The txbd's baseinfo of DESC_TYPE_PAGE & DESC_TYPE_SKB */
> >+desc_cb->priv = priv;
> >+desc_cb->length = size;
> >+desc_cb->dma = dma;
> >+desc_cb->type = type;
> >+
> >+/* now, fill the descriptor */
> >+desc->addr = cpu_to_le64(dma);
> >+desc->tx.send_size = cpu_to_le16((u16)size);
> >+hns3_set_txbd_baseinfo(_fe_sc_vld_ra_ri, frag_end);
> >+desc->tx.bdtp_fe_sc_vld_ra_ri =
> cpu_to_le16(bdtp_fe_sc_vld_ra_ri);
> >+
> >+if (type == DESC_TYPE_SKB) {
> >+skb = (struct sk_buff *)priv;
> >+paylen = cpu_to_le16(skb->len);
> >+
> >+if (skb->ip_summed == CHECKSUM_PARTIAL) {
> >+skb_reset_mac_len(skb);
> >+protocol = skb->protocol;
> >+
> >+/* vlan packe t*/
> 
> Just a spealling:   /* vlan packet */
Fixed in V4 patch. Thanks!

Salil
> 
> >+if (protocol == htons(ETH_P_8021Q)) {
> >+protocol = vlan_get_protocol(skb);
> >+skb->protocol = protocol;
> >+}
> >+hns3_get_l4_protocol(skb, _proto, _proto);
> >+hns3_set_l2l3l4_len(skb, ol4_proto, il4_proto,
> >+_cs_vlan_tso,
> >+_type_vlan_len_msec);
> >+ret = hns3_set_l3l4_type_csum(skb, ol4_proto,
> il4_proto,
> >+  _cs_vlan_tso,
> >+  _type_vlan_len_msec);
> >+if (ret)
> >+return ret;
> >+
> >+ret = hns3_set_tso(skb, , ,
> >+   _cs_vlan_tso);
> >+if (ret)
> >+return ret;
> >+}
> >+
> >+/* Set txbd */
> >+desc->tx.ol_type_vlan_len_msec =
> >+cpu_to_le32(ol_type_vlan_len_msec);
> >+desc->tx.type_cs_vlan_tso_len =
> >+cpu_to_le32(type_cs_vlan_tso);
> >+desc->tx.paylen = cpu_to_le16(paylen);
> >+desc->tx.mss = cpu_to_le16(mss);
> >+}
> >+
> >+/* move ring pointer to next.*/
> >+ring_ptr_move_fw(ring, next_to_use);
> >+
> >+return 0;
> >+}
> >+
> >+static int hns3_fill_desc_tso(struct hns3_enet_ring *ring, void
> *priv,
> >+  int size, dma_addr_t dma, int frag_end,
> >+  enum hns_desc_type type)
> >+{
> >+int frag_buf_num;
> >+int sizeoflast;
> >+int ret, k;
> >+
> >+frag_buf_num = (size + HNS3_MAX_BD_SIZE - 1) / HNS3_MAX_BD_SIZE;
> >+sizeoflast = size % HNS3_MAX_BD_SIZE;
> >+sizeoflast = sizeoflast ? sizeoflast : HNS3_MAX_BD_SIZE;
> >+
> >+/* When the frag size is bigger than hardware, split this frag */
> >+for (k = 0; k < frag_buf_num; k++) {
> >+ret = hns3_fill_desc(ring, priv,
> >+ (k == frag_buf_num - 1) ?
> >+sizeoflast : HNS3_MAX_BD_SIZE,
> >+dma + HNS3_MAX_BD_SIZE * k,
> >+frag_end && (k == frag_buf_num - 1) ? 1 : 0,
> >+(type == DESC_TYPE_SKB && !k) ?
> >+DESC_TYPE_SKB : DESC_TYPE_PAGE);
> >+if (ret)
> >+return ret;
> >+}
> >+
> >+return 0;
> >+}
> >+
> >+static int hns3_nic_maybe_stop_tso(struct sk_buff **out_skb, int
> *bnum,
> >+   struct hns3_enet_ring *ring)
> >+{
> >+struct sk_buff *skb = *out_skb;
> >+struct skb_frag_struct *frag;
> >+int bdnum_for_frag;
> >+int frag_num;
> >+int buf_num;
> >+int size;
> >+int i;
> >+
> >+size = skb_headlen(skb);
> >+buf_num = (size + HNS3_MAX_BD_SIZE -

RE: [PATCH V3 net-next 1/8] net: hns3: Add support of HNS3 Ethernet Driver for hip08 SoC

2017-07-22 Thread Salil Mehta

Hi Bo Yu,

> -Original Message-
> From: Bo Yu [mailto:tsu.y...@gmail.com]
> Sent: Monday, June 19, 2017 1:18 AM
> To: Salil Mehta
> Cc: da...@davemloft.net; Zhuangyuzeng (Yisen); huangdaode; lipeng (Y);
> mehta.salil@gmail.com; net...@vger.kernel.org; linux-
> ker...@vger.kernel.org; Linuxarm
> Subject: Re: [PATCH V3 net-next 1/8] net: hns3: Add support of HNS3
> Ethernet Driver for hip08 SoC
> 
> Hi,
> On Sat, Jun 17, 2017 at 06:24:24PM +0100, Salil Mehta wrote:
> >+static int hns3_fill_desc(struct hns3_enet_ring *ring, void *priv,
> >+  int size, dma_addr_t dma, int frag_end,
> >+  enum hns_desc_type type)
> >+{
> >+struct hns3_desc_cb *desc_cb = >desc_cb[ring->next_to_use];
> >+struct hns3_desc *desc = >desc[ring->next_to_use];
> >+u32 ol_type_vlan_len_msec = 0;
> >+u16 bdtp_fe_sc_vld_ra_ri = 0;
> >+u32 type_cs_vlan_tso = 0;
> >+struct sk_buff *skb;
> >+u32 paylen = 0;
> >+u16 mss = 0;
> >+__be16 protocol;
> >+u8 ol4_proto;
> >+u8 il4_proto;
> >+int ret;
> >+
> >+/* The txbd's baseinfo of DESC_TYPE_PAGE & DESC_TYPE_SKB */
> >+desc_cb->priv = priv;
> >+desc_cb->length = size;
> >+desc_cb->dma = dma;
> >+desc_cb->type = type;
> >+
> >+/* now, fill the descriptor */
> >+desc->addr = cpu_to_le64(dma);
> >+desc->tx.send_size = cpu_to_le16((u16)size);
> >+hns3_set_txbd_baseinfo(_fe_sc_vld_ra_ri, frag_end);
> >+desc->tx.bdtp_fe_sc_vld_ra_ri =
> cpu_to_le16(bdtp_fe_sc_vld_ra_ri);
> >+
> >+if (type == DESC_TYPE_SKB) {
> >+skb = (struct sk_buff *)priv;
> >+paylen = cpu_to_le16(skb->len);
> >+
> >+if (skb->ip_summed == CHECKSUM_PARTIAL) {
> >+skb_reset_mac_len(skb);
> >+protocol = skb->protocol;
> >+
> >+/* vlan packe t*/
> 
> Just a spealling:   /* vlan packet */
Fixed in V4 patch. Thanks!

Salil
> 
> >+if (protocol == htons(ETH_P_8021Q)) {
> >+protocol = vlan_get_protocol(skb);
> >+skb->protocol = protocol;
> >+}
> >+hns3_get_l4_protocol(skb, _proto, _proto);
> >+hns3_set_l2l3l4_len(skb, ol4_proto, il4_proto,
> >+_cs_vlan_tso,
> >+_type_vlan_len_msec);
> >+ret = hns3_set_l3l4_type_csum(skb, ol4_proto,
> il4_proto,
> >+  _cs_vlan_tso,
> >+  _type_vlan_len_msec);
> >+if (ret)
> >+return ret;
> >+
> >+ret = hns3_set_tso(skb, , ,
> >+   _cs_vlan_tso);
> >+if (ret)
> >+return ret;
> >+}
> >+
> >+/* Set txbd */
> >+desc->tx.ol_type_vlan_len_msec =
> >+cpu_to_le32(ol_type_vlan_len_msec);
> >+desc->tx.type_cs_vlan_tso_len =
> >+cpu_to_le32(type_cs_vlan_tso);
> >+desc->tx.paylen = cpu_to_le16(paylen);
> >+desc->tx.mss = cpu_to_le16(mss);
> >+}
> >+
> >+/* move ring pointer to next.*/
> >+ring_ptr_move_fw(ring, next_to_use);
> >+
> >+return 0;
> >+}
> >+
> >+static int hns3_fill_desc_tso(struct hns3_enet_ring *ring, void
> *priv,
> >+  int size, dma_addr_t dma, int frag_end,
> >+  enum hns_desc_type type)
> >+{
> >+int frag_buf_num;
> >+int sizeoflast;
> >+int ret, k;
> >+
> >+frag_buf_num = (size + HNS3_MAX_BD_SIZE - 1) / HNS3_MAX_BD_SIZE;
> >+sizeoflast = size % HNS3_MAX_BD_SIZE;
> >+sizeoflast = sizeoflast ? sizeoflast : HNS3_MAX_BD_SIZE;
> >+
> >+/* When the frag size is bigger than hardware, split this frag */
> >+for (k = 0; k < frag_buf_num; k++) {
> >+ret = hns3_fill_desc(ring, priv,
> >+ (k == frag_buf_num - 1) ?
> >+sizeoflast : HNS3_MAX_BD_SIZE,
> >+dma + HNS3_MAX_BD_SIZE * k,
> >+frag_end && (k == frag_buf_num - 1) ? 1 : 0,
> >+(type == DESC_TYPE_SKB && !k) ?
> >+DESC_TYPE_SKB : DESC_TYPE_PAGE);
> >+if (ret)
> >+return ret;
> >+}
> >+
> >+return 0;
> >+}
> >+
> >+static int hns3_nic_maybe_stop_tso(struct sk_buff **out_skb, int
> *bnum,
> >+   struct hns3_enet_ring *ring)
> >+{
> >+struct sk_buff *skb = *out_skb;
> >+struct skb_frag_struct *frag;
> >+int bdnum_for_frag;
> >+int frag_num;
> >+int buf_num;
> >+int size;
> >+int i;
> >+
> >+size = skb_headlen(skb);
> >+buf_num = (size + HNS3_MAX_BD_SIZE -

[PATCH] get_maintainer.pl: Prepare for separate MAINTAINERS files

2017-07-22 Thread Joe Perches

Allow for MAINTAINERS to become a directory and if it is,
read all the files in the directory for maintained sections.

Miscellanea:

o Create a read_maintainer_file subroutine from the existing code
o Test only the existence of MAINTAINERS, not whether it's a file

Signed-off-by: Joe Perches 
---
 scripts/get_maintainer.pl | 68 ---
 1 file changed, 41 insertions(+), 27 deletions(-)

diff --git a/scripts/get_maintainer.pl b/scripts/get_maintainer.pl
index 3bd5f4f30235..d71a2994b147 100755
--- a/scripts/get_maintainer.pl
+++ b/scripts/get_maintainer.pl
@@ -308,35 +308,49 @@ if (!top_of_kernel_tree($lk_path)) {
 my @typevalue = ();
 my %keyword_hash;
 
-open (my $maint, '<', "${lk_path}MAINTAINERS")
-or die "$P: Can't open MAINTAINERS: $!\n";
-while (<$maint>) {
-my $line = $_;
-
-if ($line =~ m/^([A-Z]):\s*(.*)/) {
-   my $type = $1;
-   my $value = $2;
-
-   ##Filename pattern matching
-   if ($type eq "F" || $type eq "X") {
-   $value =~ s@\.@\\\.@g;   ##Convert . to \.
-   $value =~ s/\*/\.\*/g;   ##Convert * to .*
-   $value =~ s/\?/\./g; ##Convert ? to .
-   ##if pattern is a directory and it lacks a trailing slash, add one
-   if ((-d $value)) {
-   $value =~ s@([^/])$@$1/@;
-   }
-   } elsif ($type eq "K") {
-   $keyword_hash{@typevalue} = $value;
-   }
-   push(@typevalue, "$type:$value");
-} elsif (!/^(\s)*$/) {
-   $line =~ s/\n$//g;
-   push(@typevalue, $line);
+if (-f "${lk_path}MAINTAINERS") {
+read_maintainer_file("${lk_path}MAINTAINERS");
+} elsif (-d "${lk_path}MAINTAINERS") {
+opendir(DIR, "${lk_path}MAINTAINERS") or die $!;
+my @mfiles = grep { /^\./ && -f "${lk_path}MAINTAINERS/$_" } readdir(DIR);
+closedir(DIR);
+foreach my $file (@mfiles) {
+   read_maintainer_file("${lk_path}MAINTAINERS/$file");
 }
 }
-close($maint);
 
+sub read_maintainer_file {
+my ($file) = @_;
+
+open (my $maint, '<', "$file")
+   or die "$P: Can't open MAINTAINERS file '$file': $!\n";
+while (<$maint>) {
+   my $line = $_;
+
+   if ($line =~ m/^([A-Z]):\s*(.*)/) {
+   my $type = $1;
+   my $value = $2;
+
+   ##Filename pattern matching
+   if ($type eq "F" || $type eq "X") {
+   $value =~ s@\.@\\\.@g;   ##Convert . to \.
+   $value =~ s/\*/\.\*/g;   ##Convert * to .*
+   $value =~ s/\?/\./g; ##Convert ? to .
+   ##if pattern is a directory and it lacks a trailing slash, add 
one
+   if ((-d $value)) {
+   $value =~ s@([^/])$@$1/@;
+   }
+   } elsif ($type eq "K") {
+   $keyword_hash{@typevalue} = $value;
+   }
+   push(@typevalue, "$type:$value");
+   } elsif (!/^(\s)*$/) {
+   $line =~ s/\n$//g;
+   push(@typevalue, $line);
+   }
+}
+close($maint);
+}
 
 #
 # Read mail address map
@@ -873,7 +887,7 @@ sub top_of_kernel_tree {
 if (   (-f "${lk_path}COPYING")
&& (-f "${lk_path}CREDITS")
&& (-f "${lk_path}Kbuild")
-   && (-f "${lk_path}MAINTAINERS")
+   && (-e "${lk_path}MAINTAINERS")
&& (-f "${lk_path}Makefile")
&& (-f "${lk_path}README")
&& (-d "${lk_path}Documentation")
-- 
2.10.0.rc2.1.g053435c

[PATCH] get_maintainer.pl: Prepare for separate MAINTAINERS files

2017-07-22 Thread Joe Perches

Allow for MAINTAINERS to become a directory and if it is,
read all the files in the directory for maintained sections.

Miscellanea:

o Create a read_maintainer_file subroutine from the existing code
o Test only the existence of MAINTAINERS, not whether it's a file

Signed-off-by: Joe Perches 
---
 scripts/get_maintainer.pl | 68 ---
 1 file changed, 41 insertions(+), 27 deletions(-)

diff --git a/scripts/get_maintainer.pl b/scripts/get_maintainer.pl
index 3bd5f4f30235..d71a2994b147 100755
--- a/scripts/get_maintainer.pl
+++ b/scripts/get_maintainer.pl
@@ -308,35 +308,49 @@ if (!top_of_kernel_tree($lk_path)) {
 my @typevalue = ();
 my %keyword_hash;
 
-open (my $maint, '<', "${lk_path}MAINTAINERS")
-or die "$P: Can't open MAINTAINERS: $!\n";
-while (<$maint>) {
-my $line = $_;
-
-if ($line =~ m/^([A-Z]):\s*(.*)/) {
-   my $type = $1;
-   my $value = $2;
-
-   ##Filename pattern matching
-   if ($type eq "F" || $type eq "X") {
-   $value =~ s@\.@\\\.@g;   ##Convert . to \.
-   $value =~ s/\*/\.\*/g;   ##Convert * to .*
-   $value =~ s/\?/\./g; ##Convert ? to .
-   ##if pattern is a directory and it lacks a trailing slash, add one
-   if ((-d $value)) {
-   $value =~ s@([^/])$@$1/@;
-   }
-   } elsif ($type eq "K") {
-   $keyword_hash{@typevalue} = $value;
-   }
-   push(@typevalue, "$type:$value");
-} elsif (!/^(\s)*$/) {
-   $line =~ s/\n$//g;
-   push(@typevalue, $line);
+if (-f "${lk_path}MAINTAINERS") {
+read_maintainer_file("${lk_path}MAINTAINERS");
+} elsif (-d "${lk_path}MAINTAINERS") {
+opendir(DIR, "${lk_path}MAINTAINERS") or die $!;
+my @mfiles = grep { /^\./ && -f "${lk_path}MAINTAINERS/$_" } readdir(DIR);
+closedir(DIR);
+foreach my $file (@mfiles) {
+   read_maintainer_file("${lk_path}MAINTAINERS/$file");
 }
 }
-close($maint);
 
+sub read_maintainer_file {
+my ($file) = @_;
+
+open (my $maint, '<', "$file")
+   or die "$P: Can't open MAINTAINERS file '$file': $!\n";
+while (<$maint>) {
+   my $line = $_;
+
+   if ($line =~ m/^([A-Z]):\s*(.*)/) {
+   my $type = $1;
+   my $value = $2;
+
+   ##Filename pattern matching
+   if ($type eq "F" || $type eq "X") {
+   $value =~ s@\.@\\\.@g;   ##Convert . to \.
+   $value =~ s/\*/\.\*/g;   ##Convert * to .*
+   $value =~ s/\?/\./g; ##Convert ? to .
+   ##if pattern is a directory and it lacks a trailing slash, add 
one
+   if ((-d $value)) {
+   $value =~ s@([^/])$@$1/@;
+   }
+   } elsif ($type eq "K") {
+   $keyword_hash{@typevalue} = $value;
+   }
+   push(@typevalue, "$type:$value");
+   } elsif (!/^(\s)*$/) {
+   $line =~ s/\n$//g;
+   push(@typevalue, $line);
+   }
+}
+close($maint);
+}
 
 #
 # Read mail address map
@@ -873,7 +887,7 @@ sub top_of_kernel_tree {
 if (   (-f "${lk_path}COPYING")
&& (-f "${lk_path}CREDITS")
&& (-f "${lk_path}Kbuild")
-   && (-f "${lk_path}MAINTAINERS")
+   && (-e "${lk_path}MAINTAINERS")
&& (-f "${lk_path}Makefile")
&& (-f "${lk_path}README")
&& (-d "${lk_path}Documentation")
-- 
2.10.0.rc2.1.g053435c

RE: [PATCH V3 net-next 2/8] net: hns3: Add support of the HNAE3 framework

2017-07-22 Thread Salil Mehta

Hi Andrew,
> -Original Message-
> From: Andrew Lunn [mailto:and...@lunn.ch]
> Sent: Sunday, June 18, 2017 4:02 PM
> To: Salil Mehta
> Cc: da...@davemloft.net; Zhuangyuzeng (Yisen); huangdaode; lipeng (Y);
> mehta.salil@gmail.com; net...@vger.kernel.org; linux-
> ker...@vger.kernel.org; Linuxarm
> Subject: Re: [PATCH V3 net-next 2/8] net: hns3: Add support of the
> HNAE3 framework
> 
> > +static int __init hnae3_init(void)
> > +{
> > +   return 0;
> > +}
> > +
> > +static void __exit hnae3_exit(void)
> > +{
> > +}
> > +
> > +module_init(hnae3_init);
> > +module_exit(hnae3_exit);
> 
> I think init and exit functions are optional. Since your's don't do
> anything useful, please try without them.
Yes, you were right. Removed in V4 patch. 

Thanks
Salil
> 
>Andrew

RE: [PATCH V3 net-next 2/8] net: hns3: Add support of the HNAE3 framework

2017-07-22 Thread Salil Mehta

Hi Andrew,
> -Original Message-
> From: Andrew Lunn [mailto:and...@lunn.ch]
> Sent: Sunday, June 18, 2017 4:02 PM
> To: Salil Mehta
> Cc: da...@davemloft.net; Zhuangyuzeng (Yisen); huangdaode; lipeng (Y);
> mehta.salil@gmail.com; net...@vger.kernel.org; linux-
> ker...@vger.kernel.org; Linuxarm
> Subject: Re: [PATCH V3 net-next 2/8] net: hns3: Add support of the
> HNAE3 framework
> 
> > +static int __init hnae3_init(void)
> > +{
> > +   return 0;
> > +}
> > +
> > +static void __exit hnae3_exit(void)
> > +{
> > +}
> > +
> > +module_init(hnae3_init);
> > +module_exit(hnae3_exit);
> 
> I think init and exit functions are optional. Since your's don't do
> anything useful, please try without them.
Yes, you were right. Removed in V4 patch. 

Thanks
Salil
> 
>Andrew

RE: [PATCH V2 net-next 2/8] net: hns3: Add support of the HNAE3 framework

2017-07-22 Thread Salil Mehta



> -Original Message-
> From: Andrew Lunn [mailto:and...@lunn.ch]
> Sent: Sunday, June 18, 2017 3:53 PM
> To: Salil Mehta
> Cc: da...@davemloft.net; Zhuangyuzeng (Yisen); huangdaode; lipeng (Y);
> mehta.salil@gmail.com; net...@vger.kernel.org; linux-
> ker...@vger.kernel.org; Linuxarm
> Subject: Re: [PATCH V2 net-next 2/8] net: hns3: Add support of the
> HNAE3 framework
> 
> On Wed, Jun 14, 2017 at 12:10:29AM +0100, Salil Mehta wrote:
> > This patch adds the support of the HNAE3 (Hisilicon Network
> > Acceleration Engine 3) framework support to the HNS3 driver.
> >
> > Framework facilitates clients like ENET(HNS3 Ethernet Driver), RoCE
> > and user-space Ethernet drivers (like ODP etc.) to register with
> HNAE3
> > devices and their associated operations.
> 
> checkpatch throws out one warning in this file
> 
> CHECK: Comparison to NULL could be written "pos"
> #3508: FILE:
> drivers/net/ethernet/hisilicon/hns3/hns3pf/hns3_enet.h:572:
> +  for (pos = (head).ring; pos != NULL; pos = pos->next)
> 
> This one seems valid. Not all checkpatch warnings are.
Agreed and fixed in V4 patch.

Thanks
Salil
> 
>  Andrew

RE: [PATCH V2 net-next 2/8] net: hns3: Add support of the HNAE3 framework

2017-07-22 Thread Salil Mehta



> -Original Message-
> From: Andrew Lunn [mailto:and...@lunn.ch]
> Sent: Sunday, June 18, 2017 3:53 PM
> To: Salil Mehta
> Cc: da...@davemloft.net; Zhuangyuzeng (Yisen); huangdaode; lipeng (Y);
> mehta.salil@gmail.com; net...@vger.kernel.org; linux-
> ker...@vger.kernel.org; Linuxarm
> Subject: Re: [PATCH V2 net-next 2/8] net: hns3: Add support of the
> HNAE3 framework
> 
> On Wed, Jun 14, 2017 at 12:10:29AM +0100, Salil Mehta wrote:
> > This patch adds the support of the HNAE3 (Hisilicon Network
> > Acceleration Engine 3) framework support to the HNS3 driver.
> >
> > Framework facilitates clients like ENET(HNS3 Ethernet Driver), RoCE
> > and user-space Ethernet drivers (like ODP etc.) to register with
> HNAE3
> > devices and their associated operations.
> 
> checkpatch throws out one warning in this file
> 
> CHECK: Comparison to NULL could be written "pos"
> #3508: FILE:
> drivers/net/ethernet/hisilicon/hns3/hns3pf/hns3_enet.h:572:
> +  for (pos = (head).ring; pos != NULL; pos = pos->next)
> 
> This one seems valid. Not all checkpatch warnings are.
Agreed and fixed in V4 patch.

Thanks
Salil
> 
>  Andrew

RE: [PATCH V3 net-next 2/8] net: hns3: Add support of the HNAE3 framework

2017-07-22 Thread Salil Mehta

Hi Andrew,

> -Original Message-
> From: Andrew Lunn [mailto:and...@lunn.ch]
> Sent: Saturday, June 17, 2017 8:46 PM
> To: Salil Mehta
> Cc: da...@davemloft.net; Zhuangyuzeng (Yisen); huangdaode; lipeng (Y);
> mehta.salil@gmail.com; net...@vger.kernel.org; linux-
> ker...@vger.kernel.org; Linuxarm
> Subject: Re: [PATCH V3 net-next 2/8] net: hns3: Add support of the
> HNAE3 framework
> 
> > +static void hnae3_list_add(spinlock_t *lock, struct list_head *node,
> > +  struct list_head *head)
> > +{
> > +   unsigned long flags;
> > +
> > +   spin_lock_irqsave(lock, flags);
> > +   list_add_tail(node, head);
> > +   spin_unlock_irqrestore(lock, flags);
> > +}
> > +
> > +static void hnae3_list_del(spinlock_t *lock, struct list_head *node)
> > +{
> > +   unsigned long flags;
> > +
> > +   spin_lock_irqsave(lock, flags);
> > +   list_del(node);
> > +   spin_unlock_irqrestore(lock, flags);
> > +}
> > +
> 
> > +int hnae3_register_client(struct hnae3_client *client)
> > +{
> > +   struct hnae3_client *client_tmp;
> > +   struct hnae3_ae_dev *ae_dev;
> > +   int ret;
> > +
> > +   /* One system should only have one client for every type */
> > +   list_for_each_entry(client_tmp, _client_list, node) {
> > +   if (client_tmp->type == client->type)
> > +   return 0;
> > +   }
> > +
> > +   hnae3_list_add(_list_client_lock, >node,
> > +  _client_list);
> 
> Please could you explain your locking scheme. I don't get it.
> 
>Thanks
>   Andrew
Locking scheme has been fixed in the V4 patch. Please review it.

Thanks
Salil

RE: [PATCH V3 net-next 2/8] net: hns3: Add support of the HNAE3 framework

2017-07-22 Thread Salil Mehta

Hi Andrew,

> -Original Message-
> From: Andrew Lunn [mailto:and...@lunn.ch]
> Sent: Saturday, June 17, 2017 8:46 PM
> To: Salil Mehta
> Cc: da...@davemloft.net; Zhuangyuzeng (Yisen); huangdaode; lipeng (Y);
> mehta.salil@gmail.com; net...@vger.kernel.org; linux-
> ker...@vger.kernel.org; Linuxarm
> Subject: Re: [PATCH V3 net-next 2/8] net: hns3: Add support of the
> HNAE3 framework
> 
> > +static void hnae3_list_add(spinlock_t *lock, struct list_head *node,
> > +  struct list_head *head)
> > +{
> > +   unsigned long flags;
> > +
> > +   spin_lock_irqsave(lock, flags);
> > +   list_add_tail(node, head);
> > +   spin_unlock_irqrestore(lock, flags);
> > +}
> > +
> > +static void hnae3_list_del(spinlock_t *lock, struct list_head *node)
> > +{
> > +   unsigned long flags;
> > +
> > +   spin_lock_irqsave(lock, flags);
> > +   list_del(node);
> > +   spin_unlock_irqrestore(lock, flags);
> > +}
> > +
> 
> > +int hnae3_register_client(struct hnae3_client *client)
> > +{
> > +   struct hnae3_client *client_tmp;
> > +   struct hnae3_ae_dev *ae_dev;
> > +   int ret;
> > +
> > +   /* One system should only have one client for every type */
> > +   list_for_each_entry(client_tmp, _client_list, node) {
> > +   if (client_tmp->type == client->type)
> > +   return 0;
> > +   }
> > +
> > +   hnae3_list_add(_list_client_lock, >node,
> > +  _client_list);
> 
> Please could you explain your locking scheme. I don't get it.
> 
>Thanks
>   Andrew
Locking scheme has been fixed in the V4 patch. Please review it.

Thanks
Salil

RE: [PATCH V3 net-next 1/8] net: hns3: Add support of HNS3 Ethernet Driver for hip08 SoC

2017-07-22 Thread Salil Mehta

Hi Andrew

> -Original Message-
> From: Andrew Lunn [mailto:and...@lunn.ch]
> Sent: Saturday, June 17, 2017 6:54 PM
> To: Salil Mehta
> Cc: da...@davemloft.net; Zhuangyuzeng (Yisen); huangdaode; lipeng (Y);
> mehta.salil@gmail.com; net...@vger.kernel.org; linux-
> ker...@vger.kernel.org; Linuxarm
> Subject: Re: [PATCH V3 net-next 1/8] net: hns3: Add support of HNS3
> Ethernet Driver for hip08 SoC
> 
> > +static int hns3_nic_net_up(struct net_device *ndev)
> > +{
> > +   struct hns3_nic_priv *priv = netdev_priv(ndev);
> > +   struct hnae3_handle *h = priv->ae_handle;
> > +   int i, j;
> > +   int ret;
> > +
> > +   ret = hns3_nic_init_irq(priv);
> > +   if (ret != 0) {
> 
>   if (ret)
> 
> No need to compare with zero.
Sure, changed in V4 patch. 

Thanks
Salil
> 
> > +   netdev_err(ndev, "hns init irq failed! ret=%d\n", ret);
> > +   return ret;
> 
> > +static int hns3_nic_net_open(struct net_device *ndev)
> > +{
> > +   struct hns3_nic_priv *priv = netdev_priv(ndev);
> > +   struct hnae3_handle *h = priv->ae_handle;
> > +   int ret;
> > +
> > +   netif_carrier_off(ndev);
> > +
> > +   ret = netif_set_real_num_tx_queues(ndev, h->kinfo.num_tqps);
> > +   if (ret < 0) {
> > +   netdev_err(ndev, "netif_set_real_num_tx_queues fail,
> ret=%d!\n",
> > +  ret);
> > +   return ret;
> > +   }
> 
> In general, functions return 0 for success, and something else for an
> error. So there is no need to do a comparison. Please remove all
> comparisons, unless it is really needed. It also makes the code look
> consistent. At the moment you sometime have < 0, sometime !=0, and
> sometimes no comparison at all.
Acknowledged, scanned and have changed in V4 patch. Please have a look.

Thanks
Salil
> 
> Andrew

RE: [PATCH V3 net-next 1/8] net: hns3: Add support of HNS3 Ethernet Driver for hip08 SoC

2017-07-22 Thread Salil Mehta

Hi Andrew

> -Original Message-
> From: Andrew Lunn [mailto:and...@lunn.ch]
> Sent: Saturday, June 17, 2017 6:54 PM
> To: Salil Mehta
> Cc: da...@davemloft.net; Zhuangyuzeng (Yisen); huangdaode; lipeng (Y);
> mehta.salil@gmail.com; net...@vger.kernel.org; linux-
> ker...@vger.kernel.org; Linuxarm
> Subject: Re: [PATCH V3 net-next 1/8] net: hns3: Add support of HNS3
> Ethernet Driver for hip08 SoC
> 
> > +static int hns3_nic_net_up(struct net_device *ndev)
> > +{
> > +   struct hns3_nic_priv *priv = netdev_priv(ndev);
> > +   struct hnae3_handle *h = priv->ae_handle;
> > +   int i, j;
> > +   int ret;
> > +
> > +   ret = hns3_nic_init_irq(priv);
> > +   if (ret != 0) {
> 
>   if (ret)
> 
> No need to compare with zero.
Sure, changed in V4 patch. 

Thanks
Salil
> 
> > +   netdev_err(ndev, "hns init irq failed! ret=%d\n", ret);
> > +   return ret;
> 
> > +static int hns3_nic_net_open(struct net_device *ndev)
> > +{
> > +   struct hns3_nic_priv *priv = netdev_priv(ndev);
> > +   struct hnae3_handle *h = priv->ae_handle;
> > +   int ret;
> > +
> > +   netif_carrier_off(ndev);
> > +
> > +   ret = netif_set_real_num_tx_queues(ndev, h->kinfo.num_tqps);
> > +   if (ret < 0) {
> > +   netdev_err(ndev, "netif_set_real_num_tx_queues fail,
> ret=%d!\n",
> > +  ret);
> > +   return ret;
> > +   }
> 
> In general, functions return 0 for success, and something else for an
> error. So there is no need to do a comparison. Please remove all
> comparisons, unless it is really needed. It also makes the code look
> consistent. At the moment you sometime have < 0, sometime !=0, and
> sometimes no comparison at all.
Acknowledged, scanned and have changed in V4 patch. Please have a look.

Thanks
Salil
> 
> Andrew

Re: [PATCH] perf tool sort: Use default sort if evlist is empty

2017-07-22 Thread Namhyung Kim

On Fri, Jul 21, 2017 at 01:02:50PM -0700, David Carrillo-Cisneros wrote:
> On Fri, Jul 21, 2017 at 12:44 AM, Jiri Olsa  wrote:
> > On Thu, Jul 20, 2017 at 10:11:57PM -0700, David Carrillo-Cisneros wrote:
> >> Fixes bug noted by Jiri in https://lkml.org/lkml/2017/6/13/755 and caused
> >> by commit d49dadea7862 ("perf tools: Make 'trace' or 'trace_fields' sort
> >>key default for tracepoint events")
> >> not taking into account that evlist is empty in pipe-mode.
> >>
> >> Before this commit, pipe mode will only show bogus "100.00%  N/A" instead
> >> of correct output as follows:
> >>
> >>   $ perf record -o - sleep 1 | perf report -i -
> >>   # To display the perf.data header info, please use 
> >> --header/--header-only options.
> >>   #
> >>   [ perf record: Woken up 1 times to write data ]
> >>   [ perf record: Captured and wrote 0.000 MB - ]
> >>   #
> >>   # Total Lost Samples: 0
> >>   #
> >>   # Samples: 8  of event 'cycles:ppH'
> >>   # Event count (approx.): 145658
> >>   #
> >>   # Overhead  Trace output
> >>   #   
> >>   #
> >>  100.00%  N/A
> >>
> >> Correct output, after patch:
> >>
> >>   $ perf record -o - sleep 1 | perf report -i -
> >>   # To display the perf.data header info, please use 
> >> --header/--header-only options.
> >>   #
> >>   [ perf record: Woken up 1 times to write data ]
> >>   [ perf record: Captured and wrote 0.000 MB - ]
> >>   #
> >>   # Total Lost Samples: 0
> >>   #
> >>   # Samples: 8  of event 'cycles:ppH'
> >>   # Event count (approx.): 191331
> >>   #
> >>   # Overhead  Command  Shared Object  Symbol
> >>   #   ...  .  .
> >>   #
> >>   81.63%  sleeplibc-2.19.so   [.] _exit
> >>   13.58%  sleepld-2.19.so [.] do_lookup_x
> >>2.34%  sleep[kernel.kallsyms]  [k] context_switch
> >>2.34%  sleeplibc-2.19.so   [.] __GI___libc_nanosleep
> >>0.11%  perf [kernel.kallsyms]  [k] __intel_pmu_enable_a
> >>
> >
> > I wonder we could reinit the sortorder once we know what
> > events we have in pipe, and recognize the tracepoint output
> > properly:
> 
> I see this hard to do since, at any given point while traversing the
> pipe's content, the best we can do is guess that we've seen all event
> types. Then we'd need to fall back and redo the output whenever a new
> sample refutes our last guess.

After reading feature event, you could know the number of events, no?

Thanks,
Namhyung


> 
> >
> > [root@krava perf]# ./perf record -e 'sched:sched_switch' sleep 1 |  
> > ./perf report
> > # To display the perf.data header info, please use 
> > --header/--header-only options.
> >
> > SNIP
> >
> > #
> > # Overhead  Command  Shared Object  Symbol
> > #   ...  .  ..
> > #
> >100.00%  sleep[kernel.kallsyms]  [k] __schedule
> >
> >
> > also I've got another crash for (added -a option for above example):
> >
> > [root@krava perf]# ./perf record -e 'sched:sched_switch' -a sleep 1 
> > |  ./perf report
> > # To display the perf.data header info, please use 
> > --header/--header-only options.
> > #
> > [ perf record: Woken up 1 times to write data ]
> > [ perf record: Captured and wrote 0.000 MB (null) ]
> > Segmentation fault (core dumped)
> >
> > catchsegv got:
> > 
> > /home/jolsa/kernel/linux-perf/tools/perf/util/ordered-events.c:85(free_dup_event)[0x51a6a5]
> > ./perf(ordered_events__free+0x5c)[0x51b0b7]
> > 
> > /home/jolsa/kernel/linux-perf/tools/perf/util/session.c:1751(__perf_session__process_pipe_events)[0x518abb]
> > ./perf(perf_session__process_events+0x91)[0x5190f0]
> > 
> > /home/jolsa/kernel/linux-perf/tools/perf/builtin-report.c:598(__cmd_report)[0x443a91]
> > ./perf(cmd_report+0x169b)[0x4455a3]
> > 
> > /home/jolsa/kernel/linux-perf/tools/perf/perf.c:296(run_builtin)[0x4be1b0]
> > 
> > /home/jolsa/kernel/linux-perf/tools/perf/perf.c:348(handle_internal_command)[0x4be41d]
> > 
> > /home/jolsa/kernel/linux-perf/tools/perf/perf.c:395(run_argv)[0x4be56f]
> > ./perf(main+0x2d6)[0x4be949]
> > /lib64/libc.so.6(__libc_start_main+0xf1)[0x7f3de8a10401]
> > ./perf(_start+0x2a)[0x42831a]
> >
> > looks like some mem corruption.. will try to follow up
> > on this later if nobody beats me to it ;-)
> 
> Cannot reproduce it in acme's perf/core building the tool with
>   make NO_LIBPYTHON=1 LDFLAGS=-static
> 
> If you have a file with the perf record output causing perf report's
> crash, I'd like to take a look.
> 
> Thanks,
> David

Re: [PATCH] perf tool sort: Use default sort if evlist is empty

2017-07-22 Thread Namhyung Kim

On Fri, Jul 21, 2017 at 01:02:50PM -0700, David Carrillo-Cisneros wrote:
> On Fri, Jul 21, 2017 at 12:44 AM, Jiri Olsa  wrote:
> > On Thu, Jul 20, 2017 at 10:11:57PM -0700, David Carrillo-Cisneros wrote:
> >> Fixes bug noted by Jiri in https://lkml.org/lkml/2017/6/13/755 and caused
> >> by commit d49dadea7862 ("perf tools: Make 'trace' or 'trace_fields' sort
> >>key default for tracepoint events")
> >> not taking into account that evlist is empty in pipe-mode.
> >>
> >> Before this commit, pipe mode will only show bogus "100.00%  N/A" instead
> >> of correct output as follows:
> >>
> >>   $ perf record -o - sleep 1 | perf report -i -
> >>   # To display the perf.data header info, please use 
> >> --header/--header-only options.
> >>   #
> >>   [ perf record: Woken up 1 times to write data ]
> >>   [ perf record: Captured and wrote 0.000 MB - ]
> >>   #
> >>   # Total Lost Samples: 0
> >>   #
> >>   # Samples: 8  of event 'cycles:ppH'
> >>   # Event count (approx.): 145658
> >>   #
> >>   # Overhead  Trace output
> >>   #   
> >>   #
> >>  100.00%  N/A
> >>
> >> Correct output, after patch:
> >>
> >>   $ perf record -o - sleep 1 | perf report -i -
> >>   # To display the perf.data header info, please use 
> >> --header/--header-only options.
> >>   #
> >>   [ perf record: Woken up 1 times to write data ]
> >>   [ perf record: Captured and wrote 0.000 MB - ]
> >>   #
> >>   # Total Lost Samples: 0
> >>   #
> >>   # Samples: 8  of event 'cycles:ppH'
> >>   # Event count (approx.): 191331
> >>   #
> >>   # Overhead  Command  Shared Object  Symbol
> >>   #   ...  .  .
> >>   #
> >>   81.63%  sleeplibc-2.19.so   [.] _exit
> >>   13.58%  sleepld-2.19.so [.] do_lookup_x
> >>2.34%  sleep[kernel.kallsyms]  [k] context_switch
> >>2.34%  sleeplibc-2.19.so   [.] __GI___libc_nanosleep
> >>0.11%  perf [kernel.kallsyms]  [k] __intel_pmu_enable_a
> >>
> >
> > I wonder we could reinit the sortorder once we know what
> > events we have in pipe, and recognize the tracepoint output
> > properly:
> 
> I see this hard to do since, at any given point while traversing the
> pipe's content, the best we can do is guess that we've seen all event
> types. Then we'd need to fall back and redo the output whenever a new
> sample refutes our last guess.

After reading feature event, you could know the number of events, no?

Thanks,
Namhyung


> 
> >
> > [root@krava perf]# ./perf record -e 'sched:sched_switch' sleep 1 |  
> > ./perf report
> > # To display the perf.data header info, please use 
> > --header/--header-only options.
> >
> > SNIP
> >
> > #
> > # Overhead  Command  Shared Object  Symbol
> > #   ...  .  ..
> > #
> >100.00%  sleep[kernel.kallsyms]  [k] __schedule
> >
> >
> > also I've got another crash for (added -a option for above example):
> >
> > [root@krava perf]# ./perf record -e 'sched:sched_switch' -a sleep 1 
> > |  ./perf report
> > # To display the perf.data header info, please use 
> > --header/--header-only options.
> > #
> > [ perf record: Woken up 1 times to write data ]
> > [ perf record: Captured and wrote 0.000 MB (null) ]
> > Segmentation fault (core dumped)
> >
> > catchsegv got:
> > 
> > /home/jolsa/kernel/linux-perf/tools/perf/util/ordered-events.c:85(free_dup_event)[0x51a6a5]
> > ./perf(ordered_events__free+0x5c)[0x51b0b7]
> > 
> > /home/jolsa/kernel/linux-perf/tools/perf/util/session.c:1751(__perf_session__process_pipe_events)[0x518abb]
> > ./perf(perf_session__process_events+0x91)[0x5190f0]
> > 
> > /home/jolsa/kernel/linux-perf/tools/perf/builtin-report.c:598(__cmd_report)[0x443a91]
> > ./perf(cmd_report+0x169b)[0x4455a3]
> > 
> > /home/jolsa/kernel/linux-perf/tools/perf/perf.c:296(run_builtin)[0x4be1b0]
> > 
> > /home/jolsa/kernel/linux-perf/tools/perf/perf.c:348(handle_internal_command)[0x4be41d]
> > 
> > /home/jolsa/kernel/linux-perf/tools/perf/perf.c:395(run_argv)[0x4be56f]
> > ./perf(main+0x2d6)[0x4be949]
> > /lib64/libc.so.6(__libc_start_main+0xf1)[0x7f3de8a10401]
> > ./perf(_start+0x2a)[0x42831a]
> >
> > looks like some mem corruption.. will try to follow up
> > on this later if nobody beats me to it ;-)
> 
> Cannot reproduce it in acme's perf/core building the tool with
>   make NO_LIBPYTHON=1 LDFLAGS=-static
> 
> If you have a file with the perf record output causing perf report's
> crash, I'd like to take a look.
> 
> Thanks,
> David

Re: [PATCH v3 3/9] perf annotate: Fix wrong --show-total-period option showing number of samples

2017-07-22 Thread Namhyung Kim

Hi Arnaldo and Taeung,

(+ Andi)

On Fri, Jul 21, 2017 at 11:47:48AM -0300, Arnaldo Carvalho de Melo wrote:
> Em Thu, Jul 20, 2017 at 06:36:55AM +0900, Taeung Song escreveu:
> > +++ b/tools/perf/builtin-annotate.c
> > @@ -177,14 +177,12 @@ static int perf_evsel__add_sample(struct perf_evsel 
> > *evsel,
> >  */
> > process_branch_stack(sample->branch_stack, al, sample);
> >  
> > -   sample->period = 1;
> > sample->weight = 1;
> > -
> > he = hists__add_entry(hists, al, NULL, NULL, NULL, sample, true);
> > if (he == NULL)
> > return -ENOMEM;
> 
> I split the hunk above into a separate patch, as a fix, Namhyung, can
> you take a look at why need to unconditionally overwrite what is in
> sample->weight as well?
> 
> Looks fishy as it may come with a value from the kernel, parsed in
> perf_evsel__parse_sample(), when PERF_SAMPLE_WEIGHT is in
> perf_event_attr->sample_type.
> 
> Is it that the hists code needs a sane value when PERF_SAMPLE_WEIGHT
> isn't requested in sample_type?

It was Andi added that code originally (05484298cbfe).  IIUC the
weight is only meaningful for some CPUs with Intel TSX and he used a
dummy value.

AFAIK the hists code doesn't care of it unless weight sort key is used
(for report).  As it's not used by annotate code, I think it'd be
better leaving it as is (like period).

Thanks,
Namhyung


> 
> The resulting cset is below.
> 
> - Arnaldo
> 
> commit a935e8cd8d5d4b7936c4b4cf27c2d0e87d1a6a66
> Author: Taeung Song 
> Date:   Fri Jul 21 11:38:48 2017 -0300
> 
> perf annotate: Do not overwrite sample->period
> 
> In fixing the --show-total-period option it was noticed that the value
> of sample->period was being overwritten, fix it.
> 
> Cc: Jiri Olsa 
> Cc: Milian Wolff 
> Cc: Namhyung Kim 
> Fixes: fd36f3dd7933 ("perf hist: Pass struct sample to 
> __hists__add_entry()")
> [ split from a larger patch, added the Fixes tag ]
> Signed-off-by: Arnaldo Carvalho de Melo 
> 
> diff --git a/tools/perf/builtin-annotate.c b/tools/perf/builtin-annotate.c
> index 96fe1a88c1e5..7e33278eff67 100644
> --- a/tools/perf/builtin-annotate.c
> +++ b/tools/perf/builtin-annotate.c
> @@ -177,7 +177,6 @@ static int perf_evsel__add_sample(struct perf_evsel 
> *evsel,
>*/
>   process_branch_stack(sample->branch_stack, al, sample);
>  
> - sample->period = 1;
>   sample->weight = 1;
>  
>   he = hists__add_entry(hists, al, NULL, NULL, NULL, sample, true);

Re: [PATCH v3 3/9] perf annotate: Fix wrong --show-total-period option showing number of samples

2017-07-22 Thread Namhyung Kim

Hi Arnaldo and Taeung,

(+ Andi)

On Fri, Jul 21, 2017 at 11:47:48AM -0300, Arnaldo Carvalho de Melo wrote:
> Em Thu, Jul 20, 2017 at 06:36:55AM +0900, Taeung Song escreveu:
> > +++ b/tools/perf/builtin-annotate.c
> > @@ -177,14 +177,12 @@ static int perf_evsel__add_sample(struct perf_evsel 
> > *evsel,
> >  */
> > process_branch_stack(sample->branch_stack, al, sample);
> >  
> > -   sample->period = 1;
> > sample->weight = 1;
> > -
> > he = hists__add_entry(hists, al, NULL, NULL, NULL, sample, true);
> > if (he == NULL)
> > return -ENOMEM;
> 
> I split the hunk above into a separate patch, as a fix, Namhyung, can
> you take a look at why need to unconditionally overwrite what is in
> sample->weight as well?
> 
> Looks fishy as it may come with a value from the kernel, parsed in
> perf_evsel__parse_sample(), when PERF_SAMPLE_WEIGHT is in
> perf_event_attr->sample_type.
> 
> Is it that the hists code needs a sane value when PERF_SAMPLE_WEIGHT
> isn't requested in sample_type?

It was Andi added that code originally (05484298cbfe).  IIUC the
weight is only meaningful for some CPUs with Intel TSX and he used a
dummy value.

AFAIK the hists code doesn't care of it unless weight sort key is used
(for report).  As it's not used by annotate code, I think it'd be
better leaving it as is (like period).

Thanks,
Namhyung


> 
> The resulting cset is below.
> 
> - Arnaldo
> 
> commit a935e8cd8d5d4b7936c4b4cf27c2d0e87d1a6a66
> Author: Taeung Song 
> Date:   Fri Jul 21 11:38:48 2017 -0300
> 
> perf annotate: Do not overwrite sample->period
> 
> In fixing the --show-total-period option it was noticed that the value
> of sample->period was being overwritten, fix it.
> 
> Cc: Jiri Olsa 
> Cc: Milian Wolff 
> Cc: Namhyung Kim 
> Fixes: fd36f3dd7933 ("perf hist: Pass struct sample to 
> __hists__add_entry()")
> [ split from a larger patch, added the Fixes tag ]
> Signed-off-by: Arnaldo Carvalho de Melo 
> 
> diff --git a/tools/perf/builtin-annotate.c b/tools/perf/builtin-annotate.c
> index 96fe1a88c1e5..7e33278eff67 100644
> --- a/tools/perf/builtin-annotate.c
> +++ b/tools/perf/builtin-annotate.c
> @@ -177,7 +177,6 @@ static int perf_evsel__add_sample(struct perf_evsel 
> *evsel,
>*/
>   process_branch_stack(sample->branch_stack, al, sample);
>  
> - sample->period = 1;
>   sample->weight = 1;
>  
>   he = hists__add_entry(hists, al, NULL, NULL, NULL, sample, true);

[PATCH V4 net-next 3/8] net: hns3: Add HNS3 IMP(Integrated Mgmt Proc) Cmd Interface Support

2017-07-22 Thread Salil Mehta

This patch adds the support of IMP (Integrated Management Processor)
command interface to the HNS3 driver.

Each PF/VF has support of CQP(Command Queue Pair) ring interface.
Each CQP consis of send queue CSQ and receive queue CRQ.
There are various commands a PF/VF may support, like for Flow Table
manipulation, Device management, Packet buffer allocation, Forwarding,
VLANs config, Tunneling/Overlays etc.

This patch contains code to initialize the command queue, manage the
command queue descriptors and Rx/Tx protocol with the command processor
in the form of various commands/results and acknowledgements.

Signed-off-by: Daode Huang 
Signed-off-by: lipeng 
Signed-off-by: Salil Mehta 
Signed-off-by: Yisen Zhuang 
---
 .../net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.c | 347 ++
 .../net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.h | 742 +
 2 files changed, 1089 insertions(+)
 create mode 100644 drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.c
 create mode 100644 drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.h

diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.c 
b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.c
new file mode 100644
index ..ec20ec4a5939
--- /dev/null
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.c
@@ -0,0 +1,347 @@
+/*
+ * Copyright (c) 2016~2017 Hisilicon Limited.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include "hclge_cmd.h"
+#include "hnae3.h"
+#include "hclge_main.h"
+
+#define hclge_is_csq(ring) ((ring)->flag & HCLGE_TYPE_CSQ)
+#define hclge_ring_to_dma_dir(ring) (hclge_is_csq(ring) ? \
+   DMA_TO_DEVICE : DMA_FROM_DEVICE)
+#define cmq_ring_to_dev(ring)   (&(ring)->dev->pdev->dev)
+
+static int hclge_ring_space(struct hclge_cmq_ring *ring)
+{
+   int ntu = ring->next_to_use;
+   int ntc = ring->next_to_clean;
+   int used = (ntu - ntc + ring->desc_num) % ring->desc_num;
+
+   return ring->desc_num - used - 1;
+}
+
+static int hclge_alloc_cmd_desc(struct hclge_cmq_ring *ring)
+{
+   int size  = ring->desc_num * sizeof(struct hclge_desc);
+
+   ring->desc = kzalloc(size, GFP_KERNEL);
+   if (!ring->desc)
+   return -ENOMEM;
+
+   ring->desc_dma_addr = dma_map_single(cmq_ring_to_dev(ring), ring->desc,
+size, DMA_BIDIRECTIONAL);
+   if (dma_mapping_error(cmq_ring_to_dev(ring), ring->desc_dma_addr)) {
+   ring->desc_dma_addr = 0;
+   kfree(ring->desc);
+   ring->desc = NULL;
+   return -ENOMEM;
+   }
+
+   return 0;
+}
+
+static void hclge_free_cmd_desc(struct hclge_cmq_ring *ring)
+{
+   dma_unmap_single(cmq_ring_to_dev(ring), ring->desc_dma_addr,
+ring->desc_num * sizeof(ring->desc[0]),
+DMA_BIDIRECTIONAL);
+
+   ring->desc_dma_addr = 0;
+   kfree(ring->desc);
+   ring->desc = NULL;
+}
+
+static int hclge_init_cmd_queue(struct hclge_dev *hdev, int ring_type)
+{
+   struct hclge_hw *hw = >hw;
+   struct hclge_cmq_ring *ring =
+   (ring_type == HCLGE_TYPE_CSQ) ? >cmq.csq : >cmq.crq;
+   int ret;
+
+   ring->flag = ring_type;
+   ring->dev = hdev;
+
+   ret = hclge_alloc_cmd_desc(ring);
+   if (ret) {
+   dev_err(>pdev->dev, "descriptor %s alloc error %d\n",
+   (ring_type == HCLGE_TYPE_CSQ) ? "CSQ" : "CRQ", ret);
+   return ret;
+   }
+
+   ring->next_to_clean = 0;
+   ring->next_to_use = 0;
+
+   return 0;
+}
+
+void hclge_cmd_reuse_desc(struct hclge_desc *desc, bool is_read)
+{
+   desc->flag = cpu_to_le16(HCLGE_CMD_FLAG_NO_INTR | HCLGE_CMD_FLAG_IN);
+   if (is_read)
+   desc->flag |= cpu_to_le16(HCLGE_CMD_FLAG_WR);
+   else
+   desc->flag &= cpu_to_le16(~HCLGE_CMD_FLAG_WR);
+}
+
+void hclge_cmd_setup_basic_desc(struct hclge_desc *desc,
+   enum hclge_opcode_type opcode, bool is_read)
+{
+   memset((void *)desc, 0, sizeof(struct hclge_desc));
+   desc->opcode = cpu_to_le16(opcode);
+   desc->flag = cpu_to_le16(HCLGE_CMD_FLAG_NO_INTR | HCLGE_CMD_FLAG_IN);
+
+   if (is_read)
+   desc->flag |= cpu_to_le16(HCLGE_CMD_FLAG_WR);
+   else
+   desc->flag &= cpu_to_le16(~HCLGE_CMD_FLAG_WR);
+}
+
+static void hclge_cmd_config_regs(struct hclge_cmq_ring *ring)
+{
+   dma_addr_t dma = ring->desc_dma_addr;
+   struct hclge_dev *hdev = ring->dev;
+   struct hclge_hw *hw = >hw;
+
+   if (ring->flag ==

Re: [PATCH v3 4/5] ACPI / boot: Not all platform require acpi_reduced_hw_init()

2017-07-22 Thread Rafael J. Wysocki

On Sun, Jul 23, 2017 at 12:13 AM, Andy Shevchenko
 wrote:
> On Sun, Jul 23, 2017 at 1:02 AM, Rafael J. Wysocki  wrote:
>> On Saturday, July 22, 2017 04:53:52 AM Andy Shevchenko wrote:
>>> On Sat, Jul 22, 2017 at 1:25 AM, Rafael J. Wysocki  
>>> wrote:
>>> > On Tuesday, July 18, 2017 06:04:19 PM Andy Shevchenko wrote:
>
>>> > I'd rather do it at the time when acpi_reduced_hw_init() actually needs 
>>> > to be
>>> > overridden by at least one platform.
>>>
>>> Do you mean as folded into some other patch or just as a preparatory
>>> patch in some future series?
>
>> Any of the above would work for me.
>
> Logically it should be a separate change as I can see (I have already
> locally prepared patches for that one platform).
>
> Thanks for review.
>
> P.S. Are you going to apply first 3 then from this series?

Yes, I am.

[PATCH V4 net-next 3/8] net: hns3: Add HNS3 IMP(Integrated Mgmt Proc) Cmd Interface Support

2017-07-22 Thread Salil Mehta

This patch adds the support of IMP (Integrated Management Processor)
command interface to the HNS3 driver.

Each PF/VF has support of CQP(Command Queue Pair) ring interface.
Each CQP consis of send queue CSQ and receive queue CRQ.
There are various commands a PF/VF may support, like for Flow Table
manipulation, Device management, Packet buffer allocation, Forwarding,
VLANs config, Tunneling/Overlays etc.

This patch contains code to initialize the command queue, manage the
command queue descriptors and Rx/Tx protocol with the command processor
in the form of various commands/results and acknowledgements.

Signed-off-by: Daode Huang 
Signed-off-by: lipeng 
Signed-off-by: Salil Mehta 
Signed-off-by: Yisen Zhuang 
---
 .../net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.c | 347 ++
 .../net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.h | 742 +
 2 files changed, 1089 insertions(+)
 create mode 100644 drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.c
 create mode 100644 drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.h

diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.c 
b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.c
new file mode 100644
index ..ec20ec4a5939
--- /dev/null
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.c
@@ -0,0 +1,347 @@
+/*
+ * Copyright (c) 2016~2017 Hisilicon Limited.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include "hclge_cmd.h"
+#include "hnae3.h"
+#include "hclge_main.h"
+
+#define hclge_is_csq(ring) ((ring)->flag & HCLGE_TYPE_CSQ)
+#define hclge_ring_to_dma_dir(ring) (hclge_is_csq(ring) ? \
+   DMA_TO_DEVICE : DMA_FROM_DEVICE)
+#define cmq_ring_to_dev(ring)   (&(ring)->dev->pdev->dev)
+
+static int hclge_ring_space(struct hclge_cmq_ring *ring)
+{
+   int ntu = ring->next_to_use;
+   int ntc = ring->next_to_clean;
+   int used = (ntu - ntc + ring->desc_num) % ring->desc_num;
+
+   return ring->desc_num - used - 1;
+}
+
+static int hclge_alloc_cmd_desc(struct hclge_cmq_ring *ring)
+{
+   int size  = ring->desc_num * sizeof(struct hclge_desc);
+
+   ring->desc = kzalloc(size, GFP_KERNEL);
+   if (!ring->desc)
+   return -ENOMEM;
+
+   ring->desc_dma_addr = dma_map_single(cmq_ring_to_dev(ring), ring->desc,
+size, DMA_BIDIRECTIONAL);
+   if (dma_mapping_error(cmq_ring_to_dev(ring), ring->desc_dma_addr)) {
+   ring->desc_dma_addr = 0;
+   kfree(ring->desc);
+   ring->desc = NULL;
+   return -ENOMEM;
+   }
+
+   return 0;
+}
+
+static void hclge_free_cmd_desc(struct hclge_cmq_ring *ring)
+{
+   dma_unmap_single(cmq_ring_to_dev(ring), ring->desc_dma_addr,
+ring->desc_num * sizeof(ring->desc[0]),
+DMA_BIDIRECTIONAL);
+
+   ring->desc_dma_addr = 0;
+   kfree(ring->desc);
+   ring->desc = NULL;
+}
+
+static int hclge_init_cmd_queue(struct hclge_dev *hdev, int ring_type)
+{
+   struct hclge_hw *hw = >hw;
+   struct hclge_cmq_ring *ring =
+   (ring_type == HCLGE_TYPE_CSQ) ? >cmq.csq : >cmq.crq;
+   int ret;
+
+   ring->flag = ring_type;
+   ring->dev = hdev;
+
+   ret = hclge_alloc_cmd_desc(ring);
+   if (ret) {
+   dev_err(>pdev->dev, "descriptor %s alloc error %d\n",
+   (ring_type == HCLGE_TYPE_CSQ) ? "CSQ" : "CRQ", ret);
+   return ret;
+   }
+
+   ring->next_to_clean = 0;
+   ring->next_to_use = 0;
+
+   return 0;
+}
+
+void hclge_cmd_reuse_desc(struct hclge_desc *desc, bool is_read)
+{
+   desc->flag = cpu_to_le16(HCLGE_CMD_FLAG_NO_INTR | HCLGE_CMD_FLAG_IN);
+   if (is_read)
+   desc->flag |= cpu_to_le16(HCLGE_CMD_FLAG_WR);
+   else
+   desc->flag &= cpu_to_le16(~HCLGE_CMD_FLAG_WR);
+}
+
+void hclge_cmd_setup_basic_desc(struct hclge_desc *desc,
+   enum hclge_opcode_type opcode, bool is_read)
+{
+   memset((void *)desc, 0, sizeof(struct hclge_desc));
+   desc->opcode = cpu_to_le16(opcode);
+   desc->flag = cpu_to_le16(HCLGE_CMD_FLAG_NO_INTR | HCLGE_CMD_FLAG_IN);
+
+   if (is_read)
+   desc->flag |= cpu_to_le16(HCLGE_CMD_FLAG_WR);
+   else
+   desc->flag &= cpu_to_le16(~HCLGE_CMD_FLAG_WR);
+}
+
+static void hclge_cmd_config_regs(struct hclge_cmq_ring *ring)
+{
+   dma_addr_t dma = ring->desc_dma_addr;
+   struct hclge_dev *hdev = ring->dev;
+   struct hclge_hw *hw = >hw;
+
+   if (ring->flag == HCLGE_TYPE_CSQ) {
+   hclge_write_dev(hw, HCLGE_NIC_CSQ_BASEADDR_L_REG,
+

Re: [PATCH v3 4/5] ACPI / boot: Not all platform require acpi_reduced_hw_init()

2017-07-22 Thread Rafael J. Wysocki

On Sun, Jul 23, 2017 at 12:13 AM, Andy Shevchenko
 wrote:
> On Sun, Jul 23, 2017 at 1:02 AM, Rafael J. Wysocki  wrote:
>> On Saturday, July 22, 2017 04:53:52 AM Andy Shevchenko wrote:
>>> On Sat, Jul 22, 2017 at 1:25 AM, Rafael J. Wysocki  
>>> wrote:
>>> > On Tuesday, July 18, 2017 06:04:19 PM Andy Shevchenko wrote:
>
>>> > I'd rather do it at the time when acpi_reduced_hw_init() actually needs 
>>> > to be
>>> > overridden by at least one platform.
>>>
>>> Do you mean as folded into some other patch or just as a preparatory
>>> patch in some future series?
>
>> Any of the above would work for me.
>
> Logically it should be a separate change as I can see (I have already
> locally prepared patches for that one platform).
>
> Thanks for review.
>
> P.S. Are you going to apply first 3 then from this series?

Yes, I am.

[PATCH V4 net-next 2/8] net: hns3: Add support of the HNAE3 framework

2017-07-22 Thread Salil Mehta

This patch adds the support of the HNAE3 (Hisilicon Network
Acceleration Engine 3) framework support to the HNS3 driver.

Framework facilitates clients like ENET(HNS3 Ethernet Driver), RoCE
and user-space Ethernet drivers (like ODP etc.) to register with HNAE3
devices and their associated operations.

Signed-off-by: Daode Huang 
Signed-off-by: lipeng 
Signed-off-by: Salil Mehta 
Signed-off-by: Yisen Zhuang 
---
Patch V4: Addressed following comments
  1. Andrew Lunn:
 https://lkml.org/lkml/2017/6/17/233
 https://lkml.org/lkml/2017/6/18/105
  2. Bo Yu:
 https://lkml.org/lkml/2017/6/18/112
  3. Stephen Hamminger:
 https://lkml.org/lkml/2017/6/19/778
Patch V3: Addressed below comments
  1. Andrew Lunn:
 https://lkml.org/lkml/2017/6/13/1025
Patch V2: No change
Patch V1: Initial Submit
---
 drivers/net/ethernet/hisilicon/hns3/hnae3.c | 319 
 drivers/net/ethernet/hisilicon/hns3/hnae3.h | 449 
 2 files changed, 768 insertions(+)
 create mode 100644 drivers/net/ethernet/hisilicon/hns3/hnae3.c
 create mode 100644 drivers/net/ethernet/hisilicon/hns3/hnae3.h

diff --git a/drivers/net/ethernet/hisilicon/hns3/hnae3.c 
b/drivers/net/ethernet/hisilicon/hns3/hnae3.c
new file mode 100644
index ..7a11aaff0a23
--- /dev/null
+++ b/drivers/net/ethernet/hisilicon/hns3/hnae3.c
@@ -0,0 +1,319 @@
+/*
+ * Copyright (c) 2016-2017 Hisilicon Limited.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ */
+
+#include 
+#include 
+#include 
+
+#include "hnae3.h"
+
+static LIST_HEAD(hnae3_ae_algo_list);
+static LIST_HEAD(hnae3_client_list);
+static LIST_HEAD(hnae3_ae_dev_list);
+
+/* we are keeping things simple and using single lock for all the
+ * list. This is a non-critical code so other updations, if happen
+ * in parallel, can wait.
+ */
+static DEFINE_MUTEX(hnae3_common_lock);
+
+static bool hnae3_client_match(enum hnae3_client_type client_type,
+  enum hnae3_dev_type dev_type)
+{
+   if (dev_type == HNAE3_DEV_KNIC) {
+   switch (client_type) {
+   case HNAE3_CLIENT_KNIC:
+   case HNAE3_CLIENT_ROCE:
+   return true;
+   default:
+   return false;
+   }
+   } else if (dev_type == HNAE3_DEV_UNIC) {
+   switch (client_type) {
+   case HNAE3_CLIENT_UNIC:
+   return true;
+   default:
+   return false;
+   }
+   } else {
+   return false;
+   }
+}
+
+static int hnae3_match_n_instantiate(struct hnae3_client *client,
+struct hnae3_ae_dev *ae_dev,
+bool is_reg, bool *matched)
+{
+   int ret;
+
+   *matched = false;
+
+   /* check if this client matches the type of ae_dev */
+   if (!(hnae3_client_match(client->type, ae_dev->dev_type) &&
+ hnae_get_bit(ae_dev->flag, HNAE3_DEV_INITED_B))) {
+   return 0;
+   }
+   /* there is a match of client and dev */
+   *matched = true;
+
+   if (!(ae_dev->ops && ae_dev->ops->init_client_instance &&
+ ae_dev->ops->uninit_client_instance)) {
+   dev_err(_dev->pdev->dev,
+   "ae_dev or client init/uninit ops are null\n");
+   return -EOPNOTSUPP;
+   }
+
+   /* now, (un-)instantiate client by calling lower layer */
+   if (is_reg) {
+   ret = ae_dev->ops->init_client_instance(client, ae_dev);
+   if (ret)
+   dev_err(_dev->pdev->dev,
+   "fail to instantiate client\n");
+   return ret;
+   }
+
+   ae_dev->ops->uninit_client_instance(client, ae_dev);
+   return 0;
+}
+
+int hnae3_register_client(struct hnae3_client *client)
+{
+   struct hnae3_client *client_tmp;
+   struct hnae3_ae_dev *ae_dev;
+   bool matched;
+   int ret = 0;
+
+   mutex_lock(_common_lock);
+   /* one system should only have one client for every type */
+   list_for_each_entry(client_tmp, _client_list, node) {
+   if (client_tmp->type == client->type)
+   goto exit;
+   }
+
+   list_add_tail(>node, _client_list);
+
+   /* initialize the client on every matched port */
+   list_for_each_entry(ae_dev, _ae_dev_list, node) {
+   /* if the client could not be initialized on current port, for
+* any error reasons, move on to next available port
+*/
+   ret = hnae3_match_n_instantiate(client, ae_dev, true,

[PATCH V4 net-next 2/8] net: hns3: Add support of the HNAE3 framework

2017-07-22 Thread Salil Mehta

This patch adds the support of the HNAE3 (Hisilicon Network
Acceleration Engine 3) framework support to the HNS3 driver.

Framework facilitates clients like ENET(HNS3 Ethernet Driver), RoCE
and user-space Ethernet drivers (like ODP etc.) to register with HNAE3
devices and their associated operations.

Signed-off-by: Daode Huang 
Signed-off-by: lipeng 
Signed-off-by: Salil Mehta 
Signed-off-by: Yisen Zhuang 
---
Patch V4: Addressed following comments
  1. Andrew Lunn:
 https://lkml.org/lkml/2017/6/17/233
 https://lkml.org/lkml/2017/6/18/105
  2. Bo Yu:
 https://lkml.org/lkml/2017/6/18/112
  3. Stephen Hamminger:
 https://lkml.org/lkml/2017/6/19/778
Patch V3: Addressed below comments
  1. Andrew Lunn:
 https://lkml.org/lkml/2017/6/13/1025
Patch V2: No change
Patch V1: Initial Submit
---
 drivers/net/ethernet/hisilicon/hns3/hnae3.c | 319 
 drivers/net/ethernet/hisilicon/hns3/hnae3.h | 449 
 2 files changed, 768 insertions(+)
 create mode 100644 drivers/net/ethernet/hisilicon/hns3/hnae3.c
 create mode 100644 drivers/net/ethernet/hisilicon/hns3/hnae3.h

diff --git a/drivers/net/ethernet/hisilicon/hns3/hnae3.c 
b/drivers/net/ethernet/hisilicon/hns3/hnae3.c
new file mode 100644
index ..7a11aaff0a23
--- /dev/null
+++ b/drivers/net/ethernet/hisilicon/hns3/hnae3.c
@@ -0,0 +1,319 @@
+/*
+ * Copyright (c) 2016-2017 Hisilicon Limited.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ */
+
+#include 
+#include 
+#include 
+
+#include "hnae3.h"
+
+static LIST_HEAD(hnae3_ae_algo_list);
+static LIST_HEAD(hnae3_client_list);
+static LIST_HEAD(hnae3_ae_dev_list);
+
+/* we are keeping things simple and using single lock for all the
+ * list. This is a non-critical code so other updations, if happen
+ * in parallel, can wait.
+ */
+static DEFINE_MUTEX(hnae3_common_lock);
+
+static bool hnae3_client_match(enum hnae3_client_type client_type,
+  enum hnae3_dev_type dev_type)
+{
+   if (dev_type == HNAE3_DEV_KNIC) {
+   switch (client_type) {
+   case HNAE3_CLIENT_KNIC:
+   case HNAE3_CLIENT_ROCE:
+   return true;
+   default:
+   return false;
+   }
+   } else if (dev_type == HNAE3_DEV_UNIC) {
+   switch (client_type) {
+   case HNAE3_CLIENT_UNIC:
+   return true;
+   default:
+   return false;
+   }
+   } else {
+   return false;
+   }
+}
+
+static int hnae3_match_n_instantiate(struct hnae3_client *client,
+struct hnae3_ae_dev *ae_dev,
+bool is_reg, bool *matched)
+{
+   int ret;
+
+   *matched = false;
+
+   /* check if this client matches the type of ae_dev */
+   if (!(hnae3_client_match(client->type, ae_dev->dev_type) &&
+ hnae_get_bit(ae_dev->flag, HNAE3_DEV_INITED_B))) {
+   return 0;
+   }
+   /* there is a match of client and dev */
+   *matched = true;
+
+   if (!(ae_dev->ops && ae_dev->ops->init_client_instance &&
+ ae_dev->ops->uninit_client_instance)) {
+   dev_err(_dev->pdev->dev,
+   "ae_dev or client init/uninit ops are null\n");
+   return -EOPNOTSUPP;
+   }
+
+   /* now, (un-)instantiate client by calling lower layer */
+   if (is_reg) {
+   ret = ae_dev->ops->init_client_instance(client, ae_dev);
+   if (ret)
+   dev_err(_dev->pdev->dev,
+   "fail to instantiate client\n");
+   return ret;
+   }
+
+   ae_dev->ops->uninit_client_instance(client, ae_dev);
+   return 0;
+}
+
+int hnae3_register_client(struct hnae3_client *client)
+{
+   struct hnae3_client *client_tmp;
+   struct hnae3_ae_dev *ae_dev;
+   bool matched;
+   int ret = 0;
+
+   mutex_lock(_common_lock);
+   /* one system should only have one client for every type */
+   list_for_each_entry(client_tmp, _client_list, node) {
+   if (client_tmp->type == client->type)
+   goto exit;
+   }
+
+   list_add_tail(>node, _client_list);
+
+   /* initialize the client on every matched port */
+   list_for_each_entry(ae_dev, _ae_dev_list, node) {
+   /* if the client could not be initialized on current port, for
+* any error reasons, move on to next available port
+*/
+   ret = hnae3_match_n_instantiate(client, ae_dev, true, );
+   if (ret)
+   dev_err(_dev->pdev->dev,
+

[PATCH V4 net-next 6/8] net: hns3: Add MDIO support to HNS3 Ethernet driver for hip08 SoC

2017-07-22 Thread Salil Mehta

This patch adds the support of MDIO bus interface for HNS3 driver.
Code provides various interfaces to start and stop the PHY layer
and to read and write the MDIO bus or PHY.

Signed-off-by: Daode Huang 
Signed-off-by: lipeng 
Signed-off-by: Salil Mehta 
Signed-off-by: Yisen Zhuang 
---
Patch V4: Addressed following comments:
 1. Andrew Lunn:
https://lkml.org/lkml/2017/6/17/208
Patch V3: Addressed Below comments:
 1. Florian Fainelli:
https://lkml.org/lkml/2017/6/13/963
 2. Andrew Lunn:
https://lkml.org/lkml/2017/6/13/1039
Patch V2: Addressed below comments:
 1. Florian Fainelli:
https://lkml.org/lkml/2017/6/10/130
 2. Andrew Lunn:
https://lkml.org/lkml/2017/6/10/168
Patch V1: Initial Submit
---
 .../ethernet/hisilicon/hns3/hns3pf/hclge_mdio.c| 230 +
 1 file changed, 230 insertions(+)
 create mode 100644 drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_mdio.c

diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_mdio.c 
b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_mdio.c
new file mode 100644
index ..6036a97f7de5
--- /dev/null
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_mdio.c
@@ -0,0 +1,230 @@
+/*
+ * Copyright (c) 2016~2017 Hisilicon Limited.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ */
+
+#include 
+#include 
+
+#include "hclge_cmd.h"
+#include "hclge_main.h"
+
+enum hclge_mdio_c22_op_seq {
+   HCLGE_MDIO_C22_WRITE = 1,
+   HCLGE_MDIO_C22_READ = 2
+};
+
+#define HCLGE_MDIO_CTRL_START_B0
+#define HCLGE_MDIO_CTRL_ST_S   1
+#define HCLGE_MDIO_CTRL_ST_M   (0x3 << HCLGE_MDIO_CTRL_ST_S)
+#define HCLGE_MDIO_CTRL_OP_S   3
+#define HCLGE_MDIO_CTRL_OP_M   (0x3 << HCLGE_MDIO_CTRL_OP_S)
+
+#define HCLGE_MDIO_PHYID_S 0
+#define HCLGE_MDIO_PHYID_M (0x1f << HCLGE_MDIO_PHYID_S)
+
+#define HCLGE_MDIO_PHYREG_S0
+#define HCLGE_MDIO_PHYREG_M(0x1f << HCLGE_MDIO_PHYREG_S)
+
+#define HCLGE_MDIO_STA_B   0
+
+struct hclge_mdio_cfg_cmd {
+   u8 ctrl_bit;
+   u8 phyid;
+   u8 phyad;
+   u8 rsvd;
+   __le16 reserve;
+   __le16 data_wr;
+   __le16 data_rd;
+   __le16 sta;
+};
+
+static int hclge_mdio_write(struct mii_bus *bus, int phyid, int regnum,
+   u16 data)
+{
+   struct hclge_dev *hdev = (struct hclge_dev *)bus->priv;
+   struct hclge_mdio_cfg_cmd *mdio_cmd;
+   enum hclge_cmd_status status;
+   struct hclge_desc desc;
+
+   if (!bus)
+   return -EINVAL;
+
+   hclge_cmd_setup_basic_desc(, HCLGE_OPC_MDIO_CONFIG, false);
+
+   mdio_cmd = (struct hclge_mdio_cfg_cmd *)desc.data;
+
+   hnae_set_field(mdio_cmd->phyid, HCLGE_MDIO_PHYID_M,
+  HCLGE_MDIO_PHYID_S, phyid);
+   hnae_set_field(mdio_cmd->phyad, HCLGE_MDIO_PHYREG_M,
+  HCLGE_MDIO_PHYREG_S, regnum);
+
+   hnae_set_bit(mdio_cmd->ctrl_bit, HCLGE_MDIO_CTRL_START_B, 1);
+   hnae_set_field(mdio_cmd->ctrl_bit, HCLGE_MDIO_CTRL_ST_M,
+  HCLGE_MDIO_CTRL_ST_S, 1);
+   hnae_set_field(mdio_cmd->ctrl_bit, HCLGE_MDIO_CTRL_OP_M,
+  HCLGE_MDIO_CTRL_OP_S, HCLGE_MDIO_C22_WRITE);
+
+   mdio_cmd->data_wr = cpu_to_le16(data);
+
+   status = hclge_cmd_send(>hw, , 1);
+   if (status) {
+   dev_err(>pdev->dev,
+   "mdio write fail when sending cmd, status is %d.\n",
+   status);
+   return -EIO;
+   }
+
+   return 0;
+}
+
+static int hclge_mdio_read(struct mii_bus *bus, int phyid, int regnum)
+{
+   struct hclge_dev *hdev = (struct hclge_dev *)bus->priv;
+   struct hclge_mdio_cfg_cmd *mdio_cmd;
+   enum hclge_cmd_status status;
+   struct hclge_desc desc;
+
+   if (!bus)
+   return -EINVAL;
+
+   hclge_cmd_setup_basic_desc(, HCLGE_OPC_MDIO_CONFIG, true);
+
+   mdio_cmd = (struct hclge_mdio_cfg_cmd *)desc.data;
+
+   hnae_set_field(mdio_cmd->phyid, HCLGE_MDIO_PHYID_M,
+  HCLGE_MDIO_PHYID_S, phyid);
+   hnae_set_field(mdio_cmd->phyad, HCLGE_MDIO_PHYREG_M,
+  HCLGE_MDIO_PHYREG_S, regnum);
+
+   hnae_set_bit(mdio_cmd->ctrl_bit, HCLGE_MDIO_CTRL_START_B, 1);
+   hnae_set_field(mdio_cmd->ctrl_bit, HCLGE_MDIO_CTRL_ST_M,
+  HCLGE_MDIO_CTRL_ST_S, 1);
+   hnae_set_field(mdio_cmd->ctrl_bit, HCLGE_MDIO_CTRL_OP_M,
+  HCLGE_MDIO_CTRL_OP_S, HCLGE_MDIO_C22_READ);
+
+   /* Read out phy data */
+   status = hclge_cmd_send(>hw, , 1);
+   if (status) {
+   dev_err(>pdev->dev,
+

Re: [PATCH v3 4/5] ACPI / boot: Not all platform require acpi_reduced_hw_init()

2017-07-22 Thread Andy Shevchenko

On Sun, Jul 23, 2017 at 1:02 AM, Rafael J. Wysocki  wrote:
> On Saturday, July 22, 2017 04:53:52 AM Andy Shevchenko wrote:
>> On Sat, Jul 22, 2017 at 1:25 AM, Rafael J. Wysocki  
>> wrote:
>> > On Tuesday, July 18, 2017 06:04:19 PM Andy Shevchenko wrote:

>> > I'd rather do it at the time when acpi_reduced_hw_init() actually needs to 
>> > be
>> > overridden by at least one platform.
>>
>> Do you mean as folded into some other patch or just as a preparatory
>> patch in some future series?

> Any of the above would work for me.

Logically it should be a separate change as I can see (I have already
locally prepared patches for that one platform).

Thanks for review.

P.S. Are you going to apply first 3 then from this series?

-- 
With Best Regards,
Andy Shevchenko

[PATCH V4 net-next 6/8] net: hns3: Add MDIO support to HNS3 Ethernet driver for hip08 SoC

2017-07-22 Thread Salil Mehta

This patch adds the support of MDIO bus interface for HNS3 driver.
Code provides various interfaces to start and stop the PHY layer
and to read and write the MDIO bus or PHY.

Signed-off-by: Daode Huang 
Signed-off-by: lipeng 
Signed-off-by: Salil Mehta 
Signed-off-by: Yisen Zhuang 
---
Patch V4: Addressed following comments:
 1. Andrew Lunn:
https://lkml.org/lkml/2017/6/17/208
Patch V3: Addressed Below comments:
 1. Florian Fainelli:
https://lkml.org/lkml/2017/6/13/963
 2. Andrew Lunn:
https://lkml.org/lkml/2017/6/13/1039
Patch V2: Addressed below comments:
 1. Florian Fainelli:
https://lkml.org/lkml/2017/6/10/130
 2. Andrew Lunn:
https://lkml.org/lkml/2017/6/10/168
Patch V1: Initial Submit
---
 .../ethernet/hisilicon/hns3/hns3pf/hclge_mdio.c| 230 +
 1 file changed, 230 insertions(+)
 create mode 100644 drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_mdio.c

diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_mdio.c 
b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_mdio.c
new file mode 100644
index ..6036a97f7de5
--- /dev/null
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_mdio.c
@@ -0,0 +1,230 @@
+/*
+ * Copyright (c) 2016~2017 Hisilicon Limited.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ */
+
+#include 
+#include 
+
+#include "hclge_cmd.h"
+#include "hclge_main.h"
+
+enum hclge_mdio_c22_op_seq {
+   HCLGE_MDIO_C22_WRITE = 1,
+   HCLGE_MDIO_C22_READ = 2
+};
+
+#define HCLGE_MDIO_CTRL_START_B0
+#define HCLGE_MDIO_CTRL_ST_S   1
+#define HCLGE_MDIO_CTRL_ST_M   (0x3 << HCLGE_MDIO_CTRL_ST_S)
+#define HCLGE_MDIO_CTRL_OP_S   3
+#define HCLGE_MDIO_CTRL_OP_M   (0x3 << HCLGE_MDIO_CTRL_OP_S)
+
+#define HCLGE_MDIO_PHYID_S 0
+#define HCLGE_MDIO_PHYID_M (0x1f << HCLGE_MDIO_PHYID_S)
+
+#define HCLGE_MDIO_PHYREG_S0
+#define HCLGE_MDIO_PHYREG_M(0x1f << HCLGE_MDIO_PHYREG_S)
+
+#define HCLGE_MDIO_STA_B   0
+
+struct hclge_mdio_cfg_cmd {
+   u8 ctrl_bit;
+   u8 phyid;
+   u8 phyad;
+   u8 rsvd;
+   __le16 reserve;
+   __le16 data_wr;
+   __le16 data_rd;
+   __le16 sta;
+};
+
+static int hclge_mdio_write(struct mii_bus *bus, int phyid, int regnum,
+   u16 data)
+{
+   struct hclge_dev *hdev = (struct hclge_dev *)bus->priv;
+   struct hclge_mdio_cfg_cmd *mdio_cmd;
+   enum hclge_cmd_status status;
+   struct hclge_desc desc;
+
+   if (!bus)
+   return -EINVAL;
+
+   hclge_cmd_setup_basic_desc(, HCLGE_OPC_MDIO_CONFIG, false);
+
+   mdio_cmd = (struct hclge_mdio_cfg_cmd *)desc.data;
+
+   hnae_set_field(mdio_cmd->phyid, HCLGE_MDIO_PHYID_M,
+  HCLGE_MDIO_PHYID_S, phyid);
+   hnae_set_field(mdio_cmd->phyad, HCLGE_MDIO_PHYREG_M,
+  HCLGE_MDIO_PHYREG_S, regnum);
+
+   hnae_set_bit(mdio_cmd->ctrl_bit, HCLGE_MDIO_CTRL_START_B, 1);
+   hnae_set_field(mdio_cmd->ctrl_bit, HCLGE_MDIO_CTRL_ST_M,
+  HCLGE_MDIO_CTRL_ST_S, 1);
+   hnae_set_field(mdio_cmd->ctrl_bit, HCLGE_MDIO_CTRL_OP_M,
+  HCLGE_MDIO_CTRL_OP_S, HCLGE_MDIO_C22_WRITE);
+
+   mdio_cmd->data_wr = cpu_to_le16(data);
+
+   status = hclge_cmd_send(>hw, , 1);
+   if (status) {
+   dev_err(>pdev->dev,
+   "mdio write fail when sending cmd, status is %d.\n",
+   status);
+   return -EIO;
+   }
+
+   return 0;
+}
+
+static int hclge_mdio_read(struct mii_bus *bus, int phyid, int regnum)
+{
+   struct hclge_dev *hdev = (struct hclge_dev *)bus->priv;
+   struct hclge_mdio_cfg_cmd *mdio_cmd;
+   enum hclge_cmd_status status;
+   struct hclge_desc desc;
+
+   if (!bus)
+   return -EINVAL;
+
+   hclge_cmd_setup_basic_desc(, HCLGE_OPC_MDIO_CONFIG, true);
+
+   mdio_cmd = (struct hclge_mdio_cfg_cmd *)desc.data;
+
+   hnae_set_field(mdio_cmd->phyid, HCLGE_MDIO_PHYID_M,
+  HCLGE_MDIO_PHYID_S, phyid);
+   hnae_set_field(mdio_cmd->phyad, HCLGE_MDIO_PHYREG_M,
+  HCLGE_MDIO_PHYREG_S, regnum);
+
+   hnae_set_bit(mdio_cmd->ctrl_bit, HCLGE_MDIO_CTRL_START_B, 1);
+   hnae_set_field(mdio_cmd->ctrl_bit, HCLGE_MDIO_CTRL_ST_M,
+  HCLGE_MDIO_CTRL_ST_S, 1);
+   hnae_set_field(mdio_cmd->ctrl_bit, HCLGE_MDIO_CTRL_OP_M,
+  HCLGE_MDIO_CTRL_OP_S, HCLGE_MDIO_C22_READ);
+
+   /* Read out phy data */
+   status = hclge_cmd_send(>hw, , 1);
+   if (status) {
+   dev_err(>pdev->dev,
+   "mdio read fail when get data, status is %d.\n",
+

Re: [PATCH v3 4/5] ACPI / boot: Not all platform require acpi_reduced_hw_init()

2017-07-22 Thread Andy Shevchenko

On Sun, Jul 23, 2017 at 1:02 AM, Rafael J. Wysocki  wrote:
> On Saturday, July 22, 2017 04:53:52 AM Andy Shevchenko wrote:
>> On Sat, Jul 22, 2017 at 1:25 AM, Rafael J. Wysocki  
>> wrote:
>> > On Tuesday, July 18, 2017 06:04:19 PM Andy Shevchenko wrote:

>> > I'd rather do it at the time when acpi_reduced_hw_init() actually needs to 
>> > be
>> > overridden by at least one platform.
>>
>> Do you mean as folded into some other patch or just as a preparatory
>> patch in some future series?

> Any of the above would work for me.

Logically it should be a separate change as I can see (I have already
locally prepared patches for that one platform).

Thanks for review.

P.S. Are you going to apply first 3 then from this series?

-- 
With Best Regards,
Andy Shevchenko

[PATCH V4 net-next 4/8] net: hns3: Add HNS3 Acceleration Engine & Compatibility Layer Support

2017-07-22 Thread Salil Mehta

This patch adds the support of Hisilicon Network Subsystem Accceleration
Engine and common operations to access it. This layer provides access to the
hardware configuration, hardware statistics. This layer is also
responsible for triggering the initialization of the PHY layer through
the below MDIO layer.

Signed-off-by: Daode Huang 
Signed-off-by: lipeng 
Signed-off-by: Salil Mehta 
Signed-off-by: Yisen Zhuang 
---
Patch V4:
 1. removed register_client/unregister_client wrapper functions
 2. name inconsistencies, changed variable name from phy_dev->phydev
at some places
Patch V1: Initial Submit
---
 .../ethernet/hisilicon/hns3/hns3pf/hclge_main.c| 4240 
 .../ethernet/hisilicon/hns3/hns3pf/hclge_main.h|  494 +++
 2 files changed, 4734 insertions(+)
 create mode 100644 drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
 create mode 100644 drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.h

diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c 
b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
new file mode 100644
index ..fb28511ad4a1
--- /dev/null
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
@@ -0,0 +1,4240 @@
+/*
+ * Copyright (c) 2016-2017 Hisilicon Limited.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "hclge_cmd.h"
+#include "hclge_main.h"
+#include "hclge_tm.h"
+#include "hnae3.h"
+
+#define HCLGE_NAME "hclge"
+#define HCLGE_STATS_READ(p, offset) (*((u64 *)((u8 *)(p) + (offset
+#define HCLGE_MAC_STATS_FIELD_OFF(f) (offsetof(struct hclge_mac_stats, f))
+#define HCLGE_64BIT_STATS_FIELD_OFF(f) (offsetof(struct hclge_64_bit_stats, f))
+#define HCLGE_32BIT_STATS_FIELD_OFF(f) (offsetof(struct hclge_32_bit_stats, f))
+
+static int hclge_rss_init_hw(struct hclge_dev *hdev);
+static int hclge_set_mta_filter_mode(struct hclge_dev *hdev,
+enum hclge_mta_dmac_sel_type mta_mac_sel,
+bool enable);
+static int hclge_init_vlan_config(struct hclge_dev *hdev);
+
+struct hnae3_ae_algo ae_algo;
+
+static const struct pci_device_id ae_algo_pci_tbl[] = {
+   {PCI_VDEVICE(HUAWEI, HNAE3_DEV_ID_GE), 0},
+   {PCI_VDEVICE(HUAWEI, HNAE3_DEV_ID_25GE), 0},
+   {PCI_VDEVICE(HUAWEI, HNAE3_DEV_ID_25GE_RDMA), 0},
+   {PCI_VDEVICE(HUAWEI, HNAE3_DEV_ID_25GE_RDMA_MACSEC), 0},
+   {PCI_VDEVICE(HUAWEI, HNAE3_DEV_ID_50GE_RDMA), 0},
+   {PCI_VDEVICE(HUAWEI, HNAE3_DEV_ID_50GE_RDMA_MACSEC), 0},
+   {PCI_VDEVICE(HUAWEI, HNAE3_DEV_ID_100G_RDMA_MACSEC), 0},
+   /* Required last entry */
+   {0, }
+};
+
+static const struct pci_device_id roce_pci_tbl[] = {
+   {PCI_VDEVICE(HUAWEI, HNAE3_DEV_ID_25GE_RDMA), 0},
+   {PCI_VDEVICE(HUAWEI, HNAE3_DEV_ID_25GE_RDMA_MACSEC), 0},
+   {PCI_VDEVICE(HUAWEI, HNAE3_DEV_ID_50GE_RDMA), 0},
+   {PCI_VDEVICE(HUAWEI, HNAE3_DEV_ID_50GE_RDMA_MACSEC), 0},
+   {PCI_VDEVICE(HUAWEI, HNAE3_DEV_ID_100G_RDMA_MACSEC), 0},
+   /* Required last entry */
+   {0, }
+};
+
+static const char hns3_nic_test_strs[][ETH_GSTRING_LEN] = {
+   "MacLoopback test",
+   "Serdes Loopback test",
+   "PhyLoopback test"
+};
+
+static const struct hclge_comm_stats_str g_all_64bit_stats_string[] = {
+   {"igu_rx_oversize_pkt",
+   HCLGE_64BIT_STATS_FIELD_OFF(igu_rx_oversize_pkt)},
+   {"igu_rx_undersize_pkt",
+   HCLGE_64BIT_STATS_FIELD_OFF(igu_rx_undersize_pkt)},
+   {"igu_rx_out_all_pkt",
+   HCLGE_64BIT_STATS_FIELD_OFF(igu_rx_out_all_pkt)},
+   {"igu_rx_uni_pkt",
+   HCLGE_64BIT_STATS_FIELD_OFF(igu_rx_uni_pkt)},
+   {"igu_rx_multi_pkt",
+   HCLGE_64BIT_STATS_FIELD_OFF(igu_rx_multi_pkt)},
+   {"igu_rx_broad_pkt",
+   HCLGE_64BIT_STATS_FIELD_OFF(igu_rx_broad_pkt)},
+   {"egu_tx_out_all_pkt",
+   HCLGE_64BIT_STATS_FIELD_OFF(egu_tx_out_all_pkt)},
+   {"egu_tx_uni_pkt",
+   HCLGE_64BIT_STATS_FIELD_OFF(egu_tx_uni_pkt)},
+   {"egu_tx_multi_pkt",
+   HCLGE_64BIT_STATS_FIELD_OFF(egu_tx_multi_pkt)},
+   {"egu_tx_broad_pkt",
+   HCLGE_64BIT_STATS_FIELD_OFF(egu_tx_broad_pkt)},
+   {"ssu_ppp_mac_key_num",
+   HCLGE_64BIT_STATS_FIELD_OFF(ssu_ppp_mac_key_num)},
+   {"ssu_ppp_host_key_num",
+   HCLGE_64BIT_STATS_FIELD_OFF(ssu_ppp_host_key_num)},
+   {"ppp_ssu_mac_rlt_num",
+

[PATCH V4 net-next 4/8] net: hns3: Add HNS3 Acceleration Engine & Compatibility Layer Support

2017-07-22 Thread Salil Mehta

This patch adds the support of Hisilicon Network Subsystem Accceleration
Engine and common operations to access it. This layer provides access to the
hardware configuration, hardware statistics. This layer is also
responsible for triggering the initialization of the PHY layer through
the below MDIO layer.

Signed-off-by: Daode Huang 
Signed-off-by: lipeng 
Signed-off-by: Salil Mehta 
Signed-off-by: Yisen Zhuang 
---
Patch V4:
 1. removed register_client/unregister_client wrapper functions
 2. name inconsistencies, changed variable name from phy_dev->phydev
at some places
Patch V1: Initial Submit
---
 .../ethernet/hisilicon/hns3/hns3pf/hclge_main.c| 4240 
 .../ethernet/hisilicon/hns3/hns3pf/hclge_main.h|  494 +++
 2 files changed, 4734 insertions(+)
 create mode 100644 drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
 create mode 100644 drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.h

diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c 
b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
new file mode 100644
index ..fb28511ad4a1
--- /dev/null
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
@@ -0,0 +1,4240 @@
+/*
+ * Copyright (c) 2016-2017 Hisilicon Limited.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "hclge_cmd.h"
+#include "hclge_main.h"
+#include "hclge_tm.h"
+#include "hnae3.h"
+
+#define HCLGE_NAME "hclge"
+#define HCLGE_STATS_READ(p, offset) (*((u64 *)((u8 *)(p) + (offset
+#define HCLGE_MAC_STATS_FIELD_OFF(f) (offsetof(struct hclge_mac_stats, f))
+#define HCLGE_64BIT_STATS_FIELD_OFF(f) (offsetof(struct hclge_64_bit_stats, f))
+#define HCLGE_32BIT_STATS_FIELD_OFF(f) (offsetof(struct hclge_32_bit_stats, f))
+
+static int hclge_rss_init_hw(struct hclge_dev *hdev);
+static int hclge_set_mta_filter_mode(struct hclge_dev *hdev,
+enum hclge_mta_dmac_sel_type mta_mac_sel,
+bool enable);
+static int hclge_init_vlan_config(struct hclge_dev *hdev);
+
+struct hnae3_ae_algo ae_algo;
+
+static const struct pci_device_id ae_algo_pci_tbl[] = {
+   {PCI_VDEVICE(HUAWEI, HNAE3_DEV_ID_GE), 0},
+   {PCI_VDEVICE(HUAWEI, HNAE3_DEV_ID_25GE), 0},
+   {PCI_VDEVICE(HUAWEI, HNAE3_DEV_ID_25GE_RDMA), 0},
+   {PCI_VDEVICE(HUAWEI, HNAE3_DEV_ID_25GE_RDMA_MACSEC), 0},
+   {PCI_VDEVICE(HUAWEI, HNAE3_DEV_ID_50GE_RDMA), 0},
+   {PCI_VDEVICE(HUAWEI, HNAE3_DEV_ID_50GE_RDMA_MACSEC), 0},
+   {PCI_VDEVICE(HUAWEI, HNAE3_DEV_ID_100G_RDMA_MACSEC), 0},
+   /* Required last entry */
+   {0, }
+};
+
+static const struct pci_device_id roce_pci_tbl[] = {
+   {PCI_VDEVICE(HUAWEI, HNAE3_DEV_ID_25GE_RDMA), 0},
+   {PCI_VDEVICE(HUAWEI, HNAE3_DEV_ID_25GE_RDMA_MACSEC), 0},
+   {PCI_VDEVICE(HUAWEI, HNAE3_DEV_ID_50GE_RDMA), 0},
+   {PCI_VDEVICE(HUAWEI, HNAE3_DEV_ID_50GE_RDMA_MACSEC), 0},
+   {PCI_VDEVICE(HUAWEI, HNAE3_DEV_ID_100G_RDMA_MACSEC), 0},
+   /* Required last entry */
+   {0, }
+};
+
+static const char hns3_nic_test_strs[][ETH_GSTRING_LEN] = {
+   "MacLoopback test",
+   "Serdes Loopback test",
+   "PhyLoopback test"
+};
+
+static const struct hclge_comm_stats_str g_all_64bit_stats_string[] = {
+   {"igu_rx_oversize_pkt",
+   HCLGE_64BIT_STATS_FIELD_OFF(igu_rx_oversize_pkt)},
+   {"igu_rx_undersize_pkt",
+   HCLGE_64BIT_STATS_FIELD_OFF(igu_rx_undersize_pkt)},
+   {"igu_rx_out_all_pkt",
+   HCLGE_64BIT_STATS_FIELD_OFF(igu_rx_out_all_pkt)},
+   {"igu_rx_uni_pkt",
+   HCLGE_64BIT_STATS_FIELD_OFF(igu_rx_uni_pkt)},
+   {"igu_rx_multi_pkt",
+   HCLGE_64BIT_STATS_FIELD_OFF(igu_rx_multi_pkt)},
+   {"igu_rx_broad_pkt",
+   HCLGE_64BIT_STATS_FIELD_OFF(igu_rx_broad_pkt)},
+   {"egu_tx_out_all_pkt",
+   HCLGE_64BIT_STATS_FIELD_OFF(egu_tx_out_all_pkt)},
+   {"egu_tx_uni_pkt",
+   HCLGE_64BIT_STATS_FIELD_OFF(egu_tx_uni_pkt)},
+   {"egu_tx_multi_pkt",
+   HCLGE_64BIT_STATS_FIELD_OFF(egu_tx_multi_pkt)},
+   {"egu_tx_broad_pkt",
+   HCLGE_64BIT_STATS_FIELD_OFF(egu_tx_broad_pkt)},
+   {"ssu_ppp_mac_key_num",
+   HCLGE_64BIT_STATS_FIELD_OFF(ssu_ppp_mac_key_num)},
+   {"ssu_ppp_host_key_num",
+   HCLGE_64BIT_STATS_FIELD_OFF(ssu_ppp_host_key_num)},
+   {"ppp_ssu_mac_rlt_num",
+   HCLGE_64BIT_STATS_FIELD_OFF(ppp_ssu_mac_rlt_num)},
+   {"ppp_ssu_host_rlt_num",
+

[PATCH V4 net-next 7/8] net: hns3: Add Ethtool support to HNS3 driver

2017-07-22 Thread Salil Mehta

This patch adds the support of the Ethtool interface to
the HNS3 Ethernet driver. Various commands to read the
statistics, configure the offloading, loopback selftest etc.
are supported.

Signed-off-by: Daode Huang 
Signed-off-by: lipeng 
Signed-off-by: Salil Mehta 
Signed-off-by: Yisen Zhuang 
---
Patch V4: addressed below comments
 1. Andrew Lunn
Removed the support of loop PHY back for now
Patch V3: Address below comments
 1. Stephen Hemminger
https://lkml.org/lkml/2017/6/13/974
 2. Andrew Lunn
https://lkml.org/lkml/2017/6/13/1037
Patch V2: No change
Patch V1: Initial Submit
---
 .../ethernet/hisilicon/hns3/hns3pf/hns3_ethtool.c  | 543 +
 1 file changed, 543 insertions(+)
 create mode 100644 drivers/net/ethernet/hisilicon/hns3/hns3pf/hns3_ethtool.c

diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hns3_ethtool.c 
b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hns3_ethtool.c
new file mode 100644
index ..82b0d4d829f8
--- /dev/null
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hns3_ethtool.c
@@ -0,0 +1,543 @@
+/*
+ * Copyright (c) 2016~2017 Hisilicon Limited.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ */
+
+#include 
+#include "hns3_enet.h"
+
+struct hns3_stats {
+   char stats_string[ETH_GSTRING_LEN];
+   int stats_size;
+   int stats_offset;
+};
+
+/* netdev related stats */
+#define HNS3_NETDEV_STAT(_string, _member) \
+   { _string,  \
+ FIELD_SIZEOF(struct rtnl_link_stats64, _member),  \
+ offsetof(struct rtnl_link_stats64, _member),  \
+   }
+
+static const struct hns3_stats hns3_netdev_stats[] = {
+   /* misc. Rx/Tx statistics */
+   HNS3_NETDEV_STAT("rx_packets", rx_packets),
+   HNS3_NETDEV_STAT("tx_packets", tx_packets),
+   HNS3_NETDEV_STAT("rx_bytes", rx_bytes),
+   HNS3_NETDEV_STAT("tx_bytes", tx_bytes),
+   HNS3_NETDEV_STAT("rx_errors", rx_errors),
+   HNS3_NETDEV_STAT("tx_errors", tx_errors),
+   HNS3_NETDEV_STAT("rx_dropped", rx_dropped),
+   HNS3_NETDEV_STAT("tx_dropped", tx_dropped),
+   HNS3_NETDEV_STAT("multicast", multicast),
+   HNS3_NETDEV_STAT("collisions", collisions),
+
+   /* detailed Rx errors */
+   HNS3_NETDEV_STAT("rx_length_errors", rx_length_errors),
+   HNS3_NETDEV_STAT("rx_over_errors", rx_over_errors),
+   HNS3_NETDEV_STAT("rx_crc_errors", rx_crc_errors),
+   HNS3_NETDEV_STAT("rx_frame_errors", rx_frame_errors),
+   HNS3_NETDEV_STAT("rx_fifo_errors", rx_fifo_errors),
+   HNS3_NETDEV_STAT("rx_missed_errors", rx_missed_errors),
+
+   /* detailed Tx errors */
+   HNS3_NETDEV_STAT("tx_aborted_errors", tx_aborted_errors),
+   HNS3_NETDEV_STAT("tx_carrier_errors", tx_carrier_errors),
+   HNS3_NETDEV_STAT("tx_fifo_errors", tx_fifo_errors),
+   HNS3_NETDEV_STAT("tx_heartbeat_errors", tx_heartbeat_errors),
+   HNS3_NETDEV_STAT("tx_window_errors", tx_window_errors),
+
+   /* for cslip etc */
+   HNS3_NETDEV_STAT("rx_compressed", rx_compressed),
+   HNS3_NETDEV_STAT("tx_compressed", tx_compressed),
+};
+
+#define HNS3_NETDEV_STATS_COUNT ARRAY_SIZE(hns3_netdev_stats)
+
+/* tqp related stats */
+#define HNS3_TQP_STAT(_string, _member)\
+   { _string,  \
+ FIELD_SIZEOF(struct ring_stats, _member), \
+ offsetof(struct hns3_enet_ring, stats),   \
+   }
+
+static const struct hns3_stats hns3_txq_stats[] = {
+   /* Tx per-queue statistics */
+   HNS3_TQP_STAT("tx_io_err_cnt", io_err_cnt),
+   HNS3_TQP_STAT("tx_sw_err_cnt", sw_err_cnt),
+   HNS3_TQP_STAT("tx_seg_pkt_cnt", seg_pkt_cnt),
+   HNS3_TQP_STAT("tx_pkts", tx_pkts),
+   HNS3_TQP_STAT("tx_bytes", tx_bytes),
+   HNS3_TQP_STAT("tx_err_cnt", tx_err_cnt),
+   HNS3_TQP_STAT("tx_restart_queue", restart_queue),
+   HNS3_TQP_STAT("tx_busy", tx_busy),
+};
+
+#define HNS3_TXQ_STATS_COUNT ARRAY_SIZE(hns3_txq_stats)
+
+static const struct hns3_stats hns3_rxq_stats[] = {
+   /* Rx per-queue statistics */
+   HNS3_TQP_STAT("rx_io_err_cnt", io_err_cnt),
+   HNS3_TQP_STAT("rx_sw_err_cnt", sw_err_cnt),
+   HNS3_TQP_STAT("rx_seg_pkt_cnt", seg_pkt_cnt),
+   HNS3_TQP_STAT("rx_pkts", rx_pkts),
+   HNS3_TQP_STAT("rx_bytes", rx_bytes),
+   HNS3_TQP_STAT("rx_err_cnt", rx_err_cnt),
+   HNS3_TQP_STAT("rx_reuse_pg_cnt", reuse_pg_cnt),
+   HNS3_TQP_STAT("rx_err_pkt_len", err_pkt_len),
+   HNS3_TQP_STAT("rx_non_vld_descs", non_vld_descs),
+   HNS3_TQP_STAT("rx_err_bd_num", err_bd_num),
+

[PATCH V4 net-next 7/8] net: hns3: Add Ethtool support to HNS3 driver

2017-07-22 Thread Salil Mehta

This patch adds the support of the Ethtool interface to
the HNS3 Ethernet driver. Various commands to read the
statistics, configure the offloading, loopback selftest etc.
are supported.

Signed-off-by: Daode Huang 
Signed-off-by: lipeng 
Signed-off-by: Salil Mehta 
Signed-off-by: Yisen Zhuang 
---
Patch V4: addressed below comments
 1. Andrew Lunn
Removed the support of loop PHY back for now
Patch V3: Address below comments
 1. Stephen Hemminger
https://lkml.org/lkml/2017/6/13/974
 2. Andrew Lunn
https://lkml.org/lkml/2017/6/13/1037
Patch V2: No change
Patch V1: Initial Submit
---
 .../ethernet/hisilicon/hns3/hns3pf/hns3_ethtool.c  | 543 +
 1 file changed, 543 insertions(+)
 create mode 100644 drivers/net/ethernet/hisilicon/hns3/hns3pf/hns3_ethtool.c

diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hns3_ethtool.c 
b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hns3_ethtool.c
new file mode 100644
index ..82b0d4d829f8
--- /dev/null
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hns3_ethtool.c
@@ -0,0 +1,543 @@
+/*
+ * Copyright (c) 2016~2017 Hisilicon Limited.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ */
+
+#include 
+#include "hns3_enet.h"
+
+struct hns3_stats {
+   char stats_string[ETH_GSTRING_LEN];
+   int stats_size;
+   int stats_offset;
+};
+
+/* netdev related stats */
+#define HNS3_NETDEV_STAT(_string, _member) \
+   { _string,  \
+ FIELD_SIZEOF(struct rtnl_link_stats64, _member),  \
+ offsetof(struct rtnl_link_stats64, _member),  \
+   }
+
+static const struct hns3_stats hns3_netdev_stats[] = {
+   /* misc. Rx/Tx statistics */
+   HNS3_NETDEV_STAT("rx_packets", rx_packets),
+   HNS3_NETDEV_STAT("tx_packets", tx_packets),
+   HNS3_NETDEV_STAT("rx_bytes", rx_bytes),
+   HNS3_NETDEV_STAT("tx_bytes", tx_bytes),
+   HNS3_NETDEV_STAT("rx_errors", rx_errors),
+   HNS3_NETDEV_STAT("tx_errors", tx_errors),
+   HNS3_NETDEV_STAT("rx_dropped", rx_dropped),
+   HNS3_NETDEV_STAT("tx_dropped", tx_dropped),
+   HNS3_NETDEV_STAT("multicast", multicast),
+   HNS3_NETDEV_STAT("collisions", collisions),
+
+   /* detailed Rx errors */
+   HNS3_NETDEV_STAT("rx_length_errors", rx_length_errors),
+   HNS3_NETDEV_STAT("rx_over_errors", rx_over_errors),
+   HNS3_NETDEV_STAT("rx_crc_errors", rx_crc_errors),
+   HNS3_NETDEV_STAT("rx_frame_errors", rx_frame_errors),
+   HNS3_NETDEV_STAT("rx_fifo_errors", rx_fifo_errors),
+   HNS3_NETDEV_STAT("rx_missed_errors", rx_missed_errors),
+
+   /* detailed Tx errors */
+   HNS3_NETDEV_STAT("tx_aborted_errors", tx_aborted_errors),
+   HNS3_NETDEV_STAT("tx_carrier_errors", tx_carrier_errors),
+   HNS3_NETDEV_STAT("tx_fifo_errors", tx_fifo_errors),
+   HNS3_NETDEV_STAT("tx_heartbeat_errors", tx_heartbeat_errors),
+   HNS3_NETDEV_STAT("tx_window_errors", tx_window_errors),
+
+   /* for cslip etc */
+   HNS3_NETDEV_STAT("rx_compressed", rx_compressed),
+   HNS3_NETDEV_STAT("tx_compressed", tx_compressed),
+};
+
+#define HNS3_NETDEV_STATS_COUNT ARRAY_SIZE(hns3_netdev_stats)
+
+/* tqp related stats */
+#define HNS3_TQP_STAT(_string, _member)\
+   { _string,  \
+ FIELD_SIZEOF(struct ring_stats, _member), \
+ offsetof(struct hns3_enet_ring, stats),   \
+   }
+
+static const struct hns3_stats hns3_txq_stats[] = {
+   /* Tx per-queue statistics */
+   HNS3_TQP_STAT("tx_io_err_cnt", io_err_cnt),
+   HNS3_TQP_STAT("tx_sw_err_cnt", sw_err_cnt),
+   HNS3_TQP_STAT("tx_seg_pkt_cnt", seg_pkt_cnt),
+   HNS3_TQP_STAT("tx_pkts", tx_pkts),
+   HNS3_TQP_STAT("tx_bytes", tx_bytes),
+   HNS3_TQP_STAT("tx_err_cnt", tx_err_cnt),
+   HNS3_TQP_STAT("tx_restart_queue", restart_queue),
+   HNS3_TQP_STAT("tx_busy", tx_busy),
+};
+
+#define HNS3_TXQ_STATS_COUNT ARRAY_SIZE(hns3_txq_stats)
+
+static const struct hns3_stats hns3_rxq_stats[] = {
+   /* Rx per-queue statistics */
+   HNS3_TQP_STAT("rx_io_err_cnt", io_err_cnt),
+   HNS3_TQP_STAT("rx_sw_err_cnt", sw_err_cnt),
+   HNS3_TQP_STAT("rx_seg_pkt_cnt", seg_pkt_cnt),
+   HNS3_TQP_STAT("rx_pkts", rx_pkts),
+   HNS3_TQP_STAT("rx_bytes", rx_bytes),
+   HNS3_TQP_STAT("rx_err_cnt", rx_err_cnt),
+   HNS3_TQP_STAT("rx_reuse_pg_cnt", reuse_pg_cnt),
+   HNS3_TQP_STAT("rx_err_pkt_len", err_pkt_len),
+   HNS3_TQP_STAT("rx_non_vld_descs", non_vld_descs),
+   HNS3_TQP_STAT("rx_err_bd_num", err_bd_num),
+   HNS3_TQP_STAT("rx_l2_err", l2_err),
+   HNS3_TQP_STAT("rx_l3l4_csum_err", l3l4_csum_err),

[PATCH V4 net-next 5/8] net: hns3: Add support of TX Scheduler & Shaper to HNS3 driver

2017-07-22 Thread Salil Mehta

THis patch adds the support of the Scheduling and Shaping
functionalities during the transmit leg. This also adds the
support of Pause at MAC level. (Pause at per-priority level
shall be added later along with the DCB feature).

Hardware as such consists of two types of cofiguration of 6 level
schedulers. Algorithms varies according to the level and type
of scheduler being used. Current patch is used to initialize
the mapping, algorithms(like SP, DWRR etc) and shaper(CIR, PIR etc)
being used.

Signed-off-by: Daode Huang 
Signed-off-by: lipeng 
Signed-off-by: Salil Mehta 
Signed-off-by: Yisen Zhuang 
---
 .../net/ethernet/hisilicon/hns3/hns3pf/hclge_tm.c  | 1018 
 .../net/ethernet/hisilicon/hns3/hns3pf/hclge_tm.h  |  108 +++
 2 files changed, 1126 insertions(+)
 create mode 100644 drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_tm.c
 create mode 100644 drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_tm.h

diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_tm.c 
b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_tm.c
new file mode 100644
index ..2b66a0e63aec
--- /dev/null
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_tm.c
@@ -0,0 +1,1018 @@
+/*
+ * Copyright (c) 2016~2017 Hisilicon Limited.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ */
+
+#include 
+
+#include "hclge_cmd.h"
+#include "hclge_main.h"
+#include "hclge_tm.h"
+
+enum hclge_shaper_level {
+   HCLGE_SHAPER_LVL_PRI= 0,
+   HCLGE_SHAPER_LVL_PG = 1,
+   HCLGE_SHAPER_LVL_PORT   = 2,
+   HCLGE_SHAPER_LVL_QSET   = 3,
+   HCLGE_SHAPER_LVL_CNT= 4,
+   HCLGE_SHAPER_LVL_VF = 0,
+   HCLGE_SHAPER_LVL_PF = 1,
+};
+
+#define HCLGE_SHAPER_BS_U_DEF  1
+#define HCLGE_SHAPER_BS_S_DEF  4
+
+#define HCLGE_ETHER_MAX_RATE   10
+
+/* hclge_shaper_para_calc: calculate ir parameter for the shaper
+ * @ir: Rate to be config, its unit is Mbps
+ * @shaper_level: the shaper level. eg: port, pg, priority, queueset
+ * @ir_b: IR_B parameter of IR shaper
+ * @ir_u: IR_U parameter of IR shaper
+ * @ir_s: IR_S parameter of IR shaper
+ *
+ * the formula:
+ *
+ * IR_b * (2 ^ IR_u) * 8
+ * IR(Mbps) = -  *  CLOCK(1000Mbps)
+ * Tick * (2 ^ IR_s)
+ *
+ * @return: 0: calculate sucessful, negative: fail
+ */
+static int hclge_shaper_para_calc(u32 ir, u8 shaper_level,
+ u8 *ir_b, u8 *ir_u, u8 *ir_s)
+{
+   const u16 tick_array[HCLGE_SHAPER_LVL_CNT] = {
+   6 * 256,/* Prioriy level */
+   6 * 32, /* Prioriy group level */
+   6 * 8,  /* Port level */
+   6 * 256 /* Qset level */
+   };
+   u8 ir_u_calc = 0, ir_s_calc = 0;
+   u32 ir_calc;
+   u32 tick;
+
+   /* Calc tick */
+   if (shaper_level >= HCLGE_SHAPER_LVL_CNT)
+   return -ENOMEM;
+
+   tick = tick_array[shaper_level];
+
+   /**
+* Calc the speed if ir_b = 126, ir_u = 0 and ir_s = 0
+* the formula is changed to:
+*  126 * 1 * 8
+* ir_calc =  * 1000
+*  tick * 1
+*/
+   ir_calc = (1008000 + (tick >> 1) - 1) / tick;
+
+   if (ir_calc == ir) {
+   *ir_b = 126;
+   *ir_u = 0;
+   *ir_s = 0;
+
+   return 0;
+   } else if (ir_calc > ir) {
+   /* Increasing the denominator to select ir_s value */
+   while (ir_calc > ir) {
+   ir_s_calc++;
+   ir_calc = 1008000 / (tick * (1 << ir_s_calc));
+   }
+
+   if (ir_calc == ir)
+   *ir_b = 126;
+   else
+   *ir_b = (ir * tick * (1 << ir_s_calc) + 4000) / 8000;
+   } else {
+   /* Increasing the numerator to select ir_u value */
+   u32 numerator;
+
+   while (ir_calc < ir) {
+   ir_u_calc++;
+   numerator = 1008000 * (1 << ir_u_calc);
+   ir_calc = (numerator + (tick >> 1)) / tick;
+   }
+
+   if (ir_calc == ir) {
+   *ir_b = 126;
+   } else {
+   u32 denominator = (8000 * (1 << --ir_u_calc));
+   *ir_b = (ir * tick + (denominator >> 1)) / denominator;
+   }
+   }
+
+   *ir_u = ir_u_calc;
+   *ir_s = ir_s_calc;
+
+   return 0;
+}
+
+static int hclge_mac_pause_en_cfg(struct hclge_dev *hdev, bool tx, bool rx)
+{
+   struct hclge_desc desc;
+
+

[PATCH V4 net-next 5/8] net: hns3: Add support of TX Scheduler & Shaper to HNS3 driver

2017-07-22 Thread Salil Mehta

THis patch adds the support of the Scheduling and Shaping
functionalities during the transmit leg. This also adds the
support of Pause at MAC level. (Pause at per-priority level
shall be added later along with the DCB feature).

Hardware as such consists of two types of cofiguration of 6 level
schedulers. Algorithms varies according to the level and type
of scheduler being used. Current patch is used to initialize
the mapping, algorithms(like SP, DWRR etc) and shaper(CIR, PIR etc)
being used.

Signed-off-by: Daode Huang 
Signed-off-by: lipeng 
Signed-off-by: Salil Mehta 
Signed-off-by: Yisen Zhuang 
---
 .../net/ethernet/hisilicon/hns3/hns3pf/hclge_tm.c  | 1018 
 .../net/ethernet/hisilicon/hns3/hns3pf/hclge_tm.h  |  108 +++
 2 files changed, 1126 insertions(+)
 create mode 100644 drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_tm.c
 create mode 100644 drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_tm.h

diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_tm.c 
b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_tm.c
new file mode 100644
index ..2b66a0e63aec
--- /dev/null
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_tm.c
@@ -0,0 +1,1018 @@
+/*
+ * Copyright (c) 2016~2017 Hisilicon Limited.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ */
+
+#include 
+
+#include "hclge_cmd.h"
+#include "hclge_main.h"
+#include "hclge_tm.h"
+
+enum hclge_shaper_level {
+   HCLGE_SHAPER_LVL_PRI= 0,
+   HCLGE_SHAPER_LVL_PG = 1,
+   HCLGE_SHAPER_LVL_PORT   = 2,
+   HCLGE_SHAPER_LVL_QSET   = 3,
+   HCLGE_SHAPER_LVL_CNT= 4,
+   HCLGE_SHAPER_LVL_VF = 0,
+   HCLGE_SHAPER_LVL_PF = 1,
+};
+
+#define HCLGE_SHAPER_BS_U_DEF  1
+#define HCLGE_SHAPER_BS_S_DEF  4
+
+#define HCLGE_ETHER_MAX_RATE   10
+
+/* hclge_shaper_para_calc: calculate ir parameter for the shaper
+ * @ir: Rate to be config, its unit is Mbps
+ * @shaper_level: the shaper level. eg: port, pg, priority, queueset
+ * @ir_b: IR_B parameter of IR shaper
+ * @ir_u: IR_U parameter of IR shaper
+ * @ir_s: IR_S parameter of IR shaper
+ *
+ * the formula:
+ *
+ * IR_b * (2 ^ IR_u) * 8
+ * IR(Mbps) = -  *  CLOCK(1000Mbps)
+ * Tick * (2 ^ IR_s)
+ *
+ * @return: 0: calculate sucessful, negative: fail
+ */
+static int hclge_shaper_para_calc(u32 ir, u8 shaper_level,
+ u8 *ir_b, u8 *ir_u, u8 *ir_s)
+{
+   const u16 tick_array[HCLGE_SHAPER_LVL_CNT] = {
+   6 * 256,/* Prioriy level */
+   6 * 32, /* Prioriy group level */
+   6 * 8,  /* Port level */
+   6 * 256 /* Qset level */
+   };
+   u8 ir_u_calc = 0, ir_s_calc = 0;
+   u32 ir_calc;
+   u32 tick;
+
+   /* Calc tick */
+   if (shaper_level >= HCLGE_SHAPER_LVL_CNT)
+   return -ENOMEM;
+
+   tick = tick_array[shaper_level];
+
+   /**
+* Calc the speed if ir_b = 126, ir_u = 0 and ir_s = 0
+* the formula is changed to:
+*  126 * 1 * 8
+* ir_calc =  * 1000
+*  tick * 1
+*/
+   ir_calc = (1008000 + (tick >> 1) - 1) / tick;
+
+   if (ir_calc == ir) {
+   *ir_b = 126;
+   *ir_u = 0;
+   *ir_s = 0;
+
+   return 0;
+   } else if (ir_calc > ir) {
+   /* Increasing the denominator to select ir_s value */
+   while (ir_calc > ir) {
+   ir_s_calc++;
+   ir_calc = 1008000 / (tick * (1 << ir_s_calc));
+   }
+
+   if (ir_calc == ir)
+   *ir_b = 126;
+   else
+   *ir_b = (ir * tick * (1 << ir_s_calc) + 4000) / 8000;
+   } else {
+   /* Increasing the numerator to select ir_u value */
+   u32 numerator;
+
+   while (ir_calc < ir) {
+   ir_u_calc++;
+   numerator = 1008000 * (1 << ir_u_calc);
+   ir_calc = (numerator + (tick >> 1)) / tick;
+   }
+
+   if (ir_calc == ir) {
+   *ir_b = 126;
+   } else {
+   u32 denominator = (8000 * (1 << --ir_u_calc));
+   *ir_b = (ir * tick + (denominator >> 1)) / denominator;
+   }
+   }
+
+   *ir_u = ir_u_calc;
+   *ir_s = ir_s_calc;
+
+   return 0;
+}
+
+static int hclge_mac_pause_en_cfg(struct hclge_dev *hdev, bool tx, bool rx)
+{
+   struct hclge_desc desc;
+
+   hclge_cmd_setup_basic_desc(, HCLGE_OPC_CFG_MAC_PAUSE_EN, false);
+
+   desc.data[0] = cpu_to_le32((tx ?

[PATCH V4 net-next 8/8] net: hns3: Add HNS3 driver to kernel build framework & MAINTAINERS

2017-07-22 Thread Salil Mehta

This patch updates the MAINTAINERS file with HNS3 Ethernet driver
maintainers names and other details. This also introduces the new
Makefiles required to build the HNS3 Ethernet driver and updates
the existing Kconfig file in the hisilicon folder.

Signed-off-by: Salil Mehta 
---
Patch V3: Addressed below errors:
 1. Intel kbuild: https://lkml.org/lkml/2017/6/14/313
 2. Intel Kbuild: https://lkml.org/lkml/2017/6/14/636
Patch V2: No change
Patch V1: Initial Submit
---
 MAINTAINERS|  8 +++
 drivers/net/ethernet/hisilicon/Kconfig | 27 ++
 drivers/net/ethernet/hisilicon/Makefile|  1 +
 drivers/net/ethernet/hisilicon/hns3/Makefile   |  7 ++
 .../net/ethernet/hisilicon/hns3/hns3pf/Makefile| 11 +
 5 files changed, 54 insertions(+)
 create mode 100644 drivers/net/ethernet/hisilicon/hns3/Makefile
 create mode 100644 drivers/net/ethernet/hisilicon/hns3/hns3pf/Makefile

diff --git a/MAINTAINERS b/MAINTAINERS
index 297e610c9163..a22d5b86c2b7 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -6197,6 +6197,14 @@ S:   Maintained
 F: drivers/net/ethernet/hisilicon/
 F: Documentation/devicetree/bindings/net/hisilicon*.txt
 
+HISILICON NETWORK SUBSYSTEM 3 DRIVER (HNS3)
+M: Yisen Zhuang 
+M: Salil Mehta 
+L: net...@vger.kernel.org
+W: http://www.hisilicon.com
+S: Maintained
+F: drivers/net/ethernet/hisilicon/hns3/
+
 HISILICON ROCE DRIVER
 M: Lijun Ou 
 M: Wei Hu(Xavier) 
diff --git a/drivers/net/ethernet/hisilicon/Kconfig 
b/drivers/net/ethernet/hisilicon/Kconfig
index d11287e11371..9f8ea283c531 100644
--- a/drivers/net/ethernet/hisilicon/Kconfig
+++ b/drivers/net/ethernet/hisilicon/Kconfig
@@ -76,4 +76,31 @@ config HNS_ENET
  This selects the general ethernet driver for HNS.  This module make
  use of any HNS AE driver, such as HNS_DSAF
 
+config HNS3
+   tristate "Hisilicon Network Subsystem Support HNS3 (Framework)"
+depends on PCI
+   ---help---
+ This selects the framework support for Hisilicon Network Subsystem 3.
+ This layer facilitates clients like ENET, RoCE and user-space ethernet
+ drivers(like ODP)to register with HNAE devices and their associated
+ operations.
+
+config HNS3_HCLGE
+   tristate "Hisilicon HNS3 HCLGE Acceleration Engine & Compatibility 
Layer Support"
+depends on PCI_MSI
+   select HNS3
+   ---help---
+ This selects the HNS3_HCLGE network acceleration engine & its hardware
+ compatibility layer. The engine would be used in Hisilicon hip08 
family of
+ SoCs and further upcoming SoCs.
+
+config HNS3_ENET
+   tristate "Hisilicon HNS3 Ethernet Device Support"
+depends on 64BIT && PCI
+   select HNS3
+   ---help---
+ This selects the Ethernet Driver for Hisilicon Network Subsystem 3 
for hip08
+ family of SoCs. This module depends upon HNAE3 driver to access the 
HNAE3
+ devices and their associated operations.
+
 endif # NET_VENDOR_HISILICON
diff --git a/drivers/net/ethernet/hisilicon/Makefile 
b/drivers/net/ethernet/hisilicon/Makefile
index 8661695024dc..3828c435c18f 100644
--- a/drivers/net/ethernet/hisilicon/Makefile
+++ b/drivers/net/ethernet/hisilicon/Makefile
@@ -6,4 +6,5 @@ obj-$(CONFIG_HIX5HD2_GMAC) += hix5hd2_gmac.o
 obj-$(CONFIG_HIP04_ETH) += hip04_eth.o
 obj-$(CONFIG_HNS_MDIO) += hns_mdio.o
 obj-$(CONFIG_HNS) += hns/
+obj-$(CONFIG_HNS3) += hns3/
 obj-$(CONFIG_HISI_FEMAC) += hisi_femac.o
diff --git a/drivers/net/ethernet/hisilicon/hns3/Makefile 
b/drivers/net/ethernet/hisilicon/hns3/Makefile
new file mode 100644
index ..5e53735b2d4e
--- /dev/null
+++ b/drivers/net/ethernet/hisilicon/hns3/Makefile
@@ -0,0 +1,7 @@
+#
+# Makefile for the HISILICON network device drivers.
+#
+
+obj-$(CONFIG_HNS3) += hns3pf/
+
+obj-$(CONFIG_HNS3) +=hnae3.o
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/Makefile 
b/drivers/net/ethernet/hisilicon/hns3/hns3pf/Makefile
new file mode 100644
index ..c0a92b5690a9
--- /dev/null
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/Makefile
@@ -0,0 +1,11 @@
+#
+# Makefile for the HISILICON network device drivers.
+#
+
+ccflags-y := -Idrivers/net/ethernet/hisilicon/hns3
+
+obj-$(CONFIG_HNS3_HCLGE) += hclge.o
+hclge-objs =hclge_main.o hclge_cmd.o hclge_mdio.o hclge_tm.o
+
+obj-$(CONFIG_HNS3_ENET) += hns3.o
+hns3-objs = hns3_enet.o hns3_ethtool.o
-- 
2.11.0

[PATCH V4 net-next 8/8] net: hns3: Add HNS3 driver to kernel build framework & MAINTAINERS

2017-07-22 Thread Salil Mehta

This patch updates the MAINTAINERS file with HNS3 Ethernet driver
maintainers names and other details. This also introduces the new
Makefiles required to build the HNS3 Ethernet driver and updates
the existing Kconfig file in the hisilicon folder.

Signed-off-by: Salil Mehta 
---
Patch V3: Addressed below errors:
 1. Intel kbuild: https://lkml.org/lkml/2017/6/14/313
 2. Intel Kbuild: https://lkml.org/lkml/2017/6/14/636
Patch V2: No change
Patch V1: Initial Submit
---
 MAINTAINERS|  8 +++
 drivers/net/ethernet/hisilicon/Kconfig | 27 ++
 drivers/net/ethernet/hisilicon/Makefile|  1 +
 drivers/net/ethernet/hisilicon/hns3/Makefile   |  7 ++
 .../net/ethernet/hisilicon/hns3/hns3pf/Makefile| 11 +
 5 files changed, 54 insertions(+)
 create mode 100644 drivers/net/ethernet/hisilicon/hns3/Makefile
 create mode 100644 drivers/net/ethernet/hisilicon/hns3/hns3pf/Makefile

diff --git a/MAINTAINERS b/MAINTAINERS
index 297e610c9163..a22d5b86c2b7 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -6197,6 +6197,14 @@ S:   Maintained
 F: drivers/net/ethernet/hisilicon/
 F: Documentation/devicetree/bindings/net/hisilicon*.txt
 
+HISILICON NETWORK SUBSYSTEM 3 DRIVER (HNS3)
+M: Yisen Zhuang 
+M: Salil Mehta 
+L: net...@vger.kernel.org
+W: http://www.hisilicon.com
+S: Maintained
+F: drivers/net/ethernet/hisilicon/hns3/
+
 HISILICON ROCE DRIVER
 M: Lijun Ou 
 M: Wei Hu(Xavier) 
diff --git a/drivers/net/ethernet/hisilicon/Kconfig 
b/drivers/net/ethernet/hisilicon/Kconfig
index d11287e11371..9f8ea283c531 100644
--- a/drivers/net/ethernet/hisilicon/Kconfig
+++ b/drivers/net/ethernet/hisilicon/Kconfig
@@ -76,4 +76,31 @@ config HNS_ENET
  This selects the general ethernet driver for HNS.  This module make
  use of any HNS AE driver, such as HNS_DSAF
 
+config HNS3
+   tristate "Hisilicon Network Subsystem Support HNS3 (Framework)"
+depends on PCI
+   ---help---
+ This selects the framework support for Hisilicon Network Subsystem 3.
+ This layer facilitates clients like ENET, RoCE and user-space ethernet
+ drivers(like ODP)to register with HNAE devices and their associated
+ operations.
+
+config HNS3_HCLGE
+   tristate "Hisilicon HNS3 HCLGE Acceleration Engine & Compatibility 
Layer Support"
+depends on PCI_MSI
+   select HNS3
+   ---help---
+ This selects the HNS3_HCLGE network acceleration engine & its hardware
+ compatibility layer. The engine would be used in Hisilicon hip08 
family of
+ SoCs and further upcoming SoCs.
+
+config HNS3_ENET
+   tristate "Hisilicon HNS3 Ethernet Device Support"
+depends on 64BIT && PCI
+   select HNS3
+   ---help---
+ This selects the Ethernet Driver for Hisilicon Network Subsystem 3 
for hip08
+ family of SoCs. This module depends upon HNAE3 driver to access the 
HNAE3
+ devices and their associated operations.
+
 endif # NET_VENDOR_HISILICON
diff --git a/drivers/net/ethernet/hisilicon/Makefile 
b/drivers/net/ethernet/hisilicon/Makefile
index 8661695024dc..3828c435c18f 100644
--- a/drivers/net/ethernet/hisilicon/Makefile
+++ b/drivers/net/ethernet/hisilicon/Makefile
@@ -6,4 +6,5 @@ obj-$(CONFIG_HIX5HD2_GMAC) += hix5hd2_gmac.o
 obj-$(CONFIG_HIP04_ETH) += hip04_eth.o
 obj-$(CONFIG_HNS_MDIO) += hns_mdio.o
 obj-$(CONFIG_HNS) += hns/
+obj-$(CONFIG_HNS3) += hns3/
 obj-$(CONFIG_HISI_FEMAC) += hisi_femac.o
diff --git a/drivers/net/ethernet/hisilicon/hns3/Makefile 
b/drivers/net/ethernet/hisilicon/hns3/Makefile
new file mode 100644
index ..5e53735b2d4e
--- /dev/null
+++ b/drivers/net/ethernet/hisilicon/hns3/Makefile
@@ -0,0 +1,7 @@
+#
+# Makefile for the HISILICON network device drivers.
+#
+
+obj-$(CONFIG_HNS3) += hns3pf/
+
+obj-$(CONFIG_HNS3) +=hnae3.o
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/Makefile 
b/drivers/net/ethernet/hisilicon/hns3/hns3pf/Makefile
new file mode 100644
index ..c0a92b5690a9
--- /dev/null
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/Makefile
@@ -0,0 +1,11 @@
+#
+# Makefile for the HISILICON network device drivers.
+#
+
+ccflags-y := -Idrivers/net/ethernet/hisilicon/hns3
+
+obj-$(CONFIG_HNS3_HCLGE) += hclge.o
+hclge-objs =hclge_main.o hclge_cmd.o hclge_mdio.o hclge_tm.o
+
+obj-$(CONFIG_HNS3_ENET) += hns3.o
+hns3-objs = hns3_enet.o hns3_ethtool.o
-- 
2.11.0

Re: [PATCH v3 4/5] ACPI / boot: Not all platform require acpi_reduced_hw_init()

2017-07-22 Thread Rafael J. Wysocki

On Saturday, July 22, 2017 04:53:52 AM Andy Shevchenko wrote:
> On Sat, Jul 22, 2017 at 1:25 AM, Rafael J. Wysocki  wrote:
> > On Tuesday, July 18, 2017 06:04:19 PM Andy Shevchenko wrote:
> >> Some platform might take care of legacy devices on theirs own.
> >> Let's allow them to do that by exporting a weak function.
> >>
> >> Signed-off-by: Andy Shevchenko 
> >
> > I'd rather do it at the time when acpi_reduced_hw_init() actually needs to 
> > be
> > overridden by at least one platform.
> 
> Do you mean as folded into some other patch or just as a preparatory
> patch in some future series?
> 
> 

Any of the above would work for me.

Thanks,
Rafael

Re: [PATCH v3 4/5] ACPI / boot: Not all platform require acpi_reduced_hw_init()

2017-07-22 Thread Rafael J. Wysocki

On Saturday, July 22, 2017 04:53:52 AM Andy Shevchenko wrote:
> On Sat, Jul 22, 2017 at 1:25 AM, Rafael J. Wysocki  wrote:
> > On Tuesday, July 18, 2017 06:04:19 PM Andy Shevchenko wrote:
> >> Some platform might take care of legacy devices on theirs own.
> >> Let's allow them to do that by exporting a weak function.
> >>
> >> Signed-off-by: Andy Shevchenko 
> >
> > I'd rather do it at the time when acpi_reduced_hw_init() actually needs to 
> > be
> > overridden by at least one platform.
> 
> Do you mean as folded into some other patch or just as a preparatory
> patch in some future series?
> 
> 

Any of the above would work for me.

Thanks,
Rafael

[PATCH V4 net-next 1/8] net: hns3: Add support of HNS3 Ethernet Driver for hip08 SoC

2017-07-22 Thread Salil Mehta

This patch adds the support of Hisilicon Network Subsystem 3
Ethernet driver to hip08 family of SoCs.

This driver includes basic Rx/Tx functionality. It also includes
the client registration code with the HNAE3(Hisilicon Network
Acceleration Engine 3) framework.

This work provides the initial support to the hip08 SoC and
would incrementally add features or enhancements.

Signed-off-by: Daode Huang 
Signed-off-by: lipeng 
Signed-off-by: Salil Mehta 
Signed-off-by: Yisen Zhuang 
---
Patch V4: addressed comments by:
  1. Andrew Lunn:
 https://lkml.org/lkml/2017/6/17/222
 https://lkml.org/lkml/2017/6/17/232
  2. Bo Yu:
 https://lkml.org/lkml/2017/6/18/110
 https://lkml.org/lkml/2017/6/18/115
Patch V3: Addresed below comments:
  1. Stephen Hemminger:
 https://lkml.org/lkml/2017/6/13/972
  2. Yuval Mintz:
 https://lkml.org/lkml/2017/6/14/151
Patch V2: Addressed below comments:
  1. Kbuild:
 https://lkml.org/lkml/2017/6/11/73
  2. Yuval Mintz:
 https://lkml.org/lkml/2017/6/10/78
Patch V1: Initial Submit
---
 .../net/ethernet/hisilicon/hns3/hns3pf/hns3_enet.c | 2894 
 .../net/ethernet/hisilicon/hns3/hns3pf/hns3_enet.h |  598 
 2 files changed, 3492 insertions(+)
 create mode 100644 drivers/net/ethernet/hisilicon/hns3/hns3pf/hns3_enet.c
 create mode 100644 drivers/net/ethernet/hisilicon/hns3/hns3pf/hns3_enet.h

diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hns3_enet.c 
b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hns3_enet.c
new file mode 100644
index ..6e0e2967db42
--- /dev/null
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hns3_enet.c
@@ -0,0 +1,2894 @@
+/*
+ * Copyright (c) 2016~2017 Hisilicon Limited.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "hnae3.h"
+#include "hns3_enet.h"
+
+const char hns3_driver_name[] = "hns3";
+static const char hns3_driver_string[] =
+   "Hisilicon Ethernet Network Driver for Hi162x Family";
+static const char hns3_copyright[] = "Copyright (c) 2017 Huawei Corporation.";
+static struct hnae3_client client;
+
+/* hns3_pci_tbl - PCI Device ID Table
+ *
+ * Last entry must be all 0s
+ *
+ * { Vendor ID, Device ID, SubVendor ID, SubDevice ID,
+ *   Class, Class Mask, private data (not used) }
+ */
+static const struct pci_device_id hns3_pci_tbl[] = {
+   {PCI_VDEVICE(HUAWEI, HNAE3_DEV_ID_GE), 0},
+   {PCI_VDEVICE(HUAWEI, HNAE3_DEV_ID_25GE), 0},
+   {PCI_VDEVICE(HUAWEI, HNAE3_DEV_ID_25GE_RDMA), 0},
+   {PCI_VDEVICE(HUAWEI, HNAE3_DEV_ID_25GE_RDMA_MACSEC), 0},
+   {PCI_VDEVICE(HUAWEI, HNAE3_DEV_ID_50GE_RDMA), 0},
+   {PCI_VDEVICE(HUAWEI, HNAE3_DEV_ID_50GE_RDMA_MACSEC), 0},
+   {PCI_VDEVICE(HUAWEI, HNAE3_DEV_ID_100G_RDMA_MACSEC), 0},
+   /* required last entry */
+   {0, }
+};
+MODULE_DEVICE_TABLE(pci, hns3_pci_tbl);
+
+static irqreturn_t hns3_irq_handle(int irq, void *dev)
+{
+   struct hns3_enet_tqp_vector *tqp_vector = dev;
+
+   napi_schedule(_vector->napi);
+
+   return IRQ_HANDLED;
+}
+
+static int hns3_nic_init_irq(struct hns3_nic_priv *priv)
+{
+   struct pci_dev *pdev = priv->ae_handle->pdev;
+   struct hns3_enet_tqp_vector *tqp_vectors;
+   int txrx_int_idx = 0;
+   int rx_int_idx = 0;
+   int tx_int_idx = 0;
+   int ret;
+   int i;
+
+   for (i = 0; i < priv->vector_num; i++) {
+   tqp_vectors = >tqp_vector[i];
+
+   if (tqp_vectors->irq_init_flag == HNS3_VECTOR_INITED)
+   continue;
+
+   if (tqp_vectors->tx_group.ring && tqp_vectors->rx_group.ring) {
+   snprintf(tqp_vectors->name, HNAE3_INT_NAME_LEN - 1,
+"%s-%s-%d", priv->netdev->name, "TxRx",
+txrx_int_idx++);
+   txrx_int_idx++;
+   } else if (tqp_vectors->rx_group.ring) {
+   snprintf(tqp_vectors->name, HNAE3_INT_NAME_LEN - 1,
+"%s-%s-%d", priv->netdev->name, "Rx",
+rx_int_idx++);
+   } else if (tqp_vectors->tx_group.ring) {
+   snprintf(tqp_vectors->name, HNAE3_INT_NAME_LEN - 1,
+"%s-%s-%d", priv->netdev->name, "Tx",
+tx_int_idx++);
+   } else {
+   /* Skip this unused q_vector */
+   continue;
+   }
+
+   tqp_vectors->name[HNAE3_INT_NAME_LEN - 1] = '\0';
+
+

[PATCH V4 net-next 1/8] net: hns3: Add support of HNS3 Ethernet Driver for hip08 SoC

2017-07-22 Thread Salil Mehta

This patch adds the support of Hisilicon Network Subsystem 3
Ethernet driver to hip08 family of SoCs.

This driver includes basic Rx/Tx functionality. It also includes
the client registration code with the HNAE3(Hisilicon Network
Acceleration Engine 3) framework.

This work provides the initial support to the hip08 SoC and
would incrementally add features or enhancements.

Signed-off-by: Daode Huang 
Signed-off-by: lipeng 
Signed-off-by: Salil Mehta 
Signed-off-by: Yisen Zhuang 
---
Patch V4: addressed comments by:
  1. Andrew Lunn:
 https://lkml.org/lkml/2017/6/17/222
 https://lkml.org/lkml/2017/6/17/232
  2. Bo Yu:
 https://lkml.org/lkml/2017/6/18/110
 https://lkml.org/lkml/2017/6/18/115
Patch V3: Addresed below comments:
  1. Stephen Hemminger:
 https://lkml.org/lkml/2017/6/13/972
  2. Yuval Mintz:
 https://lkml.org/lkml/2017/6/14/151
Patch V2: Addressed below comments:
  1. Kbuild:
 https://lkml.org/lkml/2017/6/11/73
  2. Yuval Mintz:
 https://lkml.org/lkml/2017/6/10/78
Patch V1: Initial Submit
---
 .../net/ethernet/hisilicon/hns3/hns3pf/hns3_enet.c | 2894 
 .../net/ethernet/hisilicon/hns3/hns3pf/hns3_enet.h |  598 
 2 files changed, 3492 insertions(+)
 create mode 100644 drivers/net/ethernet/hisilicon/hns3/hns3pf/hns3_enet.c
 create mode 100644 drivers/net/ethernet/hisilicon/hns3/hns3pf/hns3_enet.h

diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hns3_enet.c 
b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hns3_enet.c
new file mode 100644
index ..6e0e2967db42
--- /dev/null
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hns3_enet.c
@@ -0,0 +1,2894 @@
+/*
+ * Copyright (c) 2016~2017 Hisilicon Limited.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "hnae3.h"
+#include "hns3_enet.h"
+
+const char hns3_driver_name[] = "hns3";
+static const char hns3_driver_string[] =
+   "Hisilicon Ethernet Network Driver for Hi162x Family";
+static const char hns3_copyright[] = "Copyright (c) 2017 Huawei Corporation.";
+static struct hnae3_client client;
+
+/* hns3_pci_tbl - PCI Device ID Table
+ *
+ * Last entry must be all 0s
+ *
+ * { Vendor ID, Device ID, SubVendor ID, SubDevice ID,
+ *   Class, Class Mask, private data (not used) }
+ */
+static const struct pci_device_id hns3_pci_tbl[] = {
+   {PCI_VDEVICE(HUAWEI, HNAE3_DEV_ID_GE), 0},
+   {PCI_VDEVICE(HUAWEI, HNAE3_DEV_ID_25GE), 0},
+   {PCI_VDEVICE(HUAWEI, HNAE3_DEV_ID_25GE_RDMA), 0},
+   {PCI_VDEVICE(HUAWEI, HNAE3_DEV_ID_25GE_RDMA_MACSEC), 0},
+   {PCI_VDEVICE(HUAWEI, HNAE3_DEV_ID_50GE_RDMA), 0},
+   {PCI_VDEVICE(HUAWEI, HNAE3_DEV_ID_50GE_RDMA_MACSEC), 0},
+   {PCI_VDEVICE(HUAWEI, HNAE3_DEV_ID_100G_RDMA_MACSEC), 0},
+   /* required last entry */
+   {0, }
+};
+MODULE_DEVICE_TABLE(pci, hns3_pci_tbl);
+
+static irqreturn_t hns3_irq_handle(int irq, void *dev)
+{
+   struct hns3_enet_tqp_vector *tqp_vector = dev;
+
+   napi_schedule(_vector->napi);
+
+   return IRQ_HANDLED;
+}
+
+static int hns3_nic_init_irq(struct hns3_nic_priv *priv)
+{
+   struct pci_dev *pdev = priv->ae_handle->pdev;
+   struct hns3_enet_tqp_vector *tqp_vectors;
+   int txrx_int_idx = 0;
+   int rx_int_idx = 0;
+   int tx_int_idx = 0;
+   int ret;
+   int i;
+
+   for (i = 0; i < priv->vector_num; i++) {
+   tqp_vectors = >tqp_vector[i];
+
+   if (tqp_vectors->irq_init_flag == HNS3_VECTOR_INITED)
+   continue;
+
+   if (tqp_vectors->tx_group.ring && tqp_vectors->rx_group.ring) {
+   snprintf(tqp_vectors->name, HNAE3_INT_NAME_LEN - 1,
+"%s-%s-%d", priv->netdev->name, "TxRx",
+txrx_int_idx++);
+   txrx_int_idx++;
+   } else if (tqp_vectors->rx_group.ring) {
+   snprintf(tqp_vectors->name, HNAE3_INT_NAME_LEN - 1,
+"%s-%s-%d", priv->netdev->name, "Rx",
+rx_int_idx++);
+   } else if (tqp_vectors->tx_group.ring) {
+   snprintf(tqp_vectors->name, HNAE3_INT_NAME_LEN - 1,
+"%s-%s-%d", priv->netdev->name, "Tx",
+tx_int_idx++);
+   } else {
+   /* Skip this unused q_vector */
+   continue;
+   }
+
+   tqp_vectors->name[HNAE3_INT_NAME_LEN - 1] = '\0';
+
+   ret = devm_request_irq(>dev, tqp_vectors->vector_irq,
+

[PATCH V4 net-next 0/8] Hisilicon Network Subsystem 3 Ethernet Driver

2017-07-22 Thread Salil Mehta

This patch-set contains the support of the HNS3 (Hisilicon Network Subsystem 3)
Ethernet driver for hip08 family of SoCs and future upcoming SoCs.

Hisilicon's new hip08 SoCs have integrated ethernet based on PCI Express and
hence there was a need of new driver over the previous HNS driver which is 
already part of the Linux mainline. This new driver is NOT backward
compatible with HNS.

This current driver is meant to control the Physical Function and there would
soon be a support of a separate driver for Virtual Function once this base PF
driver has been accepted. Also, this driver is the ongoing development work and
HNS3 Ethernet driver would be incrementally enhanced with more new features.

High Level Architecture:

[ Ethtool ]
   ^  |
   |  | 
  [Ethernet Client]  [ODP/UIO Client] . . . [ RoCE Client ] 
 ||
   [ HNAE Device ]|
 ||
- |
 ||
 [ HNAE3 Framework (Register/unregister) ]|
 ||
- |
 ||
   [ HCLGE Layer] |
 |_   |
|| |  |
[ MDIO ][ Scheduler/Shaper ]  [ Debugfs* ]|
|| |  |
||_|  | 
 ||
 [ IMP command Interface ]|
- |
  HIP08  H A R D W A R E  *


Current patch-set broadly adds the support of the following PF functionality:
 1. Basic Rx and Tx functionality 
 2. TSO support
 3. Ethtool support
 4. * Debugfs support -> this patch for now has been taken off.
 5. HNAE framework and hardware compatability layer
 6. Scheduler and Shaper support in transmit function
 7. MDIO support

Change Log:
V3->V4: Addressed below comments:
* Andrew Lunn: Various comments on MDIO, ethtool, ENET driver etc,
* Stephen Hemminger: change access and updation to 64 but statistics
* Bo You: some spelling mistakes and checkpatch.pl errors.
V2->V3: Addressed comments
* Yuval Mintz: Removal of redundant userprio-to-tc code
* Stephen Hemminger: Ethtool & interuupt enable
* Andrew Lunn: On C45/C22 PHy support, HNAE, ethtool
* Florian Fainelli: C45/C22 and phy_connect/attach
* Intel kbuild errors
V1->V2: Addressed some comments by kbuild, Yuval MIntz, Andrew Lunn &
Florian Fainelli in the following patches:
* Add support of HNS3 Ethernet Driver for hip08 SoC
* Add MDIO support to HNS3 Ethernet driver for hip08 SoC
* Add support of debugfs interface to HNS3 driver

Salil Mehta (8):
  net: hns3: Add support of HNS3 Ethernet Driver for hip08 SoC
  net: hns3: Add support of the HNAE3 framework
  net: hns3: Add HNS3 IMP(Integrated Mgmt Proc) Cmd Interface Support
  net: hns3: Add HNS3 Acceleration Engine & Compatibility Layer Support
  net: hns3: Add support of TX Scheduler & Shaper to HNS3 driver
  net: hns3: Add MDIO support to HNS3 Ethernet driver for hip08 SoC
  net: hns3: Add Ethtool support to HNS3 driver
  net: hns3: Add HNS3 driver to kernel build framework & MAINTAINERS

 MAINTAINERS|8 +
 drivers/net/ethernet/hisilicon/Kconfig |   27 +
 drivers/net/ethernet/hisilicon/Makefile|1 +
 drivers/net/ethernet/hisilicon/hns3/Makefile   |7 +
 drivers/net/ethernet/hisilicon/hns3/hnae3.c|  319 ++
 drivers/net/ethernet/hisilicon/hns3/hnae3.h|  449 +++
 .../net/ethernet/hisilicon/hns3/hns3pf/Makefile|   11 +
 .../net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.c |  347 ++
 .../net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.h |  742 
 .../ethernet/hisilicon/hns3/hns3pf/hclge_main.c| 4240 
 .../ethernet/hisilicon/hns3/hns3pf/hclge_main.h|  494 +++
 .../ethernet/hisilicon/hns3/hns3pf/hclge_mdio.c|  230 ++
 .../net/ethernet/hisilicon/hns3/hns3pf/hclge_tm.c  | 1018 +
 .../net/ethernet/hisilicon/hns3/hns3pf/hclge_tm.h  |  108 +
 .../net/ethernet/hisilicon/hns3/hns3pf/hns3_enet.c | 2894 +
 .../net/ethernet/hisilicon/hns3/hns3pf/hns3_enet.h |  598 +++
 .../ethernet/hisilicon/hns3/hns3pf/hns3_ethtool.c  |  543 +++
 17 files changed, 12036 insertions(+)
 create mode 100644 drivers/net/ethernet/hisilicon/hns3/Makefile
 create mode 100644 drivers/net/ethernet/hisilicon/hns3/hnae3.c
 create mode 100644 drivers/net/ethernet/hisilicon/hns3/hnae3.h
 create mode 100644

1 2 3 4 >

1 - 100 of 305 matches

Mail list logo