date:20161205

Re: [PATCH v8 1/3] perf annotate: Show raw form for jump instruction with indirect target

2016-12-05 Thread Arnaldo Carvalho de Melo

Em Mon, Dec 05, 2016 at 05:21:42PM -0300, Arnaldo Carvalho de Melo escreveu:
> Em Mon, Dec 05, 2016 at 09:26:45PM +0530, Ravi Bangoria escreveu:
> > For jump instructions that does not include target address as direct
> > operand, show the original disassembled line for them. This is needed
> > for certain powerpc jump instructions that use target address in a
> > register (such as bctr, btar, ...).
> 
> Found it, .__bpf_prog_run, that is present in that perf.data file you
> sent me, has it, will use it in my committer notes for this patch.

So, I've added these committer notes while testing it, will continue
processing your patches later/tomorrow, thanks!

Committer notes:

Testing it using a perf.data file and vmlinux for powerpc64,
cross-annotating it on a x86_64 workstation:

Before:

  .__bpf_prog_run  vmlinux.powerpc
 │stdr10,512(r9)  ▒
 │lbzr9,0(r31)▒
 │rldicr r9,r9,3,60   ▒
 │ldxr9,r30,r9▒
 │mtctr  r9   ▒
  100.00 │  ↓ bctr   3fe01510 ▒
 │lwar10,4(r31)   ▒
 │lwzr9,0(r31)▒
  
  Invalid jump offset: 3fe01510

After:

  .__bpf_prog_run  vmlinux.powerpc
 │stdr10,512(r9)  ▒
 │lbzr9,0(r31)▒
 │rldicr r9,r9,3,60   ▒
 │ldxr9,r30,r9▒
 │mtctr  r9   ▒
  100.00 │  ↓ bctr▒
 │lwar10,4(r31)   ▒
 │lwzr9,0(r31)▒
  
  Invalid jump offset: 3fe01510

This, in turn, uncovers another problem with jumps without operands, the
ENTER/-> operation, to jump to the target, still continues using the bogus
target :-)

BTW, this was the file used for the above tests:

  [acme@jouet ravi_bangoria]$ perf report --header-only -i 
perf.data.f22vm.powerdev
  # 
  # captured on: Thu Nov 24 12:40:38 2016
  # hostname : pdev-f22-qemu
  # os release : 4.4.10-200.fc22.ppc64
  # perf version : 4.9.rc1.g6298ce
  # arch : ppc64
  # nrcpus online : 48
  # nrcpus avail : 48
  # cpudesc : POWER7 (architected), altivec supported
  # cpuid : 74,513
  # total memory : 4158976 kB
  # cmdline : /home/ravi/Workspace/linux/tools/perf/perf record -a
  # event : name = cycles:ppp, , size = 112, { sample_period, sample_freq } 
= 4000, sample_type = IP|TID|TIME|CPU|PERIOD, disabled = 1, inherit = 1, mmap = 
1, c
  # HEADER_CPU_TOPOLOGY info available, use -I to display
  # HEADER_NUMA_TOPOLOGY info available, use -I to display
  # pmu mappings: cpu = 4, software = 1, tracepoint = 2, breakpoint = 5
  # missing features: HEADER_TRACING_DATA HEADER_BRANCH_STACK 
HEADER_GROUP_DESC HEADER_AUXTRACE HEADER_STAT HEADER_CACHE
  # 
  #
  [acme@jouet ravi_bangoria]$

Suggested-by: Michael Ellerman 
Signed-off-by: Ravi Bangoria 
Tested-by: Arnaldo Carvalho de Melo 
 
> - Arnaldo
> 
> > 
> > Before:
> >  ld r12,32088(r12)
> >  mtctr  r12
> >   v  bctr   ca2c
> >  stdr2,24(r1)
> >  addis  r12,r2,-1
> > 
> > After:
> >  ld r12,32088(r12)
> >  mtctr  r12
> >   v  bctr
> >  stdr2,24(r1)
> >  addis  r12,r2,-1
> > 
> > Suggested-by: Michael Ellerman 
> > Signed-off-by: Ravi Bangoria 
> > ---
> > Changes in v8:
> >   - v7: https://lkml.org/lkml/2016/9/21/436
> >   - Rebase to acme/perf/core
> >   - No logical changes. (Cross arch annotate patches are in. This patch
> > is for hardening annotate for powerpc.)
> > 
> >  tools/perf/util/annotate.c | 3 +++
> >  1 file changed, 3 insertions(+)
> > 
> > diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
> > index 4012b1d..ea7e0de 100644
> > --- a/tools/perf/util/annotate.c
> > +++ b/tools/perf/util/annotate.c
> > @@ -237,6 +237,9 @@ static int jump__parse(struct arch *arch 
> > __maybe_unused, struct ins_operands *op
> >  static int jump__scnprintf(struct ins *ins, char *bf, size_t size,
> >struct ins_operands *ops)
> >  {
> > +   if (!ops->target.addr)
> > +   return ins__raw_scnprintf(ins, bf, size, ops);
> > +
> > return scnprintf(bf, size, "%-6.6s %" PRIx64, ins->name, 
> > ops->target.offset);
> >  }
> >  
> > -- 
> > 2.4.11

Re: [PATCH v8 1/3] perf annotate: Show raw form for jump instruction with indirect target

2016-12-05 Thread Arnaldo Carvalho de Melo

Em Mon, Dec 05, 2016 at 05:21:42PM -0300, Arnaldo Carvalho de Melo escreveu:
> Em Mon, Dec 05, 2016 at 09:26:45PM +0530, Ravi Bangoria escreveu:
> > For jump instructions that does not include target address as direct
> > operand, show the original disassembled line for them. This is needed
> > for certain powerpc jump instructions that use target address in a
> > register (such as bctr, btar, ...).
> 
> Found it, .__bpf_prog_run, that is present in that perf.data file you
> sent me, has it, will use it in my committer notes for this patch.

So, I've added these committer notes while testing it, will continue
processing your patches later/tomorrow, thanks!

Committer notes:

Testing it using a perf.data file and vmlinux for powerpc64,
cross-annotating it on a x86_64 workstation:

Before:

  .__bpf_prog_run  vmlinux.powerpc
 │stdr10,512(r9)  ▒
 │lbzr9,0(r31)▒
 │rldicr r9,r9,3,60   ▒
 │ldxr9,r30,r9▒
 │mtctr  r9   ▒
  100.00 │  ↓ bctr   3fe01510 ▒
 │lwar10,4(r31)   ▒
 │lwzr9,0(r31)▒
  
  Invalid jump offset: 3fe01510

After:

  .__bpf_prog_run  vmlinux.powerpc
 │stdr10,512(r9)  ▒
 │lbzr9,0(r31)▒
 │rldicr r9,r9,3,60   ▒
 │ldxr9,r30,r9▒
 │mtctr  r9   ▒
  100.00 │  ↓ bctr▒
 │lwar10,4(r31)   ▒
 │lwzr9,0(r31)▒
  
  Invalid jump offset: 3fe01510

This, in turn, uncovers another problem with jumps without operands, the
ENTER/-> operation, to jump to the target, still continues using the bogus
target :-)

BTW, this was the file used for the above tests:

  [acme@jouet ravi_bangoria]$ perf report --header-only -i 
perf.data.f22vm.powerdev
  # 
  # captured on: Thu Nov 24 12:40:38 2016
  # hostname : pdev-f22-qemu
  # os release : 4.4.10-200.fc22.ppc64
  # perf version : 4.9.rc1.g6298ce
  # arch : ppc64
  # nrcpus online : 48
  # nrcpus avail : 48
  # cpudesc : POWER7 (architected), altivec supported
  # cpuid : 74,513
  # total memory : 4158976 kB
  # cmdline : /home/ravi/Workspace/linux/tools/perf/perf record -a
  # event : name = cycles:ppp, , size = 112, { sample_period, sample_freq } 
= 4000, sample_type = IP|TID|TIME|CPU|PERIOD, disabled = 1, inherit = 1, mmap = 
1, c
  # HEADER_CPU_TOPOLOGY info available, use -I to display
  # HEADER_NUMA_TOPOLOGY info available, use -I to display
  # pmu mappings: cpu = 4, software = 1, tracepoint = 2, breakpoint = 5
  # missing features: HEADER_TRACING_DATA HEADER_BRANCH_STACK 
HEADER_GROUP_DESC HEADER_AUXTRACE HEADER_STAT HEADER_CACHE
  # 
  #
  [acme@jouet ravi_bangoria]$

Suggested-by: Michael Ellerman 
Signed-off-by: Ravi Bangoria 
Tested-by: Arnaldo Carvalho de Melo 
 
> - Arnaldo
> 
> > 
> > Before:
> >  ld r12,32088(r12)
> >  mtctr  r12
> >   v  bctr   ca2c
> >  stdr2,24(r1)
> >  addis  r12,r2,-1
> > 
> > After:
> >  ld r12,32088(r12)
> >  mtctr  r12
> >   v  bctr
> >  stdr2,24(r1)
> >  addis  r12,r2,-1
> > 
> > Suggested-by: Michael Ellerman 
> > Signed-off-by: Ravi Bangoria 
> > ---
> > Changes in v8:
> >   - v7: https://lkml.org/lkml/2016/9/21/436
> >   - Rebase to acme/perf/core
> >   - No logical changes. (Cross arch annotate patches are in. This patch
> > is for hardening annotate for powerpc.)
> > 
> >  tools/perf/util/annotate.c | 3 +++
> >  1 file changed, 3 insertions(+)
> > 
> > diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
> > index 4012b1d..ea7e0de 100644
> > --- a/tools/perf/util/annotate.c
> > +++ b/tools/perf/util/annotate.c
> > @@ -237,6 +237,9 @@ static int jump__parse(struct arch *arch 
> > __maybe_unused, struct ins_operands *op
> >  static int jump__scnprintf(struct ins *ins, char *bf, size_t size,
> >struct ins_operands *ops)
> >  {
> > +   if (!ops->target.addr)
> > +   return ins__raw_scnprintf(ins, bf, size, ops);
> > +
> > return scnprintf(bf, size, "%-6.6s %" PRIx64, ins->name, 
> > ops->target.offset);
> >  }
> >  
> > -- 
> > 2.4.11

[PATCH] drm/amdgpu: don't add files at control minor debugfs directory

2016-12-05 Thread Nicolai Stange

Since commit 8a357d10043c ("drm: Nerf DRM_CONTROL nodes"), a
struct drm_device's ->control member is always NULL.

In the case of CONFIG_DEBUG_FS=y, amdgpu_debugfs_add_files() accesses
->control->debugfs_root though. This results in a NULL pointer
dereference.

Fix this by omitting the drm_debugfs_create_files() call for the
control minor debugfs directory which is now non-existent anyway.

Fixes: 8a357d10043c ("drm: Nerf DRM_CONTROL nodes")
Signed-off-by: Nicolai Stange 
---
Applicable to next-20161202. Compile-only tested.

 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 6 --
 1 file changed, 6 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index deee2db..0cb3e82 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -2493,9 +2493,6 @@ int amdgpu_debugfs_add_files(struct amdgpu_device *adev,
adev->debugfs_count = i;
 #if defined(CONFIG_DEBUG_FS)
drm_debugfs_create_files(files, nfiles,
-adev->ddev->control->debugfs_root,
-adev->ddev->control);
-   drm_debugfs_create_files(files, nfiles,
 adev->ddev->primary->debugfs_root,
 adev->ddev->primary);
 #endif
@@ -2510,9 +2507,6 @@ static void amdgpu_debugfs_remove_files(struct 
amdgpu_device *adev)
for (i = 0; i < adev->debugfs_count; i++) {
drm_debugfs_remove_files(adev->debugfs[i].files,
 adev->debugfs[i].num_files,
-adev->ddev->control);
-   drm_debugfs_remove_files(adev->debugfs[i].files,
-adev->debugfs[i].num_files,
 adev->ddev->primary);
}
 #endif
-- 
2.10.2

[PATCH] drm/amdgpu: don't add files at control minor debugfs directory

2016-12-05 Thread Nicolai Stange

Since commit 8a357d10043c ("drm: Nerf DRM_CONTROL nodes"), a
struct drm_device's ->control member is always NULL.

In the case of CONFIG_DEBUG_FS=y, amdgpu_debugfs_add_files() accesses
->control->debugfs_root though. This results in a NULL pointer
dereference.

Fix this by omitting the drm_debugfs_create_files() call for the
control minor debugfs directory which is now non-existent anyway.

Fixes: 8a357d10043c ("drm: Nerf DRM_CONTROL nodes")
Signed-off-by: Nicolai Stange 
---
Applicable to next-20161202. Compile-only tested.

 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 6 --
 1 file changed, 6 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index deee2db..0cb3e82 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -2493,9 +2493,6 @@ int amdgpu_debugfs_add_files(struct amdgpu_device *adev,
adev->debugfs_count = i;
 #if defined(CONFIG_DEBUG_FS)
drm_debugfs_create_files(files, nfiles,
-adev->ddev->control->debugfs_root,
-adev->ddev->control);
-   drm_debugfs_create_files(files, nfiles,
 adev->ddev->primary->debugfs_root,
 adev->ddev->primary);
 #endif
@@ -2510,9 +2507,6 @@ static void amdgpu_debugfs_remove_files(struct 
amdgpu_device *adev)
for (i = 0; i < adev->debugfs_count; i++) {
drm_debugfs_remove_files(adev->debugfs[i].files,
 adev->debugfs[i].num_files,
-adev->ddev->control);
-   drm_debugfs_remove_files(adev->debugfs[i].files,
-adev->debugfs[i].num_files,
 adev->ddev->primary);
}
 #endif
-- 
2.10.2

Re: [PATCH] objtool: fix bytes check of lea's rex_prefix

2016-12-05 Thread Josh Poimboeuf

On Mon, Dec 05, 2016 at 11:55:51AM +0100, Jiri Slaby wrote:
> For the "lea %(rsp), %rbp" case, we check if there is a rex_prefix. But
> we check "bytes" which is insn_byte_t[4] in rex_prefix (insn_field
> structure). Therefore, the check is always true.
> 
> Instead, check nbytes which is the right one.
> 
> Signed-off-by: Jiri Slaby 
> Cc: Josh Poimboeuf 

Acked-by: Josh Poimboeuf

Re: [PATCH] objtool: fix bytes check of lea's rex_prefix

2016-12-05 Thread Josh Poimboeuf

On Mon, Dec 05, 2016 at 11:55:51AM +0100, Jiri Slaby wrote:
> For the "lea %(rsp), %rbp" case, we check if there is a rex_prefix. But
> we check "bytes" which is insn_byte_t[4] in rex_prefix (insn_field
> structure). Therefore, the check is always true.
> 
> Instead, check nbytes which is the right one.
> 
> Signed-off-by: Jiri Slaby 
> Cc: Josh Poimboeuf 

Acked-by: Josh Poimboeuf

Re: [PATCH v8 1/3] perf annotate: Show raw form for jump instruction with indirect target

2016-12-05 Thread Arnaldo Carvalho de Melo

Em Mon, Dec 05, 2016 at 09:26:45PM +0530, Ravi Bangoria escreveu:
> For jump instructions that does not include target address as direct
> operand, show the original disassembled line for them. This is needed
> for certain powerpc jump instructions that use target address in a
> register (such as bctr, btar, ...).

Found it, .__bpf_prog_run, that is present in that perf.data file you
sent me, has it, will use it in my committer notes for this patch.

- Arnaldo

> 
> Before:
>  ld r12,32088(r12)
>  mtctr  r12
>   v  bctr   ca2c
>  stdr2,24(r1)
>  addis  r12,r2,-1
> 
> After:
>  ld r12,32088(r12)
>  mtctr  r12
>   v  bctr
>  stdr2,24(r1)
>  addis  r12,r2,-1
> 
> Suggested-by: Michael Ellerman 
> Signed-off-by: Ravi Bangoria 
> ---
> Changes in v8:
>   - v7: https://lkml.org/lkml/2016/9/21/436
>   - Rebase to acme/perf/core
>   - No logical changes. (Cross arch annotate patches are in. This patch
> is for hardening annotate for powerpc.)
> 
>  tools/perf/util/annotate.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
> index 4012b1d..ea7e0de 100644
> --- a/tools/perf/util/annotate.c
> +++ b/tools/perf/util/annotate.c
> @@ -237,6 +237,9 @@ static int jump__parse(struct arch *arch __maybe_unused, 
> struct ins_operands *op
>  static int jump__scnprintf(struct ins *ins, char *bf, size_t size,
>  struct ins_operands *ops)
>  {
> + if (!ops->target.addr)
> + return ins__raw_scnprintf(ins, bf, size, ops);
> +
>   return scnprintf(bf, size, "%-6.6s %" PRIx64, ins->name, 
> ops->target.offset);
>  }
>  
> -- 
> 2.4.11

Re: [PATCH v8 1/3] perf annotate: Show raw form for jump instruction with indirect target

2016-12-05 Thread Arnaldo Carvalho de Melo

Em Mon, Dec 05, 2016 at 09:26:45PM +0530, Ravi Bangoria escreveu:
> For jump instructions that does not include target address as direct
> operand, show the original disassembled line for them. This is needed
> for certain powerpc jump instructions that use target address in a
> register (such as bctr, btar, ...).

Found it, .__bpf_prog_run, that is present in that perf.data file you
sent me, has it, will use it in my committer notes for this patch.

- Arnaldo

> 
> Before:
>  ld r12,32088(r12)
>  mtctr  r12
>   v  bctr   ca2c
>  stdr2,24(r1)
>  addis  r12,r2,-1
> 
> After:
>  ld r12,32088(r12)
>  mtctr  r12
>   v  bctr
>  stdr2,24(r1)
>  addis  r12,r2,-1
> 
> Suggested-by: Michael Ellerman 
> Signed-off-by: Ravi Bangoria 
> ---
> Changes in v8:
>   - v7: https://lkml.org/lkml/2016/9/21/436
>   - Rebase to acme/perf/core
>   - No logical changes. (Cross arch annotate patches are in. This patch
> is for hardening annotate for powerpc.)
> 
>  tools/perf/util/annotate.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
> index 4012b1d..ea7e0de 100644
> --- a/tools/perf/util/annotate.c
> +++ b/tools/perf/util/annotate.c
> @@ -237,6 +237,9 @@ static int jump__parse(struct arch *arch __maybe_unused, 
> struct ins_operands *op
>  static int jump__scnprintf(struct ins *ins, char *bf, size_t size,
>  struct ins_operands *ops)
>  {
> + if (!ops->target.addr)
> + return ins__raw_scnprintf(ins, bf, size, ops);
> +
>   return scnprintf(bf, size, "%-6.6s %" PRIx64, ins->name, 
> ops->target.offset);
>  }
>  
> -- 
> 2.4.11

Re: Enabling peer to peer device transactions for PCIe devices

2016-12-05 Thread Christoph Hellwig

On Mon, Dec 05, 2016 at 12:46:14PM -0700, Jason Gunthorpe wrote:
> In any event the allocator still needs to track which regions are in
> use and be able to hook 'free' from userspace. That does suggest it
> should be integrated into the nvme driver and not a bolt on driver..

Two totally different use cases:

 - a card that exposes directly byte addressable storage as a PCI-e
   bar.  Thin of it as a nvdimm on a PCI-e card.  That's the iopmem
   case.
 - the NVMe CMB which exposes a byte addressable indirection buffer for
   I/O, but does not actually provide byte addressable persistent
   storage.  This is something that needs to be added to the NVMe driver
   (and the block layer for the abstraction probably).

[PATCH v4 00/13] net: ethernet: ti: cpts: update and fixes

2016-12-05 Thread Grygorii Strashko

It is preparation series intended to clean up and optimize TI CPTS driver to
facilitate further integration with other TI's SoCs like Keystone 2.

Changes in v4:
- fixed build error in patch
  "net: ethernet: ti: cpts: clean up event list if event pool is empty"
- rebased on top of net-next
 
Changes in v3:
- patches reordered: fixes and small updates moved first
- added comments in code about cpts->cc_mult
- conversation range (maxsec) limited to 10sec

Changes in v2:
- patch "net: ethernet: ti: cpts: rework initialization/deinitialization"
  was split on 4 patches
- applied comments from Richard Cochran
- dropped patch
  "net: ethernet: ti: cpts: add return value to tx and rx timestamp funcitons"
- new patches added:
  "net: ethernet: ti: cpts: drop excessive writes to CTRL and INT_EN regs"
  and "clocksource: export the clocks_calc_mult_shift to use by timestamp code"

Links on prev versions:
v3: https://www.spinics.net/lists/devicetree/msg153474.html
v2: http://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1282034.html
v1: http://www.spinics.net/lists/linux-omap/msg131925.html

Grygorii Strashko (11):
  net: ethernet: ti: cpts: switch to readl/writel_relaxed()
  net: ethernet: ti: allow cpts to be built separately
  net: ethernet: ti: cpsw: minimize direct access to struct cpts
  net: ethernet: ti: cpts: fix unbalanced clk api usage in 
cpts_register/unregister
  net: ethernet: ti: cpts: fix registration order
  net: ethernet: ti: cpts: disable cpts when unregistered
  net: ethernet: ti: cpts: drop excessive writes to CTRL and INT_EN regs
  net: ethernet: ti: cpts: rework initialization/deinitialization
  net: ethernet: ti: cpts: move dt props parsing to cpts driver
  net: ethernet: ti: cpts: calc mult and shift from refclk freq
  net: ethernet: ti: cpts: fix overflow check period

Murali Karicheri (1):
  clocksource: export the clocks_calc_mult_shift to use by timestamp code

WingMan Kwok (1):
  net: ethernet: ti: cpts: clean up event list if event pool is empty

 Documentation/devicetree/bindings/net/cpsw.txt |   8 +-
 drivers/net/ethernet/ti/Kconfig|   2 +-
 drivers/net/ethernet/ti/Makefile   |   3 +-
 drivers/net/ethernet/ti/cpsw.c |  84 -
 drivers/net/ethernet/ti/cpsw.h |   2 -
 drivers/net/ethernet/ti/cpts.c | 239 +++--
 drivers/net/ethernet/ti/cpts.h |  80 -
 kernel/time/clocksource.c  |   1 +
 8 files changed, 304 insertions(+), 115 deletions(-)

-- 
2.10.1

Re: Enabling peer to peer device transactions for PCIe devices

2016-12-05 Thread Christoph Hellwig

On Mon, Dec 05, 2016 at 12:46:14PM -0700, Jason Gunthorpe wrote:
> In any event the allocator still needs to track which regions are in
> use and be able to hook 'free' from userspace. That does suggest it
> should be integrated into the nvme driver and not a bolt on driver..

Two totally different use cases:

 - a card that exposes directly byte addressable storage as a PCI-e
   bar.  Thin of it as a nvdimm on a PCI-e card.  That's the iopmem
   case.
 - the NVMe CMB which exposes a byte addressable indirection buffer for
   I/O, but does not actually provide byte addressable persistent
   storage.  This is something that needs to be added to the NVMe driver
   (and the block layer for the abstraction probably).

[PATCH v4 00/13] net: ethernet: ti: cpts: update and fixes

2016-12-05 Thread Grygorii Strashko

It is preparation series intended to clean up and optimize TI CPTS driver to
facilitate further integration with other TI's SoCs like Keystone 2.

Changes in v4:
- fixed build error in patch
  "net: ethernet: ti: cpts: clean up event list if event pool is empty"
- rebased on top of net-next
 
Changes in v3:
- patches reordered: fixes and small updates moved first
- added comments in code about cpts->cc_mult
- conversation range (maxsec) limited to 10sec

Changes in v2:
- patch "net: ethernet: ti: cpts: rework initialization/deinitialization"
  was split on 4 patches
- applied comments from Richard Cochran
- dropped patch
  "net: ethernet: ti: cpts: add return value to tx and rx timestamp funcitons"
- new patches added:
  "net: ethernet: ti: cpts: drop excessive writes to CTRL and INT_EN regs"
  and "clocksource: export the clocks_calc_mult_shift to use by timestamp code"

Links on prev versions:
v3: https://www.spinics.net/lists/devicetree/msg153474.html
v2: http://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1282034.html
v1: http://www.spinics.net/lists/linux-omap/msg131925.html

Grygorii Strashko (11):
  net: ethernet: ti: cpts: switch to readl/writel_relaxed()
  net: ethernet: ti: allow cpts to be built separately
  net: ethernet: ti: cpsw: minimize direct access to struct cpts
  net: ethernet: ti: cpts: fix unbalanced clk api usage in 
cpts_register/unregister
  net: ethernet: ti: cpts: fix registration order
  net: ethernet: ti: cpts: disable cpts when unregistered
  net: ethernet: ti: cpts: drop excessive writes to CTRL and INT_EN regs
  net: ethernet: ti: cpts: rework initialization/deinitialization
  net: ethernet: ti: cpts: move dt props parsing to cpts driver
  net: ethernet: ti: cpts: calc mult and shift from refclk freq
  net: ethernet: ti: cpts: fix overflow check period

Murali Karicheri (1):
  clocksource: export the clocks_calc_mult_shift to use by timestamp code

WingMan Kwok (1):
  net: ethernet: ti: cpts: clean up event list if event pool is empty

 Documentation/devicetree/bindings/net/cpsw.txt |   8 +-
 drivers/net/ethernet/ti/Kconfig|   2 +-
 drivers/net/ethernet/ti/Makefile   |   3 +-
 drivers/net/ethernet/ti/cpsw.c |  84 -
 drivers/net/ethernet/ti/cpsw.h |   2 -
 drivers/net/ethernet/ti/cpts.c | 239 +++--
 drivers/net/ethernet/ti/cpts.h |  80 -
 kernel/time/clocksource.c  |   1 +
 8 files changed, 304 insertions(+), 115 deletions(-)

-- 
2.10.1

[PATCH v4 13/13] net: ethernet: ti: cpts: fix overflow check period

2016-12-05 Thread Grygorii Strashko

The CPTS drivers uses 8sec period for overflow checking with
assumption that CPTS retclk will not exceed 500MHz. But that's not
true on some TI platforms (Kesytone 2). As result, it is possible that
CPTS counter will overflow more than once between two readings.

Hence, fix it by selecting overflow check period dynamically as
max_sec_before_overflow/2, where
 max_sec_before_overflow = max_counter_val / rftclk_freq.

Cc: John Stultz 
Cc: Thomas Gleixner 
Signed-off-by: Grygorii Strashko 
Acked-by: Richard Cochran 
---
 drivers/net/ethernet/ti/cpts.c | 10 +++---
 drivers/net/ethernet/ti/cpts.h |  4 +---
 2 files changed, 8 insertions(+), 6 deletions(-)

diff --git a/drivers/net/ethernet/ti/cpts.c b/drivers/net/ethernet/ti/cpts.c
index 361d13a..a60d837 100644
--- a/drivers/net/ethernet/ti/cpts.c
+++ b/drivers/net/ethernet/ti/cpts.c
@@ -245,7 +245,7 @@ static void cpts_overflow_check(struct work_struct *work)
 
cpts_ptp_gettime(>info, );
pr_debug("cpts overflow check at %lld.%09lu\n", ts.tv_sec, ts.tv_nsec);
-   schedule_delayed_work(>overflow_work, CPTS_OVERFLOW_PERIOD);
+   schedule_delayed_work(>overflow_work, cpts->ov_check_period);
 }
 
 static int cpts_match(struct sk_buff *skb, unsigned int ptp_class,
@@ -382,8 +382,7 @@ int cpts_register(struct cpts *cpts)
}
cpts->phc_index = ptp_clock_index(cpts->clock);
 
-   schedule_delayed_work(>overflow_work, CPTS_OVERFLOW_PERIOD);
-
+   schedule_delayed_work(>overflow_work, cpts->ov_check_period);
return 0;
 
 err_ptp:
@@ -427,6 +426,11 @@ static void cpts_calc_mult_shift(struct cpts *cpts)
if (maxsec > 10)
maxsec = 10;
 
+   /* Calc overflow check period (maxsec / 2) */
+   cpts->ov_check_period = (HZ * maxsec) / 2;
+   dev_info(cpts->dev, "cpts: overflow check period %lu (jiffies)\n",
+cpts->ov_check_period);
+
if (cpts->cc_mult || cpts->cc.shift)
return;
 
diff --git a/drivers/net/ethernet/ti/cpts.h b/drivers/net/ethernet/ti/cpts.h
index 5da23af..c96eca2 100644
--- a/drivers/net/ethernet/ti/cpts.h
+++ b/drivers/net/ethernet/ti/cpts.h
@@ -97,9 +97,6 @@ enum {
CPTS_EV_TX,   /* Ethernet Transmit Event */
 };
 
-/* This covers any input clock up to about 500 MHz. */
-#define CPTS_OVERFLOW_PERIOD (HZ * 8)
-
 #define CPTS_FIFO_DEPTH 16
 #define CPTS_MAX_EVENTS 32
 
@@ -127,6 +124,7 @@ struct cpts {
struct list_head events;
struct list_head pool;
struct cpts_event pool_data[CPTS_MAX_EVENTS];
+   unsigned long ov_check_period;
 };
 
 void cpts_rx_timestamp(struct cpts *cpts, struct sk_buff *skb);
-- 
2.10.1

Re: bio linked list corruption.

2016-12-05 Thread Vegard Nossum

On 5 December 2016 at 20:11, Vegard Nossum  wrote:
> On 5 December 2016 at 18:55, Linus Torvalds
>  wrote:
>> On Mon, Dec 5, 2016 at 9:09 AM, Vegard Nossum  
>> wrote:
>> Since you apparently can recreate this fairly easily, how about trying
>> this stupid patch?
>>
>> NOTE! This is entirely untested. I may have screwed this up entirely.
>> You get the idea, though - just remove the wait queue head from the
>> list - the list entries stay around, but nothing points to the stack
>> entry (that we're going to free) any more.
>>
>> And add the warning to see if this actually ever triggers (and because
>> I'd like to see the callchain when it does, to see if it's another
>> waitqueue somewhere or what..)
>
> [ cut here ]
> WARNING: CPU: 22 PID: 14012 at mm/shmem.c:2668 shmem_fallocate+0x9a7/0xac0
> Kernel panic - not syncing: panic_on_warn set ...

So I noticed that panic_on_warn just after sending the email and I've
been waiting for it it to trigger again.

The warning has triggered twice more without panic_on_warn set and I
haven't seen any crash yet.


Vegard

[PATCH v4 13/13] net: ethernet: ti: cpts: fix overflow check period

2016-12-05 Thread Grygorii Strashko

The CPTS drivers uses 8sec period for overflow checking with
assumption that CPTS retclk will not exceed 500MHz. But that's not
true on some TI platforms (Kesytone 2). As result, it is possible that
CPTS counter will overflow more than once between two readings.

Hence, fix it by selecting overflow check period dynamically as
max_sec_before_overflow/2, where
 max_sec_before_overflow = max_counter_val / rftclk_freq.

Cc: John Stultz 
Cc: Thomas Gleixner 
Signed-off-by: Grygorii Strashko 
Acked-by: Richard Cochran 
---
 drivers/net/ethernet/ti/cpts.c | 10 +++---
 drivers/net/ethernet/ti/cpts.h |  4 +---
 2 files changed, 8 insertions(+), 6 deletions(-)

diff --git a/drivers/net/ethernet/ti/cpts.c b/drivers/net/ethernet/ti/cpts.c
index 361d13a..a60d837 100644
--- a/drivers/net/ethernet/ti/cpts.c
+++ b/drivers/net/ethernet/ti/cpts.c
@@ -245,7 +245,7 @@ static void cpts_overflow_check(struct work_struct *work)
 
cpts_ptp_gettime(>info, );
pr_debug("cpts overflow check at %lld.%09lu\n", ts.tv_sec, ts.tv_nsec);
-   schedule_delayed_work(>overflow_work, CPTS_OVERFLOW_PERIOD);
+   schedule_delayed_work(>overflow_work, cpts->ov_check_period);
 }
 
 static int cpts_match(struct sk_buff *skb, unsigned int ptp_class,
@@ -382,8 +382,7 @@ int cpts_register(struct cpts *cpts)
}
cpts->phc_index = ptp_clock_index(cpts->clock);
 
-   schedule_delayed_work(>overflow_work, CPTS_OVERFLOW_PERIOD);
-
+   schedule_delayed_work(>overflow_work, cpts->ov_check_period);
return 0;
 
 err_ptp:
@@ -427,6 +426,11 @@ static void cpts_calc_mult_shift(struct cpts *cpts)
if (maxsec > 10)
maxsec = 10;
 
+   /* Calc overflow check period (maxsec / 2) */
+   cpts->ov_check_period = (HZ * maxsec) / 2;
+   dev_info(cpts->dev, "cpts: overflow check period %lu (jiffies)\n",
+cpts->ov_check_period);
+
if (cpts->cc_mult || cpts->cc.shift)
return;
 
diff --git a/drivers/net/ethernet/ti/cpts.h b/drivers/net/ethernet/ti/cpts.h
index 5da23af..c96eca2 100644
--- a/drivers/net/ethernet/ti/cpts.h
+++ b/drivers/net/ethernet/ti/cpts.h
@@ -97,9 +97,6 @@ enum {
CPTS_EV_TX,   /* Ethernet Transmit Event */
 };
 
-/* This covers any input clock up to about 500 MHz. */
-#define CPTS_OVERFLOW_PERIOD (HZ * 8)
-
 #define CPTS_FIFO_DEPTH 16
 #define CPTS_MAX_EVENTS 32
 
@@ -127,6 +124,7 @@ struct cpts {
struct list_head events;
struct list_head pool;
struct cpts_event pool_data[CPTS_MAX_EVENTS];
+   unsigned long ov_check_period;
 };
 
 void cpts_rx_timestamp(struct cpts *cpts, struct sk_buff *skb);
-- 
2.10.1

Re: bio linked list corruption.

2016-12-05 Thread Vegard Nossum

On 5 December 2016 at 20:11, Vegard Nossum  wrote:
> On 5 December 2016 at 18:55, Linus Torvalds
>  wrote:
>> On Mon, Dec 5, 2016 at 9:09 AM, Vegard Nossum  
>> wrote:
>> Since you apparently can recreate this fairly easily, how about trying
>> this stupid patch?
>>
>> NOTE! This is entirely untested. I may have screwed this up entirely.
>> You get the idea, though - just remove the wait queue head from the
>> list - the list entries stay around, but nothing points to the stack
>> entry (that we're going to free) any more.
>>
>> And add the warning to see if this actually ever triggers (and because
>> I'd like to see the callchain when it does, to see if it's another
>> waitqueue somewhere or what..)
>
> [ cut here ]
> WARNING: CPU: 22 PID: 14012 at mm/shmem.c:2668 shmem_fallocate+0x9a7/0xac0
> Kernel panic - not syncing: panic_on_warn set ...

So I noticed that panic_on_warn just after sending the email and I've
been waiting for it it to trigger again.

The warning has triggered twice more without panic_on_warn set and I
haven't seen any crash yet.


Vegard

Re: [mm PATCH 0/3] Page fragment updates

2016-12-05 Thread Andrew Morton

On Mon, 5 Dec 2016 09:01:12 -0800 Alexander Duyck  
wrote:

> On Tue, Nov 29, 2016 at 10:23 AM, Alexander Duyck
>  wrote:
> > This patch series takes care of a few cleanups for the page fragments API.
> >
> > ...
> 
> It's been about a week since I submitted this series.  Just wanted to
> check in and see if anyone had any feedback or if this is good to be
> accepted for 4.10-rc1 with the rest of the set?

Looks good to me.  I have it all queued for post-4.9 processing.

Re: [mm PATCH 0/3] Page fragment updates

2016-12-05 Thread Andrew Morton

On Mon, 5 Dec 2016 09:01:12 -0800 Alexander Duyck  
wrote:

> On Tue, Nov 29, 2016 at 10:23 AM, Alexander Duyck
>  wrote:
> > This patch series takes care of a few cleanups for the page fragments API.
> >
> > ...
> 
> It's been about a week since I submitted this series.  Just wanted to
> check in and see if anyone had any feedback or if this is good to be
> accepted for 4.10-rc1 with the rest of the set?

Looks good to me.  I have it all queued for post-4.9 processing.

Re: [PATCH V3 net-next] net: hns: Fix to conditionally convey RX checksum flag to stack

2016-12-05 Thread kbuild test robot

Hi Salil,

[auto build test WARNING on net-next/master]

url:
https://github.com/0day-ci/linux/commits/Salil-Mehta/net-hns-Fix-to-conditionally-convey-RX-checksum-flag-to-stack/20161206-022948
config: i386-allmodconfig (attached as .config)
compiler: gcc-6 (Debian 6.2.0-3) 6.2.0 20160901
reproduce:
# save the attached .config to linux build tree
make ARCH=i386 

Note: it may well be a FALSE warning. FWIW you are at least aware of it now.
http://gcc.gnu.org/wiki/Better_Uninitialized_Warnings

All warnings (new ones prefixed by >>):

   drivers/net/ethernet/hisilicon/hns/hns_enet.c: In function 
'hns_nic_rx_poll_one':
>> drivers/net/ethernet/hisilicon/hns/hns_enet.c:606:5: warning: 'l3id' may be 
>> used uninitialized in this function [-Wmaybe-uninitialized]
 if ((l3id != HNS_RX_FLAG_L3ID_IPV4) && (l3id != HNS_RX_FLAG_L3ID_IPV6))
^
   drivers/net/ethernet/hisilicon/hns/hns_enet.c:573:6: note: 'l3id' was 
declared here
 u32 l3id;
 ^~~~
>> drivers/net/ethernet/hisilicon/hns/hns_enet.c:618:37: warning: 'l4id' may be 
>> used uninitialized in this function [-Wmaybe-uninitialized]
 if ((l4id != HNS_RX_FLAG_L4ID_TCP) &&
 ~~~^~
 (l4id != HNS_RX_FLAG_L4ID_UDP) &&
 ~~  
   drivers/net/ethernet/hisilicon/hns/hns_enet.c:574:6: note: 'l4id' was 
declared here
 u32 l4id;
 ^~~~

vim +/l3id +606 drivers/net/ethernet/hisilicon/hns/hns_enet.c

   600   * checksum or any other L3/L4 error, we will not (cannot) 
convey
   601   * checksum status for such cases to upper stack and will not 
maintain
   602   * the RX L3/L4 checksum counters as well.
   603   */
   604  
   605  /*  check L3 protocol for which checksum is supported */
 > 606  if ((l3id != HNS_RX_FLAG_L3ID_IPV4) && (l3id != 
 > HNS_RX_FLAG_L3ID_IPV6))
   607  return;
   608  
   609  /* check for any(not just checksum)flagged L3 protocol errors */
   610  if (unlikely(hnae_get_bit(flag, HNS_RXD_L3E_B)))
   611  return;
   612  
   613  /* we do not support checksum of fragmented packets */
   614  if (unlikely(hnae_get_bit(flag, HNS_RXD_FRAG_B)))
   615  return;
   616  
   617  /*  check L4 protocol for which checksum is supported */
 > 618  if ((l4id != HNS_RX_FLAG_L4ID_TCP) &&
   619  (l4id != HNS_RX_FLAG_L4ID_UDP) &&
   620  (l4id != HNS_RX_FLAG_L4ID_SCTP))
   621  return;

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


.config.gz
Description: application/gzip

Re: [PATCH V3 net-next] net: hns: Fix to conditionally convey RX checksum flag to stack

2016-12-05 Thread kbuild test robot

Hi Salil,

[auto build test WARNING on net-next/master]

url:
https://github.com/0day-ci/linux/commits/Salil-Mehta/net-hns-Fix-to-conditionally-convey-RX-checksum-flag-to-stack/20161206-022948
config: i386-allmodconfig (attached as .config)
compiler: gcc-6 (Debian 6.2.0-3) 6.2.0 20160901
reproduce:
# save the attached .config to linux build tree
make ARCH=i386 

Note: it may well be a FALSE warning. FWIW you are at least aware of it now.
http://gcc.gnu.org/wiki/Better_Uninitialized_Warnings

All warnings (new ones prefixed by >>):

   drivers/net/ethernet/hisilicon/hns/hns_enet.c: In function 
'hns_nic_rx_poll_one':
>> drivers/net/ethernet/hisilicon/hns/hns_enet.c:606:5: warning: 'l3id' may be 
>> used uninitialized in this function [-Wmaybe-uninitialized]
 if ((l3id != HNS_RX_FLAG_L3ID_IPV4) && (l3id != HNS_RX_FLAG_L3ID_IPV6))
^
   drivers/net/ethernet/hisilicon/hns/hns_enet.c:573:6: note: 'l3id' was 
declared here
 u32 l3id;
 ^~~~
>> drivers/net/ethernet/hisilicon/hns/hns_enet.c:618:37: warning: 'l4id' may be 
>> used uninitialized in this function [-Wmaybe-uninitialized]
 if ((l4id != HNS_RX_FLAG_L4ID_TCP) &&
 ~~~^~
 (l4id != HNS_RX_FLAG_L4ID_UDP) &&
 ~~  
   drivers/net/ethernet/hisilicon/hns/hns_enet.c:574:6: note: 'l4id' was 
declared here
 u32 l4id;
 ^~~~

vim +/l3id +606 drivers/net/ethernet/hisilicon/hns/hns_enet.c

   600   * checksum or any other L3/L4 error, we will not (cannot) 
convey
   601   * checksum status for such cases to upper stack and will not 
maintain
   602   * the RX L3/L4 checksum counters as well.
   603   */
   604  
   605  /*  check L3 protocol for which checksum is supported */
 > 606  if ((l3id != HNS_RX_FLAG_L3ID_IPV4) && (l3id != 
 > HNS_RX_FLAG_L3ID_IPV6))
   607  return;
   608  
   609  /* check for any(not just checksum)flagged L3 protocol errors */
   610  if (unlikely(hnae_get_bit(flag, HNS_RXD_L3E_B)))
   611  return;
   612  
   613  /* we do not support checksum of fragmented packets */
   614  if (unlikely(hnae_get_bit(flag, HNS_RXD_FRAG_B)))
   615  return;
   616  
   617  /*  check L4 protocol for which checksum is supported */
 > 618  if ((l4id != HNS_RX_FLAG_L4ID_TCP) &&
   619  (l4id != HNS_RX_FLAG_L4ID_UDP) &&
   620  (l4id != HNS_RX_FLAG_L4ID_SCTP))
   621  return;

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


.config.gz
Description: application/gzip

[PATCH v4 02/13] net: ethernet: ti: allow cpts to be built separately

2016-12-05 Thread Grygorii Strashko

TI CPTS IP is used as part of TI OMAP CPSW driver, but it's also
present as part of NETCP on TI Keystone 2 SoCs. So, It's required
to enable build of CPTS for both this drivers and this can be
achieved by allowing CPTS to be built separately.

Hence, allow cpts to be built separately and convert it to be
a module as both CPSW and NETCP drives can be built as modules.

Signed-off-by: Grygorii Strashko 
---
 drivers/net/ethernet/ti/Kconfig  |  2 +-
 drivers/net/ethernet/ti/Makefile |  3 ++-
 drivers/net/ethernet/ti/cpsw.c   | 22 +-
 drivers/net/ethernet/ti/cpts.c   | 16 
 drivers/net/ethernet/ti/cpts.h   | 18 ++
 5 files changed, 42 insertions(+), 19 deletions(-)

diff --git a/drivers/net/ethernet/ti/Kconfig b/drivers/net/ethernet/ti/Kconfig
index 9904d74..ff7f518 100644
--- a/drivers/net/ethernet/ti/Kconfig
+++ b/drivers/net/ethernet/ti/Kconfig
@@ -74,7 +74,7 @@ config TI_CPSW
  will be called cpsw.
 
 config TI_CPTS
-   bool "TI Common Platform Time Sync (CPTS) Support"
+   tristate "TI Common Platform Time Sync (CPTS) Support"
depends on TI_CPSW
select PTP_1588_CLOCK
---help---
diff --git a/drivers/net/ethernet/ti/Makefile b/drivers/net/ethernet/ti/Makefile
index d420d94..1e7c10b 100644
--- a/drivers/net/ethernet/ti/Makefile
+++ b/drivers/net/ethernet/ti/Makefile
@@ -12,8 +12,9 @@ obj-$(CONFIG_TI_DAVINCI_MDIO) += davinci_mdio.o
 obj-$(CONFIG_TI_DAVINCI_CPDMA) += davinci_cpdma.o
 obj-$(CONFIG_TI_CPSW_PHY_SEL) += cpsw-phy-sel.o
 obj-$(CONFIG_TI_CPSW_ALE) += cpsw_ale.o
+obj-$(CONFIG_TI_CPTS) += cpts.o
 obj-$(CONFIG_TI_CPSW) += ti_cpsw.o
-ti_cpsw-y := cpsw.o cpts.o
+ti_cpsw-y := cpsw.o
 
 obj-$(CONFIG_TI_KEYSTONE_NETCP) += keystone_netcp.o
 keystone_netcp-y := netcp_core.o
diff --git a/drivers/net/ethernet/ti/cpsw.c b/drivers/net/ethernet/ti/cpsw.c
index 3f96c57..323174d 100644
--- a/drivers/net/ethernet/ti/cpsw.c
+++ b/drivers/net/ethernet/ti/cpsw.c
@@ -1594,7 +1594,7 @@ static netdev_tx_t cpsw_ndo_start_xmit(struct sk_buff 
*skb,
return NETDEV_TX_BUSY;
 }
 
-#ifdef CONFIG_TI_CPTS
+#if IS_ENABLED(CONFIG_TI_CPTS)
 
 static void cpsw_hwtstamp_v1(struct cpsw_common *cpsw)
 {
@@ -1742,7 +1742,16 @@ static int cpsw_hwtstamp_get(struct net_device *dev, 
struct ifreq *ifr)
 
return copy_to_user(ifr->ifr_data, , sizeof(cfg)) ? -EFAULT : 0;
 }
+#else
+static int cpsw_hwtstamp_get(struct net_device *dev, struct ifreq *ifr)
+{
+   return -EOPNOTSUPP;
+}
 
+static int cpsw_hwtstamp_set(struct net_device *dev, struct ifreq *ifr)
+{
+   return -EOPNOTSUPP;
+}
 #endif /*CONFIG_TI_CPTS*/
 
 static int cpsw_ndo_ioctl(struct net_device *dev, struct ifreq *req, int cmd)
@@ -1755,12 +1764,10 @@ static int cpsw_ndo_ioctl(struct net_device *dev, 
struct ifreq *req, int cmd)
return -EINVAL;
 
switch (cmd) {
-#ifdef CONFIG_TI_CPTS
case SIOCSHWTSTAMP:
return cpsw_hwtstamp_set(dev, req);
case SIOCGHWTSTAMP:
return cpsw_hwtstamp_get(dev, req);
-#endif
}
 
if (!cpsw->slaves[slave_no].phy)
@@ -2100,10 +2107,10 @@ static void cpsw_set_msglevel(struct net_device *ndev, 
u32 value)
priv->msg_enable = value;
 }
 
+#if IS_ENABLED(CONFIG_TI_CPTS)
 static int cpsw_get_ts_info(struct net_device *ndev,
struct ethtool_ts_info *info)
 {
-#ifdef CONFIG_TI_CPTS
struct cpsw_common *cpsw = ndev_to_cpsw(ndev);
 
info->so_timestamping =
@@ -2120,7 +2127,12 @@ static int cpsw_get_ts_info(struct net_device *ndev,
info->rx_filters =
(1 << HWTSTAMP_FILTER_NONE) |
(1 << HWTSTAMP_FILTER_PTP_V2_EVENT);
+   return 0;
+}
 #else
+static int cpsw_get_ts_info(struct net_device *ndev,
+   struct ethtool_ts_info *info)
+{
info->so_timestamping =
SOF_TIMESTAMPING_TX_SOFTWARE |
SOF_TIMESTAMPING_RX_SOFTWARE |
@@ -2128,9 +2140,9 @@ static int cpsw_get_ts_info(struct net_device *ndev,
info->phc_index = -1;
info->tx_types = 0;
info->rx_filters = 0;
-#endif
return 0;
 }
+#endif
 
 static int cpsw_get_link_ksettings(struct net_device *ndev,
   struct ethtool_link_ksettings *ecmd)
diff --git a/drivers/net/ethernet/ti/cpts.c b/drivers/net/ethernet/ti/cpts.c
index a42c449..8cb0369 100644
--- a/drivers/net/ethernet/ti/cpts.c
+++ b/drivers/net/ethernet/ti/cpts.c
@@ -31,8 +31,6 @@
 
 #include "cpts.h"
 
-#ifdef CONFIG_TI_CPTS
-
 #define cpts_read32(c, r)  readl_relaxed(>reg->r)
 #define cpts_write32(c, v, r)  writel_relaxed(v, >reg->r)
 
@@ -334,6 +332,7 @@ void cpts_rx_timestamp(struct cpts *cpts, struct sk_buff 
*skb)
memset(ssh, 0, sizeof(*ssh));
ssh->hwtstamp = ns_to_ktime(ns);
 }
+EXPORT_SYMBOL_GPL(cpts_rx_timestamp);
 
 void cpts_tx_timestamp(struct cpts *cpts, struct sk_buff *skb)
 {
@@ -349,13 +348,11 @@

[PATCH v4 05/13] net: ethernet: ti: cpts: fix registration order

2016-12-05 Thread Grygorii Strashko

The ptp clock registered before spinlock, which is protecting it, and
before timecounter and cyclecounter initialization in cpts_register().

So, ensure that ptp clock is registered the last, after everything
else is done.

Signed-off-by: Grygorii Strashko 
Acked-by: Richard Cochran 
---
 drivers/net/ethernet/ti/cpts.c | 24 ++--
 1 file changed, 14 insertions(+), 10 deletions(-)

diff --git a/drivers/net/ethernet/ti/cpts.c b/drivers/net/ethernet/ti/cpts.c
index 61198f1..3dda6d5 100644
--- a/drivers/net/ethernet/ti/cpts.c
+++ b/drivers/net/ethernet/ti/cpts.c
@@ -356,15 +356,8 @@ int cpts_register(struct device *dev, struct cpts *cpts,
  u32 mult, u32 shift)
 {
int err, i;
-   unsigned long flags;
 
cpts->info = cpts_info;
-   cpts->clock = ptp_clock_register(>info, dev);
-   if (IS_ERR(cpts->clock)) {
-   err = PTR_ERR(cpts->clock);
-   cpts->clock = NULL;
-   return err;
-   }
spin_lock_init(>lock);
 
cpts->cc.read = cpts_systim_read;
@@ -382,15 +375,26 @@ int cpts_register(struct device *dev, struct cpts *cpts,
cpts_write32(cpts, CPTS_EN, control);
cpts_write32(cpts, TS_PEND_EN, int_enable);
 
-   spin_lock_irqsave(>lock, flags);
timecounter_init(>tc, >cc, ktime_to_ns(ktime_get_real()));
-   spin_unlock_irqrestore(>lock, flags);
 
INIT_DELAYED_WORK(>overflow_work, cpts_overflow_check);
-   schedule_delayed_work(>overflow_work, CPTS_OVERFLOW_PERIOD);
 
+   cpts->clock = ptp_clock_register(>info, dev);
+   if (IS_ERR(cpts->clock)) {
+   err = PTR_ERR(cpts->clock);
+   cpts->clock = NULL;
+   goto err_ptp;
+   }
cpts->phc_index = ptp_clock_index(cpts->clock);
+
+   schedule_delayed_work(>overflow_work, CPTS_OVERFLOW_PERIOD);
+
return 0;
+
+err_ptp:
+   if (cpts->refclk)
+   cpts_clk_release(cpts);
+   return err;
 }
 EXPORT_SYMBOL_GPL(cpts_register);
 
-- 
2.10.1

[PATCH v4 10/13] net: ethernet: ti: cpts: move dt props parsing to cpts driver

2016-12-05 Thread Grygorii Strashko

Move DT properties parsing into CPTS driver to simplify CPSW
code and CPTS driver porting on other SoC in the future
(like Keystone 2) - with this change it will not be required
to add the same DT parsing code in Keystone 2 NETCP driver.

Signed-off-by: Grygorii Strashko 
---
 drivers/net/ethernet/ti/cpsw.c | 16 +---
 drivers/net/ethernet/ti/cpsw.h |  2 --
 drivers/net/ethernet/ti/cpts.c | 32 +---
 drivers/net/ethernet/ti/cpts.h |  5 +++--
 4 files changed, 33 insertions(+), 22 deletions(-)

diff --git a/drivers/net/ethernet/ti/cpsw.c b/drivers/net/ethernet/ti/cpsw.c
index deb008a..259c717 100644
--- a/drivers/net/ethernet/ti/cpsw.c
+++ b/drivers/net/ethernet/ti/cpsw.c
@@ -2524,18 +2524,6 @@ static int cpsw_probe_dt(struct cpsw_platform_data *data,
}
data->active_slave = prop;
 
-   if (of_property_read_u32(node, "cpts_clock_mult", )) {
-   dev_err(>dev, "Missing cpts_clock_mult property in the 
DT.\n");
-   return -EINVAL;
-   }
-   data->cpts_clock_mult = prop;
-
-   if (of_property_read_u32(node, "cpts_clock_shift", )) {
-   dev_err(>dev, "Missing cpts_clock_shift property in the 
DT.\n");
-   return -EINVAL;
-   }
-   data->cpts_clock_shift = prop;
-
data->slave_data = devm_kzalloc(>dev, data->slaves
* sizeof(struct cpsw_slave_data),
GFP_KERNEL);
@@ -2990,9 +2978,7 @@ static int cpsw_probe(struct platform_device *pdev)
goto clean_dma_ret;
}
 
-   cpsw->cpts = cpts_create(cpsw->dev, cpts_regs,
-cpsw->data.cpts_clock_mult,
-cpsw->data.cpts_clock_shift);
+   cpsw->cpts = cpts_create(cpsw->dev, cpts_regs, cpsw->dev->of_node);
if (IS_ERR(cpsw->cpts)) {
ret = PTR_ERR(cpsw->cpts);
goto clean_ale_ret;
diff --git a/drivers/net/ethernet/ti/cpsw.h b/drivers/net/ethernet/ti/cpsw.h
index 16b54c6..6c3037a 100644
--- a/drivers/net/ethernet/ti/cpsw.h
+++ b/drivers/net/ethernet/ti/cpsw.h
@@ -31,8 +31,6 @@ struct cpsw_platform_data {
u32 channels;   /* number of cpdma channels (symmetric) */
u32 slaves; /* number of slave cpgmac ports */
u32 active_slave; /* time stamping, ethtool and SIOCGMIIPHY slave */
-   u32 cpts_clock_mult;  /* convert input clock ticks to nanoseconds */
-   u32 cpts_clock_shift; /* convert input clock ticks to nanoseconds */
u32 ale_entries;/* ale table size */
u32 bd_ram_size;  /*buffer descriptor ram size */
u32 mac_control;/* Mac control register */
diff --git a/drivers/net/ethernet/ti/cpts.c b/drivers/net/ethernet/ti/cpts.c
index 9356803..59c09a4 100644
--- a/drivers/net/ethernet/ti/cpts.c
+++ b/drivers/net/ethernet/ti/cpts.c
@@ -409,10 +409,34 @@ void cpts_unregister(struct cpts *cpts)
 }
 EXPORT_SYMBOL_GPL(cpts_unregister);
 
+static int cpts_of_parse(struct cpts *cpts, struct device_node *node)
+{
+   int ret = -EINVAL;
+   u32 prop;
+
+   if (of_property_read_u32(node, "cpts_clock_mult", ))
+   goto  of_error;
+   /* save cc.mult original value as it can be modified
+* by cpts_ptp_adjfreq().
+*/
+   cpts->cc_mult = prop;
+
+   if (of_property_read_u32(node, "cpts_clock_shift", ))
+   goto  of_error;
+   cpts->cc.shift = prop;
+
+   return 0;
+
+of_error:
+   dev_err(cpts->dev, "CPTS: Missing property in the DT.\n");
+   return ret;
+}
+
 struct cpts *cpts_create(struct device *dev, void __iomem *regs,
-u32 mult, u32 shift)
+struct device_node *node)
 {
struct cpts *cpts;
+   int ret;
 
cpts = devm_kzalloc(dev, sizeof(*cpts), GFP_KERNEL);
if (!cpts)
@@ -423,6 +447,10 @@ struct cpts *cpts_create(struct device *dev, void __iomem 
*regs,
spin_lock_init(>lock);
INIT_DELAYED_WORK(>overflow_work, cpts_overflow_check);
 
+   ret = cpts_of_parse(cpts, node);
+   if (ret)
+   return ERR_PTR(ret);
+
cpts->refclk = devm_clk_get(dev, "cpts");
if (IS_ERR(cpts->refclk)) {
dev_err(dev, "Failed to get cpts refclk\n");
@@ -433,8 +461,6 @@ struct cpts *cpts_create(struct device *dev, void __iomem 
*regs,
 
cpts->cc.read = cpts_systim_read;
cpts->cc.mask = CLOCKSOURCE_MASK(32);
-   cpts->cc.shift = shift;
-   cpts->cc_mult = mult;
cpts->info = cpts_info;
 
return cpts;
diff --git a/drivers/net/ethernet/ti/cpts.h b/drivers/net/ethernet/ti/cpts.h
index e7d857c..5da23af 100644
--- a/drivers/net/ethernet/ti/cpts.h
+++ b/drivers/net/ethernet/ti/cpts.h
@@ -27,6 +27,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -133,7 +134,7 @@ void

[PATCH v4 02/13] net: ethernet: ti: allow cpts to be built separately

2016-12-05 Thread Grygorii Strashko

TI CPTS IP is used as part of TI OMAP CPSW driver, but it's also
present as part of NETCP on TI Keystone 2 SoCs. So, It's required
to enable build of CPTS for both this drivers and this can be
achieved by allowing CPTS to be built separately.

Hence, allow cpts to be built separately and convert it to be
a module as both CPSW and NETCP drives can be built as modules.

Signed-off-by: Grygorii Strashko 
---
 drivers/net/ethernet/ti/Kconfig  |  2 +-
 drivers/net/ethernet/ti/Makefile |  3 ++-
 drivers/net/ethernet/ti/cpsw.c   | 22 +-
 drivers/net/ethernet/ti/cpts.c   | 16 
 drivers/net/ethernet/ti/cpts.h   | 18 ++
 5 files changed, 42 insertions(+), 19 deletions(-)

diff --git a/drivers/net/ethernet/ti/Kconfig b/drivers/net/ethernet/ti/Kconfig
index 9904d74..ff7f518 100644
--- a/drivers/net/ethernet/ti/Kconfig
+++ b/drivers/net/ethernet/ti/Kconfig
@@ -74,7 +74,7 @@ config TI_CPSW
  will be called cpsw.
 
 config TI_CPTS
-   bool "TI Common Platform Time Sync (CPTS) Support"
+   tristate "TI Common Platform Time Sync (CPTS) Support"
depends on TI_CPSW
select PTP_1588_CLOCK
---help---
diff --git a/drivers/net/ethernet/ti/Makefile b/drivers/net/ethernet/ti/Makefile
index d420d94..1e7c10b 100644
--- a/drivers/net/ethernet/ti/Makefile
+++ b/drivers/net/ethernet/ti/Makefile
@@ -12,8 +12,9 @@ obj-$(CONFIG_TI_DAVINCI_MDIO) += davinci_mdio.o
 obj-$(CONFIG_TI_DAVINCI_CPDMA) += davinci_cpdma.o
 obj-$(CONFIG_TI_CPSW_PHY_SEL) += cpsw-phy-sel.o
 obj-$(CONFIG_TI_CPSW_ALE) += cpsw_ale.o
+obj-$(CONFIG_TI_CPTS) += cpts.o
 obj-$(CONFIG_TI_CPSW) += ti_cpsw.o
-ti_cpsw-y := cpsw.o cpts.o
+ti_cpsw-y := cpsw.o
 
 obj-$(CONFIG_TI_KEYSTONE_NETCP) += keystone_netcp.o
 keystone_netcp-y := netcp_core.o
diff --git a/drivers/net/ethernet/ti/cpsw.c b/drivers/net/ethernet/ti/cpsw.c
index 3f96c57..323174d 100644
--- a/drivers/net/ethernet/ti/cpsw.c
+++ b/drivers/net/ethernet/ti/cpsw.c
@@ -1594,7 +1594,7 @@ static netdev_tx_t cpsw_ndo_start_xmit(struct sk_buff 
*skb,
return NETDEV_TX_BUSY;
 }
 
-#ifdef CONFIG_TI_CPTS
+#if IS_ENABLED(CONFIG_TI_CPTS)
 
 static void cpsw_hwtstamp_v1(struct cpsw_common *cpsw)
 {
@@ -1742,7 +1742,16 @@ static int cpsw_hwtstamp_get(struct net_device *dev, 
struct ifreq *ifr)
 
return copy_to_user(ifr->ifr_data, , sizeof(cfg)) ? -EFAULT : 0;
 }
+#else
+static int cpsw_hwtstamp_get(struct net_device *dev, struct ifreq *ifr)
+{
+   return -EOPNOTSUPP;
+}
 
+static int cpsw_hwtstamp_set(struct net_device *dev, struct ifreq *ifr)
+{
+   return -EOPNOTSUPP;
+}
 #endif /*CONFIG_TI_CPTS*/
 
 static int cpsw_ndo_ioctl(struct net_device *dev, struct ifreq *req, int cmd)
@@ -1755,12 +1764,10 @@ static int cpsw_ndo_ioctl(struct net_device *dev, 
struct ifreq *req, int cmd)
return -EINVAL;
 
switch (cmd) {
-#ifdef CONFIG_TI_CPTS
case SIOCSHWTSTAMP:
return cpsw_hwtstamp_set(dev, req);
case SIOCGHWTSTAMP:
return cpsw_hwtstamp_get(dev, req);
-#endif
}
 
if (!cpsw->slaves[slave_no].phy)
@@ -2100,10 +2107,10 @@ static void cpsw_set_msglevel(struct net_device *ndev, 
u32 value)
priv->msg_enable = value;
 }
 
+#if IS_ENABLED(CONFIG_TI_CPTS)
 static int cpsw_get_ts_info(struct net_device *ndev,
struct ethtool_ts_info *info)
 {
-#ifdef CONFIG_TI_CPTS
struct cpsw_common *cpsw = ndev_to_cpsw(ndev);
 
info->so_timestamping =
@@ -2120,7 +2127,12 @@ static int cpsw_get_ts_info(struct net_device *ndev,
info->rx_filters =
(1 << HWTSTAMP_FILTER_NONE) |
(1 << HWTSTAMP_FILTER_PTP_V2_EVENT);
+   return 0;
+}
 #else
+static int cpsw_get_ts_info(struct net_device *ndev,
+   struct ethtool_ts_info *info)
+{
info->so_timestamping =
SOF_TIMESTAMPING_TX_SOFTWARE |
SOF_TIMESTAMPING_RX_SOFTWARE |
@@ -2128,9 +2140,9 @@ static int cpsw_get_ts_info(struct net_device *ndev,
info->phc_index = -1;
info->tx_types = 0;
info->rx_filters = 0;
-#endif
return 0;
 }
+#endif
 
 static int cpsw_get_link_ksettings(struct net_device *ndev,
   struct ethtool_link_ksettings *ecmd)
diff --git a/drivers/net/ethernet/ti/cpts.c b/drivers/net/ethernet/ti/cpts.c
index a42c449..8cb0369 100644
--- a/drivers/net/ethernet/ti/cpts.c
+++ b/drivers/net/ethernet/ti/cpts.c
@@ -31,8 +31,6 @@
 
 #include "cpts.h"
 
-#ifdef CONFIG_TI_CPTS
-
 #define cpts_read32(c, r)  readl_relaxed(>reg->r)
 #define cpts_write32(c, v, r)  writel_relaxed(v, >reg->r)
 
@@ -334,6 +332,7 @@ void cpts_rx_timestamp(struct cpts *cpts, struct sk_buff 
*skb)
memset(ssh, 0, sizeof(*ssh));
ssh->hwtstamp = ns_to_ktime(ns);
 }
+EXPORT_SYMBOL_GPL(cpts_rx_timestamp);
 
 void cpts_tx_timestamp(struct cpts *cpts, struct sk_buff *skb)
 {
@@ -349,13 +348,11 @@ void

[PATCH v4 05/13] net: ethernet: ti: cpts: fix registration order

2016-12-05 Thread Grygorii Strashko

The ptp clock registered before spinlock, which is protecting it, and
before timecounter and cyclecounter initialization in cpts_register().

So, ensure that ptp clock is registered the last, after everything
else is done.

Signed-off-by: Grygorii Strashko 
Acked-by: Richard Cochran 
---
 drivers/net/ethernet/ti/cpts.c | 24 ++--
 1 file changed, 14 insertions(+), 10 deletions(-)

diff --git a/drivers/net/ethernet/ti/cpts.c b/drivers/net/ethernet/ti/cpts.c
index 61198f1..3dda6d5 100644
--- a/drivers/net/ethernet/ti/cpts.c
+++ b/drivers/net/ethernet/ti/cpts.c
@@ -356,15 +356,8 @@ int cpts_register(struct device *dev, struct cpts *cpts,
  u32 mult, u32 shift)
 {
int err, i;
-   unsigned long flags;
 
cpts->info = cpts_info;
-   cpts->clock = ptp_clock_register(>info, dev);
-   if (IS_ERR(cpts->clock)) {
-   err = PTR_ERR(cpts->clock);
-   cpts->clock = NULL;
-   return err;
-   }
spin_lock_init(>lock);
 
cpts->cc.read = cpts_systim_read;
@@ -382,15 +375,26 @@ int cpts_register(struct device *dev, struct cpts *cpts,
cpts_write32(cpts, CPTS_EN, control);
cpts_write32(cpts, TS_PEND_EN, int_enable);
 
-   spin_lock_irqsave(>lock, flags);
timecounter_init(>tc, >cc, ktime_to_ns(ktime_get_real()));
-   spin_unlock_irqrestore(>lock, flags);
 
INIT_DELAYED_WORK(>overflow_work, cpts_overflow_check);
-   schedule_delayed_work(>overflow_work, CPTS_OVERFLOW_PERIOD);
 
+   cpts->clock = ptp_clock_register(>info, dev);
+   if (IS_ERR(cpts->clock)) {
+   err = PTR_ERR(cpts->clock);
+   cpts->clock = NULL;
+   goto err_ptp;
+   }
cpts->phc_index = ptp_clock_index(cpts->clock);
+
+   schedule_delayed_work(>overflow_work, CPTS_OVERFLOW_PERIOD);
+
return 0;
+
+err_ptp:
+   if (cpts->refclk)
+   cpts_clk_release(cpts);
+   return err;
 }
 EXPORT_SYMBOL_GPL(cpts_register);
 
-- 
2.10.1

[PATCH v4 10/13] net: ethernet: ti: cpts: move dt props parsing to cpts driver

2016-12-05 Thread Grygorii Strashko

Move DT properties parsing into CPTS driver to simplify CPSW
code and CPTS driver porting on other SoC in the future
(like Keystone 2) - with this change it will not be required
to add the same DT parsing code in Keystone 2 NETCP driver.

Signed-off-by: Grygorii Strashko 
---
 drivers/net/ethernet/ti/cpsw.c | 16 +---
 drivers/net/ethernet/ti/cpsw.h |  2 --
 drivers/net/ethernet/ti/cpts.c | 32 +---
 drivers/net/ethernet/ti/cpts.h |  5 +++--
 4 files changed, 33 insertions(+), 22 deletions(-)

diff --git a/drivers/net/ethernet/ti/cpsw.c b/drivers/net/ethernet/ti/cpsw.c
index deb008a..259c717 100644
--- a/drivers/net/ethernet/ti/cpsw.c
+++ b/drivers/net/ethernet/ti/cpsw.c
@@ -2524,18 +2524,6 @@ static int cpsw_probe_dt(struct cpsw_platform_data *data,
}
data->active_slave = prop;
 
-   if (of_property_read_u32(node, "cpts_clock_mult", )) {
-   dev_err(>dev, "Missing cpts_clock_mult property in the 
DT.\n");
-   return -EINVAL;
-   }
-   data->cpts_clock_mult = prop;
-
-   if (of_property_read_u32(node, "cpts_clock_shift", )) {
-   dev_err(>dev, "Missing cpts_clock_shift property in the 
DT.\n");
-   return -EINVAL;
-   }
-   data->cpts_clock_shift = prop;
-
data->slave_data = devm_kzalloc(>dev, data->slaves
* sizeof(struct cpsw_slave_data),
GFP_KERNEL);
@@ -2990,9 +2978,7 @@ static int cpsw_probe(struct platform_device *pdev)
goto clean_dma_ret;
}
 
-   cpsw->cpts = cpts_create(cpsw->dev, cpts_regs,
-cpsw->data.cpts_clock_mult,
-cpsw->data.cpts_clock_shift);
+   cpsw->cpts = cpts_create(cpsw->dev, cpts_regs, cpsw->dev->of_node);
if (IS_ERR(cpsw->cpts)) {
ret = PTR_ERR(cpsw->cpts);
goto clean_ale_ret;
diff --git a/drivers/net/ethernet/ti/cpsw.h b/drivers/net/ethernet/ti/cpsw.h
index 16b54c6..6c3037a 100644
--- a/drivers/net/ethernet/ti/cpsw.h
+++ b/drivers/net/ethernet/ti/cpsw.h
@@ -31,8 +31,6 @@ struct cpsw_platform_data {
u32 channels;   /* number of cpdma channels (symmetric) */
u32 slaves; /* number of slave cpgmac ports */
u32 active_slave; /* time stamping, ethtool and SIOCGMIIPHY slave */
-   u32 cpts_clock_mult;  /* convert input clock ticks to nanoseconds */
-   u32 cpts_clock_shift; /* convert input clock ticks to nanoseconds */
u32 ale_entries;/* ale table size */
u32 bd_ram_size;  /*buffer descriptor ram size */
u32 mac_control;/* Mac control register */
diff --git a/drivers/net/ethernet/ti/cpts.c b/drivers/net/ethernet/ti/cpts.c
index 9356803..59c09a4 100644
--- a/drivers/net/ethernet/ti/cpts.c
+++ b/drivers/net/ethernet/ti/cpts.c
@@ -409,10 +409,34 @@ void cpts_unregister(struct cpts *cpts)
 }
 EXPORT_SYMBOL_GPL(cpts_unregister);
 
+static int cpts_of_parse(struct cpts *cpts, struct device_node *node)
+{
+   int ret = -EINVAL;
+   u32 prop;
+
+   if (of_property_read_u32(node, "cpts_clock_mult", ))
+   goto  of_error;
+   /* save cc.mult original value as it can be modified
+* by cpts_ptp_adjfreq().
+*/
+   cpts->cc_mult = prop;
+
+   if (of_property_read_u32(node, "cpts_clock_shift", ))
+   goto  of_error;
+   cpts->cc.shift = prop;
+
+   return 0;
+
+of_error:
+   dev_err(cpts->dev, "CPTS: Missing property in the DT.\n");
+   return ret;
+}
+
 struct cpts *cpts_create(struct device *dev, void __iomem *regs,
-u32 mult, u32 shift)
+struct device_node *node)
 {
struct cpts *cpts;
+   int ret;
 
cpts = devm_kzalloc(dev, sizeof(*cpts), GFP_KERNEL);
if (!cpts)
@@ -423,6 +447,10 @@ struct cpts *cpts_create(struct device *dev, void __iomem 
*regs,
spin_lock_init(>lock);
INIT_DELAYED_WORK(>overflow_work, cpts_overflow_check);
 
+   ret = cpts_of_parse(cpts, node);
+   if (ret)
+   return ERR_PTR(ret);
+
cpts->refclk = devm_clk_get(dev, "cpts");
if (IS_ERR(cpts->refclk)) {
dev_err(dev, "Failed to get cpts refclk\n");
@@ -433,8 +461,6 @@ struct cpts *cpts_create(struct device *dev, void __iomem 
*regs,
 
cpts->cc.read = cpts_systim_read;
cpts->cc.mask = CLOCKSOURCE_MASK(32);
-   cpts->cc.shift = shift;
-   cpts->cc_mult = mult;
cpts->info = cpts_info;
 
return cpts;
diff --git a/drivers/net/ethernet/ti/cpts.h b/drivers/net/ethernet/ti/cpts.h
index e7d857c..5da23af 100644
--- a/drivers/net/ethernet/ti/cpts.h
+++ b/drivers/net/ethernet/ti/cpts.h
@@ -27,6 +27,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -133,7 +134,7 @@ void cpts_tx_timestamp(struct cpts

[PATCH v4 06/13] net: ethernet: ti: cpts: disable cpts when unregistered

2016-12-05 Thread Grygorii Strashko

The cpts now is left enabled after unregistration.
Hence, disable it in cpts_unregister().

Signed-off-by: Grygorii Strashko 
Acked-by: Richard Cochran 
---
 drivers/net/ethernet/ti/cpts.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/drivers/net/ethernet/ti/cpts.c b/drivers/net/ethernet/ti/cpts.c
index 3dda6d5..d3c1ac5 100644
--- a/drivers/net/ethernet/ti/cpts.c
+++ b/drivers/net/ethernet/ti/cpts.c
@@ -404,6 +404,10 @@ void cpts_unregister(struct cpts *cpts)
ptp_clock_unregister(cpts->clock);
cancel_delayed_work_sync(>overflow_work);
}
+
+   cpts_write32(cpts, 0, int_enable);
+   cpts_write32(cpts, 0, control);
+
if (cpts->refclk)
cpts_clk_release(cpts);
 }
-- 
2.10.1

Re: bio linked list corruption.

2016-12-05 Thread Linus Torvalds

On Mon, Dec 5, 2016 at 11:11 AM, Vegard Nossum  wrote:
>
> [ cut here ]
> WARNING: CPU: 22 PID: 14012 at mm/shmem.c:2668 shmem_fallocate+0x9a7/0xac0

Ok, good. So that's confirmed as the cause of this problem.

And the call chain that I wanted is obviously completely
uninteresting, because it's call cghain on the other side (the page
fault side) that would show the nested wake queue behavior. I was just
being stupid about it.

I wonder if we have any other places where we just blithely assume
that "wake_up_all()" will actually empty the whole wait queue. It's
_usually_ true, but as noted, nested waiting does happen.

Anyway, can you try this patch instead? It should actually cause the
wake_up_all() to always remove all entries, and thus the WARN_ON()
should no longer happen (and I removed the "list_del()" hackery).

   Linus
diff --git a/mm/shmem.c b/mm/shmem.c
index 166ebf5d2bce..17beb44e9f4f 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -1848,6 +1848,19 @@ alloc_nohuge:page = 
shmem_alloc_and_acct_page(gfp, info, sbinfo,
return error;
 }

+/*
+ * This is like autoremove_wake_function, but it removes the wait queue
+ * entry unconditionally - even if something else had already woken the
+ * target.
+ */
+static int synchronous_wake_function(wait_queue_t *wait, unsigned mode, int 
sync, void *key)
+{
+   int ret = default_wake_function(wait, mode, sync, key);
+   list_del_init(>task_list);
+   return ret;
+}
+
+
 static int shmem_fault(struct vm_area_struct *vma, struct vm_fault *vmf)
 {
struct inode *inode = file_inode(vma->vm_file);
@@ -1883,7 +1896,7 @@ static int shmem_fault(struct vm_area_struct *vma, struct 
vm_fault *vmf)
vmf->pgoff >= shmem_falloc->start &&
vmf->pgoff < shmem_falloc->next) {
wait_queue_head_t *shmem_falloc_waitq;
-   DEFINE_WAIT(shmem_fault_wait);
+   DEFINE_WAIT_FUNC(shmem_fault_wait, 
synchronous_wake_function);

ret = VM_FAULT_NOPAGE;
if ((vmf->flags & FAULT_FLAG_ALLOW_RETRY) &&
@@ -2665,6 +2678,7 @@ static long shmem_fallocate(struct file *file, int mode, 
loff_t offset,
spin_lock(>i_lock);
inode->i_private = NULL;
wake_up_all(_falloc_waitq);
+   WARN_ON_ONCE(!list_empty(_falloc_waitq.task_list));
spin_unlock(>i_lock);
error = 0;
goto out;

[PATCH v4 03/13] net: ethernet: ti: cpsw: minimize direct access to struct cpts

2016-12-05 Thread Grygorii Strashko

This will provide more flexibility in changing CPTS internals and also
required for further changes.

Signed-off-by: Grygorii Strashko 
---
 drivers/net/ethernet/ti/cpsw.c | 28 +++-
 drivers/net/ethernet/ti/cpts.h | 39 +++
 2 files changed, 54 insertions(+), 13 deletions(-)

diff --git a/drivers/net/ethernet/ti/cpsw.c b/drivers/net/ethernet/ti/cpsw.c
index 323174d..ec05e20 100644
--- a/drivers/net/ethernet/ti/cpsw.c
+++ b/drivers/net/ethernet/ti/cpsw.c
@@ -1562,7 +1562,7 @@ static netdev_tx_t cpsw_ndo_start_xmit(struct sk_buff 
*skb,
}
 
if (skb_shinfo(skb)->tx_flags & SKBTX_HW_TSTAMP &&
-   cpsw->cpts->tx_enable)
+   cpts_is_tx_enabled(cpsw->cpts))
skb_shinfo(skb)->tx_flags |= SKBTX_IN_PROGRESS;
 
skb_tx_timestamp(skb);
@@ -1601,7 +1601,8 @@ static void cpsw_hwtstamp_v1(struct cpsw_common *cpsw)
struct cpsw_slave *slave = >slaves[cpsw->data.active_slave];
u32 ts_en, seq_id;
 
-   if (!cpsw->cpts->tx_enable && !cpsw->cpts->rx_enable) {
+   if (!cpts_is_tx_enabled(cpsw->cpts) &&
+   !cpts_is_rx_enabled(cpsw->cpts)) {
slave_write(slave, 0, CPSW1_TS_CTL);
return;
}
@@ -1609,10 +1610,10 @@ static void cpsw_hwtstamp_v1(struct cpsw_common *cpsw)
seq_id = (30 << CPSW_V1_SEQ_ID_OFS_SHIFT) | ETH_P_1588;
ts_en = EVENT_MSG_BITS << CPSW_V1_MSG_TYPE_OFS;
 
-   if (cpsw->cpts->tx_enable)
+   if (cpts_is_tx_enabled(cpsw->cpts))
ts_en |= CPSW_V1_TS_TX_EN;
 
-   if (cpsw->cpts->rx_enable)
+   if (cpts_is_rx_enabled(cpsw->cpts))
ts_en |= CPSW_V1_TS_RX_EN;
 
slave_write(slave, ts_en, CPSW1_TS_CTL);
@@ -1635,20 +1636,20 @@ static void cpsw_hwtstamp_v2(struct cpsw_priv *priv)
case CPSW_VERSION_2:
ctrl &= ~CTRL_V2_ALL_TS_MASK;
 
-   if (cpsw->cpts->tx_enable)
+   if (cpts_is_tx_enabled(cpsw->cpts))
ctrl |= CTRL_V2_TX_TS_BITS;
 
-   if (cpsw->cpts->rx_enable)
+   if (cpts_is_rx_enabled(cpsw->cpts))
ctrl |= CTRL_V2_RX_TS_BITS;
break;
case CPSW_VERSION_3:
default:
ctrl &= ~CTRL_V3_ALL_TS_MASK;
 
-   if (cpsw->cpts->tx_enable)
+   if (cpts_is_tx_enabled(cpsw->cpts))
ctrl |= CTRL_V3_TX_TS_BITS;
 
-   if (cpsw->cpts->rx_enable)
+   if (cpts_is_rx_enabled(cpsw->cpts))
ctrl |= CTRL_V3_RX_TS_BITS;
break;
}
@@ -1684,7 +1685,7 @@ static int cpsw_hwtstamp_set(struct net_device *dev, 
struct ifreq *ifr)
 
switch (cfg.rx_filter) {
case HWTSTAMP_FILTER_NONE:
-   cpts->rx_enable = 0;
+   cpts_rx_enable(cpts, 0);
break;
case HWTSTAMP_FILTER_ALL:
case HWTSTAMP_FILTER_PTP_V1_L4_EVENT:
@@ -1700,14 +1701,14 @@ static int cpsw_hwtstamp_set(struct net_device *dev, 
struct ifreq *ifr)
case HWTSTAMP_FILTER_PTP_V2_EVENT:
case HWTSTAMP_FILTER_PTP_V2_SYNC:
case HWTSTAMP_FILTER_PTP_V2_DELAY_REQ:
-   cpts->rx_enable = 1;
+   cpts_rx_enable(cpts, 1);
cfg.rx_filter = HWTSTAMP_FILTER_PTP_V2_EVENT;
break;
default:
return -ERANGE;
}
 
-   cpts->tx_enable = cfg.tx_type == HWTSTAMP_TX_ON;
+   cpts_tx_enable(cpts, cfg.tx_type == HWTSTAMP_TX_ON);
 
switch (cpsw->version) {
case CPSW_VERSION_1:
@@ -1736,8 +1737,9 @@ static int cpsw_hwtstamp_get(struct net_device *dev, 
struct ifreq *ifr)
return -EOPNOTSUPP;
 
cfg.flags = 0;
-   cfg.tx_type = cpts->tx_enable ? HWTSTAMP_TX_ON : HWTSTAMP_TX_OFF;
-   cfg.rx_filter = (cpts->rx_enable ?
+   cfg.tx_type = cpts_is_tx_enabled(cpts) ?
+ HWTSTAMP_TX_ON : HWTSTAMP_TX_OFF;
+   cfg.rx_filter = (cpts_is_rx_enabled(cpts) ?
 HWTSTAMP_FILTER_PTP_V2_EVENT : HWTSTAMP_FILTER_NONE);
 
return copy_to_user(ifr->ifr_data, , sizeof(cfg)) ? -EFAULT : 0;
diff --git a/drivers/net/ethernet/ti/cpts.h b/drivers/net/ethernet/ti/cpts.h
index 416ba2c..29a1e80c 100644
--- a/drivers/net/ethernet/ti/cpts.h
+++ b/drivers/net/ethernet/ti/cpts.h
@@ -132,6 +132,27 @@ void cpts_rx_timestamp(struct cpts *cpts, struct sk_buff 
*skb);
 void cpts_tx_timestamp(struct cpts *cpts, struct sk_buff *skb);
 int cpts_register(struct device *dev, struct cpts *cpts, u32 mult, u32 shift);
 void cpts_unregister(struct cpts *cpts);
+
+static inline void cpts_rx_enable(struct cpts *cpts, int enable)
+{
+   cpts->rx_enable = enable;
+}
+
+static inline bool cpts_is_rx_enabled(struct cpts *cpts)
+{
+   return !!cpts->rx_enable;
+}
+
+static inline void cpts_tx_enable(struct cpts *cpts, int

[PATCH v4 11/13] clocksource: export the clocks_calc_mult_shift to use by timestamp code

2016-12-05 Thread Grygorii Strashko

From: Murali Karicheri 

The CPSW CPTS driver is capable of doing timestamping on tx/rx packets and
requires to know mult and shift factors for timestamp conversion from raw
value to nanoseconds (ptp clock). Now these mult and shift factors are
calculated manually and provided through DT, which makes very hard to
support of a lot number of platforms, especially if CPTS refclk is not the
same for some kind of boards and depends on efuse settings (Keystone 2
platforms). Hence, export clocks_calc_mult_shift() to allow drivers like
CPSW CPTS (and other ptp drivesr) to benefit from automaitc calculation of
mult and shift factors.

Cc: John Stultz 
Signed-off-by: Murali Karicheri 
Signed-off-by: Grygorii Strashko 
Acked-by: Thomas Gleixner 
---
 kernel/time/clocksource.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/kernel/time/clocksource.c b/kernel/time/clocksource.c
index 7e4fad7..150242c 100644
--- a/kernel/time/clocksource.c
+++ b/kernel/time/clocksource.c
@@ -89,6 +89,7 @@ clocks_calc_mult_shift(u32 *mult, u32 *shift, u32 from, u32 
to, u32 maxsec)
*mult = tmp;
*shift = sft;
 }
+EXPORT_SYMBOL_GPL(clocks_calc_mult_shift);
 
 /*[Clocksource internal variables]-
  * curr_clocksource:
-- 
2.10.1

[PATCH v4 06/13] net: ethernet: ti: cpts: disable cpts when unregistered

2016-12-05 Thread Grygorii Strashko

The cpts now is left enabled after unregistration.
Hence, disable it in cpts_unregister().

Signed-off-by: Grygorii Strashko 
Acked-by: Richard Cochran 
---
 drivers/net/ethernet/ti/cpts.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/drivers/net/ethernet/ti/cpts.c b/drivers/net/ethernet/ti/cpts.c
index 3dda6d5..d3c1ac5 100644
--- a/drivers/net/ethernet/ti/cpts.c
+++ b/drivers/net/ethernet/ti/cpts.c
@@ -404,6 +404,10 @@ void cpts_unregister(struct cpts *cpts)
ptp_clock_unregister(cpts->clock);
cancel_delayed_work_sync(>overflow_work);
}
+
+   cpts_write32(cpts, 0, int_enable);
+   cpts_write32(cpts, 0, control);
+
if (cpts->refclk)
cpts_clk_release(cpts);
 }
-- 
2.10.1

Re: bio linked list corruption.

2016-12-05 Thread Linus Torvalds

On Mon, Dec 5, 2016 at 11:11 AM, Vegard Nossum  wrote:
>
> [ cut here ]
> WARNING: CPU: 22 PID: 14012 at mm/shmem.c:2668 shmem_fallocate+0x9a7/0xac0

Ok, good. So that's confirmed as the cause of this problem.

And the call chain that I wanted is obviously completely
uninteresting, because it's call cghain on the other side (the page
fault side) that would show the nested wake queue behavior. I was just
being stupid about it.

I wonder if we have any other places where we just blithely assume
that "wake_up_all()" will actually empty the whole wait queue. It's
_usually_ true, but as noted, nested waiting does happen.

Anyway, can you try this patch instead? It should actually cause the
wake_up_all() to always remove all entries, and thus the WARN_ON()
should no longer happen (and I removed the "list_del()" hackery).

   Linus
diff --git a/mm/shmem.c b/mm/shmem.c
index 166ebf5d2bce..17beb44e9f4f 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -1848,6 +1848,19 @@ alloc_nohuge:page = 
shmem_alloc_and_acct_page(gfp, info, sbinfo,
return error;
 }

+/*
+ * This is like autoremove_wake_function, but it removes the wait queue
+ * entry unconditionally - even if something else had already woken the
+ * target.
+ */
+static int synchronous_wake_function(wait_queue_t *wait, unsigned mode, int 
sync, void *key)
+{
+   int ret = default_wake_function(wait, mode, sync, key);
+   list_del_init(>task_list);
+   return ret;
+}
+
+
 static int shmem_fault(struct vm_area_struct *vma, struct vm_fault *vmf)
 {
struct inode *inode = file_inode(vma->vm_file);
@@ -1883,7 +1896,7 @@ static int shmem_fault(struct vm_area_struct *vma, struct 
vm_fault *vmf)
vmf->pgoff >= shmem_falloc->start &&
vmf->pgoff < shmem_falloc->next) {
wait_queue_head_t *shmem_falloc_waitq;
-   DEFINE_WAIT(shmem_fault_wait);
+   DEFINE_WAIT_FUNC(shmem_fault_wait, 
synchronous_wake_function);

ret = VM_FAULT_NOPAGE;
if ((vmf->flags & FAULT_FLAG_ALLOW_RETRY) &&
@@ -2665,6 +2678,7 @@ static long shmem_fallocate(struct file *file, int mode, 
loff_t offset,
spin_lock(>i_lock);
inode->i_private = NULL;
wake_up_all(_falloc_waitq);
+   WARN_ON_ONCE(!list_empty(_falloc_waitq.task_list));
spin_unlock(>i_lock);
error = 0;
goto out;

[PATCH v4 03/13] net: ethernet: ti: cpsw: minimize direct access to struct cpts

2016-12-05 Thread Grygorii Strashko

This will provide more flexibility in changing CPTS internals and also
required for further changes.

Signed-off-by: Grygorii Strashko 
---
 drivers/net/ethernet/ti/cpsw.c | 28 +++-
 drivers/net/ethernet/ti/cpts.h | 39 +++
 2 files changed, 54 insertions(+), 13 deletions(-)

diff --git a/drivers/net/ethernet/ti/cpsw.c b/drivers/net/ethernet/ti/cpsw.c
index 323174d..ec05e20 100644
--- a/drivers/net/ethernet/ti/cpsw.c
+++ b/drivers/net/ethernet/ti/cpsw.c
@@ -1562,7 +1562,7 @@ static netdev_tx_t cpsw_ndo_start_xmit(struct sk_buff 
*skb,
}
 
if (skb_shinfo(skb)->tx_flags & SKBTX_HW_TSTAMP &&
-   cpsw->cpts->tx_enable)
+   cpts_is_tx_enabled(cpsw->cpts))
skb_shinfo(skb)->tx_flags |= SKBTX_IN_PROGRESS;
 
skb_tx_timestamp(skb);
@@ -1601,7 +1601,8 @@ static void cpsw_hwtstamp_v1(struct cpsw_common *cpsw)
struct cpsw_slave *slave = >slaves[cpsw->data.active_slave];
u32 ts_en, seq_id;
 
-   if (!cpsw->cpts->tx_enable && !cpsw->cpts->rx_enable) {
+   if (!cpts_is_tx_enabled(cpsw->cpts) &&
+   !cpts_is_rx_enabled(cpsw->cpts)) {
slave_write(slave, 0, CPSW1_TS_CTL);
return;
}
@@ -1609,10 +1610,10 @@ static void cpsw_hwtstamp_v1(struct cpsw_common *cpsw)
seq_id = (30 << CPSW_V1_SEQ_ID_OFS_SHIFT) | ETH_P_1588;
ts_en = EVENT_MSG_BITS << CPSW_V1_MSG_TYPE_OFS;
 
-   if (cpsw->cpts->tx_enable)
+   if (cpts_is_tx_enabled(cpsw->cpts))
ts_en |= CPSW_V1_TS_TX_EN;
 
-   if (cpsw->cpts->rx_enable)
+   if (cpts_is_rx_enabled(cpsw->cpts))
ts_en |= CPSW_V1_TS_RX_EN;
 
slave_write(slave, ts_en, CPSW1_TS_CTL);
@@ -1635,20 +1636,20 @@ static void cpsw_hwtstamp_v2(struct cpsw_priv *priv)
case CPSW_VERSION_2:
ctrl &= ~CTRL_V2_ALL_TS_MASK;
 
-   if (cpsw->cpts->tx_enable)
+   if (cpts_is_tx_enabled(cpsw->cpts))
ctrl |= CTRL_V2_TX_TS_BITS;
 
-   if (cpsw->cpts->rx_enable)
+   if (cpts_is_rx_enabled(cpsw->cpts))
ctrl |= CTRL_V2_RX_TS_BITS;
break;
case CPSW_VERSION_3:
default:
ctrl &= ~CTRL_V3_ALL_TS_MASK;
 
-   if (cpsw->cpts->tx_enable)
+   if (cpts_is_tx_enabled(cpsw->cpts))
ctrl |= CTRL_V3_TX_TS_BITS;
 
-   if (cpsw->cpts->rx_enable)
+   if (cpts_is_rx_enabled(cpsw->cpts))
ctrl |= CTRL_V3_RX_TS_BITS;
break;
}
@@ -1684,7 +1685,7 @@ static int cpsw_hwtstamp_set(struct net_device *dev, 
struct ifreq *ifr)
 
switch (cfg.rx_filter) {
case HWTSTAMP_FILTER_NONE:
-   cpts->rx_enable = 0;
+   cpts_rx_enable(cpts, 0);
break;
case HWTSTAMP_FILTER_ALL:
case HWTSTAMP_FILTER_PTP_V1_L4_EVENT:
@@ -1700,14 +1701,14 @@ static int cpsw_hwtstamp_set(struct net_device *dev, 
struct ifreq *ifr)
case HWTSTAMP_FILTER_PTP_V2_EVENT:
case HWTSTAMP_FILTER_PTP_V2_SYNC:
case HWTSTAMP_FILTER_PTP_V2_DELAY_REQ:
-   cpts->rx_enable = 1;
+   cpts_rx_enable(cpts, 1);
cfg.rx_filter = HWTSTAMP_FILTER_PTP_V2_EVENT;
break;
default:
return -ERANGE;
}
 
-   cpts->tx_enable = cfg.tx_type == HWTSTAMP_TX_ON;
+   cpts_tx_enable(cpts, cfg.tx_type == HWTSTAMP_TX_ON);
 
switch (cpsw->version) {
case CPSW_VERSION_1:
@@ -1736,8 +1737,9 @@ static int cpsw_hwtstamp_get(struct net_device *dev, 
struct ifreq *ifr)
return -EOPNOTSUPP;
 
cfg.flags = 0;
-   cfg.tx_type = cpts->tx_enable ? HWTSTAMP_TX_ON : HWTSTAMP_TX_OFF;
-   cfg.rx_filter = (cpts->rx_enable ?
+   cfg.tx_type = cpts_is_tx_enabled(cpts) ?
+ HWTSTAMP_TX_ON : HWTSTAMP_TX_OFF;
+   cfg.rx_filter = (cpts_is_rx_enabled(cpts) ?
 HWTSTAMP_FILTER_PTP_V2_EVENT : HWTSTAMP_FILTER_NONE);
 
return copy_to_user(ifr->ifr_data, , sizeof(cfg)) ? -EFAULT : 0;
diff --git a/drivers/net/ethernet/ti/cpts.h b/drivers/net/ethernet/ti/cpts.h
index 416ba2c..29a1e80c 100644
--- a/drivers/net/ethernet/ti/cpts.h
+++ b/drivers/net/ethernet/ti/cpts.h
@@ -132,6 +132,27 @@ void cpts_rx_timestamp(struct cpts *cpts, struct sk_buff 
*skb);
 void cpts_tx_timestamp(struct cpts *cpts, struct sk_buff *skb);
 int cpts_register(struct device *dev, struct cpts *cpts, u32 mult, u32 shift);
 void cpts_unregister(struct cpts *cpts);
+
+static inline void cpts_rx_enable(struct cpts *cpts, int enable)
+{
+   cpts->rx_enable = enable;
+}
+
+static inline bool cpts_is_rx_enabled(struct cpts *cpts)
+{
+   return !!cpts->rx_enable;
+}
+
+static inline void cpts_tx_enable(struct cpts *cpts, int enable)
+{
+

[PATCH v4 11/13] clocksource: export the clocks_calc_mult_shift to use by timestamp code

2016-12-05 Thread Grygorii Strashko

From: Murali Karicheri 

The CPSW CPTS driver is capable of doing timestamping on tx/rx packets and
requires to know mult and shift factors for timestamp conversion from raw
value to nanoseconds (ptp clock). Now these mult and shift factors are
calculated manually and provided through DT, which makes very hard to
support of a lot number of platforms, especially if CPTS refclk is not the
same for some kind of boards and depends on efuse settings (Keystone 2
platforms). Hence, export clocks_calc_mult_shift() to allow drivers like
CPSW CPTS (and other ptp drivesr) to benefit from automaitc calculation of
mult and shift factors.

Cc: John Stultz 
Signed-off-by: Murali Karicheri 
Signed-off-by: Grygorii Strashko 
Acked-by: Thomas Gleixner 
---
 kernel/time/clocksource.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/kernel/time/clocksource.c b/kernel/time/clocksource.c
index 7e4fad7..150242c 100644
--- a/kernel/time/clocksource.c
+++ b/kernel/time/clocksource.c
@@ -89,6 +89,7 @@ clocks_calc_mult_shift(u32 *mult, u32 *shift, u32 from, u32 
to, u32 maxsec)
*mult = tmp;
*shift = sft;
 }
+EXPORT_SYMBOL_GPL(clocks_calc_mult_shift);
 
 /*[Clocksource internal variables]-
  * curr_clocksource:
-- 
2.10.1

[PATCH v4 12/13] net: ethernet: ti: cpts: calc mult and shift from refclk freq

2016-12-05 Thread Grygorii Strashko

The cyclecounter mult and shift values can be calculated based on the
CPTS rfclk frequency and timekeepnig framework provides required algos
and API's.

Hence, calc mult and shift basing on CPTS rfclk frequency if both
cpts_clock_shift and cpts_clock_mult properties are not provided in DT (the
basis of calculation algorithm is borrowed from
__clocksource_update_freq_scale() commit 7d2f944a2b83 ("clocksource:
Provide a generic mult/shift factor calculation")). After this change
cpts_clock_shift and cpts_clock_mult DT properties will become optional.

Cc: John Stultz 
Cc: Thomas Gleixner 
Signed-off-by: Grygorii Strashko 
---
 Documentation/devicetree/bindings/net/cpsw.txt |  8 ++--
 drivers/net/ethernet/ti/cpts.c | 53 +++---
 2 files changed, 52 insertions(+), 9 deletions(-)

diff --git a/Documentation/devicetree/bindings/net/cpsw.txt 
b/Documentation/devicetree/bindings/net/cpsw.txt
index 5ad439f..ebda7c9 100644
--- a/Documentation/devicetree/bindings/net/cpsw.txt
+++ b/Documentation/devicetree/bindings/net/cpsw.txt
@@ -20,8 +20,6 @@ Required properties:
 - slaves   : Specifies number for slaves
 - active_slave : Specifies the slave to use for time stamping,
  ethtool and SIOCGMIIPHY
-- cpts_clock_mult  : Numerator to convert input clock ticks into 
nanoseconds
-- cpts_clock_shift : Denominator to convert input clock ticks into 
nanoseconds
 
 Optional properties:
 - ti,hwmods: Must be "cpgmac0"
@@ -35,7 +33,11 @@ Optional properties:
  For example in dra72x-evm, pcf gpio has to be
  driven low so that cpsw slave 0 and phy data
  lines are connected via mux.
-
+- cpts_clock_mult  : Numerator to convert input clock ticks into 
nanoseconds
+- cpts_clock_shift : Denominator to convert input clock ticks into 
nanoseconds
+ Mult and shift will be calculated basing on CPTS
+ rftclk frequency if both cpts_clock_shift and
+ cpts_clock_mult properties are not provided.
 
 Slave Properties:
 Required properties:
diff --git a/drivers/net/ethernet/ti/cpts.c b/drivers/net/ethernet/ti/cpts.c
index 59c09a4..361d13a 100644
--- a/drivers/net/ethernet/ti/cpts.c
+++ b/drivers/net/ethernet/ti/cpts.c
@@ -409,21 +409,60 @@ void cpts_unregister(struct cpts *cpts)
 }
 EXPORT_SYMBOL_GPL(cpts_unregister);
 
+static void cpts_calc_mult_shift(struct cpts *cpts)
+{
+   u64 frac, maxsec, ns;
+   u32 freq, mult, shift;
+
+   freq = clk_get_rate(cpts->refclk);
+
+   /* Calc the maximum number of seconds which we can run before
+* wrapping around.
+*/
+   maxsec = cpts->cc.mask;
+   do_div(maxsec, freq);
+   /* limit conversation rate to 10 sec as higher values will produce
+* too small mult factors and so reduce the conversion accuracy
+*/
+   if (maxsec > 10)
+   maxsec = 10;
+
+   if (cpts->cc_mult || cpts->cc.shift)
+   return;
+
+   clocks_calc_mult_shift(, , freq, NSEC_PER_SEC, maxsec);
+
+   cpts->cc_mult = mult;
+   cpts->cc.mult = mult;
+   cpts->cc.shift = shift;
+
+   frac = 0;
+   ns = cyclecounter_cyc2ns(>cc, freq, cpts->cc.mask, );
+
+   dev_info(cpts->dev,
+"CPTS: ref_clk_freq:%u calc_mult:%u calc_shift:%u error:%lld 
nsec/sec\n",
+freq, cpts->cc_mult, cpts->cc.shift, (ns - NSEC_PER_SEC));
+}
+
 static int cpts_of_parse(struct cpts *cpts, struct device_node *node)
 {
int ret = -EINVAL;
u32 prop;
 
-   if (of_property_read_u32(node, "cpts_clock_mult", ))
-   goto  of_error;
/* save cc.mult original value as it can be modified
 * by cpts_ptp_adjfreq().
 */
-   cpts->cc_mult = prop;
+   cpts->cc_mult = 0;
+   if (!of_property_read_u32(node, "cpts_clock_mult", ))
+   cpts->cc_mult = prop;
+
+   cpts->cc.shift = 0;
+   if (!of_property_read_u32(node, "cpts_clock_shift", ))
+   cpts->cc.shift = prop;
 
-   if (of_property_read_u32(node, "cpts_clock_shift", ))
-   goto  of_error;
-   cpts->cc.shift = prop;
+   if ((cpts->cc_mult && !cpts->cc.shift) ||
+   (!cpts->cc_mult && cpts->cc.shift))
+   goto of_error;
 
return 0;
 
@@ -463,6 +502,8 @@ struct cpts *cpts_create(struct device *dev, void __iomem 
*regs,
cpts->cc.mask = CLOCKSOURCE_MASK(32);
cpts->info = cpts_info;
 
+   cpts_calc_mult_shift(cpts);
+
return cpts;
 }
 EXPORT_SYMBOL_GPL(cpts_create);
-- 
2.10.1

[PATCH v4 01/13] net: ethernet: ti: cpts: switch to readl/writel_relaxed()

2016-12-05 Thread Grygorii Strashko

Switch to readl/writel_relaxed() APIs, because this is recommended
API and the CPTS IP is reused on Keystone 2 SoCs
where LE/BE modes are supported.

Signed-off-by: Grygorii Strashko 
Acked-by: Richard Cochran 
---
 drivers/net/ethernet/ti/cpts.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/ti/cpts.c b/drivers/net/ethernet/ti/cpts.c
index 85a55b4..a42c449 100644
--- a/drivers/net/ethernet/ti/cpts.c
+++ b/drivers/net/ethernet/ti/cpts.c
@@ -33,8 +33,8 @@
 
 #ifdef CONFIG_TI_CPTS
 
-#define cpts_read32(c, r)  __raw_readl(>reg->r)
-#define cpts_write32(c, v, r)  __raw_writel(v, >reg->r)
+#define cpts_read32(c, r)  readl_relaxed(>reg->r)
+#define cpts_write32(c, v, r)  writel_relaxed(v, >reg->r)
 
 static int event_expired(struct cpts_event *event)
 {
-- 
2.10.1

[PATCH v4 12/13] net: ethernet: ti: cpts: calc mult and shift from refclk freq

2016-12-05 Thread Grygorii Strashko

The cyclecounter mult and shift values can be calculated based on the
CPTS rfclk frequency and timekeepnig framework provides required algos
and API's.

Hence, calc mult and shift basing on CPTS rfclk frequency if both
cpts_clock_shift and cpts_clock_mult properties are not provided in DT (the
basis of calculation algorithm is borrowed from
__clocksource_update_freq_scale() commit 7d2f944a2b83 ("clocksource:
Provide a generic mult/shift factor calculation")). After this change
cpts_clock_shift and cpts_clock_mult DT properties will become optional.

Cc: John Stultz 
Cc: Thomas Gleixner 
Signed-off-by: Grygorii Strashko 
---
 Documentation/devicetree/bindings/net/cpsw.txt |  8 ++--
 drivers/net/ethernet/ti/cpts.c | 53 +++---
 2 files changed, 52 insertions(+), 9 deletions(-)

diff --git a/Documentation/devicetree/bindings/net/cpsw.txt 
b/Documentation/devicetree/bindings/net/cpsw.txt
index 5ad439f..ebda7c9 100644
--- a/Documentation/devicetree/bindings/net/cpsw.txt
+++ b/Documentation/devicetree/bindings/net/cpsw.txt
@@ -20,8 +20,6 @@ Required properties:
 - slaves   : Specifies number for slaves
 - active_slave : Specifies the slave to use for time stamping,
  ethtool and SIOCGMIIPHY
-- cpts_clock_mult  : Numerator to convert input clock ticks into 
nanoseconds
-- cpts_clock_shift : Denominator to convert input clock ticks into 
nanoseconds
 
 Optional properties:
 - ti,hwmods: Must be "cpgmac0"
@@ -35,7 +33,11 @@ Optional properties:
  For example in dra72x-evm, pcf gpio has to be
  driven low so that cpsw slave 0 and phy data
  lines are connected via mux.
-
+- cpts_clock_mult  : Numerator to convert input clock ticks into 
nanoseconds
+- cpts_clock_shift : Denominator to convert input clock ticks into 
nanoseconds
+ Mult and shift will be calculated basing on CPTS
+ rftclk frequency if both cpts_clock_shift and
+ cpts_clock_mult properties are not provided.
 
 Slave Properties:
 Required properties:
diff --git a/drivers/net/ethernet/ti/cpts.c b/drivers/net/ethernet/ti/cpts.c
index 59c09a4..361d13a 100644
--- a/drivers/net/ethernet/ti/cpts.c
+++ b/drivers/net/ethernet/ti/cpts.c
@@ -409,21 +409,60 @@ void cpts_unregister(struct cpts *cpts)
 }
 EXPORT_SYMBOL_GPL(cpts_unregister);
 
+static void cpts_calc_mult_shift(struct cpts *cpts)
+{
+   u64 frac, maxsec, ns;
+   u32 freq, mult, shift;
+
+   freq = clk_get_rate(cpts->refclk);
+
+   /* Calc the maximum number of seconds which we can run before
+* wrapping around.
+*/
+   maxsec = cpts->cc.mask;
+   do_div(maxsec, freq);
+   /* limit conversation rate to 10 sec as higher values will produce
+* too small mult factors and so reduce the conversion accuracy
+*/
+   if (maxsec > 10)
+   maxsec = 10;
+
+   if (cpts->cc_mult || cpts->cc.shift)
+   return;
+
+   clocks_calc_mult_shift(, , freq, NSEC_PER_SEC, maxsec);
+
+   cpts->cc_mult = mult;
+   cpts->cc.mult = mult;
+   cpts->cc.shift = shift;
+
+   frac = 0;
+   ns = cyclecounter_cyc2ns(>cc, freq, cpts->cc.mask, );
+
+   dev_info(cpts->dev,
+"CPTS: ref_clk_freq:%u calc_mult:%u calc_shift:%u error:%lld 
nsec/sec\n",
+freq, cpts->cc_mult, cpts->cc.shift, (ns - NSEC_PER_SEC));
+}
+
 static int cpts_of_parse(struct cpts *cpts, struct device_node *node)
 {
int ret = -EINVAL;
u32 prop;
 
-   if (of_property_read_u32(node, "cpts_clock_mult", ))
-   goto  of_error;
/* save cc.mult original value as it can be modified
 * by cpts_ptp_adjfreq().
 */
-   cpts->cc_mult = prop;
+   cpts->cc_mult = 0;
+   if (!of_property_read_u32(node, "cpts_clock_mult", ))
+   cpts->cc_mult = prop;
+
+   cpts->cc.shift = 0;
+   if (!of_property_read_u32(node, "cpts_clock_shift", ))
+   cpts->cc.shift = prop;
 
-   if (of_property_read_u32(node, "cpts_clock_shift", ))
-   goto  of_error;
-   cpts->cc.shift = prop;
+   if ((cpts->cc_mult && !cpts->cc.shift) ||
+   (!cpts->cc_mult && cpts->cc.shift))
+   goto of_error;
 
return 0;
 
@@ -463,6 +502,8 @@ struct cpts *cpts_create(struct device *dev, void __iomem 
*regs,
cpts->cc.mask = CLOCKSOURCE_MASK(32);
cpts->info = cpts_info;
 
+   cpts_calc_mult_shift(cpts);
+
return cpts;
 }
 EXPORT_SYMBOL_GPL(cpts_create);
-- 
2.10.1

[PATCH v4 01/13] net: ethernet: ti: cpts: switch to readl/writel_relaxed()

2016-12-05 Thread Grygorii Strashko

Switch to readl/writel_relaxed() APIs, because this is recommended
API and the CPTS IP is reused on Keystone 2 SoCs
where LE/BE modes are supported.

Signed-off-by: Grygorii Strashko 
Acked-by: Richard Cochran 
---
 drivers/net/ethernet/ti/cpts.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/ti/cpts.c b/drivers/net/ethernet/ti/cpts.c
index 85a55b4..a42c449 100644
--- a/drivers/net/ethernet/ti/cpts.c
+++ b/drivers/net/ethernet/ti/cpts.c
@@ -33,8 +33,8 @@
 
 #ifdef CONFIG_TI_CPTS
 
-#define cpts_read32(c, r)  __raw_readl(>reg->r)
-#define cpts_write32(c, v, r)  __raw_writel(v, >reg->r)
+#define cpts_read32(c, r)  readl_relaxed(>reg->r)
+#define cpts_write32(c, v, r)  writel_relaxed(v, >reg->r)
 
 static int event_expired(struct cpts_event *event)
 {
-- 
2.10.1

[PATCH v4 09/13] net: ethernet: ti: cpts: rework initialization/deinitialization

2016-12-05 Thread Grygorii Strashko

The current implementation CPTS initialization and deinitialization
(represented by cpts_register/unregister()) does too many static
initialization from .ndo_open(), which is reasonable to do once at probe
time instead, and also require caller to allocate memory for struct cpts,
which is internal for CPTS driver in general.

This patch splits CPTS initialization and deinitialization on two parts:

- static initializtion cpts_create()/cpts_release() which expected to be
executed when parent driver is probed/removed;

- dynamic part cpts_register/unregister() which expected to be executed
when network device is opened/closed.

As result, current code of CPTS parent driver - CPSW - will be simplified
(and it also will allow simplify adding support for Keystone 2 devices in
the future), plus more initialization errors will be catched earlier. In
addition, this change allows to clean up cpts.h for the case when CPTS is
disabled.

Signed-off-by: Grygorii Strashko 
---
 drivers/net/ethernet/ti/cpsw.c |  24 +-
 drivers/net/ethernet/ti/cpts.c | 102 -
 drivers/net/ethernet/ti/cpts.h |  26 +--
 3 files changed, 95 insertions(+), 57 deletions(-)

diff --git a/drivers/net/ethernet/ti/cpsw.c b/drivers/net/ethernet/ti/cpsw.c
index ec05e20..deb008a 100644
--- a/drivers/net/ethernet/ti/cpsw.c
+++ b/drivers/net/ethernet/ti/cpsw.c
@@ -1486,9 +1486,7 @@ static int cpsw_ndo_open(struct net_device *ndev)
if (ret < 0)
goto err_cleanup;
 
-   if (cpts_register(cpsw->dev, cpsw->cpts,
- cpsw->data.cpts_clock_mult,
- cpsw->data.cpts_clock_shift))
+   if (cpts_register(cpsw->cpts))
dev_err(priv->dev, "error registering cpts device\n");
 
}
@@ -2796,6 +2794,7 @@ static int cpsw_probe(struct platform_device *pdev)
struct cpdma_params dma_params;
struct cpsw_ale_params  ale_params;
void __iomem*ss_regs;
+   void __iomem*cpts_regs;
struct resource *res, *ss_res;
const struct of_device_id   *of_id;
struct gpio_descs   *mode;
@@ -2823,12 +2822,6 @@ static int cpsw_probe(struct platform_device *pdev)
priv->dev  = >dev;
priv->msg_enable = netif_msg_init(debug_level, CPSW_DEBUG);
cpsw->rx_packet_max = max(rx_packet_max, 128);
-   cpsw->cpts = devm_kzalloc(>dev, sizeof(struct cpts), GFP_KERNEL);
-   if (!cpsw->cpts) {
-   dev_err(>dev, "error allocating cpts\n");
-   ret = -ENOMEM;
-   goto clean_ndev_ret;
-   }
 
mode = devm_gpiod_get_array_optional(>dev, "mode", GPIOD_OUT_LOW);
if (IS_ERR(mode)) {
@@ -2916,7 +2909,7 @@ static int cpsw_probe(struct platform_device *pdev)
switch (cpsw->version) {
case CPSW_VERSION_1:
cpsw->host_port_regs = ss_regs + CPSW1_HOST_PORT_OFFSET;
-   cpsw->cpts->reg  = ss_regs + CPSW1_CPTS_OFFSET;
+   cpts_regs   = ss_regs + CPSW1_CPTS_OFFSET;
cpsw->hw_stats   = ss_regs + CPSW1_HW_STATS;
dma_params.dmaregs   = ss_regs + CPSW1_CPDMA_OFFSET;
dma_params.txhdp = ss_regs + CPSW1_STATERAM_OFFSET;
@@ -2930,7 +2923,7 @@ static int cpsw_probe(struct platform_device *pdev)
case CPSW_VERSION_3:
case CPSW_VERSION_4:
cpsw->host_port_regs = ss_regs + CPSW2_HOST_PORT_OFFSET;
-   cpsw->cpts->reg  = ss_regs + CPSW2_CPTS_OFFSET;
+   cpts_regs   = ss_regs + CPSW2_CPTS_OFFSET;
cpsw->hw_stats   = ss_regs + CPSW2_HW_STATS;
dma_params.dmaregs   = ss_regs + CPSW2_CPDMA_OFFSET;
dma_params.txhdp = ss_regs + CPSW2_STATERAM_OFFSET;
@@ -2997,6 +2990,14 @@ static int cpsw_probe(struct platform_device *pdev)
goto clean_dma_ret;
}
 
+   cpsw->cpts = cpts_create(cpsw->dev, cpts_regs,
+cpsw->data.cpts_clock_mult,
+cpsw->data.cpts_clock_shift);
+   if (IS_ERR(cpsw->cpts)) {
+   ret = PTR_ERR(cpsw->cpts);
+   goto clean_ale_ret;
+   }
+
ndev->irq = platform_get_irq(pdev, 1);
if (ndev->irq < 0) {
dev_err(priv->dev, "error getting irq resource\n");
@@ -3112,6 +3113,7 @@ static int cpsw_remove(struct platform_device *pdev)
unregister_netdev(cpsw->slaves[1].ndev);
unregister_netdev(ndev);
 
+   cpts_release(cpsw->cpts);
cpsw_ale_destroy(cpsw->ale);
cpdma_ctlr_destroy(cpsw->dma);
cpsw_remove_dt(pdev);
diff --git a/drivers/net/ethernet/ti/cpts.c b/drivers/net/ethernet/ti/cpts.c
index fe1bb7f..9356803 100644
---

[PATCH v4 09/13] net: ethernet: ti: cpts: rework initialization/deinitialization

2016-12-05 Thread Grygorii Strashko

The current implementation CPTS initialization and deinitialization
(represented by cpts_register/unregister()) does too many static
initialization from .ndo_open(), which is reasonable to do once at probe
time instead, and also require caller to allocate memory for struct cpts,
which is internal for CPTS driver in general.

This patch splits CPTS initialization and deinitialization on two parts:

- static initializtion cpts_create()/cpts_release() which expected to be
executed when parent driver is probed/removed;

- dynamic part cpts_register/unregister() which expected to be executed
when network device is opened/closed.

As result, current code of CPTS parent driver - CPSW - will be simplified
(and it also will allow simplify adding support for Keystone 2 devices in
the future), plus more initialization errors will be catched earlier. In
addition, this change allows to clean up cpts.h for the case when CPTS is
disabled.

Signed-off-by: Grygorii Strashko 
---
 drivers/net/ethernet/ti/cpsw.c |  24 +-
 drivers/net/ethernet/ti/cpts.c | 102 -
 drivers/net/ethernet/ti/cpts.h |  26 +--
 3 files changed, 95 insertions(+), 57 deletions(-)

diff --git a/drivers/net/ethernet/ti/cpsw.c b/drivers/net/ethernet/ti/cpsw.c
index ec05e20..deb008a 100644
--- a/drivers/net/ethernet/ti/cpsw.c
+++ b/drivers/net/ethernet/ti/cpsw.c
@@ -1486,9 +1486,7 @@ static int cpsw_ndo_open(struct net_device *ndev)
if (ret < 0)
goto err_cleanup;
 
-   if (cpts_register(cpsw->dev, cpsw->cpts,
- cpsw->data.cpts_clock_mult,
- cpsw->data.cpts_clock_shift))
+   if (cpts_register(cpsw->cpts))
dev_err(priv->dev, "error registering cpts device\n");
 
}
@@ -2796,6 +2794,7 @@ static int cpsw_probe(struct platform_device *pdev)
struct cpdma_params dma_params;
struct cpsw_ale_params  ale_params;
void __iomem*ss_regs;
+   void __iomem*cpts_regs;
struct resource *res, *ss_res;
const struct of_device_id   *of_id;
struct gpio_descs   *mode;
@@ -2823,12 +2822,6 @@ static int cpsw_probe(struct platform_device *pdev)
priv->dev  = >dev;
priv->msg_enable = netif_msg_init(debug_level, CPSW_DEBUG);
cpsw->rx_packet_max = max(rx_packet_max, 128);
-   cpsw->cpts = devm_kzalloc(>dev, sizeof(struct cpts), GFP_KERNEL);
-   if (!cpsw->cpts) {
-   dev_err(>dev, "error allocating cpts\n");
-   ret = -ENOMEM;
-   goto clean_ndev_ret;
-   }
 
mode = devm_gpiod_get_array_optional(>dev, "mode", GPIOD_OUT_LOW);
if (IS_ERR(mode)) {
@@ -2916,7 +2909,7 @@ static int cpsw_probe(struct platform_device *pdev)
switch (cpsw->version) {
case CPSW_VERSION_1:
cpsw->host_port_regs = ss_regs + CPSW1_HOST_PORT_OFFSET;
-   cpsw->cpts->reg  = ss_regs + CPSW1_CPTS_OFFSET;
+   cpts_regs   = ss_regs + CPSW1_CPTS_OFFSET;
cpsw->hw_stats   = ss_regs + CPSW1_HW_STATS;
dma_params.dmaregs   = ss_regs + CPSW1_CPDMA_OFFSET;
dma_params.txhdp = ss_regs + CPSW1_STATERAM_OFFSET;
@@ -2930,7 +2923,7 @@ static int cpsw_probe(struct platform_device *pdev)
case CPSW_VERSION_3:
case CPSW_VERSION_4:
cpsw->host_port_regs = ss_regs + CPSW2_HOST_PORT_OFFSET;
-   cpsw->cpts->reg  = ss_regs + CPSW2_CPTS_OFFSET;
+   cpts_regs   = ss_regs + CPSW2_CPTS_OFFSET;
cpsw->hw_stats   = ss_regs + CPSW2_HW_STATS;
dma_params.dmaregs   = ss_regs + CPSW2_CPDMA_OFFSET;
dma_params.txhdp = ss_regs + CPSW2_STATERAM_OFFSET;
@@ -2997,6 +2990,14 @@ static int cpsw_probe(struct platform_device *pdev)
goto clean_dma_ret;
}
 
+   cpsw->cpts = cpts_create(cpsw->dev, cpts_regs,
+cpsw->data.cpts_clock_mult,
+cpsw->data.cpts_clock_shift);
+   if (IS_ERR(cpsw->cpts)) {
+   ret = PTR_ERR(cpsw->cpts);
+   goto clean_ale_ret;
+   }
+
ndev->irq = platform_get_irq(pdev, 1);
if (ndev->irq < 0) {
dev_err(priv->dev, "error getting irq resource\n");
@@ -3112,6 +3113,7 @@ static int cpsw_remove(struct platform_device *pdev)
unregister_netdev(cpsw->slaves[1].ndev);
unregister_netdev(ndev);
 
+   cpts_release(cpsw->cpts);
cpsw_ale_destroy(cpsw->ale);
cpdma_ctlr_destroy(cpsw->dma);
cpsw_remove_dt(pdev);
diff --git a/drivers/net/ethernet/ti/cpts.c b/drivers/net/ethernet/ti/cpts.c
index fe1bb7f..9356803 100644
--- a/drivers/net/ethernet/ti/cpts.c
+++

[PATCH v4 07/13] net: ethernet: ti: cpts: clean up event list if event pool is empty

2016-12-05 Thread Grygorii Strashko

From: WingMan Kwok 

When a CPTS user does not exit gracefully by disabling cpts
timestamping and leaving a joined multicast group, the system
continues to receive and timestamps the ptp packets which eventually
occupy all the event list entries.  When this happns, the added code
tries to remove some list entries which are expired.

Signed-off-by: WingMan Kwok 
Signed-off-by: Grygorii Strashko 
---
 drivers/net/ethernet/ti/cpts.c | 26 --
 1 file changed, 24 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/ti/cpts.c b/drivers/net/ethernet/ti/cpts.c
index d3c1ac5..7ab1fa7 100644
--- a/drivers/net/ethernet/ti/cpts.c
+++ b/drivers/net/ethernet/ti/cpts.c
@@ -57,6 +57,26 @@ static int cpts_fifo_pop(struct cpts *cpts, u32 *high, u32 
*low)
return -1;
 }
 
+static int cpts_purge_events(struct cpts *cpts)
+{
+   struct list_head *this, *next;
+   struct cpts_event *event;
+   int removed = 0;
+
+   list_for_each_safe(this, next, >events) {
+   event = list_entry(this, struct cpts_event, list);
+   if (event_expired(event)) {
+   list_del_init(>list);
+   list_add(>list, >pool);
+   ++removed;
+   }
+   }
+
+   if (removed)
+   pr_debug("cpts: event pool cleaned up %d\n", removed);
+   return removed ? 0 : -1;
+}
+
 /*
  * Returns zero if matching event type was found.
  */
@@ -69,10 +89,12 @@ static int cpts_fifo_read(struct cpts *cpts, int match)
for (i = 0; i < CPTS_FIFO_DEPTH; i++) {
if (cpts_fifo_pop(cpts, , ))
break;
-   if (list_empty(>pool)) {
-   pr_err("cpts: event pool is empty\n");
+
+   if (list_empty(>pool) && cpts_purge_events(cpts)) {
+   pr_err("cpts: event pool empty\n");
return -1;
}
+
event = list_first_entry(>pool, struct cpts_event, list);
event->tmo = jiffies + 2;
event->high = hi;
-- 
2.10.1

[PATCH v4 04/13] net: ethernet: ti: cpts: fix unbalanced clk api usage in cpts_register/unregister

2016-12-05 Thread Grygorii Strashko

There are two issues with TI CPTS code which are reproducible when TI
CPSW ethX device passes few up/down iterations:
- cpts refclk prepare counter continuously incremented after each
up/down iteration;
- devm_clk_get(dev, "cpts") is called many times.

Hence, fix these issues by using clk_disable_unprepare() in
cpts_clk_release() and skipping devm_clk_get() if cpts refclk has been
acquired already.

Signed-off-by: Grygorii Strashko 
Acked-by: Richard Cochran 
---
 drivers/net/ethernet/ti/cpts.c | 14 --
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/drivers/net/ethernet/ti/cpts.c b/drivers/net/ethernet/ti/cpts.c
index 8cb0369..61198f1 100644
--- a/drivers/net/ethernet/ti/cpts.c
+++ b/drivers/net/ethernet/ti/cpts.c
@@ -230,18 +230,20 @@ static void cpts_overflow_check(struct work_struct *work)
 
 static void cpts_clk_init(struct device *dev, struct cpts *cpts)
 {
-   cpts->refclk = devm_clk_get(dev, "cpts");
-   if (IS_ERR(cpts->refclk)) {
-   dev_err(dev, "Failed to get cpts refclk\n");
-   cpts->refclk = NULL;
-   return;
+   if (!cpts->refclk) {
+   cpts->refclk = devm_clk_get(dev, "cpts");
+   if (IS_ERR(cpts->refclk)) {
+   dev_err(dev, "Failed to get cpts refclk\n");
+   cpts->refclk = NULL;
+   return;
+   }
}
clk_prepare_enable(cpts->refclk);
 }
 
 static void cpts_clk_release(struct cpts *cpts)
 {
-   clk_disable(cpts->refclk);
+   clk_disable_unprepare(cpts->refclk);
 }
 
 static int cpts_match(struct sk_buff *skb, unsigned int ptp_class,
-- 
2.10.1

[PATCH v4 07/13] net: ethernet: ti: cpts: clean up event list if event pool is empty

2016-12-05 Thread Grygorii Strashko

From: WingMan Kwok 

When a CPTS user does not exit gracefully by disabling cpts
timestamping and leaving a joined multicast group, the system
continues to receive and timestamps the ptp packets which eventually
occupy all the event list entries.  When this happns, the added code
tries to remove some list entries which are expired.

Signed-off-by: WingMan Kwok 
Signed-off-by: Grygorii Strashko 
---
 drivers/net/ethernet/ti/cpts.c | 26 --
 1 file changed, 24 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/ti/cpts.c b/drivers/net/ethernet/ti/cpts.c
index d3c1ac5..7ab1fa7 100644
--- a/drivers/net/ethernet/ti/cpts.c
+++ b/drivers/net/ethernet/ti/cpts.c
@@ -57,6 +57,26 @@ static int cpts_fifo_pop(struct cpts *cpts, u32 *high, u32 
*low)
return -1;
 }
 
+static int cpts_purge_events(struct cpts *cpts)
+{
+   struct list_head *this, *next;
+   struct cpts_event *event;
+   int removed = 0;
+
+   list_for_each_safe(this, next, >events) {
+   event = list_entry(this, struct cpts_event, list);
+   if (event_expired(event)) {
+   list_del_init(>list);
+   list_add(>list, >pool);
+   ++removed;
+   }
+   }
+
+   if (removed)
+   pr_debug("cpts: event pool cleaned up %d\n", removed);
+   return removed ? 0 : -1;
+}
+
 /*
  * Returns zero if matching event type was found.
  */
@@ -69,10 +89,12 @@ static int cpts_fifo_read(struct cpts *cpts, int match)
for (i = 0; i < CPTS_FIFO_DEPTH; i++) {
if (cpts_fifo_pop(cpts, , ))
break;
-   if (list_empty(>pool)) {
-   pr_err("cpts: event pool is empty\n");
+
+   if (list_empty(>pool) && cpts_purge_events(cpts)) {
+   pr_err("cpts: event pool empty\n");
return -1;
}
+
event = list_first_entry(>pool, struct cpts_event, list);
event->tmo = jiffies + 2;
event->high = hi;
-- 
2.10.1

[PATCH v4 04/13] net: ethernet: ti: cpts: fix unbalanced clk api usage in cpts_register/unregister

2016-12-05 Thread Grygorii Strashko

There are two issues with TI CPTS code which are reproducible when TI
CPSW ethX device passes few up/down iterations:
- cpts refclk prepare counter continuously incremented after each
up/down iteration;
- devm_clk_get(dev, "cpts") is called many times.

Hence, fix these issues by using clk_disable_unprepare() in
cpts_clk_release() and skipping devm_clk_get() if cpts refclk has been
acquired already.

Signed-off-by: Grygorii Strashko 
Acked-by: Richard Cochran 
---
 drivers/net/ethernet/ti/cpts.c | 14 --
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/drivers/net/ethernet/ti/cpts.c b/drivers/net/ethernet/ti/cpts.c
index 8cb0369..61198f1 100644
--- a/drivers/net/ethernet/ti/cpts.c
+++ b/drivers/net/ethernet/ti/cpts.c
@@ -230,18 +230,20 @@ static void cpts_overflow_check(struct work_struct *work)
 
 static void cpts_clk_init(struct device *dev, struct cpts *cpts)
 {
-   cpts->refclk = devm_clk_get(dev, "cpts");
-   if (IS_ERR(cpts->refclk)) {
-   dev_err(dev, "Failed to get cpts refclk\n");
-   cpts->refclk = NULL;
-   return;
+   if (!cpts->refclk) {
+   cpts->refclk = devm_clk_get(dev, "cpts");
+   if (IS_ERR(cpts->refclk)) {
+   dev_err(dev, "Failed to get cpts refclk\n");
+   cpts->refclk = NULL;
+   return;
+   }
}
clk_prepare_enable(cpts->refclk);
 }
 
 static void cpts_clk_release(struct cpts *cpts)
 {
-   clk_disable(cpts->refclk);
+   clk_disable_unprepare(cpts->refclk);
 }
 
 static int cpts_match(struct sk_buff *skb, unsigned int ptp_class,
-- 
2.10.1

[PATCH v4 08/13] net: ethernet: ti: cpts: drop excessive writes to CTRL and INT_EN regs

2016-12-05 Thread Grygorii Strashko

CPTS module and IRQs are always enabled when CPTS is registered,
before starting overflow check work, and disabled during
deregistration, when overflow check work has been canceled already.
So, It doesn't require to (re)enable CPTS module and IRQs in
cpts_overflow_check().

Signed-off-by: Grygorii Strashko 
Acked-by: Richard Cochran 
---
 drivers/net/ethernet/ti/cpts.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/drivers/net/ethernet/ti/cpts.c b/drivers/net/ethernet/ti/cpts.c
index 7ab1fa7..fe1bb7f 100644
--- a/drivers/net/ethernet/ti/cpts.c
+++ b/drivers/net/ethernet/ti/cpts.c
@@ -243,8 +243,6 @@ static void cpts_overflow_check(struct work_struct *work)
struct timespec64 ts;
struct cpts *cpts = container_of(work, struct cpts, overflow_work.work);
 
-   cpts_write32(cpts, CPTS_EN, control);
-   cpts_write32(cpts, TS_PEND_EN, int_enable);
cpts_ptp_gettime(>info, );
pr_debug("cpts overflow check at %lld.%09lu\n", ts.tv_sec, ts.tv_nsec);
schedule_delayed_work(>overflow_work, CPTS_OVERFLOW_PERIOD);
-- 
2.10.1

[PATCH v4 08/13] net: ethernet: ti: cpts: drop excessive writes to CTRL and INT_EN regs

2016-12-05 Thread Grygorii Strashko

CPTS module and IRQs are always enabled when CPTS is registered,
before starting overflow check work, and disabled during
deregistration, when overflow check work has been canceled already.
So, It doesn't require to (re)enable CPTS module and IRQs in
cpts_overflow_check().

Signed-off-by: Grygorii Strashko 
Acked-by: Richard Cochran 
---
 drivers/net/ethernet/ti/cpts.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/drivers/net/ethernet/ti/cpts.c b/drivers/net/ethernet/ti/cpts.c
index 7ab1fa7..fe1bb7f 100644
--- a/drivers/net/ethernet/ti/cpts.c
+++ b/drivers/net/ethernet/ti/cpts.c
@@ -243,8 +243,6 @@ static void cpts_overflow_check(struct work_struct *work)
struct timespec64 ts;
struct cpts *cpts = container_of(work, struct cpts, overflow_work.work);
 
-   cpts_write32(cpts, CPTS_EN, control);
-   cpts_write32(cpts, TS_PEND_EN, int_enable);
cpts_ptp_gettime(>info, );
pr_debug("cpts overflow check at %lld.%09lu\n", ts.tv_sec, ts.tv_nsec);
schedule_delayed_work(>overflow_work, CPTS_OVERFLOW_PERIOD);
-- 
2.10.1

Re: [PATCH 1/1 v2] isdn: hisax: set error code on failure

2016-12-05 Thread David Miller

From: Pan Bian 
Date: Sun,  4 Dec 2016 18:43:31 +0800

> From: Pan Bian 
> 
> In function hfc4s8s_probe(), the value of return variable err should be
> negative on failures. However, when the call to request_region() returns
> NULL, the value of err is 0. This patch fixes the bug, assigning
> "-EBUSY" to err on the path that request_region() fails.
> 
> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=188931
> 
> Signed-off-by: Pan Bian 

Applied.

Re: [PATCH 1/1 v2] isdn: hisax: set error code on failure

2016-12-05 Thread David Miller

From: Pan Bian 
Date: Sun,  4 Dec 2016 18:43:31 +0800

> From: Pan Bian 
> 
> In function hfc4s8s_probe(), the value of return variable err should be
> negative on failures. However, when the call to request_region() returns
> NULL, the value of err is 0. This patch fixes the bug, assigning
> "-EBUSY" to err on the path that request_region() fails.
> 
> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=188931
> 
> Signed-off-by: Pan Bian 

Applied.

Re: [PATCH 5/7] perf tools: Force fixdep compilation at the start of the build

2016-12-05 Thread Arnaldo Carvalho de Melo

Em Sun, Dec 04, 2016 at 09:42:56PM +0100, Jiri Olsa escreveu:
> The fixdep tool needs to be built before everything else,
> because it fixes every object dependency file.
> 
> We handle this currently by making all objects to depend
> on fixdep, which is error prone and is easily forgotten
> when new object is added.
> 
> Instead of this, this patch force fixdep tool to be built
> as the first target in the separate make session. This way
> we don't need to handle extra fixdep dependencies and we are
> certain there's no fixdep race with any parallel make job.

With your updated version I'm getting this:

$ make -k O=/tmp/build/perf -C tools/perf install-bin

  LINK /tmp/build/perf/libperf-gtk.so
  INSTALL  GTK UI
install: cannot change permissions of ‘/lib64’: Operation not permitted
install: cannot create regular file '/lib64/libperf-gtk.so': Permission denied
Makefile.perf:738: recipe for target 'install-gtk' failed
make[2]: *** [install-gtk] Error 1
make[2]: *** Waiting for unfinished jobs
Makefile.perf:383: recipe for target 'sub-make' failed
make[1]: *** [sub-make] Error 2
Makefile:108: recipe for target 'install-bin' failed
make: *** [install-bin] Error 2
make: Leaving directory '/home/acme/git/linux/tools/perf'
 
> Link: http://lkml.kernel.org/n/tip-b6wi5cssl8apu74o2kbfh...@git.kernel.org
> Signed-off-by: Jiri Olsa 
> ---
>  tools/perf/Makefile.perf | 54 
> +---
>  1 file changed, 42 insertions(+), 12 deletions(-)
> 
> diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf
> index 112f4f7bb200..25d9d6653746 100644
> --- a/tools/perf/Makefile.perf
> +++ b/tools/perf/Makefile.perf
> @@ -179,6 +179,27 @@ ifeq ($(filter-out 
> $(NON_CONFIG_TARGETS),$(MAKECMDGOALS)),)
>  endif
>  endif
>  
> +# The fixdep build - we force fixdep tool to be built as
> +# the first target in the separate make session not to be
> +# disturbed by any parallel make jobs. Once fixdep is done
> +# we issue the requested build with FIXDEP=1 variable.
> +#
> +# The fixdep build is disabled for $(NON_CONFIG_TARGETS)
> +# targets, because it's not necessary.
> +
> +ifdef FIXDEP
> +  force_fixdep := 0
> +else
> +  force_fixdep := $(config)
> +endif
> +
> +# Disable config, we are forcing fixdep rule first
> +# and it does not need config data.
> +
> +ifeq ($(force_fixdep),1)
> +  config := 0
> +endif
> +
>  # Set FEATURE_TESTS to 'all' so all possible feature checkers are executed.
>  # Without this setting the output feature dump file misses some features, for
>  # example, liberty. Select all checkers so we won't get an incomplete feature
> @@ -324,10 +345,22 @@ LIBS = -Wl,--whole-archive $(PERFLIBS) 
> -Wl,--no-whole-archive -Wl,--start-group
>  
>  export INSTALL SHELL_PATH
>  
> +export srctree OUTPUT RM CC LD AR CFLAGS V BISON FLEX AWK
> +export HOSTCC HOSTLD HOSTAR
> +include $(srctree)/tools/build/Makefile.include
> +
>  ### Build rules
>  
>  SHELL = $(SHELL_PATH)
>  
> +ifeq ($(force_fixdep),1)
> +goals := $(filter-out all sub-make, $(MAKECMDGOALS))
> +
> +$(goals) all: sub-make
> +
> +sub-make: fixdep
> + $(Q)$(MAKE) FIXDEP=1 -f Makefile.perf $(goals)
> +else
>  all: shell_compatibility_test $(ALL_PROGRAMS) $(LANG_BINDINGS) 
> $(OTHER_PROGRAMS)
>  
>  $(OUTPUT)python/perf.so: $(PYTHON_EXT_SRCS) $(PYTHON_EXT_DEPS) 
> $(LIBTRACEEVENT_DYNAMIC_LIST)
> @@ -348,10 +381,6 @@ strip: $(PROGRAMS) $(OUTPUT)perf
>  
>  PERF_IN := $(OUTPUT)perf-in.o
>  
> -export srctree OUTPUT RM CC LD AR CFLAGS V BISON FLEX AWK
> -export HOSTCC HOSTLD HOSTAR
> -include $(srctree)/tools/build/Makefile.include
> -
>  JEVENTS   := $(OUTPUT)pmu-events/jevents
>  JEVENTS_IN:= $(OUTPUT)pmu-events/jevents-in.o
>  
> @@ -470,7 +499,7 @@ $(OUTPUT)perf: $(PERFLIBS) $(PERF_IN) $(PMU_EVENTS_IN) 
> $(LIBTRACEEVENT_DYNAMIC_L
>   $(QUIET_LINK)$(CC) $(CFLAGS) $(LDFLAGS) 
> $(LIBTRACEEVENT_DYNAMIC_LIST_LDFLAGS) \
>   $(PERF_IN) $(PMU_EVENTS_IN) $(LIBS) -o $@
>  
> -$(GTK_IN): fixdep FORCE
> +$(GTK_IN): FORCE
>   $(Q)$(MAKE) $(build)=gtk
>  
>  $(OUTPUT)libperf-gtk.so: $(GTK_IN) $(PERFLIBS)
> @@ -515,7 +544,7 @@ endif
>  __build-dir = $(subst $(OUTPUT),,$(dir $@))
>  build-dir   = $(if $(__build-dir),$(__build-dir),.)
>  
> -prepare: $(OUTPUT)PERF-VERSION-FILE $(OUTPUT)common-cmds.h fixdep archheaders
> +prepare: $(OUTPUT)PERF-VERSION-FILE $(OUTPUT)common-cmds.h archheaders
>  
>  $(OUTPUT)%.o: %.c prepare FORCE
>   $(Q)$(MAKE) -f $(srctree)/tools/build/Makefile.build dir=$(build-dir) $@
> @@ -565,7 +594,7 @@ $(patsubst perf-%,%.o,$(PROGRAMS)): $(wildcard */*.h)
>  
>  LIBPERF_IN := $(OUTPUT)libperf-in.o
>  
> -$(LIBPERF_IN): prepare fixdep FORCE
> +$(LIBPERF_IN): prepare FORCE
>   $(Q)$(MAKE) $(build)=libperf
>  
>  $(LIB_FILE): $(LIBPERF_IN)
> @@ -573,10 +602,10 @@ $(LIB_FILE): $(LIBPERF_IN)
>  
>  LIBTRACEEVENT_FLAGS += plugin_dir=$(plugindir_SQ)
>  
> -$(LIBTRACEEVENT): fixdep FORCE
> +$(LIBTRACEEVENT): FORCE
>   $(Q)$(MAKE) -C

Re: [PATCH 5/7] perf tools: Force fixdep compilation at the start of the build

2016-12-05 Thread Arnaldo Carvalho de Melo

Em Sun, Dec 04, 2016 at 09:42:56PM +0100, Jiri Olsa escreveu:
> The fixdep tool needs to be built before everything else,
> because it fixes every object dependency file.
> 
> We handle this currently by making all objects to depend
> on fixdep, which is error prone and is easily forgotten
> when new object is added.
> 
> Instead of this, this patch force fixdep tool to be built
> as the first target in the separate make session. This way
> we don't need to handle extra fixdep dependencies and we are
> certain there's no fixdep race with any parallel make job.

With your updated version I'm getting this:

$ make -k O=/tmp/build/perf -C tools/perf install-bin

  LINK /tmp/build/perf/libperf-gtk.so
  INSTALL  GTK UI
install: cannot change permissions of ‘/lib64’: Operation not permitted
install: cannot create regular file '/lib64/libperf-gtk.so': Permission denied
Makefile.perf:738: recipe for target 'install-gtk' failed
make[2]: *** [install-gtk] Error 1
make[2]: *** Waiting for unfinished jobs
Makefile.perf:383: recipe for target 'sub-make' failed
make[1]: *** [sub-make] Error 2
Makefile:108: recipe for target 'install-bin' failed
make: *** [install-bin] Error 2
make: Leaving directory '/home/acme/git/linux/tools/perf'
 
> Link: http://lkml.kernel.org/n/tip-b6wi5cssl8apu74o2kbfh...@git.kernel.org
> Signed-off-by: Jiri Olsa 
> ---
>  tools/perf/Makefile.perf | 54 
> +---
>  1 file changed, 42 insertions(+), 12 deletions(-)
> 
> diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf
> index 112f4f7bb200..25d9d6653746 100644
> --- a/tools/perf/Makefile.perf
> +++ b/tools/perf/Makefile.perf
> @@ -179,6 +179,27 @@ ifeq ($(filter-out 
> $(NON_CONFIG_TARGETS),$(MAKECMDGOALS)),)
>  endif
>  endif
>  
> +# The fixdep build - we force fixdep tool to be built as
> +# the first target in the separate make session not to be
> +# disturbed by any parallel make jobs. Once fixdep is done
> +# we issue the requested build with FIXDEP=1 variable.
> +#
> +# The fixdep build is disabled for $(NON_CONFIG_TARGETS)
> +# targets, because it's not necessary.
> +
> +ifdef FIXDEP
> +  force_fixdep := 0
> +else
> +  force_fixdep := $(config)
> +endif
> +
> +# Disable config, we are forcing fixdep rule first
> +# and it does not need config data.
> +
> +ifeq ($(force_fixdep),1)
> +  config := 0
> +endif
> +
>  # Set FEATURE_TESTS to 'all' so all possible feature checkers are executed.
>  # Without this setting the output feature dump file misses some features, for
>  # example, liberty. Select all checkers so we won't get an incomplete feature
> @@ -324,10 +345,22 @@ LIBS = -Wl,--whole-archive $(PERFLIBS) 
> -Wl,--no-whole-archive -Wl,--start-group
>  
>  export INSTALL SHELL_PATH
>  
> +export srctree OUTPUT RM CC LD AR CFLAGS V BISON FLEX AWK
> +export HOSTCC HOSTLD HOSTAR
> +include $(srctree)/tools/build/Makefile.include
> +
>  ### Build rules
>  
>  SHELL = $(SHELL_PATH)
>  
> +ifeq ($(force_fixdep),1)
> +goals := $(filter-out all sub-make, $(MAKECMDGOALS))
> +
> +$(goals) all: sub-make
> +
> +sub-make: fixdep
> + $(Q)$(MAKE) FIXDEP=1 -f Makefile.perf $(goals)
> +else
>  all: shell_compatibility_test $(ALL_PROGRAMS) $(LANG_BINDINGS) 
> $(OTHER_PROGRAMS)
>  
>  $(OUTPUT)python/perf.so: $(PYTHON_EXT_SRCS) $(PYTHON_EXT_DEPS) 
> $(LIBTRACEEVENT_DYNAMIC_LIST)
> @@ -348,10 +381,6 @@ strip: $(PROGRAMS) $(OUTPUT)perf
>  
>  PERF_IN := $(OUTPUT)perf-in.o
>  
> -export srctree OUTPUT RM CC LD AR CFLAGS V BISON FLEX AWK
> -export HOSTCC HOSTLD HOSTAR
> -include $(srctree)/tools/build/Makefile.include
> -
>  JEVENTS   := $(OUTPUT)pmu-events/jevents
>  JEVENTS_IN:= $(OUTPUT)pmu-events/jevents-in.o
>  
> @@ -470,7 +499,7 @@ $(OUTPUT)perf: $(PERFLIBS) $(PERF_IN) $(PMU_EVENTS_IN) 
> $(LIBTRACEEVENT_DYNAMIC_L
>   $(QUIET_LINK)$(CC) $(CFLAGS) $(LDFLAGS) 
> $(LIBTRACEEVENT_DYNAMIC_LIST_LDFLAGS) \
>   $(PERF_IN) $(PMU_EVENTS_IN) $(LIBS) -o $@
>  
> -$(GTK_IN): fixdep FORCE
> +$(GTK_IN): FORCE
>   $(Q)$(MAKE) $(build)=gtk
>  
>  $(OUTPUT)libperf-gtk.so: $(GTK_IN) $(PERFLIBS)
> @@ -515,7 +544,7 @@ endif
>  __build-dir = $(subst $(OUTPUT),,$(dir $@))
>  build-dir   = $(if $(__build-dir),$(__build-dir),.)
>  
> -prepare: $(OUTPUT)PERF-VERSION-FILE $(OUTPUT)common-cmds.h fixdep archheaders
> +prepare: $(OUTPUT)PERF-VERSION-FILE $(OUTPUT)common-cmds.h archheaders
>  
>  $(OUTPUT)%.o: %.c prepare FORCE
>   $(Q)$(MAKE) -f $(srctree)/tools/build/Makefile.build dir=$(build-dir) $@
> @@ -565,7 +594,7 @@ $(patsubst perf-%,%.o,$(PROGRAMS)): $(wildcard */*.h)
>  
>  LIBPERF_IN := $(OUTPUT)libperf-in.o
>  
> -$(LIBPERF_IN): prepare fixdep FORCE
> +$(LIBPERF_IN): prepare FORCE
>   $(Q)$(MAKE) $(build)=libperf
>  
>  $(LIB_FILE): $(LIBPERF_IN)
> @@ -573,10 +602,10 @@ $(LIB_FILE): $(LIBPERF_IN)
>  
>  LIBTRACEEVENT_FLAGS += plugin_dir=$(plugindir_SQ)
>  
> -$(LIBTRACEEVENT): fixdep FORCE
> +$(LIBTRACEEVENT): FORCE
>   $(Q)$(MAKE) -C $(TRACE_EVENT_DIR)

Re: [PATCH] usbip: fix warning in vhci_hcd_probe/lockdep_init_map

2016-12-05 Thread Shuah Khan

Hi Andrey,

On 12/05/2016 12:56 PM, Shuah Khan wrote:
> vhci_hcd calls sysfs_create_group() with dynamically allocated sysfs
> attributes triggering the lock-class key not persistent warning. Call
> sysfs_attr_init() for dynamically allocated sysfs attributes to fix it.
> 
> vhci_hcd vhci_hcd: USB/IP Virtual Host Controller
> vhci_hcd vhci_hcd: new USB bus registered, assigned bus number 2
> BUG: key 88006a7e8d18 not in .data!
> [ cut here ]
> WARNING: CPU: 0 PID: 1 at kernel/locking/lockdep.c:3131
> lockdep_init_map+0x60c/0x770
> DEBUG_LOCKS_WARN_ON(1)[1.567044] Modules linked in:
> CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.9.0-rc7+ #58
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
>  88006bce6eb8 81f96c8a 0a02 11000d79cd6a
>  ed000d79cd62 00046bce6ed8 41b58ab3 8598af40
>  81f969f8  41b58ab3 0200
> Call Trace:
>  [< inline >] __dump_stack lib/dump_stack.c:15
>  [] dump_stack+0x292/0x398 lib/dump_stack.c:51
>  [] __warn+0x19f/0x1e0 kernel/panic.c:550
>  [] warn_slowpath_fmt+0xc5/0x110 kernel/panic.c:565
>  [] lockdep_init_map+0x60c/0x770 
> kernel/locking/lockdep.c:3131
>  [] __kernfs_create_file+0x114/0x2a0 fs/kernfs/file.c:954
>  [] sysfs_add_file_mode_ns+0x225/0x520 fs/sysfs/file.c:305
>  [< inline >] create_files fs/sysfs/group.c:64
>  [] internal_create_group+0x239/0x8f0 fs/sysfs/group.c:134
>  [] sysfs_create_group+0x1f/0x30 fs/sysfs/group.c:156
>  [] vhci_start+0x5b4/0x7a0 drivers/usb/usbip/vhci_hcd.c:978
>  [] usb_add_hcd+0x8da/0x1c60 drivers/usb/core/hcd.c:2867
>  [] vhci_hcd_probe+0x97/0x130
> drivers/usb/usbip/vhci_hcd.c:1103
>  ---
>  ---
> ---[ end trace c33c7b202cf3aac8 ]---
> 
> Signed-off-by: Shuah Khan 
> Reported-by: Andrey Konovalov 

Here is the fix. Fixed the warning I reproduced on my system.
Let me know if it works for you.

thanks,
-- Shuah

> ---
>  drivers/usb/usbip/vhci_sysfs.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/usb/usbip/vhci_sysfs.c b/drivers/usb/usbip/vhci_sysfs.c
> index c404017..b96e5b1 100644
> --- a/drivers/usb/usbip/vhci_sysfs.c
> +++ b/drivers/usb/usbip/vhci_sysfs.c
> @@ -361,6 +361,7 @@ static void set_status_attr(int id)
>   status->attr.attr.name = status->name;
>   status->attr.attr.mode = S_IRUGO;
>   status->attr.show = status_show;
> + sysfs_attr_init(>attr.attr);
>  }
>  
>  static int init_status_attrs(void)
>

Re: [PATCH] usbip: fix warning in vhci_hcd_probe/lockdep_init_map

2016-12-05 Thread Shuah Khan

Hi Andrey,

On 12/05/2016 12:56 PM, Shuah Khan wrote:
> vhci_hcd calls sysfs_create_group() with dynamically allocated sysfs
> attributes triggering the lock-class key not persistent warning. Call
> sysfs_attr_init() for dynamically allocated sysfs attributes to fix it.
> 
> vhci_hcd vhci_hcd: USB/IP Virtual Host Controller
> vhci_hcd vhci_hcd: new USB bus registered, assigned bus number 2
> BUG: key 88006a7e8d18 not in .data!
> [ cut here ]
> WARNING: CPU: 0 PID: 1 at kernel/locking/lockdep.c:3131
> lockdep_init_map+0x60c/0x770
> DEBUG_LOCKS_WARN_ON(1)[1.567044] Modules linked in:
> CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.9.0-rc7+ #58
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
>  88006bce6eb8 81f96c8a 0a02 11000d79cd6a
>  ed000d79cd62 00046bce6ed8 41b58ab3 8598af40
>  81f969f8  41b58ab3 0200
> Call Trace:
>  [< inline >] __dump_stack lib/dump_stack.c:15
>  [] dump_stack+0x292/0x398 lib/dump_stack.c:51
>  [] __warn+0x19f/0x1e0 kernel/panic.c:550
>  [] warn_slowpath_fmt+0xc5/0x110 kernel/panic.c:565
>  [] lockdep_init_map+0x60c/0x770 
> kernel/locking/lockdep.c:3131
>  [] __kernfs_create_file+0x114/0x2a0 fs/kernfs/file.c:954
>  [] sysfs_add_file_mode_ns+0x225/0x520 fs/sysfs/file.c:305
>  [< inline >] create_files fs/sysfs/group.c:64
>  [] internal_create_group+0x239/0x8f0 fs/sysfs/group.c:134
>  [] sysfs_create_group+0x1f/0x30 fs/sysfs/group.c:156
>  [] vhci_start+0x5b4/0x7a0 drivers/usb/usbip/vhci_hcd.c:978
>  [] usb_add_hcd+0x8da/0x1c60 drivers/usb/core/hcd.c:2867
>  [] vhci_hcd_probe+0x97/0x130
> drivers/usb/usbip/vhci_hcd.c:1103
>  ---
>  ---
> ---[ end trace c33c7b202cf3aac8 ]---
> 
> Signed-off-by: Shuah Khan 
> Reported-by: Andrey Konovalov 

Here is the fix. Fixed the warning I reproduced on my system.
Let me know if it works for you.

thanks,
-- Shuah

> ---
>  drivers/usb/usbip/vhci_sysfs.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/usb/usbip/vhci_sysfs.c b/drivers/usb/usbip/vhci_sysfs.c
> index c404017..b96e5b1 100644
> --- a/drivers/usb/usbip/vhci_sysfs.c
> +++ b/drivers/usb/usbip/vhci_sysfs.c
> @@ -361,6 +361,7 @@ static void set_status_attr(int id)
>   status->attr.attr.name = status->name;
>   status->attr.attr.mode = S_IRUGO;
>   status->attr.show = status_show;
> + sysfs_attr_init(>attr.attr);
>  }
>  
>  static int init_status_attrs(void)
>

Re: [PATCH 1/1] net: ethernet: broadcom: fix improper return value

2016-12-05 Thread David Miller

From: Pan Bian 
Date: Sun,  4 Dec 2016 14:29:29 +0800

> From: Pan Bian 
> 
> Marco BNX2X_ALLOC_AND_SET(arr, lbl, func) calls kmalloc() to allocate
> memory, and jumps to label "lbl" if the allocation fails. Label "lbl"
> first cleans memory and then returns variable rc. Before calling the
> macro, the value of variable rc is 0. Because 0 means no error, the
> callers of bnx2x_init_firmware() may be misled. This patch fixes the bug,
> assigning "-ENOMEM" to rc before calling macro NX2X_ALLOC_AND_SET().
> 
> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=189141
> 
> Signed-off-by: Pan Bian 

Applied, but...

> @@ -13505,6 +13505,7 @@ static int bnx2x_init_firmware(struct bnx2x *bp)
>  
>   /* Initialize the pointers to the init arrays */
>   /* Blob */
> + rc = -ENOMEM;
>   BNX2X_ALLOC_AND_SET(init_data, request_firmware_exit, be32_to_cpu_n);
>  
>   /* Opcodes */

These kinds of macros which internally change control flow should always
be avoided.

Re: Enabling peer to peer device transactions for PCIe devices

2016-12-05 Thread Logan Gunthorpe




On 05/12/16 12:46 PM, Jason Gunthorpe wrote:

NVMe might have to deal with pci-e hot-unplug, which is a similar
problem-class to the GPU case..


Sure, but if the NVMe device gets hot-unplugged it means that all the 
CMB mappings are useless and need to be torn down. This probably means 
killing any process that has mappings open.



In any event the allocator still needs to track which regions are in
use and be able to hook 'free' from userspace. That does suggest it
should be integrated into the nvme driver and not a bolt on driver..


Yup, that's correct. And yes, I've never suggested this to be a bolt on 
driver -- I always expected for it to get integrated into the nvme 
driver. (iopmem was not meant for this.)


Logan

Re: [PATCH 1/1] net: ethernet: broadcom: fix improper return value

2016-12-05 Thread David Miller

From: Pan Bian 
Date: Sun,  4 Dec 2016 14:29:29 +0800

> From: Pan Bian 
> 
> Marco BNX2X_ALLOC_AND_SET(arr, lbl, func) calls kmalloc() to allocate
> memory, and jumps to label "lbl" if the allocation fails. Label "lbl"
> first cleans memory and then returns variable rc. Before calling the
> macro, the value of variable rc is 0. Because 0 means no error, the
> callers of bnx2x_init_firmware() may be misled. This patch fixes the bug,
> assigning "-ENOMEM" to rc before calling macro NX2X_ALLOC_AND_SET().
> 
> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=189141
> 
> Signed-off-by: Pan Bian 

Applied, but...

> @@ -13505,6 +13505,7 @@ static int bnx2x_init_firmware(struct bnx2x *bp)
>  
>   /* Initialize the pointers to the init arrays */
>   /* Blob */
> + rc = -ENOMEM;
>   BNX2X_ALLOC_AND_SET(init_data, request_firmware_exit, be32_to_cpu_n);
>  
>   /* Opcodes */

These kinds of macros which internally change control flow should always
be avoided.

Re: Enabling peer to peer device transactions for PCIe devices

2016-12-05 Thread Logan Gunthorpe




On 05/12/16 12:46 PM, Jason Gunthorpe wrote:

NVMe might have to deal with pci-e hot-unplug, which is a similar
problem-class to the GPU case..


Sure, but if the NVMe device gets hot-unplugged it means that all the 
CMB mappings are useless and need to be torn down. This probably means 
killing any process that has mappings open.



In any event the allocator still needs to track which regions are in
use and be able to hook 'free' from userspace. That does suggest it
should be integrated into the nvme driver and not a bolt on driver..


Yup, that's correct. And yes, I've never suggested this to be a bolt on 
driver -- I always expected for it to get integrated into the nvme 
driver. (iopmem was not meant for this.)


Logan

[PATCH] usbip: fix warning in vhci_hcd_probe/lockdep_init_map

2016-12-05 Thread Shuah Khan

vhci_hcd calls sysfs_create_group() with dynamically allocated sysfs
attributes triggering the lock-class key not persistent warning. Call
sysfs_attr_init() for dynamically allocated sysfs attributes to fix it.

vhci_hcd vhci_hcd: USB/IP Virtual Host Controller
vhci_hcd vhci_hcd: new USB bus registered, assigned bus number 2
BUG: key 88006a7e8d18 not in .data!
[ cut here ]
WARNING: CPU: 0 PID: 1 at kernel/locking/lockdep.c:3131
lockdep_init_map+0x60c/0x770
DEBUG_LOCKS_WARN_ON(1)[1.567044] Modules linked in:
CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.9.0-rc7+ #58
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
 88006bce6eb8 81f96c8a 0a02 11000d79cd6a
 ed000d79cd62 00046bce6ed8 41b58ab3 8598af40
 81f969f8  41b58ab3 0200
Call Trace:
 [< inline >] __dump_stack lib/dump_stack.c:15
 [] dump_stack+0x292/0x398 lib/dump_stack.c:51
 [] __warn+0x19f/0x1e0 kernel/panic.c:550
 [] warn_slowpath_fmt+0xc5/0x110 kernel/panic.c:565
 [] lockdep_init_map+0x60c/0x770 kernel/locking/lockdep.c:3131
 [] __kernfs_create_file+0x114/0x2a0 fs/kernfs/file.c:954
 [] sysfs_add_file_mode_ns+0x225/0x520 fs/sysfs/file.c:305
 [< inline >] create_files fs/sysfs/group.c:64
 [] internal_create_group+0x239/0x8f0 fs/sysfs/group.c:134
 [] sysfs_create_group+0x1f/0x30 fs/sysfs/group.c:156
 [] vhci_start+0x5b4/0x7a0 drivers/usb/usbip/vhci_hcd.c:978
 [] usb_add_hcd+0x8da/0x1c60 drivers/usb/core/hcd.c:2867
 [] vhci_hcd_probe+0x97/0x130
drivers/usb/usbip/vhci_hcd.c:1103
 ---
 ---
---[ end trace c33c7b202cf3aac8 ]---

Signed-off-by: Shuah Khan 
Reported-by: Andrey Konovalov 
---
 drivers/usb/usbip/vhci_sysfs.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/usb/usbip/vhci_sysfs.c b/drivers/usb/usbip/vhci_sysfs.c
index c404017..b96e5b1 100644
--- a/drivers/usb/usbip/vhci_sysfs.c
+++ b/drivers/usb/usbip/vhci_sysfs.c
@@ -361,6 +361,7 @@ static void set_status_attr(int id)
status->attr.attr.name = status->name;
status->attr.attr.mode = S_IRUGO;
status->attr.show = status_show;
+   sysfs_attr_init(>attr.attr);
 }
 
 static int init_status_attrs(void)
-- 
2.7.4

[PATCH] usbip: fix warning in vhci_hcd_probe/lockdep_init_map

2016-12-05 Thread Shuah Khan

vhci_hcd calls sysfs_create_group() with dynamically allocated sysfs
attributes triggering the lock-class key not persistent warning. Call
sysfs_attr_init() for dynamically allocated sysfs attributes to fix it.

vhci_hcd vhci_hcd: USB/IP Virtual Host Controller
vhci_hcd vhci_hcd: new USB bus registered, assigned bus number 2
BUG: key 88006a7e8d18 not in .data!
[ cut here ]
WARNING: CPU: 0 PID: 1 at kernel/locking/lockdep.c:3131
lockdep_init_map+0x60c/0x770
DEBUG_LOCKS_WARN_ON(1)[1.567044] Modules linked in:
CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.9.0-rc7+ #58
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
 88006bce6eb8 81f96c8a 0a02 11000d79cd6a
 ed000d79cd62 00046bce6ed8 41b58ab3 8598af40
 81f969f8  41b58ab3 0200
Call Trace:
 [< inline >] __dump_stack lib/dump_stack.c:15
 [] dump_stack+0x292/0x398 lib/dump_stack.c:51
 [] __warn+0x19f/0x1e0 kernel/panic.c:550
 [] warn_slowpath_fmt+0xc5/0x110 kernel/panic.c:565
 [] lockdep_init_map+0x60c/0x770 kernel/locking/lockdep.c:3131
 [] __kernfs_create_file+0x114/0x2a0 fs/kernfs/file.c:954
 [] sysfs_add_file_mode_ns+0x225/0x520 fs/sysfs/file.c:305
 [< inline >] create_files fs/sysfs/group.c:64
 [] internal_create_group+0x239/0x8f0 fs/sysfs/group.c:134
 [] sysfs_create_group+0x1f/0x30 fs/sysfs/group.c:156
 [] vhci_start+0x5b4/0x7a0 drivers/usb/usbip/vhci_hcd.c:978
 [] usb_add_hcd+0x8da/0x1c60 drivers/usb/core/hcd.c:2867
 [] vhci_hcd_probe+0x97/0x130
drivers/usb/usbip/vhci_hcd.c:1103
 ---
 ---
---[ end trace c33c7b202cf3aac8 ]---

Signed-off-by: Shuah Khan 
Reported-by: Andrey Konovalov 
---
 drivers/usb/usbip/vhci_sysfs.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/usb/usbip/vhci_sysfs.c b/drivers/usb/usbip/vhci_sysfs.c
index c404017..b96e5b1 100644
--- a/drivers/usb/usbip/vhci_sysfs.c
+++ b/drivers/usb/usbip/vhci_sysfs.c
@@ -361,6 +361,7 @@ static void set_status_attr(int id)
status->attr.attr.name = status->name;
status->attr.attr.mode = S_IRUGO;
status->attr.show = status_show;
+   sysfs_attr_init(>attr.attr);
 }
 
 static int init_status_attrs(void)
-- 
2.7.4

Re: [PATCH 1/1] atm: fix improper return value

2016-12-05 Thread David Miller

From: Pan Bian 
Date: Sun,  4 Dec 2016 13:45:15 +0800

> From: Pan Bian 
> 
> It returns variable "error" when ioremap_nocache() returns a NULL
> pointer. The value of "error" is 0 then, which will mislead the callers
> to believe that there is no error. This patch fixes the bug, returning
> "-ENOMEM".
> 
> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=189021
> 
> Signed-off-by: Pan Bian 

Applied.

Re: [PATCH 1/1] atm: fix improper return value

2016-12-05 Thread David Miller

From: Pan Bian 
Date: Sun,  4 Dec 2016 13:45:15 +0800

> From: Pan Bian 
> 
> It returns variable "error" when ioremap_nocache() returns a NULL
> pointer. The value of "error" is 0 then, which will mislead the callers
> to believe that there is no error. This patch fixes the bug, returning
> "-ENOMEM".
> 
> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=189021
> 
> Signed-off-by: Pan Bian 

Applied.

Re: [PATCH 1/1] net: irda: set error code on failures

2016-12-05 Thread David Miller

From: Pan Bian 
Date: Sun,  4 Dec 2016 13:27:40 +0800

> From: Pan Bian 
> 
> When the calls to kzalloc() fail, the value of return variable ret may
> be 0. 0 means success in this context. This patch fixes the bug,
> assigning "-ENOMEM" to ret before calling kzalloc().
> 
> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=188971
> 
> Signed-off-by: Pan Bian 

Applied.

Re: [PATCH 1/1] net: irda: set error code on failures

2016-12-05 Thread David Miller

From: Pan Bian 
Date: Sun,  4 Dec 2016 13:27:40 +0800

> From: Pan Bian 
> 
> When the calls to kzalloc() fail, the value of return variable ret may
> be 0. 0 means success in this context. This patch fixes the bug,
> assigning "-ENOMEM" to ret before calling kzalloc().
> 
> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=188971
> 
> Signed-off-by: Pan Bian 

Applied.

Re: [PATCH 1/1] net: caif: remove ineffective check

2016-12-05 Thread David Miller

From: Pan Bian 
Date: Sun,  4 Dec 2016 12:15:44 +0800

> The check of the return value of sock_register() is ineffective.
> "if(!err)" seems to be a typo. It is better to propagate the error code
> to the callers of caif_sktinit_module(). This patch removes the check
> statment and directly returns the result of sock_register().
> 
> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=188751
> Signed-off-by: Pan Bian 

Applied.

Re: [PATCH 1/1] net: caif: remove ineffective check

2016-12-05 Thread David Miller

From: Pan Bian 
Date: Sun,  4 Dec 2016 12:15:44 +0800

> The check of the return value of sock_register() is ineffective.
> "if(!err)" seems to be a typo. It is better to propagate the error code
> to the callers of caif_sktinit_module(). This patch removes the check
> statment and directly returns the result of sock_register().
> 
> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=188751
> Signed-off-by: Pan Bian 

Applied.

Re: [PATCH] net: ethernet: ti: cpdma: use desc_read in chan_process instead of raw read

2016-12-05 Thread David Miller

From: Grygorii Strashko 
Date: Mon, 5 Dec 2016 12:47:16 -0600

> 
> 
> On 12/02/2016 08:05 PM, Ivan Khoronzhuk wrote:
>> There is desc_read() macros to read desc fields, so no need to
>> use __raw_readl();
>>
>> Signed-off-by: Ivan Khoronzhuk 
> 
> 
> I'm going to update it all at once as part of [1].
> 
> [1] https://lkml.org/lkml/2016/12/1/781

Ok.

Re: [PATCH] net: ethernet: ti: cpdma: use desc_read in chan_process instead of raw read

2016-12-05 Thread David Miller

From: Grygorii Strashko 
Date: Mon, 5 Dec 2016 12:47:16 -0600

> 
> 
> On 12/02/2016 08:05 PM, Ivan Khoronzhuk wrote:
>> There is desc_read() macros to read desc fields, so no need to
>> use __raw_readl();
>>
>> Signed-off-by: Ivan Khoronzhuk 
> 
> 
> I'm going to update it all at once as part of [1].
> 
> [1] https://lkml.org/lkml/2016/12/1/781

Ok.

Re: Enabling peer to peer device transactions for PCIe devices

2016-12-05 Thread Jason Gunthorpe

On Mon, Dec 05, 2016 at 12:27:20PM -0700, Logan Gunthorpe wrote:
> 
> 
> On 05/12/16 12:14 PM, Jason Gunthorpe wrote:
> >But CMB sounds much more like the GPU case where there is a
> >specialized allocator handing out the BAR to consumers, so I'm not
> >sure a general purpose chardev makes a lot of sense?
> 
> I don't think it will ever need to be as complicated as the GPU case. There
> will probably only ever be a relatively small amount of memory behind the
> CMB and really the only users are those doing P2P work. Thus the specialized
> allocator could be pretty simple and I expect it would be fine to just
> return -ENOMEM if there is not enough memory.

NVMe might have to deal with pci-e hot-unplug, which is a similar
problem-class to the GPU case..

In any event the allocator still needs to track which regions are in
use and be able to hook 'free' from userspace. That does suggest it
should be integrated into the nvme driver and not a bolt on driver..

Jason

Re: Enabling peer to peer device transactions for PCIe devices

2016-12-05 Thread Jason Gunthorpe

On Mon, Dec 05, 2016 at 12:27:20PM -0700, Logan Gunthorpe wrote:
> 
> 
> On 05/12/16 12:14 PM, Jason Gunthorpe wrote:
> >But CMB sounds much more like the GPU case where there is a
> >specialized allocator handing out the BAR to consumers, so I'm not
> >sure a general purpose chardev makes a lot of sense?
> 
> I don't think it will ever need to be as complicated as the GPU case. There
> will probably only ever be a relatively small amount of memory behind the
> CMB and really the only users are those doing P2P work. Thus the specialized
> allocator could be pretty simple and I expect it would be fine to just
> return -ENOMEM if there is not enough memory.

NVMe might have to deal with pci-e hot-unplug, which is a similar
problem-class to the GPU case..

In any event the allocator still needs to track which regions are in
use and be able to hook 'free' from userspace. That does suggest it
should be integrated into the nvme driver and not a bolt on driver..

Jason

[PATCH 1/2] Revert "f2fs: use percpu_counter for # of dirty pages in inode"

2016-12-05 Thread Jaegeuk Kim

This reverts commit 1beba1b3a953107c3ff5448ab4e4297db4619c76.

The perpcu_counter doesn't provide atomicity in single core and consume more
DRAM. That incurs fs_mark test failure due to ENOMEM.

Cc: sta...@vger.kernel.org
Signed-off-by: Jaegeuk Kim 
---
 fs/f2fs/f2fs.h  | 10 +-
 fs/f2fs/file.c  |  2 +-
 fs/f2fs/super.c |  7 +--
 3 files changed, 7 insertions(+), 12 deletions(-)

diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
index 8a659493796b..35dbab157ec3 100644
--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -431,7 +431,7 @@ struct f2fs_inode_info {
/* Use below internally in f2fs*/
unsigned long flags;/* use to pass per-file flags */
struct rw_semaphore i_sem;  /* protect fi info */
-   struct percpu_counter dirty_pages;  /* # of dirty pages */
+   atomic_t dirty_pages;   /* # of dirty pages */
f2fs_hash_t chash;  /* hash value of given file name */
unsigned int clevel;/* maximum level of given file name */
nid_t i_xattr_nid;  /* node id that contains xattrs */
@@ -1269,7 +1269,7 @@ static inline void inc_page_count(struct f2fs_sb_info 
*sbi, int count_type)
 
 static inline void inode_inc_dirty_pages(struct inode *inode)
 {
-   percpu_counter_inc(_I(inode)->dirty_pages);
+   atomic_inc(_I(inode)->dirty_pages);
inc_page_count(F2FS_I_SB(inode), S_ISDIR(inode->i_mode) ?
F2FS_DIRTY_DENTS : F2FS_DIRTY_DATA);
 }
@@ -1285,7 +1285,7 @@ static inline void inode_dec_dirty_pages(struct inode 
*inode)
!S_ISLNK(inode->i_mode))
return;
 
-   percpu_counter_dec(_I(inode)->dirty_pages);
+   atomic_dec(_I(inode)->dirty_pages);
dec_page_count(F2FS_I_SB(inode), S_ISDIR(inode->i_mode) ?
F2FS_DIRTY_DENTS : F2FS_DIRTY_DATA);
 }
@@ -1295,9 +1295,9 @@ static inline s64 get_pages(struct f2fs_sb_info *sbi, int 
count_type)
return atomic_read(>nr_pages[count_type]);
 }
 
-static inline s64 get_dirty_pages(struct inode *inode)
+static inline int get_dirty_pages(struct inode *inode)
 {
-   return percpu_counter_sum_positive(_I(inode)->dirty_pages);
+   return atomic_read(_I(inode)->dirty_pages);
 }
 
 static inline int get_blocktype_secs(struct f2fs_sb_info *sbi, int block_type)
diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
index 9053a9c5a47e..4c87261c1cd2 100644
--- a/fs/f2fs/file.c
+++ b/fs/f2fs/file.c
@@ -1533,7 +1533,7 @@ static int f2fs_ioc_start_atomic_write(struct file *filp)
goto out;
 
f2fs_msg(F2FS_I_SB(inode)->sb, KERN_WARNING,
-   "Unexpected flush for atomic writes: ino=%lu, npages=%lld",
+   "Unexpected flush for atomic writes: ino=%lu, npages=%u",
inode->i_ino, get_dirty_pages(inode));
ret = filemap_write_and_wait_range(inode->i_mapping, 0, LLONG_MAX);
if (ret)
diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
index 2852a0b34d64..7591d2d7b612 100644
--- a/fs/f2fs/super.c
+++ b/fs/f2fs/super.c
@@ -571,13 +571,9 @@ static struct inode *f2fs_alloc_inode(struct super_block 
*sb)
 
init_once((void *) fi);
 
-   if (percpu_counter_init(>dirty_pages, 0, GFP_NOFS)) {
-   kmem_cache_free(f2fs_inode_cachep, fi);
-   return NULL;
-   }
-
/* Initialize f2fs-specific inode info */
fi->vfs_inode.i_version = 1;
+   atomic_set(>dirty_pages, 0);
fi->i_current_depth = 1;
fi->i_advise = 0;
init_rwsem(>i_sem);
@@ -703,7 +699,6 @@ static void f2fs_i_callback(struct rcu_head *head)
 
 static void f2fs_destroy_inode(struct inode *inode)
 {
-   percpu_counter_destroy(_I(inode)->dirty_pages);
call_rcu(>i_rcu, f2fs_i_callback);
 }
 
-- 
2.11.0

[PATCH 1/2] Revert "f2fs: use percpu_counter for # of dirty pages in inode"

2016-12-05 Thread Jaegeuk Kim

This reverts commit 1beba1b3a953107c3ff5448ab4e4297db4619c76.

The perpcu_counter doesn't provide atomicity in single core and consume more
DRAM. That incurs fs_mark test failure due to ENOMEM.

Cc: sta...@vger.kernel.org
Signed-off-by: Jaegeuk Kim 
---
 fs/f2fs/f2fs.h  | 10 +-
 fs/f2fs/file.c  |  2 +-
 fs/f2fs/super.c |  7 +--
 3 files changed, 7 insertions(+), 12 deletions(-)

diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
index 8a659493796b..35dbab157ec3 100644
--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -431,7 +431,7 @@ struct f2fs_inode_info {
/* Use below internally in f2fs*/
unsigned long flags;/* use to pass per-file flags */
struct rw_semaphore i_sem;  /* protect fi info */
-   struct percpu_counter dirty_pages;  /* # of dirty pages */
+   atomic_t dirty_pages;   /* # of dirty pages */
f2fs_hash_t chash;  /* hash value of given file name */
unsigned int clevel;/* maximum level of given file name */
nid_t i_xattr_nid;  /* node id that contains xattrs */
@@ -1269,7 +1269,7 @@ static inline void inc_page_count(struct f2fs_sb_info 
*sbi, int count_type)
 
 static inline void inode_inc_dirty_pages(struct inode *inode)
 {
-   percpu_counter_inc(_I(inode)->dirty_pages);
+   atomic_inc(_I(inode)->dirty_pages);
inc_page_count(F2FS_I_SB(inode), S_ISDIR(inode->i_mode) ?
F2FS_DIRTY_DENTS : F2FS_DIRTY_DATA);
 }
@@ -1285,7 +1285,7 @@ static inline void inode_dec_dirty_pages(struct inode 
*inode)
!S_ISLNK(inode->i_mode))
return;
 
-   percpu_counter_dec(_I(inode)->dirty_pages);
+   atomic_dec(_I(inode)->dirty_pages);
dec_page_count(F2FS_I_SB(inode), S_ISDIR(inode->i_mode) ?
F2FS_DIRTY_DENTS : F2FS_DIRTY_DATA);
 }
@@ -1295,9 +1295,9 @@ static inline s64 get_pages(struct f2fs_sb_info *sbi, int 
count_type)
return atomic_read(>nr_pages[count_type]);
 }
 
-static inline s64 get_dirty_pages(struct inode *inode)
+static inline int get_dirty_pages(struct inode *inode)
 {
-   return percpu_counter_sum_positive(_I(inode)->dirty_pages);
+   return atomic_read(_I(inode)->dirty_pages);
 }
 
 static inline int get_blocktype_secs(struct f2fs_sb_info *sbi, int block_type)
diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
index 9053a9c5a47e..4c87261c1cd2 100644
--- a/fs/f2fs/file.c
+++ b/fs/f2fs/file.c
@@ -1533,7 +1533,7 @@ static int f2fs_ioc_start_atomic_write(struct file *filp)
goto out;
 
f2fs_msg(F2FS_I_SB(inode)->sb, KERN_WARNING,
-   "Unexpected flush for atomic writes: ino=%lu, npages=%lld",
+   "Unexpected flush for atomic writes: ino=%lu, npages=%u",
inode->i_ino, get_dirty_pages(inode));
ret = filemap_write_and_wait_range(inode->i_mapping, 0, LLONG_MAX);
if (ret)
diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
index 2852a0b34d64..7591d2d7b612 100644
--- a/fs/f2fs/super.c
+++ b/fs/f2fs/super.c
@@ -571,13 +571,9 @@ static struct inode *f2fs_alloc_inode(struct super_block 
*sb)
 
init_once((void *) fi);
 
-   if (percpu_counter_init(>dirty_pages, 0, GFP_NOFS)) {
-   kmem_cache_free(f2fs_inode_cachep, fi);
-   return NULL;
-   }
-
/* Initialize f2fs-specific inode info */
fi->vfs_inode.i_version = 1;
+   atomic_set(>dirty_pages, 0);
fi->i_current_depth = 1;
fi->i_advise = 0;
init_rwsem(>i_sem);
@@ -703,7 +699,6 @@ static void f2fs_i_callback(struct rcu_head *head)
 
 static void f2fs_destroy_inode(struct inode *inode)
 {
-   percpu_counter_destroy(_I(inode)->dirty_pages);
call_rcu(>i_rcu, f2fs_i_callback);
 }
 
-- 
2.11.0

[PATCH 2/2] f2fs: call sync_fs when f2fs is idle

2016-12-05 Thread Jaegeuk Kim

The sync_fs in f2fs_balance_fs_bg must avoid interrupting current user requests.

Signed-off-by: Jaegeuk Kim 
---
 fs/f2fs/segment.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
index d5141a06b9a3..8affc5621181 100644
--- a/fs/f2fs/segment.c
+++ b/fs/f2fs/segment.c
@@ -383,12 +383,15 @@ void f2fs_balance_fs_bg(struct f2fs_sb_info *sbi)
else
build_free_nids(sbi, false);
 
+   if (!is_idle(sbi))
+   return;
+
/* checkpoint is the only way to shrink partial cached entries */
if (!available_free_memory(sbi, NAT_ENTRIES) ||
!available_free_memory(sbi, INO_ENTRIES) ||
excess_prefree_segs(sbi) ||
excess_dirty_nats(sbi) ||
-   (is_idle(sbi) && f2fs_time_over(sbi, CP_TIME))) {
+   f2fs_time_over(sbi, CP_TIME)) {
if (test_opt(sbi, DATA_FLUSH)) {
struct blk_plug plug;
 
-- 
2.11.0

[PATCH 2/2] f2fs: call sync_fs when f2fs is idle

2016-12-05 Thread Jaegeuk Kim

The sync_fs in f2fs_balance_fs_bg must avoid interrupting current user requests.

Signed-off-by: Jaegeuk Kim 
---
 fs/f2fs/segment.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
index d5141a06b9a3..8affc5621181 100644
--- a/fs/f2fs/segment.c
+++ b/fs/f2fs/segment.c
@@ -383,12 +383,15 @@ void f2fs_balance_fs_bg(struct f2fs_sb_info *sbi)
else
build_free_nids(sbi, false);
 
+   if (!is_idle(sbi))
+   return;
+
/* checkpoint is the only way to shrink partial cached entries */
if (!available_free_memory(sbi, NAT_ENTRIES) ||
!available_free_memory(sbi, INO_ENTRIES) ||
excess_prefree_segs(sbi) ||
excess_dirty_nats(sbi) ||
-   (is_idle(sbi) && f2fs_time_over(sbi, CP_TIME))) {
+   f2fs_time_over(sbi, CP_TIME)) {
if (test_opt(sbi, DATA_FLUSH)) {
struct blk_plug plug;
 
-- 
2.11.0

Re: [PATCH] Yama: allow access for the current ptrace parent

2016-12-05 Thread Kees Cook

On Mon, Dec 5, 2016 at 11:13 AM, Josh Stone  wrote:
> On 12/02/2016 03:27 PM, Kees Cook wrote:
>>> +   /* If there's already an active tracing relationship, then make an
>>
>> I'll adjust the comment style here and add it to my tree for -next.
>
> Thanks!
>
> I guess the tweak is that it should have an empty "/*" line?
>
> FWIW, checkpatch.pl doesn't warn about this -- perhaps it should?
> I only see the opposite check for NETWORKING_BLOCK_COMMENT_STYLE.

Hrm, I thought it did warn. But yeah, the networking subsystem uses
this style but everywhere else doesn't. :P Wheee. :)

-Kees

-- 
Kees Cook
Nexus Security

Re: [PATCH] Yama: allow access for the current ptrace parent

2016-12-05 Thread Kees Cook

On Mon, Dec 5, 2016 at 11:13 AM, Josh Stone  wrote:
> On 12/02/2016 03:27 PM, Kees Cook wrote:
>>> +   /* If there's already an active tracing relationship, then make an
>>
>> I'll adjust the comment style here and add it to my tree for -next.
>
> Thanks!
>
> I guess the tweak is that it should have an empty "/*" line?
>
> FWIW, checkpatch.pl doesn't warn about this -- perhaps it should?
> I only see the opposite check for NETWORKING_BLOCK_COMMENT_STYLE.

Hrm, I thought it did warn. But yeah, the networking subsystem uses
this style but everywhere else doesn't. :P Wheee. :)

-Kees

-- 
Kees Cook
Nexus Security

Re: [PATCH] overlayfs: ignore empty NFSv4 ACLs in ext4 upperdir

2016-12-05 Thread Andreas Grünbacher

2016-12-05 17:25 GMT+01:00 J. Bruce Fields :
> On Mon, Dec 05, 2016 at 04:36:03PM +0100, Miklos Szeredi wrote:
>> On Mon, Dec 5, 2016 at 4:19 PM, J. Bruce Fields  wrote:
>> >> Can NFS people comment on this?  Where does the nfs4_acl come from?
>> >
>> > This is the interface the NFS client provides for applications to modify
>> > NFSv4 ACLs on servers that support them.
>>
>> Fine, but why are we seeing this xattr on exports where no xattrs are
>> set on the exported fs?
>
> I don't know.  I took another look at the original patch and don't see
> any details on the server setup: which server is it (knfsd, ganesha,
> netapp, ...)?  How is it configured?
>
>> >> What can overlayfs do if it's a non-empty ACL?
>> >
>> > As little as possible.  You can't copy it up, can you?  So any attempt
>> > to support it is going to be incomplete.
>>
>> Right.
>>
>> >
>> >> Does knfsd translate posix ACL into NFS acl?  If so, we can translate
>> >> back.  Should we do a generic POSIX<->NFS acl translator?
>> >
>> > knsd does translate between POSIX and NFSv4 ACLs.  It's a complicated
>>
>> This does explain the nfs4_acl xattr on the client.  Question: if it's
>> empty, why have it at all?
>
> I'm honestly not sure what's going on there.  I'd be curious to see a
> network trace if possible.

I do see "system.nfs4_acl" attributes on knfsd exported filesystems
that support POSIX ACLs (for ext4: "mount -o acl"). For exported
filesystem that don't support POSIX ACLs (ext4: mount -o noacl), that
attribute is missing. The attribute shouldn't be empty though; when
the file has no real ACL, "system.nfs4_acl" represents the file mode
permissions. The "system.nfs4_acl" attribute exposes the information
on the wire; there is no resonable way to translate that into an ACL
on another filesystem, really.

Patrick, what does 'getfattr -m- -d /nfs/file' give you?

>> > algorithm, and lossy (in the NFSv4->POSIX direction).  The client
>> > developers have been understandably reluctant to have anything to do
>> > with it.
>> >
>> > So, I think listxattr should omit system.nfs4_acl, and attempts to
>> > set/get the attribute should error out.  The same should apply to any
>> > "system." attribute not supported by both filesystems, I think?
>>
>> Basically that's what happens now.  The problem is that nfsv4 mounts
>> seem always have these xattrs, even when the exported fs doesn't have
>> anything.
>
> I said "both", that's a logical "and".  Whether or not nfs claims
> support would then be irrelevant in this case, since ext4 doesn't
> support system.nfs4_acl.
>
>> We could do the copy up even if the NFS4->POSIX translation was
>> possible (which is the case with POSIX ACL translated by knfsd).  We'd
>> just get back the original ACL, so that's OK.
>
> Note that knfsd is an exception, most NFSv4-acl-supporting servers
> aren't translating from POSIX ACLs.
>
>> NFS is only supported as lower (read-only) layer, so we don't care
>> about changing the ACL on the server.
>
> Out of curiosity, how do you check permissions after copy up?
>
> The client doesn't do much permissions-checking normally, because it's
> hard to get right--even in the absence of ACLs, it may not understand
> the server's owners and groups completely.
>
> I guess that's fine, you may be happy to let people write to the file
> without permissions to the lower file, since the writes aren't going
> there anyway.
>
> So, I don't know what want here.
>
> You're not going to want to use the ACL for actual permission-checking,
> and you can't modify it, so it doesn't seem very useful.

IMO the only thing overlayfs can do is hide those attributes from the
user and ignore them when copying up. Still, the attributes shouldn't
be empty.

Andreas

Re: [PATCH] overlayfs: ignore empty NFSv4 ACLs in ext4 upperdir

2016-12-05 Thread Andreas Grünbacher

2016-12-05 17:25 GMT+01:00 J. Bruce Fields :
> On Mon, Dec 05, 2016 at 04:36:03PM +0100, Miklos Szeredi wrote:
>> On Mon, Dec 5, 2016 at 4:19 PM, J. Bruce Fields  wrote:
>> >> Can NFS people comment on this?  Where does the nfs4_acl come from?
>> >
>> > This is the interface the NFS client provides for applications to modify
>> > NFSv4 ACLs on servers that support them.
>>
>> Fine, but why are we seeing this xattr on exports where no xattrs are
>> set on the exported fs?
>
> I don't know.  I took another look at the original patch and don't see
> any details on the server setup: which server is it (knfsd, ganesha,
> netapp, ...)?  How is it configured?
>
>> >> What can overlayfs do if it's a non-empty ACL?
>> >
>> > As little as possible.  You can't copy it up, can you?  So any attempt
>> > to support it is going to be incomplete.
>>
>> Right.
>>
>> >
>> >> Does knfsd translate posix ACL into NFS acl?  If so, we can translate
>> >> back.  Should we do a generic POSIX<->NFS acl translator?
>> >
>> > knsd does translate between POSIX and NFSv4 ACLs.  It's a complicated
>>
>> This does explain the nfs4_acl xattr on the client.  Question: if it's
>> empty, why have it at all?
>
> I'm honestly not sure what's going on there.  I'd be curious to see a
> network trace if possible.

I do see "system.nfs4_acl" attributes on knfsd exported filesystems
that support POSIX ACLs (for ext4: "mount -o acl"). For exported
filesystem that don't support POSIX ACLs (ext4: mount -o noacl), that
attribute is missing. The attribute shouldn't be empty though; when
the file has no real ACL, "system.nfs4_acl" represents the file mode
permissions. The "system.nfs4_acl" attribute exposes the information
on the wire; there is no resonable way to translate that into an ACL
on another filesystem, really.

Patrick, what does 'getfattr -m- -d /nfs/file' give you?

>> > algorithm, and lossy (in the NFSv4->POSIX direction).  The client
>> > developers have been understandably reluctant to have anything to do
>> > with it.
>> >
>> > So, I think listxattr should omit system.nfs4_acl, and attempts to
>> > set/get the attribute should error out.  The same should apply to any
>> > "system." attribute not supported by both filesystems, I think?
>>
>> Basically that's what happens now.  The problem is that nfsv4 mounts
>> seem always have these xattrs, even when the exported fs doesn't have
>> anything.
>
> I said "both", that's a logical "and".  Whether or not nfs claims
> support would then be irrelevant in this case, since ext4 doesn't
> support system.nfs4_acl.
>
>> We could do the copy up even if the NFS4->POSIX translation was
>> possible (which is the case with POSIX ACL translated by knfsd).  We'd
>> just get back the original ACL, so that's OK.
>
> Note that knfsd is an exception, most NFSv4-acl-supporting servers
> aren't translating from POSIX ACLs.
>
>> NFS is only supported as lower (read-only) layer, so we don't care
>> about changing the ACL on the server.
>
> Out of curiosity, how do you check permissions after copy up?
>
> The client doesn't do much permissions-checking normally, because it's
> hard to get right--even in the absence of ACLs, it may not understand
> the server's owners and groups completely.
>
> I guess that's fine, you may be happy to let people write to the file
> without permissions to the lower file, since the writes aren't going
> there anyway.
>
> So, I don't know what want here.
>
> You're not going to want to use the ACL for actual permission-checking,
> and you can't modify it, so it doesn't seem very useful.

IMO the only thing overlayfs can do is hide those attributes from the
user and ignore them when copying up. Still, the attributes shouldn't
be empty.

Andreas

Re: [PATCH RFC] hlist_add_tail_rcu disable sparse warning

2016-12-05 Thread Michael S. Tsirkin

On Wed, Nov 23, 2016 at 10:48:19PM +0200, Michael S. Tsirkin wrote:
> sparse is unhappy about this code in hlist_add_tail_rcu:
> 
> struct hlist_node *i, *last = NULL;
> 
> for (i = hlist_first_rcu(h); i; i = hlist_next_rcu(i))
> last = i;
> 
> This is because hlist_next_rcu and hlist_next_rcu return
> __rcu pointers.
> 
> The following trivial patch disables the warning
> without changing the behaviour in any way.
> 
> Signed-off-by: Michael S. Tsirkin 

So after reviewing this, there's no rcu-ness involved
except when we assign the next pointer on the last item.
This is because as the following comment says
 * The caller must take whatever precautions are necessary
 * (such as holding appropriate locks) to avoid racing
 * with another list-mutation primitive, such as hlist_add_head_rcu()
 * or hlist_del_rcu(), running on this same list.

I conclude this patch is actually the right thing to do, by comparison,
__hlist_for_each_rcu suggested by Dave Miller would be confusing since
it's designed to run in the rcu read side critical section.

I'll repost as non-RFC unless I hear otherwise.

> ---
> 
> I would appreciate review to confirm the function doesn't
> do anything unsafe though.
> 
> In particular, should this use __hlist_for_each_rcu instead?
> I note that __hlist_for_each_rcu does rcu_dereference
> internally, which is missing here.
> 
> compile-tested only.
> 
> diff --git a/include/linux/rculist.h b/include/linux/rculist.h
> index 8beb98d..33574db 100644
> --- a/include/linux/rculist.h
> +++ b/include/linux/rculist.h
> @@ -511,7 +511,7 @@ static inline void hlist_add_tail_rcu(struct hlist_node 
> *n,
>  {
>   struct hlist_node *i, *last = NULL;
>  
> - for (i = hlist_first_rcu(h); i; i = hlist_next_rcu(i))
> + for (i = h->first; i; i = i->next)
>   last = i;
>  
>   if (last) {

Re: [PATCH RFC] hlist_add_tail_rcu disable sparse warning

2016-12-05 Thread Michael S. Tsirkin

On Wed, Nov 23, 2016 at 10:48:19PM +0200, Michael S. Tsirkin wrote:
> sparse is unhappy about this code in hlist_add_tail_rcu:
> 
> struct hlist_node *i, *last = NULL;
> 
> for (i = hlist_first_rcu(h); i; i = hlist_next_rcu(i))
> last = i;
> 
> This is because hlist_next_rcu and hlist_next_rcu return
> __rcu pointers.
> 
> The following trivial patch disables the warning
> without changing the behaviour in any way.
> 
> Signed-off-by: Michael S. Tsirkin 

So after reviewing this, there's no rcu-ness involved
except when we assign the next pointer on the last item.
This is because as the following comment says
 * The caller must take whatever precautions are necessary
 * (such as holding appropriate locks) to avoid racing
 * with another list-mutation primitive, such as hlist_add_head_rcu()
 * or hlist_del_rcu(), running on this same list.

I conclude this patch is actually the right thing to do, by comparison,
__hlist_for_each_rcu suggested by Dave Miller would be confusing since
it's designed to run in the rcu read side critical section.

I'll repost as non-RFC unless I hear otherwise.

> ---
> 
> I would appreciate review to confirm the function doesn't
> do anything unsafe though.
> 
> In particular, should this use __hlist_for_each_rcu instead?
> I note that __hlist_for_each_rcu does rcu_dereference
> internally, which is missing here.
> 
> compile-tested only.
> 
> diff --git a/include/linux/rculist.h b/include/linux/rculist.h
> index 8beb98d..33574db 100644
> --- a/include/linux/rculist.h
> +++ b/include/linux/rculist.h
> @@ -511,7 +511,7 @@ static inline void hlist_add_tail_rcu(struct hlist_node 
> *n,
>  {
>   struct hlist_node *i, *last = NULL;
>  
> - for (i = hlist_first_rcu(h); i; i = hlist_next_rcu(i))
> + for (i = h->first; i; i = i->next)
>   last = i;
>  
>   if (last) {

Re: [PATCHv13 0/3] rdmacg: IB/core: rdma controller support

2016-12-05 Thread Tejun Heo

Parav, it's a bit too late for this cycle.  Let's target v4.11.  I'll
review the patches after the merge window.  Please ping me if I don't.

Thanks.

-- 
tejun

Re: [PATCHv13 0/3] rdmacg: IB/core: rdma controller support

2016-12-05 Thread Tejun Heo

Parav, it's a bit too late for this cycle.  Let's target v4.11.  I'll
review the patches after the merge window.  Please ping me if I don't.

Thanks.

-- 
tejun

Re: ILP32 for ARM64: testing with glibc testsuite

2016-12-05 Thread Steve Ellcey

On Mon, 2016-12-05 at 11:07 +0100, Andreas Schwab wrote:
> On Dez 05 2016, "Zhangjian (Bamvor)" 
> wrote:
> 
> > 
> > Is there some progresses on it? We could collabrate to fix those
> > issues.
> All the elf/nptl/rt fails should be fixed by the recent binutils
> fixes.
> 
> Andreas.

I am using binutils ToT and Yury's latest patch (https://sourceware.org
/ml/binutils/2016-12/msg00039.html) and I am still seeing some nptl and
rt failures in the glibc testsuite, specifically:

FAIL: nptl/tst-cancel26
FAIL: nptl/tst-cancel27
FAIL: nptl/tst-stack4
FAIL: rt/tst-mqueue1
FAIL: rt/tst-mqueue2
FAIL: rt/tst-mqueue4
FAIL: rt/tst-mqueue7

Steve Ellcey
sell...@caviumnetworks.com

Re: ILP32 for ARM64: testing with glibc testsuite

2016-12-05 Thread Steve Ellcey

On Mon, 2016-12-05 at 11:07 +0100, Andreas Schwab wrote:
> On Dez 05 2016, "Zhangjian (Bamvor)" 
> wrote:
> 
> > 
> > Is there some progresses on it? We could collabrate to fix those
> > issues.
> All the elf/nptl/rt fails should be fixed by the recent binutils
> fixes.
> 
> Andreas.

I am using binutils ToT and Yury's latest patch (https://sourceware.org
/ml/binutils/2016-12/msg00039.html) and I am still seeing some nptl and
rt failures in the glibc testsuite, specifically:

FAIL: nptl/tst-cancel26
FAIL: nptl/tst-cancel27
FAIL: nptl/tst-stack4
FAIL: rt/tst-mqueue1
FAIL: rt/tst-mqueue2
FAIL: rt/tst-mqueue4
FAIL: rt/tst-mqueue7

Steve Ellcey
sell...@caviumnetworks.com

Re: usb/gadget: use-after-free in gadgetfs_setup

2016-12-05 Thread Alan Stern

On Mon, 5 Dec 2016, Andrey Konovalov wrote:

> Hi!
> 
> I've got the following error report while running the syzkaller fuzzer.
> 
> On commit 3c49de52d5647cda8b42c4255cf8a29d1e22eff5 (Dec 2).
> 
> BUG: KASAN: use-after-free in gadgetfs_setup+0x208a/0x20e0 at addr
> 88003dfe5bf2
> Read of size 2 by task syz-executor0/22994
> CPU: 3 PID: 22994 Comm: syz-executor0 Not tainted 4.9.0-rc7+ #16
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
>  88006df06a18 81f96aba e0528500 11000dbe0cd6
>  ed000dbe0cce 88006df068f0 41b58ab3 8598b4c8
>  81f96828 11000dbe0ccd 88006df06708 88006df06748
> Call Trace:
>   [  201.343209]  [< inline >] __dump_stack lib/dump_stack.c:15
>   [  201.343209]  [] dump_stack+0x292/0x398
> lib/dump_stack.c:51
>  [] kasan_object_err+0x1c/0x70 mm/kasan/report.c:159
>  [< inline >] print_address_description mm/kasan/report.c:197
>  [] kasan_report_error+0x1f0/0x4e0 mm/kasan/report.c:286
>  [< inline >] kasan_report mm/kasan/report.c:306
>  [] __asan_report_load_n_noabort+0x3a/0x40
> mm/kasan/report.c:337
>  [< inline >] config_buf drivers/usb/gadget/legacy/inode.c:1298
>  [] gadgetfs_setup+0x208a/0x20e0
> drivers/usb/gadget/legacy/inode.c:1368
>  [] dummy_timer+0x11f0/0x36d0
> drivers/usb/gadget/udc/dummy_hcd.c:1858
>  [] call_timer_fn+0x241/0x800 kernel/time/timer.c:1308
>  [< inline >] expire_timers kernel/time/timer.c:1348
>  [] __run_timers+0xa06/0xec0 kernel/time/timer.c:1641
>  [] run_timer_softirq+0x21/0x80 kernel/time/timer.c:1654
>  [] __do_softirq+0x2fb/0xb63 kernel/softirq.c:284
>  [< inline >] invoke_softirq kernel/softirq.c:364
>  [] irq_exit+0x19e/0x1d0 kernel/softirq.c:405
>  [< inline >] exiting_irq arch/x86/include/asm/apic.h:659
>  [] smp_apic_timer_interrupt+0x7b/0xa0
> arch/x86/kernel/apic/apic.c:960
>  [] apic_timer_interrupt+0x8c/0xa0
> arch/x86/entry/entry_64.S:489
>   [  201.343209]  [] ?
> _raw_spin_unlock_irqrestore+0x119/0x1a0
>  [] try_to_wake_up+0x174/0x1160 kernel/sched/core.c:2099
>  [< inline >] wake_up_process kernel/sched/core.c:2165
>  [] wake_up_q+0x8a/0xe0 kernel/sched/core.c:469
>  [] futex_wake+0x5be/0x6c0 kernel/futex.c:1453
>  [] do_futex+0x11bd/0x1f00 kernel/futex.c:3219
>  [< inline >] SYSC_futex kernel/futex.c:3275
>  [] SyS_futex+0x2fc/0x400 kernel/futex.c:3243
>  [] entry_SYSCALL_64_fastpath+0x1f/0xc2

Can you test whether the patch below fixes this problem?

Alan Stern



Index: usb-4.x/drivers/usb/gadget/legacy/inode.c
===
--- usb-4.x.orig/drivers/usb/gadget/legacy/inode.c
+++ usb-4.x/drivers/usb/gadget/legacy/inode.c
@@ -1762,6 +1762,10 @@ dev_config (struct file *fd, const char
}
spin_unlock_irq(>lock);
 
+   /* Registered but not yet bound to a UDC driver? */
+   if (dev->gadget_registered)
+   return -EIO;
+
if (len < (USB_DT_CONFIG_SIZE + USB_DT_DEVICE_SIZE + 4))
return -EINVAL;

Re: usb/gadget: use-after-free in gadgetfs_setup

2016-12-05 Thread Alan Stern

On Mon, 5 Dec 2016, Andrey Konovalov wrote:

> Hi!
> 
> I've got the following error report while running the syzkaller fuzzer.
> 
> On commit 3c49de52d5647cda8b42c4255cf8a29d1e22eff5 (Dec 2).
> 
> BUG: KASAN: use-after-free in gadgetfs_setup+0x208a/0x20e0 at addr
> 88003dfe5bf2
> Read of size 2 by task syz-executor0/22994
> CPU: 3 PID: 22994 Comm: syz-executor0 Not tainted 4.9.0-rc7+ #16
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
>  88006df06a18 81f96aba e0528500 11000dbe0cd6
>  ed000dbe0cce 88006df068f0 41b58ab3 8598b4c8
>  81f96828 11000dbe0ccd 88006df06708 88006df06748
> Call Trace:
>   [  201.343209]  [< inline >] __dump_stack lib/dump_stack.c:15
>   [  201.343209]  [] dump_stack+0x292/0x398
> lib/dump_stack.c:51
>  [] kasan_object_err+0x1c/0x70 mm/kasan/report.c:159
>  [< inline >] print_address_description mm/kasan/report.c:197
>  [] kasan_report_error+0x1f0/0x4e0 mm/kasan/report.c:286
>  [< inline >] kasan_report mm/kasan/report.c:306
>  [] __asan_report_load_n_noabort+0x3a/0x40
> mm/kasan/report.c:337
>  [< inline >] config_buf drivers/usb/gadget/legacy/inode.c:1298
>  [] gadgetfs_setup+0x208a/0x20e0
> drivers/usb/gadget/legacy/inode.c:1368
>  [] dummy_timer+0x11f0/0x36d0
> drivers/usb/gadget/udc/dummy_hcd.c:1858
>  [] call_timer_fn+0x241/0x800 kernel/time/timer.c:1308
>  [< inline >] expire_timers kernel/time/timer.c:1348
>  [] __run_timers+0xa06/0xec0 kernel/time/timer.c:1641
>  [] run_timer_softirq+0x21/0x80 kernel/time/timer.c:1654
>  [] __do_softirq+0x2fb/0xb63 kernel/softirq.c:284
>  [< inline >] invoke_softirq kernel/softirq.c:364
>  [] irq_exit+0x19e/0x1d0 kernel/softirq.c:405
>  [< inline >] exiting_irq arch/x86/include/asm/apic.h:659
>  [] smp_apic_timer_interrupt+0x7b/0xa0
> arch/x86/kernel/apic/apic.c:960
>  [] apic_timer_interrupt+0x8c/0xa0
> arch/x86/entry/entry_64.S:489
>   [  201.343209]  [] ?
> _raw_spin_unlock_irqrestore+0x119/0x1a0
>  [] try_to_wake_up+0x174/0x1160 kernel/sched/core.c:2099
>  [< inline >] wake_up_process kernel/sched/core.c:2165
>  [] wake_up_q+0x8a/0xe0 kernel/sched/core.c:469
>  [] futex_wake+0x5be/0x6c0 kernel/futex.c:1453
>  [] do_futex+0x11bd/0x1f00 kernel/futex.c:3219
>  [< inline >] SYSC_futex kernel/futex.c:3275
>  [] SyS_futex+0x2fc/0x400 kernel/futex.c:3243
>  [] entry_SYSCALL_64_fastpath+0x1f/0xc2

Can you test whether the patch below fixes this problem?

Alan Stern



Index: usb-4.x/drivers/usb/gadget/legacy/inode.c
===
--- usb-4.x.orig/drivers/usb/gadget/legacy/inode.c
+++ usb-4.x/drivers/usb/gadget/legacy/inode.c
@@ -1762,6 +1762,10 @@ dev_config (struct file *fd, const char
}
spin_unlock_irq(>lock);
 
+   /* Registered but not yet bound to a UDC driver? */
+   if (dev->gadget_registered)
+   return -EIO;
+
if (len < (USB_DT_CONFIG_SIZE + USB_DT_DEVICE_SIZE + 4))
return -EINVAL;

Re: [PATCH 1/6] net: ethernet: ti: netcp: add support of cpts

2016-12-05 Thread Richard Cochran

On Mon, Dec 05, 2016 at 12:25:57PM -0600, Grygorii Strashko wrote:
> >> --- a/Documentation/devicetree/bindings/net/keystone-netcp.txt
> >> +++ b/Documentation/devicetree/bindings/net/keystone-netcp.txt
> >> @@ -113,6 +113,15 @@ Optional properties:
> >>will only initialize these ports and attach PHY
> >>driver to them if needed.
> >>  
> >> + Properties related to cpts configurations.
> >> +  - cpts_clock_mult/cpts_clock_shift:
> > 
> > Needs vendor prefix. Don't use '_'.
>
> This module is used as part of OMAP and Keystone SoCs, so names for
> this props is ABI already :(

Your automatic calculation makes these unnecessary, and so you can
drop these altogether.

Also, maybe you should mark them as deprecated in cpsw.txt?

(The underscores were my fault, sorry)

Thanks,
Richard

Re: [PATCH 1/6] net: ethernet: ti: netcp: add support of cpts

2016-12-05 Thread Richard Cochran

On Mon, Dec 05, 2016 at 12:25:57PM -0600, Grygorii Strashko wrote:
> >> --- a/Documentation/devicetree/bindings/net/keystone-netcp.txt
> >> +++ b/Documentation/devicetree/bindings/net/keystone-netcp.txt
> >> @@ -113,6 +113,15 @@ Optional properties:
> >>will only initialize these ports and attach PHY
> >>driver to them if needed.
> >>  
> >> + Properties related to cpts configurations.
> >> +  - cpts_clock_mult/cpts_clock_shift:
> > 
> > Needs vendor prefix. Don't use '_'.
>
> This module is used as part of OMAP and Keystone SoCs, so names for
> this props is ABI already :(

Your automatic calculation makes these unnecessary, and so you can
drop these altogether.

Also, maybe you should mark them as deprecated in cpsw.txt?

(The underscores were my fault, sorry)

Thanks,
Richard

Re: Enabling peer to peer device transactions for PCIe devices

2016-12-05 Thread Logan Gunthorpe




On 05/12/16 12:14 PM, Jason Gunthorpe wrote:

But CMB sounds much more like the GPU case where there is a
specialized allocator handing out the BAR to consumers, so I'm not
sure a general purpose chardev makes a lot of sense?


I don't think it will ever need to be as complicated as the GPU case. 
There will probably only ever be a relatively small amount of memory 
behind the CMB and really the only users are those doing P2P work. Thus 
the specialized allocator could be pretty simple and I expect it would 
be fine to just return -ENOMEM if there is not enough memory.


Also, if it was implemented this way, if there was a need to make the 
allocator more complicated it could easily be added later as the 
userspace interface is just mmap to obtain a buffer.


Logan

Re: Enabling peer to peer device transactions for PCIe devices

2016-12-05 Thread Logan Gunthorpe




On 05/12/16 12:14 PM, Jason Gunthorpe wrote:

But CMB sounds much more like the GPU case where there is a
specialized allocator handing out the BAR to consumers, so I'm not
sure a general purpose chardev makes a lot of sense?


I don't think it will ever need to be as complicated as the GPU case. 
There will probably only ever be a relatively small amount of memory 
behind the CMB and really the only users are those doing P2P work. Thus 
the specialized allocator could be pretty simple and I expect it would 
be fine to just return -ENOMEM if there is not enough memory.


Also, if it was implemented this way, if there was a need to make the 
allocator more complicated it could easily be added later as the 
userspace interface is just mmap to obtain a buffer.


Logan

Re: [PATCH v8 1/3] perf annotate: Show raw form for jump instruction with indirect target

2016-12-05 Thread Arnaldo Carvalho de Melo

Em Mon, Dec 05, 2016 at 09:26:45PM +0530, Ravi Bangoria escreveu:
> For jump instructions that does not include target address as direct
> operand, show the original disassembled line for them. This is needed
> for certain powerpc jump instructions that use target address in a
> register (such as bctr, btar, ...).

Please, mention the name of the function where you copy annotated
examples from, so that I can reproduce it here, using the files you
provided (perf.data and vmlinux for powerpc).

Searching one such function now...
 
> Before:
>  ld r12,32088(r12)
>  mtctr  r12
>   v  bctr   ca2c
>  stdr2,24(r1)
>  addis  r12,r2,-1
> 
> After:
>  ld r12,32088(r12)
>  mtctr  r12
>   v  bctr
>  stdr2,24(r1)
>  addis  r12,r2,-1
> 
> Suggested-by: Michael Ellerman 
> Signed-off-by: Ravi Bangoria 
> ---
> Changes in v8:
>   - v7: https://lkml.org/lkml/2016/9/21/436
>   - Rebase to acme/perf/core
>   - No logical changes. (Cross arch annotate patches are in. This patch
> is for hardening annotate for powerpc.)
> 
>  tools/perf/util/annotate.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
> index 4012b1d..ea7e0de 100644
> --- a/tools/perf/util/annotate.c
> +++ b/tools/perf/util/annotate.c
> @@ -237,6 +237,9 @@ static int jump__parse(struct arch *arch __maybe_unused, 
> struct ins_operands *op
>  static int jump__scnprintf(struct ins *ins, char *bf, size_t size,
>  struct ins_operands *ops)
>  {
> + if (!ops->target.addr)
> + return ins__raw_scnprintf(ins, bf, size, ops);
> +
>   return scnprintf(bf, size, "%-6.6s %" PRIx64, ins->name, 
> ops->target.offset);
>  }
>  
> -- 
> 2.4.11

Re: [PATCH v8 1/3] perf annotate: Show raw form for jump instruction with indirect target

2016-12-05 Thread Arnaldo Carvalho de Melo

Em Mon, Dec 05, 2016 at 09:26:45PM +0530, Ravi Bangoria escreveu:
> For jump instructions that does not include target address as direct
> operand, show the original disassembled line for them. This is needed
> for certain powerpc jump instructions that use target address in a
> register (such as bctr, btar, ...).

Please, mention the name of the function where you copy annotated
examples from, so that I can reproduce it here, using the files you
provided (perf.data and vmlinux for powerpc).

Searching one such function now...
 
> Before:
>  ld r12,32088(r12)
>  mtctr  r12
>   v  bctr   ca2c
>  stdr2,24(r1)
>  addis  r12,r2,-1
> 
> After:
>  ld r12,32088(r12)
>  mtctr  r12
>   v  bctr
>  stdr2,24(r1)
>  addis  r12,r2,-1
> 
> Suggested-by: Michael Ellerman 
> Signed-off-by: Ravi Bangoria 
> ---
> Changes in v8:
>   - v7: https://lkml.org/lkml/2016/9/21/436
>   - Rebase to acme/perf/core
>   - No logical changes. (Cross arch annotate patches are in. This patch
> is for hardening annotate for powerpc.)
> 
>  tools/perf/util/annotate.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
> index 4012b1d..ea7e0de 100644
> --- a/tools/perf/util/annotate.c
> +++ b/tools/perf/util/annotate.c
> @@ -237,6 +237,9 @@ static int jump__parse(struct arch *arch __maybe_unused, 
> struct ins_operands *op
>  static int jump__scnprintf(struct ins *ins, char *bf, size_t size,
>  struct ins_operands *ops)
>  {
> + if (!ops->target.addr)
> + return ins__raw_scnprintf(ins, bf, size, ops);
> +
>   return scnprintf(bf, size, "%-6.6s %" PRIx64, ins->name, 
> ops->target.offset);
>  }
>  
> -- 
> 2.4.11

Re: Enabling peer to peer device transactions for PCIe devices

2016-12-05 Thread Jason Gunthorpe

On Mon, Dec 05, 2016 at 10:48:58AM -0800, Dan Williams wrote:
> On Mon, Dec 5, 2016 at 10:39 AM, Logan Gunthorpe  wrote:
> > On 05/12/16 11:08 AM, Dan Williams wrote:
> >>
> >> I've already recommended that iopmem not be a block device and instead
> >> be a device-dax instance. I also don't think it should claim the PCI
> >> ID, rather the driver that wants to map one of its bars this way can
> >> register the memory region with the device-dax core.
> >>
> >> I'm not sure there are enough device drivers that want to do this to
> >> have it be a generic /sys/.../resource_dmableX capability. It still
> >> seems to be an exotic one-off type of configuration.
> >
> >
> > Yes, this is essentially my thinking. Except I think the userspace interface
> > should really depend on the device itself. Device dax is a good  choice for
> > many and I agree the block device approach wouldn't be ideal.
> >
> > Specifically for NVME CMB: I think it would make a lot of sense to just hand
> > out these mappings with an mmap call on /dev/nvmeX. I expect CMB buffers
> > would be volatile and thus you wouldn't need to keep track of where in the
> > BAR the region came from. Thus, the mmap call would just be an allocator
> > from BAR memory. If device-dax were used, userspace would need to lookup
> > which device-dax instance corresponds to which nvme drive.
> 
> I'm not opposed to mapping /dev/nvmeX.  However, the lookup is trivial
> to accomplish in sysfs through /sys/dev/char to find the sysfs path
> of

But CMB sounds much more like the GPU case where there is a
specialized allocator handing out the BAR to consumers, so I'm not
sure a general purpose chardev makes a lot of sense?

Jason

Re: Enabling peer to peer device transactions for PCIe devices

2016-12-05 Thread Jason Gunthorpe

On Mon, Dec 05, 2016 at 10:48:58AM -0800, Dan Williams wrote:
> On Mon, Dec 5, 2016 at 10:39 AM, Logan Gunthorpe  wrote:
> > On 05/12/16 11:08 AM, Dan Williams wrote:
> >>
> >> I've already recommended that iopmem not be a block device and instead
> >> be a device-dax instance. I also don't think it should claim the PCI
> >> ID, rather the driver that wants to map one of its bars this way can
> >> register the memory region with the device-dax core.
> >>
> >> I'm not sure there are enough device drivers that want to do this to
> >> have it be a generic /sys/.../resource_dmableX capability. It still
> >> seems to be an exotic one-off type of configuration.
> >
> >
> > Yes, this is essentially my thinking. Except I think the userspace interface
> > should really depend on the device itself. Device dax is a good  choice for
> > many and I agree the block device approach wouldn't be ideal.
> >
> > Specifically for NVME CMB: I think it would make a lot of sense to just hand
> > out these mappings with an mmap call on /dev/nvmeX. I expect CMB buffers
> > would be volatile and thus you wouldn't need to keep track of where in the
> > BAR the region came from. Thus, the mmap call would just be an allocator
> > from BAR memory. If device-dax were used, userspace would need to lookup
> > which device-dax instance corresponds to which nvme drive.
> 
> I'm not opposed to mapping /dev/nvmeX.  However, the lookup is trivial
> to accomplish in sysfs through /sys/dev/char to find the sysfs path
> of

But CMB sounds much more like the GPU case where there is a
specialized allocator handing out the BAR to consumers, so I'm not
sure a general purpose chardev makes a lot of sense?

Jason

Re: [PATCH] Yama: allow access for the current ptrace parent

2016-12-05 Thread Josh Stone

On 12/02/2016 03:27 PM, Kees Cook wrote:
>> +   /* If there's already an active tracing relationship, then make an
> 
> I'll adjust the comment style here and add it to my tree for -next.

Thanks!

I guess the tweak is that it should have an empty "/*" line?

FWIW, checkpatch.pl doesn't warn about this -- perhaps it should?
I only see the opposite check for NETWORKING_BLOCK_COMMENT_STYLE.

Re: [PATCH] Yama: allow access for the current ptrace parent

2016-12-05 Thread Josh Stone

On 12/02/2016 03:27 PM, Kees Cook wrote:
>> +   /* If there's already an active tracing relationship, then make an
> 
> I'll adjust the comment style here and add it to my tree for -next.

Thanks!

I guess the tweak is that it should have an empty "/*" line?

FWIW, checkpatch.pl doesn't warn about this -- perhaps it should?
I only see the opposite check for NETWORKING_BLOCK_COMMENT_STYLE.

Re: bio linked list corruption.

2016-12-05 Thread Vegard Nossum

On 5 December 2016 at 18:55, Linus Torvalds
 wrote:
> On Mon, Dec 5, 2016 at 9:09 AM, Vegard Nossum  wrote:
>>
>> The warning shows that it made it past the list_empty_careful() check
>> in finish_wait() but then bugs out on the >task_list
>> dereference.
>>
>> Anything stick out?
>
> I hate that shmem waitqueue garbage. It's really subtle.
>
> I think the problem is that "wake_up_all()" in shmem_fallocate()
> doesn't necessarily wake up everything. It wakes up TASK_NORMAL -
> which does include TASK_UNINTERRUPTIBLE, but doesn't actually mean
> "everything on the list".
>
> I think that what happens is that the waiters somehow move from
> TASK_UNINTERRUPTIBLE to TASK_RUNNING early, and this means that
> wake_up_all() will ignore them, leave them on the list, and now that
> list on stack is no longer empty at the end.
>
> And the way *THAT* can happen is that the task is on some *other*
> waitqueue as well, and that other waiqueue wakes it up. That's not
> impossible, you can certainly have people on wait-queues that still
> take faults.
>
> Or somebody just uses a directed wake_up_process() or something.
>
> Since you apparently can recreate this fairly easily, how about trying
> this stupid patch?
>
> NOTE! This is entirely untested. I may have screwed this up entirely.
> You get the idea, though - just remove the wait queue head from the
> list - the list entries stay around, but nothing points to the stack
> entry (that we're going to free) any more.
>
> And add the warning to see if this actually ever triggers (and because
> I'd like to see the callchain when it does, to see if it's another
> waitqueue somewhere or what..)

[ cut here ]
WARNING: CPU: 22 PID: 14012 at mm/shmem.c:2668 shmem_fallocate+0x9a7/0xac0
Kernel panic - not syncing: panic_on_warn set ...

CPU: 22 PID: 14012 Comm: trinity-c73 Not tainted 4.9.0-rc7+ #220
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
Ubuntu-1.8.2-1ubuntu1 04/01/2014
8801e32af970 81fb08c1 83e74b60 8801e32afa48
83ed7600 847103e0 8801e32afa38 81515244
41b58ab3 844e21da 81515061 8151591e
Call Trace:
[] dump_stack+0x83/0xb2
[] panic+0x1e3/0x3ad
[] __warn+0x1bf/0x1e0
[] warn_slowpath_null+0x2c/0x40
[] shmem_fallocate+0x9a7/0xac0
[] vfs_fallocate+0x350/0x620
[] SyS_madvise+0x432/0x1290
[] do_syscall_64+0x1af/0x4d0
[] entry_SYSCALL64_slow_path+0x25/0x25
[ cut here ]

Attached a full log.


Vegard


0.txt.gz
Description: GNU Zip compressed data

Re: bio linked list corruption.

2016-12-05 Thread Vegard Nossum

On 5 December 2016 at 18:55, Linus Torvalds
 wrote:
> On Mon, Dec 5, 2016 at 9:09 AM, Vegard Nossum  wrote:
>>
>> The warning shows that it made it past the list_empty_careful() check
>> in finish_wait() but then bugs out on the >task_list
>> dereference.
>>
>> Anything stick out?
>
> I hate that shmem waitqueue garbage. It's really subtle.
>
> I think the problem is that "wake_up_all()" in shmem_fallocate()
> doesn't necessarily wake up everything. It wakes up TASK_NORMAL -
> which does include TASK_UNINTERRUPTIBLE, but doesn't actually mean
> "everything on the list".
>
> I think that what happens is that the waiters somehow move from
> TASK_UNINTERRUPTIBLE to TASK_RUNNING early, and this means that
> wake_up_all() will ignore them, leave them on the list, and now that
> list on stack is no longer empty at the end.
>
> And the way *THAT* can happen is that the task is on some *other*
> waitqueue as well, and that other waiqueue wakes it up. That's not
> impossible, you can certainly have people on wait-queues that still
> take faults.
>
> Or somebody just uses a directed wake_up_process() or something.
>
> Since you apparently can recreate this fairly easily, how about trying
> this stupid patch?
>
> NOTE! This is entirely untested. I may have screwed this up entirely.
> You get the idea, though - just remove the wait queue head from the
> list - the list entries stay around, but nothing points to the stack
> entry (that we're going to free) any more.
>
> And add the warning to see if this actually ever triggers (and because
> I'd like to see the callchain when it does, to see if it's another
> waitqueue somewhere or what..)

[ cut here ]
WARNING: CPU: 22 PID: 14012 at mm/shmem.c:2668 shmem_fallocate+0x9a7/0xac0
Kernel panic - not syncing: panic_on_warn set ...

CPU: 22 PID: 14012 Comm: trinity-c73 Not tainted 4.9.0-rc7+ #220
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
Ubuntu-1.8.2-1ubuntu1 04/01/2014
8801e32af970 81fb08c1 83e74b60 8801e32afa48
83ed7600 847103e0 8801e32afa38 81515244
41b58ab3 844e21da 81515061 8151591e
Call Trace:
[] dump_stack+0x83/0xb2
[] panic+0x1e3/0x3ad
[] __warn+0x1bf/0x1e0
[] warn_slowpath_null+0x2c/0x40
[] shmem_fallocate+0x9a7/0xac0
[] vfs_fallocate+0x350/0x620
[] SyS_madvise+0x432/0x1290
[] do_syscall_64+0x1af/0x4d0
[] entry_SYSCALL64_slow_path+0x25/0x25
[ cut here ]

Attached a full log.


Vegard


0.txt.gz
Description: GNU Zip compressed data

Re: [PATCH v2 2/2] eeprom: Add IDT 89HPESx driver bindings file

2016-12-05 Thread Serge Semin

On Mon, Dec 05, 2016 at 11:27:07AM -0600, Rob Herring  wrote:
> On Mon, Dec 5, 2016 at 9:25 AM, Serge Semin  wrote:
> > On Mon, Dec 05, 2016 at 08:46:21AM -0600, Rob Herring  
> > wrote:
> >> On Tue, Nov 29, 2016 at 01:38:21AM +0300, Serge Semin wrote:
> >> > See cover-letter for changelog
> >> >
> >> > Signed-off-by: Serge Semin 
> >> >
> >> > ---
> >> >  .../devicetree/bindings/misc/idt_89hpesx.txt   | 41 
> >> > ++
> >>
> >> There's not a better location for this? I can't tell because you don't
> >> describe what the device is.
> >>
> >
> > The device is PCIe-switch EEPROM driver with additional debug-interface to
> > access the switch CSRs. EEPROM is accesses via a separate i2c-slave
> > interface of the switch.
> >
> > There might be another place to put the binding file in. There is a special
> > location for EEPROM drivers bindings - 
> > Documentation/devicetree/bindings/eeprom/ .
> > But as far as I understood from the files put in there, it's intended for
> > pure EEPROM drivers only. On the other hand the directory I've chosen:
> > Documentation/devicetree/bindings/misc/
> > mostly intended for some unusual devices. My device isn't usual, since it
> > has CSRs debug-interface as well. Additionally I've found
> > eeprom-93xx46.txt binding file there, which describes EEPROM bindings.
> >
> > Anyway if you find the file should be placed in
> > Documentation/devicetree/bindings/eeprom/ instead, I'll move it, it's not
> > that a big problem.
> >

What about this comment? Shall the file be left at the path I placed it?

> >> >  1 file changed, 41 insertions(+)
> >> >  create mode 100644 
> >> > Documentation/devicetree/bindings/misc/idt_89hpesx.txt
> >> >
> >> > diff --git a/Documentation/devicetree/bindings/misc/idt_89hpesx.txt 
> >> > b/Documentation/devicetree/bindings/misc/idt_89hpesx.txt
> >> > index 000..469cc93
> >> > --- /dev/null
> >> > +++ b/Documentation/devicetree/bindings/misc/idt_89hpesx.txt
> >> > @@ -0,0 +1,41 @@
> >> > +EEPROM / CSR SMBus-slave interface of IDT 89HPESx devices
> >> > +
> >> > +Required properties:
> >> > +  - compatible : should be ","
> >> > +Basically there is only one manufacturer: idt, but some
> >> > +compatible devices may be produced in future. Following 
> >> > devices
> >> > +are supported: 89hpes8nt2, 89hpes12nt3, 89hpes24nt6ag2,
> >> > +89hpes32nt8ag2, 89hpes32nt8bg2, 89hpes12nt12g2, 
> >> > 89hpes16nt16g2,
> >> > +89hpes24nt24g2, 89hpes32nt24ag2, 89hpes32nt24bg2;
> >> > +89hpes12n3, 89hpes12n3a, 89hpes24n3, 89hpes24n3a;
> >> > +89hpes32h8, 89hpes32h8g2, 89hpes48h12, 89hpes48h12g2,
> >> > +89hpes48h12ag2, 89hpes16h16, 89hpes22h16, 89hpes22h16g2,
> >> > +89hpes34h16, 89hpes34h16g2, 89hpes64h16, 89hpes64h16g2,
> >> > +89hpes64h16ag2;
> >> > +89hpes12t3g2, 89hpes24t3g2, 89hpes16t4, 89hpes4t4g2,
> >> > +89hpes10t4g2, 89hpes16t4g2, 89hpes16t4ag2, 89hpes5t5,
> >> > +89hpes6t5, 89hpes8t5, 89hpes8t5a, 89hpes24t6, 89hpes6t6g2,
> >> > +89hpes24t6g2, 89hpes16t7, 89hpes32t8, 89hpes32t8g2,
> >> > +89hpes48t12, 89hpes48t12g2.
> >> > +Current implementation of the driver doesn't have any 
> >> > device-
> >>
> >> Driver capabilties are irrelevant to bindings.
> >>
> >
> > Why? I've told in the comment, that the devices actually differ by the CSRs
> > map. Even though it's not reflected in the code at the moment, the CSRs
> > read/write restrictions can be added by some concerned programmer in
> > future. But If I left something like "compatible : idt,89hpesx" device
> > only, it will be problematic to add that functionality.
> 
> Bindings describe the h/w, not what the Linux, FreeBSD, etc. driver
> does. You don't want to be changing the binding doc when the driver
> changes.
> 
> > Howbeit If you think it's not necessary and "compatible = idt,89hpesx" is
> > ok, it's perfectly fine for me to make it this way. The property will be
> > even simpler, than current approach.
> 
> NO! That's not at all what I'm suggesting. Specific compatible strings
> are the right way to go for the reasons you give. You just don't need
> to state why here (because it is true for all bindings).
> 

Oh, I just misunderstood what you said. I'll discard the comment.

> >> > +specific functionalities. But since each of them differs
> >> > +by registers mapping, CSRs read/write restrictions can be
> >> > +added in future.
> >> > +  - reg :   I2C address of the IDT 89HPES device.
> >> > +
> >> > +Optional properties:
> >> > +  - read-only : Parameterless property disables writes to the EEPROM
> >> > +  - idt,eesize : Size of EEPROM device connected to IDT 89HPES 
> >> > i2c-master bus
> >> > +(default value is 4096 bytes if option isn't specified)

Re: [PATCH v2 2/2] eeprom: Add IDT 89HPESx driver bindings file

2016-12-05 Thread Serge Semin

On Mon, Dec 05, 2016 at 11:27:07AM -0600, Rob Herring  wrote:
> On Mon, Dec 5, 2016 at 9:25 AM, Serge Semin  wrote:
> > On Mon, Dec 05, 2016 at 08:46:21AM -0600, Rob Herring  
> > wrote:
> >> On Tue, Nov 29, 2016 at 01:38:21AM +0300, Serge Semin wrote:
> >> > See cover-letter for changelog
> >> >
> >> > Signed-off-by: Serge Semin 
> >> >
> >> > ---
> >> >  .../devicetree/bindings/misc/idt_89hpesx.txt   | 41 
> >> > ++
> >>
> >> There's not a better location for this? I can't tell because you don't
> >> describe what the device is.
> >>
> >
> > The device is PCIe-switch EEPROM driver with additional debug-interface to
> > access the switch CSRs. EEPROM is accesses via a separate i2c-slave
> > interface of the switch.
> >
> > There might be another place to put the binding file in. There is a special
> > location for EEPROM drivers bindings - 
> > Documentation/devicetree/bindings/eeprom/ .
> > But as far as I understood from the files put in there, it's intended for
> > pure EEPROM drivers only. On the other hand the directory I've chosen:
> > Documentation/devicetree/bindings/misc/
> > mostly intended for some unusual devices. My device isn't usual, since it
> > has CSRs debug-interface as well. Additionally I've found
> > eeprom-93xx46.txt binding file there, which describes EEPROM bindings.
> >
> > Anyway if you find the file should be placed in
> > Documentation/devicetree/bindings/eeprom/ instead, I'll move it, it's not
> > that a big problem.
> >

What about this comment? Shall the file be left at the path I placed it?

> >> >  1 file changed, 41 insertions(+)
> >> >  create mode 100644 
> >> > Documentation/devicetree/bindings/misc/idt_89hpesx.txt
> >> >
> >> > diff --git a/Documentation/devicetree/bindings/misc/idt_89hpesx.txt 
> >> > b/Documentation/devicetree/bindings/misc/idt_89hpesx.txt
> >> > index 000..469cc93
> >> > --- /dev/null
> >> > +++ b/Documentation/devicetree/bindings/misc/idt_89hpesx.txt
> >> > @@ -0,0 +1,41 @@
> >> > +EEPROM / CSR SMBus-slave interface of IDT 89HPESx devices
> >> > +
> >> > +Required properties:
> >> > +  - compatible : should be ","
> >> > +Basically there is only one manufacturer: idt, but some
> >> > +compatible devices may be produced in future. Following 
> >> > devices
> >> > +are supported: 89hpes8nt2, 89hpes12nt3, 89hpes24nt6ag2,
> >> > +89hpes32nt8ag2, 89hpes32nt8bg2, 89hpes12nt12g2, 
> >> > 89hpes16nt16g2,
> >> > +89hpes24nt24g2, 89hpes32nt24ag2, 89hpes32nt24bg2;
> >> > +89hpes12n3, 89hpes12n3a, 89hpes24n3, 89hpes24n3a;
> >> > +89hpes32h8, 89hpes32h8g2, 89hpes48h12, 89hpes48h12g2,
> >> > +89hpes48h12ag2, 89hpes16h16, 89hpes22h16, 89hpes22h16g2,
> >> > +89hpes34h16, 89hpes34h16g2, 89hpes64h16, 89hpes64h16g2,
> >> > +89hpes64h16ag2;
> >> > +89hpes12t3g2, 89hpes24t3g2, 89hpes16t4, 89hpes4t4g2,
> >> > +89hpes10t4g2, 89hpes16t4g2, 89hpes16t4ag2, 89hpes5t5,
> >> > +89hpes6t5, 89hpes8t5, 89hpes8t5a, 89hpes24t6, 89hpes6t6g2,
> >> > +89hpes24t6g2, 89hpes16t7, 89hpes32t8, 89hpes32t8g2,
> >> > +89hpes48t12, 89hpes48t12g2.
> >> > +Current implementation of the driver doesn't have any 
> >> > device-
> >>
> >> Driver capabilties are irrelevant to bindings.
> >>
> >
> > Why? I've told in the comment, that the devices actually differ by the CSRs
> > map. Even though it's not reflected in the code at the moment, the CSRs
> > read/write restrictions can be added by some concerned programmer in
> > future. But If I left something like "compatible : idt,89hpesx" device
> > only, it will be problematic to add that functionality.
> 
> Bindings describe the h/w, not what the Linux, FreeBSD, etc. driver
> does. You don't want to be changing the binding doc when the driver
> changes.
> 
> > Howbeit If you think it's not necessary and "compatible = idt,89hpesx" is
> > ok, it's perfectly fine for me to make it this way. The property will be
> > even simpler, than current approach.
> 
> NO! That's not at all what I'm suggesting. Specific compatible strings
> are the right way to go for the reasons you give. You just don't need
> to state why here (because it is true for all bindings).
> 

Oh, I just misunderstood what you said. I'll discard the comment.

> >> > +specific functionalities. But since each of them differs
> >> > +by registers mapping, CSRs read/write restrictions can be
> >> > +added in future.
> >> > +  - reg :   I2C address of the IDT 89HPES device.
> >> > +
> >> > +Optional properties:
> >> > +  - read-only : Parameterless property disables writes to the EEPROM
> >> > +  - idt,eesize : Size of EEPROM device connected to IDT 89HPES 
> >> > i2c-master bus
> >> > +(default value is 4096 bytes if option isn't specified)
> >> > +  - idt,eeaddr : Custom address of EEPROM device
> >> > +(If not

Re: scsi: use-after-free in bio_copy_from_iter

2016-12-05 Thread Al Viro

On Mon, Dec 05, 2016 at 04:17:53PM +0100, Johannes Thumshirn wrote:
> 633 hp = >header;
> [...]
> 646 hp->dxferp = (char __user *)buf + cmd_size;

> So the memory for hp->dxferp comes from:
> 633 hp = >header;

> >From my debug instrumentation I see that the dxferp ends up in the
> iovec_iter's kvec->iov_base and the faulting address is always dxferp + n *
> 4k with n in [1, 16] (and we're copying 16 4k pages from the iovec into the
> bio).

_Address_ of hp->dxferp comes from that assignment; the value is 'buf'
argument of sg_write() + small offset.  In this case, it should point
inside a pipe buffer, which is, indeed, at a kernel address.  Who'd
allocated srp is irrelevant.

And if you end up dereferencing more than one page worth there, you do have
a problem - pipe buffers are not going to be that large.  Could you slap
WARN_ON((size_t)input_size > count);
right after the calculation of input_size in sg_write() and see if it triggers
on your reproducer?

Re: scsi: use-after-free in bio_copy_from_iter

2016-12-05 Thread Al Viro

On Mon, Dec 05, 2016 at 04:17:53PM +0100, Johannes Thumshirn wrote:
> 633 hp = >header;
> [...]
> 646 hp->dxferp = (char __user *)buf + cmd_size;

> So the memory for hp->dxferp comes from:
> 633 hp = >header;

> >From my debug instrumentation I see that the dxferp ends up in the
> iovec_iter's kvec->iov_base and the faulting address is always dxferp + n *
> 4k with n in [1, 16] (and we're copying 16 4k pages from the iovec into the
> bio).

_Address_ of hp->dxferp comes from that assignment; the value is 'buf'
argument of sg_write() + small offset.  In this case, it should point
inside a pipe buffer, which is, indeed, at a kernel address.  Who'd
allocated srp is irrelevant.

And if you end up dereferencing more than one page worth there, you do have
a problem - pipe buffers are not going to be that large.  Could you slap
WARN_ON((size_t)input_size > count);
right after the calculation of input_size in sg_write() and see if it triggers
on your reproducer?

< 3 4 5 6 7 8 9 10 11 12 >

701 - 800 of 1768 matches

Mail list logo